This notebook demonstrates how to compare the AIC scores that were calculated with the `compare_multiplicative_vs_additive.R` script, which fit multiplicative and additive models for the null hypothesis ($H_0: \beta_{AB}=0$) to the high confidence pairs from the Gasperini et al. data.

# Load data

### Additive models
To begin, we will load the fitted additive models and check how many of them converged. We will want to ignore genes for which the model did not converge. 

In [2]:
add.mods <- readRDS('/iblm/netapp/data1/jezhou/crisprQTL/multiplicative_vs_additive_330_pairs_11-May-2023/fitted_additive_mods_null.rds')

In [3]:
# check for convergence
add.convergence <- lapply(add.mods, function(x) {x$convergence})

# get ix of unconverged models
add.unconverged.ix <- c()
j <- 1

for (i in 1:length(add.convergence)) {
    if (is.null(add.convergence[[i]])) {
        add.unconverged.ix[j] <- i
        j <- j + 1
    } else if (add.convergence[[i]] != 0) {
        add.unconverged.ix[j] <- i
        j <- j + 1
    }
}

print(paste(length(add.unconverged.ix), "of ", length(add.mods), "fitted models failed to converge"))

[1] "39 of  330 fitted models failed to converge"


### Multiplicative models

We can do the same for the multiplicative models.

In [4]:
mult.mods <- readRDS("/iblm/netapp/data1/jezhou/crisprQTL/multiplicative_vs_additive_330_pairs_11-May-2023/fitted_multiplicative_mods_null.rds")

In [5]:
# check for convergence - under `converged` because a different function was used for fitting
mult.convergence <- sapply(mult.mods, function(x) {x$converged})

# get ix of unconverged models
mult.unconverged.ix <- c()
j <- 1

for (i in 1:length(mult.convergence)) {
    if (is.null(mult.convergence[[i]])) {
        mult.unconverged.ix[j] <- i
        j <- j + 1
    } else if (!mult.convergence[[i]]) {
        mult.unconverged.ix[[j]] <- i
        j <- j + 1
    }
}

print(paste(length(mult.unconverged.ix), "of ", length(mult.mods), "fitted models failed to converge"))

[1] "31 of  330 fitted models failed to converge"


Now we need a unified list of the genes for which both the additive and multiplicative models converged and can be compared for model fit.

In [6]:
failed.genes <- union(mult.unconverged.ix, add.unconverged.ix)
print(paste(length(failed.genes), "genes failed to converge"))

[1] "39 genes failed to converge"


### AIC table
Now we'll load the table that recorded the AIC for each gene for both the additive (`AIC.identity`) and multiplicative (`AIC.log`) models.

In [6]:
aic.df <- read.csv("/iblm/netapp/data1/jezhou/crisprQTL/multiplicative_vs_additive_330_pairs/aic_summary.csv", header = TRUE)
head(aic.df, n= 10)

Unnamed: 0_level_0,gene,enhancer1,enhancer2,AIC.log,AIC.identity
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<dbl>,<dbl>
1,ENSG00000005249,chr7.4040,chr7.4045,580610.4,585463.5
2,ENSG00000005249,chr7.4040,chr7.4046,580630.0,584817.8
3,ENSG00000005249,chr7.4040,chr7.4041,580654.6,585157.6
4,ENSG00000005249,chr7.4040,chr7.4042,580675.8,585097.4
5,ENSG00000005249,chr7.4040,chr7.4048,580690.2,584724.6
6,ENSG00000005249,chr7.4040,chr7.4050,580696.4,584085.0
7,ENSG00000005249,chr7.4045,chr7.4046,580662.3,585497.6
8,ENSG00000005249,chr7.4045,chr7.4041,580685.6,585944.6
9,ENSG00000005249,chr7.4045,chr7.4042,580707.3,586160.7
10,ENSG00000005249,chr7.4045,chr7.4048,580722.0,584854.3


In [7]:
dim(aic.df)