In [1]:
library("DESeq2")

Loading required package: S4Vectors
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which, which.max, which.min


Attaching package: ‘S4Vectors’

The followin

In [2]:
# determines which count matrices to load -- can be 'filtered' or 'all'
TYPE = c('filtered', 'all')

# determines which layers to use for the output -- can be 'matrix', 'spliced', 'unspliced', 'ambiguous'
LAYER = c('spliced')

# determines which time series to include -- can be a tuple of any of 'm5', 'm3' or 'm1'
TIMESERIES = list(c('m5', 'm3'), c('m3', 'm1'), c('m5', 'm1'))

In [3]:
load_tsv = function(type, layer, comparison) {
    filename = paste(paste('matrix', type, layer, paste(comparison, collapse = "-"), sep='_'), 'tsv', sep='.')
    path = file.path('tsv_matrices', filename)
    df = read.table(path, row.names=1)
    
    return(df)
}

In [4]:
run_deseq_from_df = function(df) {
    condition = sub("_[^_]+$", "", names(df))
    oocyte_id = c(names(df))
    
    coldata = data.frame(condition, row.names = oocyte_id)
    
    dds = DESeqDataSetFromMatrix(countData = df, colData = coldata, design = ~ condition)
    dds = DESeq(dds)
    
    res = results(dds)
    
    return(res)
}

In [5]:
for(type in TYPE) {
    for(layer in LAYER) {
        for(comparison in TIMESERIES) {
            cat('working on type ', type, ', layer', layer, ', comparison', comparison,  '\n')
            df = load_tsv(type, layer, comparison)
            
            print('tsv loaded, running deseq on dataframe')
            res = run_deseq_from_df(df)
            
            print("results are:")
            print(res)
            
            filename = paste(paste('deseq', type, layer, paste(comparison, collapse = "-"), sep='_'), 'csv', sep='.')
            path = file.path('deseq_results', filename)
            
            cat("saving to ", path)
            write.csv(res, path, row.names=TRUE)
        }
    }
}

working on type  filtered , layer spliced , comparison m5 m3 
[1] "tsv loaded, running deseq on dataframe"


estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
-- note: fitType='parametric', but the dispersion trend was not well captured by the
   function: y = a/x + b, and a local regression fit was automatically substituted.
   specify fitType='local' or 'mean' to avoid this message next time.
final dispersion estimates
fitting model and testing


[1] "results are:"
log2 fold change (MLE): condition m5 vs m3 
Wald test p-value: condition m5 vs m3 
DataFrame with 46904 rows and 6 columns
                  baseMean     log2FoldChange             lfcSE
                 <numeric>          <numeric>         <numeric>
Y74C9A.6                 0                 NA                NA
homt-1    53.9300160908429 -0.100514695422401  0.28911050227232
rcor-1    181.933165757509  0.124955511474888 0.329165413486626
Y74C9A.9                 0                 NA                NA
sesn-1    15.9986257630807  0.880131543281224 0.773793525465167
...                    ...                ...               ...
T23E7.8                  0                 NA                NA
T23E7.2   6.53555497407612  -1.82109345422477  1.81810263940779
cgt-2                    0                 NA                NA
6R55.2                   0                 NA                NA
cTel55X.1 61.1546608542507   0.02509652963344 0.268049002827752
                        st

estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
-- note: fitType='parametric', but the dispersion trend was not well captured by the
   function: y = a/x + b, and a local regression fit was automatically substituted.
   specify fitType='local' or 'mean' to avoid this message next time.
final dispersion estimates
fitting model and testing


[1] "results are:"
log2 fold change (MLE): condition m3 vs m1 
Wald test p-value: condition m3 vs m1 
DataFrame with 46904 rows and 6 columns
                   baseMean     log2FoldChange             lfcSE
                  <numeric>          <numeric>         <numeric>
Y74C9A.6  0.746061349276833  -3.23454057421006  3.44150902732789
homt-1      58.793210647796   0.18007151991263 0.276125570554118
rcor-1     216.263677690695 -0.290891189987603 0.308641941060943
Y74C9A.9                  0                 NA                NA
sesn-1     16.5338466519946 -0.701730485232513 0.788989818004428
...                     ...                ...               ...
T23E7.8                   0                 NA                NA
T23E7.2    7.84488173165633   1.36858148340891  1.69111292329064
cgt-2                     0                 NA                NA
6R55.2                    0                 NA                NA
cTel55X.1  68.1602419355637 -0.028290210908841 0.255957830774199
             

estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
-- note: fitType='parametric', but the dispersion trend was not well captured by the
   function: y = a/x + b, and a local regression fit was automatically substituted.
   specify fitType='local' or 'mean' to avoid this message next time.
final dispersion estimates
fitting model and testing


[1] "results are:"
log2 fold change (MLE): condition m5 vs m1 
Wald test p-value: condition m5 vs m1 
DataFrame with 46904 rows and 6 columns
                   baseMean      log2FoldChange             lfcSE
                  <numeric>           <numeric>         <numeric>
Y74C9A.6  0.589034200063238   -2.52287297993932   3.4413510886837
homt-1      44.786023996071  0.0941989977750486 0.262876470122455
rcor-1     177.540199812444  -0.151812786377972 0.150469603600228
Y74C9A.9                  0                  NA                NA
sesn-1     17.2850334703627   0.186141327661603 0.466597167419155
...                     ...                 ...               ...
T23E7.8                   0                  NA                NA
T23E7.2    2.97429927828495  -0.431512776177526  1.78989484529144
cgt-2                     0                  NA                NA
6R55.2                    0                  NA                NA
cTel55X.1  54.5224430205443 0.00576149350266051 0.238210991741815


estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing


[1] "results are:"
log2 fold change (MLE): condition m5 vs m3 
Wald test p-value: condition m5 vs m3 
DataFrame with 46904 rows and 6 columns
                  baseMean     log2FoldChange             lfcSE
                 <numeric>          <numeric>         <numeric>
Y74C9A.6                 0                 NA                NA
homt-1     234.34970247437 -0.041524701903254  0.47604148405941
rcor-1    1143.52114109932  0.148090141396799 0.373900218932645
Y74C9A.9                 0                 NA                NA
sesn-1    44.5329722308789   1.14747073169367  0.98059790826643
...                    ...                ...               ...
T23E7.8                  0                 NA                NA
T23E7.2   15.8514956366396  -2.71391862066179  2.34222063514511
cgt-2                    0                 NA                NA
6R55.2                   0                 NA                NA
cTel55X.1 325.311249073947  0.113684319004029 0.482435475445608
                         s

estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing


[1] "results are:"
log2 fold change (MLE): condition m3 vs m1 
Wald test p-value: condition m3 vs m1 
DataFrame with 46904 rows and 6 columns
                  baseMean      log2FoldChange             lfcSE
                 <numeric>           <numeric>         <numeric>
Y74C9A.6  2.39302542895418   -4.91234194529653     3.40207996127
homt-1    228.406651225721   0.248084821820594 0.480559138355438
rcor-1    1282.89139571823  -0.339167188056068 0.364786382315377
Y74C9A.9                 0                  NA                NA
sesn-1    38.2188061641722  -0.711351817396375 0.944521728445087
...                    ...                 ...               ...
T23E7.8                  0                  NA                NA
T23E7.2   20.8907006904619    1.12178140920777  2.09002153083486
cgt-2                    0                  NA                NA
6R55.2                   0                  NA                NA
cTel55X.1 335.788233315637 -0.0819476472096226 0.454117269602968
             

estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing


[1] "results are:"
log2 fold change (MLE): condition m5 vs m1 
Wald test p-value: condition m5 vs m1 
DataFrame with 46904 rows and 6 columns
                  baseMean     log2FoldChange             lfcSE
                 <numeric>          <numeric>         <numeric>
Y74C9A.6  2.04815471070318  -4.42849131726789  3.40201922349342
homt-1    193.081600507037  0.224941893605807  0.43222772098022
rcor-1    1151.20041615433 -0.174413419080911 0.248085687178815
Y74C9A.9                 0                 NA                NA
sesn-1    47.9057046019741  0.454056928327253 0.604664685948042
...                    ...                ...               ...
T23E7.8                  0                 NA                NA
T23E7.2   7.42682665598407   -1.5633716011182   2.1812435426457
cgt-2                    0                 NA                NA
6R55.2                   0                 NA                NA
cTel55X.1 299.130248352324 0.0495530194117745 0.434746648134689
                        st