Skip to content

AFBuratin/DECMiMo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DECMiMo

Circular RNAs (CircRNAs) differential expression analysis by using Generalized Linear Mixed Model.

Here we present a novel approach to detect circRNAs differentially expressed (DECs). We evaluate our approach that combine the expression of circRNAs from different detection tools (CIRI, DCC, CircExplorer2 and findcirc) in terms of:

  • the ability of differential expression detection methods to control the Type I Error;
  • the ability of differential expression detection methods in terms of Consistency;
  • the power of differntial expression detection methods.

Data used in this analysis was retrieved from available GEO repository of Ribo-depleted RNA-seq samples of two or more different conditions (GSE53697, GSE86356, GSE52463).

GLMM matrix composition

To perform GLMM on the combined matrix (CMAT) we used a function get_combined_matrix from the egaffo/CREART R package. The function take as input x the list of the methods' output to be combined, or the path of the CirComPara2's results, or the full path to the circrnas.gtf file from the CirComPara's output. You have to specify in the option select_methods the names of the detection tools kept to compose the CMAT.

Robustness

The directory ./robustness/ contains:

  • getPheno.R which create a file .txt containing B combination of samples for the creation of synthetic datasets;
  • glm_glmm_paired.R and DEscripts.R which estimates the Negative Binomial and GLMM models for each synthetic datasets saving the results as .RData;
  • SensitivityPrecision.Rnw which computes specificity, sensitivity and other measures considering p-values generated by each method in the simulations;
  • plot_eval.R which puts the information from all datasets together and then plots the results.

parametric_sim

The directory ./parametric_sim/ contains:

  • datasets_and_models.R and sampling_func_glmm.R which estimates the Negative Binomial parametric distributions to use as template for the simulations in the datasets;
  • simulator_New.R which creates the simulation framework for both glm and glmm models evaluation;
  • eval_function_call.R which tests the differential expression detection methods;
  • evalPVals.R which computes specificity, sensitivity and other measures considering p-values generated by each method in the simulations;

Consinstency

The directory ./consinstency/ contains:

  • consistency_replicability.Rmd which loads DECs results from robustness evaluation and then tests the differential expression detection methods in terms of Concordance At the Top.

Type I Error Control

The directory ./type_I_error_control/ contains the TIEC.Rmd file which loads DECs results estimated using glm_glmm_paired.R and DEscripts.R for the evaluation of the ability of differential expression detection methods to control the type first error using mock datasets, without differentially abundant features, generated using getSampleShuffle.R script.

Data

Since the entire data production took a long time, the ./data/ directory contains several outputs from all the analyses. This should make it easier for the user to replicate the results.

Instructions and R environment

To replicate the analyses it is strongly suggested to clone or download the entire github directory. Some of the functions used this paper are adapted from the work of: Assessment of statistical methods from single cell, bulk RNA-seq and metagenomics applied to microbiome data., their original code is available at https://github.com/mcalgaro93/sc2meta. The analyses run in many version of R during the development, R 4.1.2 was the final R version on which the methods worked.

About

Differential expression analysis of Circular RNA by Generalized Mixed Model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages