Skip to content

ainefairbrother/MitoNuclear_coexpression_pipeline

Repository files navigation

DOI

MitoNuclear_coexpression_pipeline

See associated publication: https://www.nature.com/articles/s42003-021-02792-w

Pipeline to produce mitochondrial-nuclear correlation matrices from TPM matrices

Function_file Function_name Description Input Output Dataset specific
filterNullGenesAndSamps.py filterNullGenesAndSamps Removes samples with TPM=0 in all genes and retains only genes where TPM>0 in all samples. One or more .csv files in the format: rows=samples, columns=genes Filtered .csv file No
log10MedNormalise.R log10MedNormalise Log10 median normalises counts. The log10 transformation makes sample distributions normal, then median normalisation makes the sample expression medians the same for inter-sample comparability One or more .csv files in the format: cols=samples, rows=genes Log10 median normalised .csv file No
maskGeneOutliers.py maskGeneOutliers Masks gene outlier values as follows: LQ+/- 3IQR and UQ+/- 3IQR with NaN value One or more .csv files in the format: rows=samples, columns=genes Masked outliers .csv file No
gtex_regress_covariates.py gtex_regress_covariates Runs a linear model to regress out (hardcoded) covariates from GTEx CNS data 1. One or more .csv files in the format: rows=samples, columns=genes; 2. Metadata from GTEx portal; 3. Phenotype data from GTEx portal Covariate corrected residuals .csv file Yes: GTEX V6p CNS
rosmap_regress_covariates.py rosmap_regress_covariates Runs a linear model to regress out (hardcoded) covariates from ROS/MAP case-control frontal cortex data 1. One or more .csv files in the format: rows=samples, columns=genes; 2. Metadata from the Synapse portal including ROS/MAP ID table, clinical metadata and RNAseq metadata (preprocessing of this done using ROSMAP_preprocess_and_covs.ipynb) Covariate corrected residuals .csv file YES: ROS/MAP case-control frontal cortex
genCorrs.py genCorrs Generates all gene pairwise spearman or pearson correlations One or more .csv files in the format: rows=samples, columns=genes 1. correlation matrix .csv file; 2. p-value matrix .csv file No

About

Pipeline to produce mitochondrial-nuclear correlation matrices from TPM matrices

Resources

Stars

Watchers

Forks

Packages

No packages published