Although lots of technologies and strategies can be used to identify genetic factors of human complex diseases, results from these studies are still hard to decipher due to the diseases' complexity and multifactors. Inflammation has recently been associated with many complex diseases and may cause long-term damage to the human body. Here we examined whether the history of complex disease systematically altered human tissue transcriptomes and whether inflammation is linked to identifiable signatures, using over 16,000 postmortem samples from the Genotype-Tissue Expression (GTEx) project. We analyzed expression profiles of subjects with and without a medical history of cerebrovascular disease (CVD) or major depression (MD), more details can be found in our paper.
- Comparison of the sample numbers between v6 and v8 releases:
comp.v6.v8.Rmd
- Information about subjects and samples:
- Principal Coordinates Analysis (PCoA) of tissues:
PCoA.Rmd
- PCoA is used to determine which tissues can be merged into a group based on similarities, but then we determined not to merge tissues
- Comparison of tissue-aware normalization methods:
normalization.Rmd
- We decided to use tissue-aware quantile normalization
ExpressionSet
object generation:expressionSet.Rmd
- To gather the sample and subject data into one R object
- Filtering:
filter.merge.Rmd
- Subjects were selected by the criteria in the script
- Tissues with very few sampels were removed
- Genes were filtered in a tissue-aware manner, also Y genes were eliminated
- Differential expression analysis using limma-voom:
limma.voom.Rmd
,dprssn.de.Rmd
- Linear mixed model:
Y ~ Sex + Ischemic time + Age + Batch + Hardy + MHCVD/MHDPRSSN
- Linear mixed model:
- Pre-ranked gene set enrichment analysis using GSEA:
run.prerankedGSEA.sh
- Genes were ranked by absolute t-statistics from limma results
gmt
files were processed from annotation files of various databases
Gene expression variation was inferred from several public GWAS summary statistics using S-PrediXcan software for tissues with expression differences between CVD & non-CVD or MD & non-MD cohorts. Genes that have significant associations were compared to the differentially expressed genes from limma-voom. The code sees here.
- Plots for differentially expressed genes:
cvd.deg.plot.Rmd
,dprssn.deg.plot.Rmd
- Plots for enriched pathways:
pathway.plot.Rmd
Warnings from function lmFit
: if any term in the linear model only has one sample, then there will be a warning: Warning: Partial NA coefficients for n probe(s)
Poon C-L and Chen C-Y (2021) Exploring the Impact of Cerebrovascular Disease and Major Depression on Non-diseased Human Tissue Transcriptomes. Front. Genet. 12:696836. doi: 10.3389/fgene.2021.696836