Skip to content

cyclab/GTEx-Complex-Diseases

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GTEx-Complex-Diseases

Introduction

Although lots of technologies and strategies can be used to identify genetic factors of human complex diseases, results from these studies are still hard to decipher due to the diseases' complexity and multifactors. Inflammation has recently been associated with many complex diseases and may cause long-term damage to the human body. Here we examined whether the history of complex disease systematically altered human tissue transcriptomes and whether inflammation is linked to identifiable signatures, using over 16,000 postmortem samples from the Genotype-Tissue Expression (GTEx) project. We analyzed expression profiles of subjects with and without a medical history of cerebrovascular disease (CVD) or major depression (MD), more details can be found in our paper.

Data Exploration

  • Comparison of the sample numbers between v6 and v8 releases: comp.v6.v8.Rmd
  • Information about subjects and samples:
  • Principal Coordinates Analysis (PCoA) of tissues: PCoA.Rmd
    • PCoA is used to determine which tissues can be merged into a group based on similarities, but then we determined not to merge tissues
  • Comparison of tissue-aware normalization methods: normalization.Rmd
    • We decided to use tissue-aware quantile normalization

RNA-seq Analysis

Preprocessing

  • ExpressionSet object generation: expressionSet.Rmd
    • To gather the sample and subject data into one R object
  • Filtering: filter.merge.Rmd
    • Subjects were selected by the criteria in the script
    • Tissues with very few sampels were removed
    • Genes were filtered in a tissue-aware manner, also Y genes were eliminated

Differential Expression Analysis

  • Differential expression analysis using limma-voom: limma.voom.Rmd, dprssn.de.Rmd
    • Linear mixed model: Y ~ Sex + Ischemic time + Age + Batch + Hardy + MHCVD/MHDPRSSN

Functional Enrichment Analysis

  • Pre-ranked gene set enrichment analysis using GSEA: run.prerankedGSEA.sh
    • Genes were ranked by absolute t-statistics from limma results
    • gmt files were processed from annotation files of various databases

Transcriptome-Wide Association Study

Gene expression variation was inferred from several public GWAS summary statistics using S-PrediXcan software for tissues with expression differences between CVD & non-CVD or MD & non-MD cohorts. Genes that have significant associations were compared to the differentially expressed genes from limma-voom. The code sees here.

Plots

Notes

Warnings from function lmFit: if any term in the linear model only has one sample, then there will be a warning: Warning: Partial NA coefficients for n probe(s)

Publication

Poon C-L and Chen C-Y (2021) Exploring the Impact of Cerebrovascular Disease and Major Depression on Non-diseased Human Tissue Transcriptomes. Front. Genet. 12:696836. doi: 10.3389/fgene.2021.696836

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages