If you use this pipeline for published work, please cite our paper:
Tang, S., Buchman, A.S., Wang, Y. et al. Differential gene expression analysis based on linear mixed model corrects false positive inflation for studying quantitative traits. Sci Rep 13, 16570 (2023). https://doi.org/10.1038/s41598-023-43686-7
- For Mac or Windows, Using docker to run GEMMA, https://github.com/genetics-statistics/GEMMA. After adding the GEMMA image into docker, open terminal to run this following command to run GEMMA:
docker run -w /run -v ${local_path for your files}:/run ed5bf7499691 gemma
- For Linux or HPC, download the binary format from https://github.com/genetics-statistics/GEMMA.
chmod u+x gemma
./gemma
Required file :
Raw read counts: sample_raw_reads.txt
Covariate matrix: cov_matrix.txt
For read counts file, the first three column is gene id, the second and third column is allele types (Ignore this here, type all A/T/C/G in one column).
For the covariate matrix, intercept is manually required.
Use DESeq2 to normalize the raw read counts, for details, see normalization.R
The file generated from normalization.R is :
normalized_reads.txt
cov_bim.txt
Use gzip command to get the compressed read counts .gz file which is required for GEMMA.
Required file:
gzip read counts: normalized_reads.txt.gz
Phenotype: phenotype.txt
Covariate matrix: cov_bim.txt
Get Kinship matrix
gemma -g normalized_reads.txt.gz -p phenotype.txt -c cov_bim.txt -gk 2 -notsnp -o cov_mat
The default output file should in the output folder under the data directory, cov_mat.sXX.txt. Notice that this is generated by using gene expression data, not the really kinship matrix.
LMM
gemma -g normalized_reads.txt.gz -p phenotype.txt -k output/cov_mat.sXX.txt -c cov_bim.txt -lmm 4 -notsnp -o output.
This will conduct the DGE anlysis by LMM GEMMA approach. The default output file should in the output folder under the data directory, output.assoc.txt.
p_wald, p_lrt and p_score is the p-value for wald test, likelihood ratio test and score test for the differentially expressed.
Required file:
sample data: sample_data.txt
Create qq plot, manhattan plot and volcano plot.
The example qqplot:
The example manhattan plot:
The example volcano plot: