g-LDSC is a tool for estimating heritability and functional enrichment from GWAS summary statistics. g-LDSC is written under R-4.1.0.
g-LDSC is written under R-4.1.0. In this tutorial, we would give a demonstration of how g-LDSC is run under a Linux-based system.
Install gldsc (R package) via devtools
#install.packages('devtools')
devtools::install_github("xzw20046/gldsc")
3 files are required to run g-LDSC:
- GWAS summary statistics
- Pre-calculated (partitioned) LD score matrix
- g-LDSC function file
For GWAS summmary statistics, the required formate is shown as follows:
SNP A1 A2 N Z
rs1000000 G A 361194 1.5397
rs10000010 T C 361194 -0.850433
rs1000002 C T 361194 0.672368
To convert your GWAS result in such format, you could use munge_sumstats.py
from ldsc.
This file contains the information of LD matrix and annotation information, to get this file, see tutorial of Calculate LD score matrix in the futher section.
The R file functions.R
that contain all g-LDSC functions.
To calculate LD score matrix you could use the command shown as follow:
Rscript gldsc.run.R \
LDpath=ldblk_ukbb_eur \
annopath=/your file path/baseline \
mafpath=/your file path/1000G_frq \
function=mlfun.R \
out=/your out path/ \
cores=4
LDpath
This flag tellsg-LDSC
which LD matrix files to use in calculating LD score matrix. Under this folder all LD matrixs file should be in.hdf5
format. LD matrix of 1000G and UKBB could be download here.annopath
&mafpath
This two flag give the location of the annotation and MAF of SNPs. The input format here remain the same as.annot
and.M_5_50
inldsc
. Detail see here.out
This flag tellsg-LDSC
where to print the the output.cores
This flag tellsg-LDSC
how many cores you would like to use for parallel computing.
This function will output a file called LDSM.pannel.Rdata
. The size of this output file is approximately 10GB.
You could use the command shown as follow:
Rscript gldsc.run.R \
panel=LDSM.pannel.Rdata \
gwas=BMI.sumstats \
function=mlfun.R \
out=/your out path/BMI \
cores=4
The output of this process will return a data frame with rows represent the result of each functional annotation and columns shown as follow:
Taus Partition_H2 Partition_H2_SD Enrichment Enrichment_SD intercept intercept_SD e.stat P tau_SD
Taus
annotation contributorTaus_SD
standard error of annotation contributorPartition_H2
partitioned SNP-heritabilityPartition_H2_SD
standard error of partitioned SNP-heritabilityEnrichment
fold of enrichmentEnrichment_SD
jackknife standard error of fold of enrichmentintercept
confounding biasintercept_SD
standard error of confounding biase.stat
t-statistics of function enrichmentP
P-value of function enrichment (t-test)
Author: Zewei Xiong (the University of Hong Kong): xzw20046@163.com