Skip to content

Example Usage

brakitsch edited this page Jun 12, 2016 · 34 revisions

Example Usage

In the following, we give a brief example on how to use GNetLMM. As a case study, we use a subset of the genotypes from the 1000 project [1] and simulated phenotypes.

All commands can be found in demos/run_GNetLMM.sh. In the following, we give a short summary of the individual steps.

Go to the bin folder, create the output folder, set the filenames and parameters:

mkdir out
BFILE=./../data/1000G_chr22/chrom22_subsample20_maf0.10 #specify here bed basename
FFILE=./../data/1000G_chr22/ones.txt
PFILE=./out/pheno
CFILE=./out/chrom22
ASSOC0FILE=./out/lmm
GFILE=./out/genes
ANCHOR_THRESH=1e-6
ANCHORFILE=./out/cisanchor_thresh1e-6_wnd2000.txt
WINDOW=2000
VFILE=./out/vstructures_thresh1e-6_wnd2000
ASSOCFILE=./out/gnetlmm_thresh1e-6_wnd2000
PLOTFILE=./out/power.pdf

Simulating phenotype:

./../GNetLMM/bin/gNetLMM_simPheno --bfile $BFILE --pfile $PFILE

Creating the kinship matrix:

./../GNetLMM/bin/gNetLMM_preprocess --bfile $BFILE --cfile $PFILE

Running the initial association scan:

for i in $(seq 0 10000 40000)
do
    ./../GNetLMM/bin/gNetLMM_analyse --initial_scan --bfile $BFILE --pfile $PFILE --ffile $FFILE --cfile $CFILE.cov --assoc0file   $ASSOC0FILE --startSnpIdx $i --nSnps 10000 
done
./../GNetLMM/bin/gNetLMM_analyse --merge_assoc0_scan  --assoc0file $ASSOC0FILE --nSnps 10000 --bfile $BFILE

Here, we split the SNPs into 5 blocks of length 10000 to demonstrate how the initial association scan can be easily parallelized and ran on the cluster.

Computing the marginal gene-gene correlations when splitting the genes in groups of size 25:

for i in $(seq 0 25 100)
do
./../GNetLMM/bin/gNetLMM_analyse --gene_corr --pfile $PFILE --gfile $GFILE.startTrait_$i  --startTraitIdx $i --nTraits 25
done 
./../GNetLMM/bin/gNetLMM_analyse --merge_corr  --gfile $GFILE  --pfile $PFILE --nTraits 25

Computing the cis anchors:

./../GNetLMM/bin/gNetLMM_analyse --compute_anchors  --bfile $BFILE --pfile $PFILE --assoc0file $ASSOC0FILE --anchorfile $ANCHORFILE --anchor_thresh=$ANCHOR_THRESH  --window=$WINDOW --cis

Finding v-structures:

for i in $(seq 0 10 90)
do
    ./../GNetLMM/bin/gNetLMM_analyse --find_vstructures  --pfile $PFILE  --gfile $GFILE --anchorfile $ANCHORFILE  --assoc0file $ASSOC0FILE --window $WINDOW --vfile $VFILE --bfile $BFILE --startTraitIdx $i --nTraits 10
done
./../GNetLMM/bin/gNetLMM_postprocess --concatenate --infiles $VFILE      --outfile $VFILE

Updating the associations:

for i in $(seq 0 10 90)
do
    ./../GNetLMM/bin/gNetLMM_analyse --update_assoc --bfile $BFILE --pfile $PFILE --cfile $CFILE.cov --ffile $FFILE --vfile $VFILE --assocfile $ASSOCFILE --startTraitIdx $i --nTraits 10
done
./../GNetLMM/bin/gNetLMM_postprocess --concatenate --infiles $ASSOCFILE  --outfile $ASSOCFILE

Here, we split the genes into 10 blocks of length 10 to demonstrate how the steps can be parallelized.

Update initial association results:

./../GNetLMM/bin/gNetLMM_postprocess --merge_assoc --assoc0file $ASSOC0FILE --assocfile $ASSOCFILE

Creating nice output file for v-structures

./../GNetLMM/bin/gNetLMM_postprocess --nice_output --bfile $BFILE --pfile $PFILE --vfile $VFILE --outfile $VFILE.nice

Running the algorithm again, this time blocking the causal chain anchor snp -> anchor gene -> focal gene by conditioning on the focal gene:

for i in $(seq 0 10 90)
do
     ./../GNetLMM/bin/gNetLMM_analyse --block_assoc --bfile $BFILE --pfile $PFILE --cfile $CFILE.cov --ffile $FFILE --vfile $VFILE --assocfile $ASSOCFILE.block --startTraitIdx $i --nTraits 10
done
./../GNetLMM/bin/gNetLMM_postprocess --concatenate --infiles $ASSOCFILE.block  --outfile $ASSOCFILE.block
./../GNetLMM/bin/gNetLMM_postprocess --merge_assoc --assoc0file $ASSOC0FILE --assocfile $ASSOCFILE.block

Plotting results:

./../GNetLMM/bin/gNetLMM_postprocess --plot_power --assocfile $ASSOCFILE --assoc0file $ASSOC0FILE --plotfile $PLOTFILE --pfile $PFILE --bfile $BFILE --window $WINDOW --blockfile $ASSOCFILE.block

GNet-LMM increases the power compared to a standard LMM, Block-LMM decreases the power since the causal chain is interrupted.

Converting updated associations in human readable format:

./../GNetLMM/bin/gNetLMM_postprocess --nice_output --bfile $BFILE --pfile $PFILE --vfile $VFILE --outfile $VFILE.nice --assocfile $ASSOCFILE --assoc0file $ASSOC0FILE --blockfile $BLOCKFILE

[1]: Genomes Project, C. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56-65 (2012).