Skip to content

Simulating case control data

gc5k edited this page Jan 2, 2018 · 1 revision

###Simulation for case-control data###



Specify the numbers of cases and controls.


The effects are sorted in ascending order and assign to QTLs. So the first QTL has the smallest effect and the last QTL has the largest effect.


Specify the number of total loci, which is 1000 by default.


Specify the number of null loci, which is zero by default.


Specify LD in Lewontin's D', a value between -1 to 1. It defaults to 0, linkage equilibrium for markers.


If want to the effects to be uniformly distributed, turn this option on; otherwise, the additive effects follow a normal distribution N(0,h2/N), in which h2 is the heritability and N is the number of loci.


Specify the file that has the effect for each locus. This command will mask --poly-U.


The prevalence of the cases in the population. It defaults to 0.05.


Specify the heritability of the trait. It defaults to 0.5 under the liability scale.


Specify the seed for simulation.


The genotypes will be written in the plink binary format.


gear --simu-cc 500,500 --poly-loci 100 --simu-k 0.01 --simu-hsq 0.8 --seed 2010 --out poly
gear --simu-cc 500,500 --poly-loci 100 --poly-loci-null 50 --simu-k 0.01 --simu-hsq 0.8 --seed 2010 --out poly
gear --simu-cc 500,500 --poly-loci 100 --simu-k 0.01 --simu-hsq 0.8 --make-bed --out poly
gear --simu-cc 500,500 --simu-order --poly-loci 100 --simu-k 0.01 --simu-hsq 0.8 --make-bed --out poly
gear --simu-cc 500,500 --poly-effect effect.txt --simu-k 0.01 --simu-hsq 0.8 --make-bed --out poly
The output files include *.bim, *.fam, and *.bed (the genotype file in plink binary format).

*.phe: there are three columns included. The first two columns are family id and individual id. The 3rd column is the phenotypic value.

*.breed: genotypic (3rd) and the phenotypic (4th) values in the liability scale. 
*.rnd: there are three columns included. 1st is the marker name, 2nd is the reference allele, the 3rd column is its additive effect.

*.add: the genotype in additive model coding scheme.

*.cov: there are four columns included. The first two columns are family id and individual id, the third column is probability given the liability, and the fourth column is the liability.

[Return to GEAR Home](