Simulating case control data
###Simulation for case-control data###
Options
--simu-cc
Specify the numbers of cases and controls.
--simu-order
The effects are sorted in ascending order and assign to QTLs. So the first QTL has the smallest effect and the last QTL has the largest effect.
--poly-loci
Specify the number of total loci, which is 1000 by default.
--poly-loci-null
Specify the number of null loci, which is zero by default.
--poly-ld
Specify LD in Lewontin's D', a value between -1 to 1. It defaults to 0, linkage equilibrium for markers.
--poly-U
If want to the effects to be uniformly distributed, turn this option on; otherwise, the additive effects follow a normal distribution N(0,h2/N), in which h2 is the heritability and N is the number of loci.
--poly-effect
Specify the file that has the effect for each locus. This command will mask --poly-U.
--simu-k
The prevalence of the cases in the population. It defaults to 0.05.
--simu-hsq
Specify the heritability of the trait. It defaults to 0.5 under the liability scale.
--seed
Specify the seed for simulation.
--make-bed
The genotypes will be written in the plink binary format.
Examples
gear --simu-cc 500,500 --poly-loci 100 --simu-k 0.01 --simu-hsq 0.8 --seed 2010 --out poly
gear --simu-cc 500,500 --poly-loci 100 --poly-loci-null 50 --simu-k 0.01 --simu-hsq 0.8 --seed 2010 --out poly
gear --simu-cc 500,500 --poly-loci 100 --simu-k 0.01 --simu-hsq 0.8 --make-bed --out poly
gear --simu-cc 500,500 --simu-order --poly-loci 100 --simu-k 0.01 --simu-hsq 0.8 --make-bed --out poly
gear --simu-cc 500,500 --poly-effect effect.txt --simu-k 0.01 --simu-hsq 0.8 --make-bed --out poly
~~~~~~
The output files include *.bim, *.fam, and *.bed (the genotype file in plink binary format).
*.phe: there are three columns included. The first two columns are family id and individual id. The 3rd column is the phenotypic value.
*.breed: genotypic (3rd) and the phenotypic (4th) values in the liability scale.
*.rnd: there are three columns included. 1st is the marker name, 2nd is the reference allele, the 3rd column is its additive effect.
*.add: the genotype in additive model coding scheme.
*.cov: there are four columns included. The first two columns are family id and individual id, the third column is probability given the liability, and the fourth column is the liability.
[Return to GEAR Home](https://github.com/gc5k/GEAR/wiki)