-
Notifications
You must be signed in to change notification settings - Fork 6
Data management options
GEAR supports usual data management via the options below. Those options can often be used with a specified analysis.
Sample selection
- --keep [filename]
- --remove [filename]
--keep/--remove can accept a file with family IDs and within-family ID in the first and the second columns, separated by whitespace, and it keeps/removes all unlisted samples from the current analysis.
- --keep-fam [filename]
- --remove-fam [filename]
--keep-fam/--remove-fam can accept a file with one family ID per line, and it keeps/removes the families have the listed family IDs from the current analysis.
SNP selection
- --extract [filename]
- --exclude [filename]
--extract accepts a text file with a list of SNP IDs (usually one SNP per line, but it's okay if separated by spaces), and removes unlisted variants from the current analysis. --exclude does the same.
- --chr [chromosome ids]
- --not-chr [chromosome ids]
It is flexible to include (--chr) or exclude (--not-chr) SNPs. For example,
--chr 1 4-6, it includes SNPs on chromosome 1, 4, 5, and 6.
--not-chr 1 4-8, it removes SNPs on chromosome 1, 4, 5, 6, 7, and 8.
Quality control
- --maf [cutoff], "--maf 0.1" removes any SNP the minor allele frequency of which is lower than 0.1.
- --max-maf [cutoff], "--max-maf 0.4" removes any SNP the minor allele frequencies of which is greater than 0.4.
- --maf-range [range1 range2], "--maf-range 0.05-0.1 0.25-0.3" keeps any SNP the minor allele frequency of which is within the range 0.05-0.1 and 0.25-0.3.
- --geno [missing-rate], "--geno 0.1" removes any SNP the missing rate of which is greater than 0.1.
- --zero-var, when it is switched on, the SNP of zero variance will be eliminated.
Of note, --zero-var will remove a SNP all genotypes are "Aa", which has maf 0.5.