Join GitHub today
The advent of next-generation sequencing enabled fast and cost-effective genotyping, which has significantly accelerated the process of gene identification in sequencing-based gene cloning studies. To identify a Mendelian phenotype-associated gene, we may sequence several unrelated individuals/mutants with the exact same phenotype and search for genes that harbour high-effect variants in most of the sequenced samples. The success of such approach depends on a number of factors, including the number of samples being sequenced, the genomic region being sequenced, the sequencing quality and depth, the approach to mapping the sequencing reads onto the genome, variant calling methods, the approach to filtering unlikely functional variants, and the criterion to report candidate genes. It is still difficult for an investigator to design an optimal experiment that considers all these factors, in particular, after obtaining the sequencing results, to design an effective analysis procedure that fits the quality of this particular set of sequencing data.
Gene identification via phenotype sequencing (GIPS) computes four measurements of the effectiveness of a study protocol that identifies Mendelian phenotype-associated genes through sequencing several unrelated individuals/mutants of the exact same phenotype. These four measurements can help iterative optimization of the study protocol. GIPS computes,
- The chance of reporting the true phenotype-associated gene.
- The expected number of random genes that may be reported.
- The significance of each candidate gene’s association with the phenotype.
- The significance of violating the Mendelian assumption if no gene is reported or if all candidate genes have failed validation.
The user manual describes both the theoretical framework of GIPS and its software usage.
- GIPS software package (binary) Require JAVA 1.7 on Linux.
- GIPS software package (source) Require JAVA 1.7 on Linux.
- GIPS Manual
- Rice script
java -jar GIPS.jar [options]
java –Xms5g -jar GIPS.jar -T <tool> -p /path/to/project_folder
|-h (-H)||Show help|
|-Test||Initiate a new project with test setup|
|-init||/path/to/project_folder||Initiate a new project|
|-p||/path/to/project_folder||Work with an existing project|
|-T||<gips|vcs|filter>||Select GIPS function. gips: full work flow; vcs: only estimate variant calling sensitivity for each sample; filter: only filter sample variants. Defaults to gips.|
|-update||Run GIPS in update mode. GIPS will try to re-use intermediate results produced in the previous run.|
PROJECT, REF_GENOME_ANNOTATION.GFF, SNPEFF_GENOME_VERSION, SNPEFF, CANDIDATE_CRITERIA, VAR_CALL_SCRIPT, EFF_REGION, VAR_FILTERS, SCORE_MATRIX, MAX_AA_SCORE, NUM_SIM_SNPS, MAX_VAR_DENSITY, LIB_PHENOTYPE_VAR, LIB_VAR_SNPEFF_GENOME_VERSION, LIB_GENOME_ANNOTATION.GFF, CONTROL
Sample specific section:
SAMPLE_NAME, SAMPLE.VCF, READS_ALIGNMENT.SAM, VAR_CALL_SCRIPT, SCORE_MATRIX, MAX_AA_SCORE, CONTROL, NUM_SIM_SNPS, MAX_VAR_DENSITY, SPECIFY_HOMO_VDS, SPECIFY_HETERO_VDS, SPECIFY_BVF
Weitao Wang(wwtwin#zju.edu.cn) and Xin Chen (xinchen#zju.edu.cn)
If you have any question or suggestion, feel free to contact us.