Latest release v.4.1 (October 6, 2019)
The GBS SNP Calling Reference Optional Pipeline (GBS-SNP-CROP) is executed via a sequence of seven Perl scripts that integrate custom parsing and filtering procedures with well-known, vetted bioinformatic tools, giving the user full access to all intermediate files. By employing a novel strategy of variant (SNPs and indels) calling based on the correspondence of within-individual to across-population patterns of polymorphism, the pipeline is able to identify and distinguish high-confidence variants from both sequencing and PCR errors, whether or not a reference genome is available. In the latter case, the pipeline adopts a clustering strategy to build a population-tailored "Mock Reference" using the same GBS data for downstream calling and genotyping. Designed for libraries of either paired-end (PE) or single-end (SE) reads of arbitrary lengths, GBS-SNP-CROP maximizes data usage by eliminating unnecessary data culling due to imposed length uniformity requirements. GBS-SNP-CROP is a complete bioinformatics pipeline developed primarily to support curation, research, and breeding programs wishing to utilize GBS for the cost-effective genome-wide characterization of plant genetic resources.
Stage 1. Process the raw GBS data
- Step 1: Parse the raw reads
- Step 2: Trim based on quality and adaptors
- Step 3: Demultiplex
Stage 2. Build the Mock Reference
- Step 4: Cluster reads and assemble the Mock Reference
Stage 3. Map the processed reads and generate standardized alignment files
- Step 5: Align with BWA-mem and process with SAMtools
- Step 6: Parse mpileup outputs and produce the variants discovery matrix
Stage 4. Call Variants and Genotypes
- Step 7: Filter variants and call genotypes
Begin by carefully going through the GBS-SNP-CROP User manual. Before posting a question or starting a discussion, please first refer to the FAQ page. Also, please check your barcode ID file for empty characters or blank spaces and verify that it was saved as a tab-delimited file. If you're still facing an issue or have suggestions for improving this tool, kindly submit your question or comment to our Google groups page.
- Java 7 or higher - The latest version of GBS-SNP-CROP (v.4.1) was tested using Java 8 (update 221)
- Trimmomatic Latest version tested using v.0.39 (Bolger et al., 2014)
- PEAR Latest version tested with v.0.9.11 (Zhang et al., 2014)
- VSEARCH Latest version tested with v2.13.7 (Rognes et al., 2016)
- BWA aligner Latest version tested with v.0.7.12 (Li & Durbin, 2009)
- SAMTools Latest version tested with v.1.7 (Li et al., 2009)
- The following five CPAN modules also need to be installed: GetOpt::Long, IO::ZLib, List::Util, List::MoreUtils, Parallel::ForkManager
Melo et al. GBS-SNP-CROP: A reference-optional pipeline for SNP discovery and plant germplasm characterization using genotyping-by-sequencing data. BMC Bioinformatics. 2016. 17:29. DOI 10.1186/s12859-016-0879-y.