Skip to content
Switch branches/tags
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


Latest release v.4.1 (October 6, 2019)


The GBS SNP Calling Reference Optional Pipeline (GBS-SNP-CROP) is executed via a sequence of seven Perl scripts that integrate custom parsing and filtering procedures with well-known, vetted bioinformatic tools, giving the user full access to all intermediate files. By employing a novel strategy of variant (SNPs and indels) calling based on the correspondence of within-individual to across-population patterns of polymorphism, the pipeline is able to identify and distinguish high-confidence variants from both sequencing and PCR errors, whether or not a reference genome is available. In the latter case, the pipeline adopts a clustering strategy to build a population-tailored "Mock Reference" using the same GBS data for downstream calling and genotyping. Designed for libraries of either paired-end (PE) or single-end (SE) reads of arbitrary lengths, GBS-SNP-CROP maximizes data usage by eliminating unnecessary data culling due to imposed length uniformity requirements. GBS-SNP-CROP is a complete bioinformatics pipeline developed primarily to support curation, research, and breeding programs wishing to utilize GBS for the cost-effective genome-wide characterization of plant genetic resources.

Pipeline workflow

Stage 1. Process the raw GBS data

  • Step 1: Parse the raw reads
  • Step 2: Trim based on quality and adaptors
  • Step 3: Demultiplex

Stage 2. Build the Mock Reference

  • Step 4: Cluster reads and assemble the Mock Reference

Stage 3. Map the processed reads and generate standardized alignment files

  • Step 5: Align with BWA-mem and process with SAMtools
  • Step 6: Parse mpileup outputs and produce the variants discovery matrix

Stage 4. Call Variants and Genotypes

  • Step 7: Filter variants and call genotypes

Below is a schematic of the workflow, with inputs and outputs (boxes) indicated for each step (arrows).

Released versions

v.4.1: Released on 10/6/2019
v.4.0: Released on 10/22/2018
v.3.0: Released on 2/8/2018
v.2.0: Released on 2/22/2017
v.1.1: Released on 3/11/2016
v.1.0: Released on 1/12/2016

Getting Help

Begin by carefully going through the GBS-SNP-CROP User manual. Before posting a question or starting a discussion, please first refer to the FAQ page. Also, please check your barcode ID file for empty characters or blank spaces and verify that it was saved as a tab-delimited file. If you're still facing an issue or have suggestions for improving this tool, kindly submit your question or comment to our Google groups page.


  • Java 7 or higher - The latest version of GBS-SNP-CROP (v.4.1) was tested using Java 8 (update 221)
  • Trimmomatic Latest version tested using v.0.39 (Bolger et al., 2014)
  • PEAR Latest version tested with v.0.9.11 (Zhang et al., 2014)
  • VSEARCH Latest version tested with v2.13.7 (Rognes et al., 2016)
  • BWA aligner Latest version tested with v.0.7.12 (Li & Durbin, 2009)
  • SAMTools Latest version tested with v.1.7 (Li et al., 2009)
  • The following five CPAN modules also need to be installed: GetOpt::Long, IO::ZLib, List::Util, List::MoreUtils, Parallel::ForkManager


Melo et al. GBS-SNP-CROP: A reference-optional pipeline for SNP discovery and plant germplasm characterization using genotyping-by-sequencing data. BMC Bioinformatics. 2016. 17:29. DOI 10.1186/s12859-016-0879-y.


GBS SNP Calling Reference Optional Pipeline




No packages published