Skip to content

Remap the Genome in a Bottle NA12878 validation variant calls to human genome build 38

Notifications You must be signed in to change notification settings

hbc/giab_remap_38

Repository files navigation

Genome in a Bottle NA12878 Human Genome 38 remapped validation set

Scripts to remap the Genome in a Bottle NA12878 validation variant calls to build 38 (GRCh38/hg38) of the human genome.

These convert the VCF calls and assessment region BED files from build 37 to build 38 coordinates using remapping. We take multiple remapping approaches for testing purposes:

Results

  • Validation results

  • Remapped truth sets and validation files

    • Genome in a Bottle regions for GRCh37 that map to build 38: GiaB_v2_19-37_prep_regions.bed

    • Crossmap hg38 liftover with UCSC chain files, regions and VCF file: GiaB_v2_19-38_crossmap-regions.bed, GiaB_v2_19-38_crossmap.vcf.gz

    • NCBI remap hg38 regions and VCF file: GiaB_v2_19-38_remap-regions.bed, GiaB_v2_19-38_remap.vcf.gz

    • Validation VCFs and statistics using rtg vcfeval: giab-hg38-validation-results.tar.gz

    • Indels that are different but overlapping between Platinum Genomes and Genome in a Bottle: giab-platinum-indel-diffs.tar.gz

Usage

Download the inputs with:

cd inputs && bash get_inputs.sh

Run the remapping with:

bash run.sh

Requirements

This depends on external tools to do the actual work:

The easiest way to install the Python dependencies is with Miniconda. Then do:

conda install -c bcbio crossmap pyfaidx

Contributors

About

Remap the Genome in a Bottle NA12878 validation variant calls to human genome build 38

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published