Skip to content

Latest commit

 

History

History
81 lines (58 loc) · 3.99 KB

V5.0.md

File metadata and controls

81 lines (58 loc) · 3.99 KB

Version(V5.0)

Overview

Input

  • PacBio HiFi data
  • OmniC sequencing data

Output

The final output corresponds to a dual or partially phased diploid assembly (http://lh3.github.io/2021/10/10/introducing-dual-assembly).

The difference between versions is related to a new step that has been added where we attempt to improve the genome assembly by manually curating both haplotypes using the Rapid Curation toolkit from the Welcome Trust Sanger Institute.

Software versions

Almost all of the links will send you to the corresponding repository. The rest of the links correspond to the main documentation web site of the tool.

QC and data preparation

Purpose Program Version
Filtering PacBio HiFi adapters HiFiAdapterFilt Commit 64d1c7b
K-mer counting meryl 1
Estimation of genome size and heterozygosity GenomeScope 2

Assembly

Purpose Program Version
de novo assembly (contiging) HiFiasm 0.16.1-r375
Long-read, genome-genome alignment minimap2 2.16
(Optional) Remove low-coverage, duplicated contigs purge_dups 1.0.1
OmniiC mapping for SALSA Arima Genomics mapping pipeline Commit 2e74ea4
OmniC Scaffolding SALSA 2
Gap closing YAGCloser Commit 20e2769

Omni-C Contact map generation

Purpose Program Version
Short-read alignment bwa 0.7.17-r1188
SAM/BAM processing samtools 1.11
SAM/BAM filtering pairtools 0.3.0
Pairs indexing pairix 0.3.7
Matrix generation cooler 0.8.10
Matrix balancing hicExplorer 3.6
Contact map visualization HiGlass 2.1.11
Contact map generation PretextMap 0.1.4
Contact map visualization PretextView 0.1.5
Contact map visualization PretextSnapshot 0.0.3

Organelle assembly

Purpose Program Version
Mitgenome assembly MitoHiFi 2 Commit c06ed3e

Genome quality assessment

Purpose Program Version
Basic assembly metrics QUAST 5.0.2
Assembly completeness BUSCO 5.0.0
k-mer based assembly evaluation Merqury 1
Contamination screening BlobToolKit 2.3.3

Rapid curation

Purpose Program Version
Manual curation Rapid curation pipeline (Wellcome Trust Sanger Institute, Genome Reference Informatics Team) Commit 4ddca450

Species generated with this pipeline

  • Phrynosoma blainvillii

Richmond JQ, McGuire JA, Escalona M, Marimuthu MPA, Nguyen O, Sacco S, Beraut E, Toffelmier E, Fisher RN, Wang IJ, Shaffer HB (2023) Reference genome of an iconic lizard in western North America, Blainville's horned lizard Phrynosoma blainvillii. Journal of Heredity, esad032. https://doi.org/10.1093/jhered/esad032