Skip to content

Latest commit

 

History

History
90 lines (63 loc) · 5.18 KB

V2.0.md

File metadata and controls

90 lines (63 loc) · 5.18 KB

Version(V2.0)

Overview

Input

  • PacBio HiFi data
  • OmniC sequencing data

Output

The final output corresponds to a diploid assembly, with the primary/alternate approach. We are generating 2 pseudo-haplotypes (primary and alternate). The primary assembly is more complete and consists of longer phased blocks. The alternate consists of haplotigs (contigs of clones with the same haplotype) in heterozygous regions and is not as complete and more fragmented. As explained by Heng Li (here), given the characteristics of the latter, it cannot be considered on its own, but as a complement of the primary assembly.

There are 3 main differences with respect to the previous version of the pipeline.

  • Input chromatin conformation capture data we use is Omni-C data, not Hi-C.
  • The version of the de novo assembler, changed from 0.13 to 0.15.
  • The workflow to generate the mitogenome assembly changed to MitoHiFi

Software

Almost all of the links will send you to the corresponding repository. The rest of the links correspond to the main documentation web site of the tool.

Assembly

Purpose Program Version
Filtering PacBio HiFi adapters HiFiAdapterFilt Commit 64d1c7b
K-mer counting meryl 1
Estimation of genome size and heterozygosity GenomeScope 2
de novo assembly (contiging) HiFiasm 0.15-r327
Long-read, genome-genome alignment minimap2 2.16
Remove low-coverage, duplicated contigs purge_dups 1.0.1
HiC mapping for SALSA Arima Genomics mapping pipeline Commit 2e74ea4
HiC Scaffolding SALSA 2
Gap closing YAGCloser Commit 20e2769

Omni-C Contact map generation

Purpose Program Version
Short-read alignment bwa 0.7.17-r1188
SAM/BAM processing samtools 1.11
SAM/BAM filtering pairtools 0.3.0
Pairs indexing pairix 0.3.7
Matrix generation cooler 0.8.10
Matrix balancing hicExplorer 3.6
Contact map visualization HiGlass 2.1.11
Contact map generation PretextMap 0.1.4
Contact map visualization PretextView 0.1.5
Contact map visualization PretextSnapshot 0.0.3

Organelle assembly

Purpose Program Version
Mitgenome assembly MitoHiFi 2 Commit c06ed3e

Genome quality assessment

Purpose Program Version
Basic assembly metrics QUAST 5.0.2
Assembly completeness BUSCO 5.0.0
k-mer based assembly evaluation Merqury 1
Contamination screening BlobToolKit 2.3.3

Species generated with this pipeline

  • Actinemys marmorata

Todd BD, Jenkinson TS, Escalona M, Beraut E, Nguyen O, Sahasrabudhe R, Scott PA, Toffelmier R, Wang IJ, Shaffer HB (2022) Reference genome of the northwestern pond turtle, Actinemys marmorata. Journal of Heredity, 113 (6): 624–631, https://doi.org/10.1093/jhered/esac021

  • Haliotis cracherodii

Orland C*, Escalona M* (* Co-first author), Sahasrabudhe R, Marimuthu MPA, Nguyen O, Beraut E, Marshman B, Moore J, Raimondi P, Shapiro B (2022) A Draft Reference Genome Assembly of the Critically Endangered Black Abalone, Haliotis cracherodii. Journal of Heredity, 113 (6): 665–672, https://doi.org/10.1093/jhered/esac024

  • Laterallis jamaicensis

Hall LA, Wang IJ, Escalona M, Beraut E, Sacco S, Sahasrabudhe R, Nguyen O, Toffelmier E, Shaffer HB, Beissinger SR. (2023) Reference genome of the black rail, Laterallus jamaicensis. Journal of Heredity, esad025. https://doi.org/10.1093/jhered/esad025

  • Rallus limicola

Hall LA, Wang IJ, Escalona M, Beraut E, Sacco S, Sahasrabudhe R, Nguyen O, Toffelmier E, Shaffer HB, Beissinger SR (2023) Reference genome of the Virginia rail, Rallus limicola. Journal of Heredity, esad026. https://doi.org/10.1093/jhered/esad026