Skip to content

Latest commit

 

History

History
124 lines (94 loc) · 7.07 KB

rev1.md

File metadata and controls

124 lines (94 loc) · 7.07 KB

Description of data

Josh Quick's talk at Genome Science 2018.

Zymo Community Standards 2 (Even)

  • 10 species, all with equal genomic DNA input
  • Useful for evaluating nanopore assembly and taxonomic assignments
  • GridION
    • Bead-beating (~10Gb) (GRIDION-EVEN-BB)
    • Metapolyzyme (~10Gb) (GRIDION-EVEN-MPZ)
    • Hybrid BB + MPZ (~17Gb) (GRIDION-EVEN-MPZ-BB)
  • Zymo Specification Sheet

Zymo Community Standards 2 (Log/Staggered)

  • 10 species from 10^2 - 10^7 genomic DNA abundance
  • Useful for evaluating limit of detection at high coverage
  • PromethION
    • Bead-beating (~130Gb) (PION-LOG-BB)
  • Zymo Specification Sheet

Data availability

Basic run stats

PION-LOG-BB (PromethION)

Active channels:        2746
Mean read length:       3699.9
Mean read quality:      8.9
Median read length:     3342.0
Median read quality:    9.4
Number of reads:        35556299
Read length N50:        5071
Total bases:    131556241641
Number, percentage and megabases of reads above quality cutoffs
>Q5:    33095038 (93.1%) 13026.4Mb
>Q7:    29654587 (83.4%) 12130.3Mb
>Q10:   12428259 (35.0%) 5346.4Mb
>Q12:   43721 (0.1%) 13.3Mb
>Q15:   24 (0.0%) 0.0Mb

PION-LOG-Yield

PION-LOG-ReadlengthLog

GRIDION-EVEN-BB (Zymo_CS_LSK109)

GRIDION-EVEN-MPZ (GridION-Zymo_CS_MPZ_LSK109)

GRIDION-EVEN-MPZ-BB (Zymo_CS_MPZBB_LSK109)

Initial assembly and consensus

minimap2 -t 24 -x ava-ont GridION-Zymo_CS_ALL3_LSK109.all.fq GridION-Zymo_CS_ALL3_LSK109.all.fq | gzip > GridION-Zymo_CS_ALL3_LSK109.all.fq.paf.gz
miniasm -f GridION-Zymo_CS_ALL3_LSK109.all.fq GridION-Zymo_CS_ALL3_LSK109.all.fq.paf.gz > GridION-Zymo_CS_ALL3_LSK109.all.miniasm.gfa
awk '/^S/{print ">"$2"\n"$3}' GridION-Zymo_CS_ALL3_LSK109.all.miniasm.gfa > GridION-Zymo_CS_ALL3_LSK109.all.miniasm.fa
minimap2 -t 12 -x map-ont GridION-Zymo_CS_ALL3_LSK109.all.miniasm.fa GridION-Zymo_CS_ALL3_LSK109.all.fq > GridION-Zymo_CS_ALL3_LSK109.all.reads_miniasm.paf
racon -t 36 GridION-Zymo_CS_ALL3_LSK109.all.fq GridION-Zymo_CS_ALL3_LSK109.all.reads_miniasm.paf GridION-Zymo_CS_ALL3_LSK109.all.miniasm.fa > GridION-Zymo_CS_ALL3_LSK109.all.miniasm.racon_r1.fa
minimap2 -t 12 -x map-ont GridION-Zymo_CS_ALL3_LSK109.all.miniasm.racon_r1.fa GridION-Zymo_CS_ALL3_LSK109.all.fq > GridION-Zymo_CS_ALL3_LSK109.all.reads_racon1.paf
racon -t 36 GridION-Zymo_CS_ALL3_LSK109.all.fq GridION-Zymo_CS_ALL3_LSK109.all.reads_racon1.paf GridION-Zymo_CS_ALL3_LSK109.all.miniasm.racon_r1.fa > GridION-Zymo_CS_ALL3_LSK109.all.miniasm.racon_r2.fa

kraken2 taxonomic assignment

  • hash.k2d (30 GB, b327a46e5f8122c6ce627aecf13ae5b1)
  • opts.k2d (48 B,e77f42c833b99bf91a8315a3c19f83f7)
  • taxo.k2d (1.7 MB, 764fee20387217bd8f28ec9bf955c484)
mkdir kraken2-microbial-fatfree/
cd kraken2-microbial-fatfree/
wget https://refdb.s3.climb.ac.uk/kraken2-microbial/hash.k2d
wget https://refdb.s3.climb.ac.uk/kraken2-microbial/opts.k2d
wget https://refdb.s3.climb.ac.uk/kraken2-microbial/taxo.k2d

awk '/^S/{print ">"$2"\n"$3}' GridION-Zymo_CS_ALL3_LSK109.all.gfa > 12 GridION-Zymo_CS_ALL3_LSK109.all.gfa.fa
kraken2 --db kraken2-microbial-fatfree/ --threads 12 GridION-Zymo_CS_ALL3_LSK109.all.gfa.fa > GridION-Zymo_CS_ALL3_LSK109.all.gfa.fa.krak2

bracken abundance assessment

Additionally, we've processed the database with bracken (assuming a read length distributed around 2500 bp).

cd kraken2-microbial-fatfree/
wget https://refdb.s3.climb.ac.uk/kraken2-microbial/database.kraken
wget https://refdb.s3.climb.ac.uk/kraken2-microbial/database2500mers.kraken
wget https://refdb.s3.climb.ac.uk/kraken2-microbial/database2500mers.kmer_distrib