Skip to content

Latest commit

 

History

History
84 lines (59 loc) · 7.64 KB

README.md

File metadata and controls

84 lines (59 loc) · 7.64 KB

NA12878 Human Reference on Oxford Nanopore MinION

Contributors

Mark Akeson (1), Andrew D. Beggs (2), Thomas Nieto (2), Miten Jain (1), Nicholas J. Loman (3), Matt Loose (4), Sunir Malla (4), Justin O’Grady (5), Hugh E. Olsen (1), Josh Quick (3), Hollian Richardson (5), Jared T. Simpson (6,7), Terrance P. Snutch (8), Louise Tee (2), John R. Tyson (8)

  1. University of California, Santa Cruz, Santa Cruz, CA, USA
  2. University of Birmingham, Birmingham, B15 2TT
  3. Institute of Microbiology and Infection, School of Biosciences, University of Birmingham, Birmingham, B15 2TT, United Kingdom
  4. DeepSeq, School of Life Sciences, University of Nottingham, Nottingham, UK
  5. Norwich Medical School, University of East Anglia, Norwich, NR4 7UQ, United Kingdom.
  6. Ontario Institute for Cancer Research, Toronto, Canada
  7. Department of Computer Science, University of Toronto, Toronto, Canada
  8. Michael Smith Laboratories, University of British Columbia, Vancouver, Canada

Background

We have sequenced the CEPH1463 (NA12878/GM12878, Ceph/Utah pedigree) human genome reference standard on the Oxford Nanopore MinION using 1D ligation kits (450 bp/s) using R9.4 chemistry (FLO-MIN106).

Human genomic DNA from GM12878 human cell line (Ceph/Utah pedigree) was either purchased from Coriell - "DNA" - (cat no NA12878) or extracted from the cultured cell line - "cells". As the DNA is native, modified bases will be preserved.

Data availability

Check back in the next few days for the remaining reads, alignments and raw signal-level reads.

rel2

We have processed approximately 2/3rds of the total dataset.

The current release rel2 consists of:

  • 25 flowcells
  • 58958035887 bases
  • 9053909 reads
flowcell_id reads bases flowcell_id Date Centre SampleType Links
FAB39075 466324 2439308482 FAB39075 20/09/16 UBC DNA FASTQ
FAB39043 305667 1543725551 FAB39043 23/09/16 Bham DNA FASTQ
FAB42706 400751 1857323339 FAB42706 12/10/16 UBC DNA FASTQ
FAB42316 107013 606761274 FAB42316 14/10/16 Notts DNA FASTQ
FAB42205 312666 1664297400 FAB42205 14/10/16 Notts DNA FASTQ
FAB42561 231562 1510037000 FAB42561 19/10/16 Notts DNA FASTQ
FAB42473 598480 3140575707 FAB42473 20/10/16 UBC DNA FASTQ
FAB42476 376897 2061807133 FAB42476 27/10/16 UBC DNA FASTQ
FAB42451 769524 4256154457 FAB42451 28/10/16 Notts DNA FASTQ
FAB42704 276151 1750146174 FAB42704 28/10/16 UBC DNA FASTQ
FAB42810 265456 1665251718 FAB42810 02/11/16 Norwich DNA FASTQ
FAB46683 72602 286264094 FAB46683 17/11/16 Bham DNA FASTQ
FAB45332 530913 2863965040 FAB45332 17/11/16 UBC DNA FASTQ
FAB43577 241646 1423672212 FAB43577 18/11/16 UCSC DNA FASTQ
FAB44989 558195 3443623448 FAB44989 18/11/16 UCSC DNA FASTQ
FAF01169 16489 120873419 FAF01169 22/11/16 Bham Cells FASTQ
FAF01441 43281 358912895 FAF01441 22/11/16 Bham Cells FASTQ
FAB45277 53541 445614920 FAB45277 22/11/16 Notts Cells FASTQ
FAB45321 299172 2583989736 FAB45321 22/11/16 Notts Cells FASTQ
FAF01132 689781 5455971336 FAF01132 25/11/16 Bham Cells FASTQ
FAF01127 632728 4972081712 FAF01127 25/11/16 Bham Cells FASTQ
FAB49712 592317 4589575564 FAB49712 28/11/16 Bham Cells FASTQ
FAF01253 442221 3476220233 FAF01253 28/11/16 Bham Cells FASTQ
FAB49914 309162 2840857895 FAB49914 28/11/16 Notts Cells FASTQ
FAB45271 461370 3601025148 FAB45271 28/11/16 Notts Cells FASTQ

Please verify downloads against MD5 hashes and list of links.

Read lengths

Cellular library read length distribution

Figure: A typical read length distribution from a flowcell where we have run a cell-extracted DNA library. The y-axis shows the count of bases. Mean read length ~8.6kb with N50 of ~12.5kb (vertical line). Reads longer than 60kb are not expected due to limitations of the QIAGEN extraction kit employed.

Acknowledgements

We would like to acknowledge the support of Oxford Nanopore Technologies in generating this dataset, with particular thanks to Rosemary Dokos, Oliver Hartwell, Jonathan Pugh and Clive Brown. We would like to thank Radoslaw Poplawski and Simon Thompson for technical assistance with configuration and optimising of the CLIMB platform file system.

Contact

Please raise issues on this Github repository concerning this dataset. A preprint describing the dataset in more detail will be available shortly.

History

* rel1: 1st December 2016. Initial release.