Skip to content
This repository has been archived by the owner on Mar 16, 2022. It is now read-only.

C. elegans data set

mbadgettexas edited this page Feb 24, 2015 · 3 revisions

Data Link: http://datasets.pacb.com.s3.amazonaws.com/2014/c_elegans/list.html

README (Last Updated 10/15/2014)

  1. INTRODUCTION

This README file provides descriptive information for the contents found in this directory.

The dataset released in this directory contains the results of PacBio® SMRT® Sequencing for Caenorhabditis elegans, a Bristol mutant strain, as a resource for general community exploration. The genome was sequenced using P6-C4 chemistry and a 20 KB insert library with size selection performed using a 15-50 kb elution window protocol on a BluePippin™ DNA size-selection system from SAGE Science to generate 4.57 GB of unfiltered bases used for assembly.

Assembly of the genome was performed using HGAP3 and polished with Quiver. The results of the Celera assembly and the genome sequence after polishing with Quiver (see REFERENCE below) are also provided for those interested in the comparison. The preassembled reads were generated using a seed read cutoff of 13,854 bp. Provided is the polished assembly and raw data from 11 SMRT® Cells.

Some Basic Stats:

Genome size: 103.02 Mb GC content: ~36% Raw data: 4.57 Gb Assembly Coverage: 39.45x Polished Contigs: 245 Max Contig Length: 3.17 Mb N50 Contig Length: 1.61 Mb Sum of Contig Lengths: 104.17 Mb

  1. DESCRIPTION OF FILES

-polished-assembly.fasta Celera® Assembler genome assembly and QUIVER polished result. -corrected.fasta All Preassembled PacBio reads for direct input into Celera Assembler -corrected.fastq All preassembled PacBio reads for direct input into Celera Assembler with quality values

  1. REFERENCE Celera Assembler Resource: http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=Main_Page

HGAP Bioinformatics Wiki: https://github.com/PacificBiosciences/Bioinformatics-Training/wiki/HGAP

HGAP Publication in Nature Methods: Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Chin CS, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, Clum A, Copeland A, Huddleston J, Eichler EE, Turner SW, Korlach J. Nat Methods. 2013 Jun;10(6):563-9. doi: 10.1038/nmeth.2474. Epub 2013 May 5.

Quiver Bioinformatics Wiki: https://github.com/PacificBiosciences/GenomicConsensus/blob/master/doc/HowToQuiver.rst

We would like to thank Brandeis University, Waltham MA for the supply of the input DNA material.

Please also read supplemental section of the HGAP publication in Nature Methods for a better understanding of Quiver.

Clone this wiki locally