An introduction about the Genome-in-a-Bottle project
Latest commit 81be291 Oct 13, 2015 @chunlinxiao chunlinxiao Update
Failed to load latest commit information. Update Oct 13, 2015


An introduction about the Genome in a Bottle Consortium

The Genome in a Bottle Consortium ( is a collaboration between NIST, FDA, NCBI, other government agencies, academic sequencing groups, sequencing technology developers, and clinical laboratories. A principal motivation for this consortium is to develop widely accepted reference materials and accompanying performance metrics to provide a strong scientific foundation for the development of regulations and professional standards for clinical sequencing. In addition, these genomes, characterized with many methods, are being used extensively for development and optimization of technologies and bioinformatics.

NIST has developed large batches of human genome DNA from several cell lines for NIST Reference Materials (RMs), which have been characterized by the Consortium for homogeneity, stability, and sequence with as many sequencing technologies and library preparation methods as possible. Information from these datasets is being integrated to form high-confidence genotype calls, which can be used by clinical and research laboratories to understand performance of their sequencing and bioinformatics methods.

NCBI is serving as the DCC and repository for the raw sequencing reads, mapped reads, genotypes, and other details for each sample on a dedicated FTP site ( ). The pilot sample is NA12878 (HG001), and NIST received over 8,000 aliquots in April 2013, which was initially be distributed to partners in the Consortium to assist in characterization, and became available from NIST as Reference Material 8398 in May 2015. Samples from an Ashkenazim trio (son HG002-NA24385-huAA53E0, father HG003-NA24149-hu6E4515, and mother HG004-NA24143-hu8E87A9), and a Han Chinese trio (son HG005-NA24631-hu91BD69, father NA24694-huCA017E, and mother NA24695-hu38168C) from Personal Genome Project (PGP) are also candidate NIST reference materials and are currently being characterized. In early 2016, NIST plans to make the Ashkenazim trio available both as NIST RMs 8391 (son only) and 8392 (entire trio). Only the son of the Asian trio will be a NIST RM (8393). DNA and cell lines for all samples are also available from Coriell, but the NIST RMs are from a single homogenized batch of DNA, so there may be small differences between the samples at Coriell and the NIST RMs.

Details about the NIST Reference Materials, data, and future plans are at and When the NIST RMs are available, they can be purchased from NIST at, where a Report of Investigation describing the DNA will also be available.

Bioproject page:

SRA Run Selector page:

Amazon AWS S3 bucket: s3://giab

GIAB Main ftp site: