OTU Tables and QIIME-compliant mapping files used to generate figures and statistics for the American Gut project. All data are de-identified. These tables were picked against Greengenes 13_8 at 97% using SortMeRNA.
American Gut tables hosted in the repository have not been updated since May 2015. It reflects an old version of the American Gut survey. The latest American Gut biom tables and mapping files can be found at ftp://ftp.microbio.me/AmericanGut/latest.
The following studies are being used to provide context for the American Gut data:
- Human Microbiome Project, v35 reads. The sequence data were generated using 454 instruments.
- Global Gut. The sequence data were generated using HiSeq instruments.
- Personal Genome Project microbiome samples, unpublished. The sequence data were generated using MiSeq instruments.
Each study used is described by an acronym:
- AG, American Gut
- HMP, Human Microbiome Project
- GG, Global Gut
- PGP, Personal Genome Project
The provided BIOM tables have a few different tags in the filenames to describe the included data.
- 100nt - The sequences were trimmed to 100 nucleotides prior to OTU picking
- even1k - The full table was rarified to 1000 sequences per sample
- even10k - The full table was rarified to 10000 sequences per sample
The trimming is necessary when combining data from studies in which different sequences technologies were used (e.g., HiSeq vs. MiSeq).
The debug data files are sourced from the main data files, but are 10% random subsets (by sample) of what is in them main files. The purpose of the debug files is to reduce processing load on the results framework for testing purposes.