Data, code and analysis results for the Earth Microbiome Project.
Jupyter Notebook OpenEdge ABL HTML Python Shell TeX
Latest commit 69ac6bc Feb 28, 2017 @mortonjt mortonjt committed on GitHub Merge pull request #75 from cuttlefishh/master
updating scripts subdirectory and README.md file

README.md

Earth Microbiome Project

The Earth Microbiome Project (EMP) is a systematic attempt to characterize global microbial taxonomic and functional diversity for the benefit of the planet and humankind.

The EMP is open science: anyone can get involved. The EMP data set is generated from samples that individual researchers have compiled and donated to the EMP. The samples from each group of researchers represent individual EMP studies. In addition to analyses being done by contributing researchers on the individual studies, we are performing a cross-study meta-analysis. All per-study raw data is publicly available through the EMP Portal of the Qiita database. This GitHub repository contains resources for the EMP meta-analysis: links to the processed, combined (across studies) EMP data on our FTP site; code developed specifically for the EMP meta-analyses; and results of initial analyses, with new results added as they are generated.

If you're interested in getting involved in EMP data analyses you should begin by reviewing the open issues. These describe analyses that we're interested in performing across studies. If you're interested in working on one of these analyses, or have ideas for other analyses that should be performed, you should get in touch with Luke Thompson, the project leader for the EMP.

Additional information is available on the Earth Microbiome Project website.

Organization of this repository

  • data/ data files used for downstream analysis (biom tables, trees, mapping files, etc)

    • data_locations.txt links to where large data files can be found (e.g., BIOM and tree files)
    • MIxS/ Excel files describing MIxS, EBI, and Qiita metadata standard requirements; used to generate metadata templates
    • sequence-lookup/ files used for the EMP Trading Cards (sequence lookup) notebooks (e.g., RDP taxonomy files)
  • ipynb/ IPython notebooks developed for meta-analysis of EMP data (Thompson et al., in prep.)

    • 01-metadata-processing/
    • 02-sequence-processing/
    • 03-otu-picking/
    • 04-rarefaction-and-subsets/
    • 05-alpha-diversity/
    • 06-beta-diversity/
    • 07-environmental-covariation/
    • 08-cooccurrence-and-nestedness/
    • 09-sequence-lookup/
  • legacy/ code, results, and website documents from the early phase of the EMP (2010-2013)

  • presentations/ collection of presentations on the EMP

  • results/ diversity analyses and high-level results (e.g., figures and tables that are useful for presentations)

    • results_locations.txt links to where large results files can be found (e.g., alpha- and beta-diversity results)
  • scripts/ utility scripts and EMP code other than notebooks

    • 01-metadata-templates/
    • 02-colors-and-styles/
    • 03-phylogenetic-placement/

File name abbreviation conventions

Finding older data

If you're looking for data generated and used for the ISME 14 EMP presentations, see here.