Permalink
Cannot retrieve contributors at this time
Fetching contributors…
| <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.1d1 20130915//EN" "JATS-archivearticle1.dtd"><article article-type="research-article" dtd-version="1.1d1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="nlm-ta">elife</journal-id><journal-id journal-id-type="hwp">eLife</journal-id><journal-id journal-id-type="publisher-id">eLife</journal-id><journal-title-group><journal-title>eLife</journal-title></journal-title-group><issn publication-format="electronic">2050-084X</issn><publisher><publisher-name>eLife Sciences Publications, Ltd</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">01202</article-id><article-id pub-id-type="doi">10.7554/eLife.01202</article-id><article-categories><subj-group subj-group-type="display-channel"><subject>Research article</subject></subj-group><subj-group subj-group-type="heading"><subject>Human biology and medicine</subject></subj-group><subj-group subj-group-type="heading"><subject>Immunology</subject></subj-group></article-categories><title-group><article-title>Expansion of intestinal <italic>Prevotella copri</italic> correlates with enhanced susceptibility to arthritis</article-title></title-group><contrib-group><contrib contrib-type="author" equal-contrib="yes" id="author-5638"><name><surname>Scher</surname><given-names>Jose U</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="fn" rid="equal-contrib">†</xref><xref ref-type="other" rid="par-3"/><xref ref-type="fn" rid="con1"/><xref ref-type="fn" rid="conf1"/><xref ref-type="other" rid="dataro1"/></contrib><contrib contrib-type="author" equal-contrib="yes" id="author-5882"><name><surname>Sczesnak</surname><given-names>Andrew</given-names></name><xref ref-type="aff" rid="aff2"/><xref ref-type="aff" rid="aff3"/><xref ref-type="fn" rid="equal-contrib">†</xref><xref ref-type="other" rid="par-6"/><xref ref-type="fn" rid="con2"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" equal-contrib="yes" id="author-6430"><name><surname>Longman</surname><given-names>Randy S</given-names></name><xref ref-type="aff" rid="aff2"/><xref ref-type="aff" rid="aff4"/><xref ref-type="fn" rid="equal-contrib">†</xref><xref ref-type="other" rid="par-5"/><xref ref-type="fn" rid="con3"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-5582"><name><surname>Segata</surname><given-names>Nicola</given-names></name><xref ref-type="aff" rid="aff5"/><xref ref-type="aff" rid="aff6"/><xref ref-type="fn" rid="con4"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-6431"><name><surname>Ubeda</surname><given-names>Carles</given-names></name><xref ref-type="aff" rid="aff7"/><xref ref-type="aff" rid="aff8"/><xref ref-type="fn" rid="con5"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-6432"><name><surname>Bielski</surname><given-names>Craig</given-names></name><xref ref-type="aff" rid="aff6"/><xref ref-type="fn" rid="con6"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-7888"><name><surname>Rostron</surname><given-names>Tim</given-names></name><xref ref-type="aff" rid="aff9"/><xref ref-type="fn" rid="con7"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-7889"><name><surname>Cerundolo</surname><given-names>Vincenzo</given-names></name><xref ref-type="aff" rid="aff9"/><xref ref-type="fn" rid="con8"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-6018"><name><surname>Pamer</surname><given-names>Eric G</given-names></name><xref ref-type="aff" rid="aff7"/><xref ref-type="other" rid="par-1"/><xref ref-type="other" rid="par-4"/><xref ref-type="fn" rid="con9"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-6433"><name><surname>Abramson</surname><given-names>Steven B</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="other" rid="par-1"/><xref ref-type="fn" rid="con10"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-6434"><name><surname>Huttenhower</surname><given-names>Curtis</given-names></name><xref ref-type="aff" rid="aff6"/><xref ref-type="other" rid="par-7"/><xref ref-type="other" rid="par-8"/><xref ref-type="fn" rid="con11"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" corresp="yes" id="author-6352"><name><surname>Littman</surname><given-names>Dan R</given-names></name><xref ref-type="aff" rid="aff2"/><xref ref-type="aff" rid="aff10"/><xref ref-type="corresp" rid="cor1">*</xref><xref ref-type="other" rid="par-1"/><xref ref-type="other" rid="par-2"/><xref ref-type="fn" rid="con12"/><xref ref-type="fn" rid="conf1"/></contrib><aff id="aff1"><institution content-type="dept">Department of Medicine</institution>, <institution>New York University School of Medicine and Hospital for Joint Diseases</institution>, <addr-line><named-content content-type="city">New York</named-content></addr-line>, <country>United States</country></aff><aff id="aff2"><institution content-type="dept">Molecular Pathogenesis Program</institution>, <institution>The Kimmel Center for Biology and Medicine of the Skirball Institute, New York University School of Medicine</institution>, <addr-line><named-content content-type="city">New York</named-content></addr-line>, <country>United States</country></aff><aff id="aff3"><institution content-type="dept">Graduate Program in Bioinformatics and Computational Biology</institution>, <institution>University of California, San Francisco</institution>, <addr-line><named-content content-type="city">San Francisco</named-content></addr-line>, <country>United States</country></aff><aff id="aff4"><institution content-type="dept">Jill Roberts IBD Center, Department of Medicine</institution>, <institution>Weill Cornell Medical College</institution>, <addr-line><named-content content-type="city">New York</named-content></addr-line>, <country>United States</country></aff><aff id="aff5"><institution content-type="dept">Centre for Integrative Biology</institution>, <institution>University of Trento</institution>, <addr-line><named-content content-type="city">Trento</named-content></addr-line>, <country>Italy</country></aff><aff id="aff6"><institution content-type="dept">Department of Biostatistics</institution>, <institution>Harvard School of Public Health</institution>, <addr-line><named-content content-type="city">Boston</named-content></addr-line>, <country>United States</country></aff><aff id="aff7"><institution content-type="dept">Immunology Program, Infectious Diseases Service, and The Lucille Castori Center for Microbes, Inflammation, and Cancer</institution>, <institution>Memorial Sloan-Kettering Cancer Center</institution>, <addr-line><named-content content-type="city">New York</named-content></addr-line>, <country>United States</country></aff><aff id="aff8"><institution content-type="dept">Centro Superior de Investigacion en Salud Publica</institution>, <institution>University of Valencia</institution>, <addr-line><named-content content-type="city">Valencia</named-content></addr-line>, <country>Spain</country></aff><aff id="aff9"><institution content-type="dept">Department of Medicine</institution>, <institution>Weatherall Institute of Molecular Medicine, University of Oxford</institution>, <addr-line><named-content content-type="city">Oxford</named-content></addr-line>, <country>United Kingdom</country></aff><aff id="aff10"><institution>Howard Hughes Medical Institute, New York University School of Medicine</institution>, <addr-line><named-content content-type="city">New York</named-content></addr-line>, <country>United States</country></aff></contrib-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Mathis</surname><given-names>Diane</given-names></name><role>Reviewing editor</role><aff><institution>Harvard Medical School</institution>, <country>United States</country></aff></contrib></contrib-group><author-notes><corresp id="cor1"><label>*</label>For correspondence: <email>dan.littman@med.nyu.edu</email></corresp><fn fn-type="con" id="equal-contrib"><label>†</label><p>These authors contributed equally to this work</p></fn></author-notes><pub-date date-type="pub" publication-format="electronic"><day>05</day><month>11</month><year>2013</year></pub-date><pub-date pub-type="collection"><year>2013</year></pub-date><volume>2</volume><elocation-id>e01202</elocation-id><history><date date-type="received"><day>08</day><month>07</month><year>2013</year></date><date date-type="accepted"><day>25</day><month>09</month><year>2013</year></date></history><permissions><copyright-statement>© 2013, Scher et al</copyright-statement><copyright-year>2013</copyright-year><copyright-holder>Scher et al</copyright-holder><license xlink:href="http://creativecommons.org/licenses/by/3.0/"><license-p>This article is distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/3.0/">Creative Commons Attribution License</ext-link>, which permits unrestricted use and redistribution provided that the original author and source are credited.</license-p></license></permissions><self-uri content-type="pdf" xlink:href="elife01202.pdf"/><related-article ext-link-type="doi" id="ra1" related-article-type="commentary" xlink:href="10.7554/eLife.01608"/><abstract><object-id pub-id-type="doi">10.7554/eLife.01202.001</object-id><p>Rheumatoid arthritis (RA) is a prevalent systemic autoimmune disease, caused by a combination of genetic and environmental factors. Animal models suggest a role for intestinal bacteria in supporting the systemic immune response required for joint inflammation. Here we performed 16S sequencing on 114 stool samples from rheumatoid arthritis patients and controls, and shotgun sequencing on a subset of 44 such samples. We identified the presence of <italic>Prevotella copri</italic> as strongly correlated with disease in new-onset untreated rheumatoid arthritis (NORA) patients. Increases in <italic>Prevotella</italic> abundance correlated with a reduction in <italic>Bacteroides</italic> and a loss of reportedly beneficial microbes in NORA subjects. We also identified unique <italic>Prevotella</italic> genes that correlated with disease. Further, colonization of mice revealed the ability of <italic>P. copri</italic> to dominate the intestinal microbiota and resulted in an increased sensitivity to chemically induced colitis. This work identifies a potential role for <italic>P. copri</italic> in the pathogenesis of RA.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.001">http://dx.doi.org/10.7554/eLife.01202.001</ext-link></p></abstract><abstract abstract-type="executive-summary"><object-id pub-id-type="doi">10.7554/eLife.01202.002</object-id><title>eLife digest</title><p>We share our bodies with a diverse set of microorganisms, known collectively as the human microbiome. Indeed, estimates suggest that our bodies contain 10 times as many microbial cells as human cells. Our stomach and intestines alone are home to many hundreds and possibly thousands of microbial species that break down indigestible compounds and help to prevent the growth of harmful bacteria. The immune system must therefore learn to tolerate these microorganisms, while retaining the ability to launch attacks against microorganisms that cause harm. Failure of this process may increase the risk of autoimmune diseases in which the body mistakenly attacks its own cells and tissues.</p><p>Rheumatoid arthritis is a chronic autoimmune disease marked by inflammation of the joints. Although the causes of rheumatoid arthritis are unknown, mice with mutations that increase the risk of the disease remain healthy if they are kept under sterile conditions. However, if these mice are exposed to certain species of bacteria sometimes found in the gut, they begin to show signs of joint inflammation.</p><p>Here, Scher et al. used genome sequencing to compare gut bacteria from patients with rheumatoid arthritis and healthy controls. A bacterial species called <italic>Prevotella copri</italic> was more abundant in patients suffering from untreated rheumatoid arthritis than in healthy individuals. Moreover, the presence of <italic>P. copri</italic> corresponded to a reduction in the abundance of other bacterial groups—including a number of beneficial microbes. In a mouse model of gut inflammation, animals colonized with <italic>P. copri</italic> had more severe disease than controls, consistent with a pro-inflammatory function of this organism.</p><p>Current treatments for rheumatoid arthritis target symptoms. However, by highlighting the role played by gut bacteria, the work of Scher et al. suggests that novel treatment options focused on curbing the spread of <italic>P. copri</italic> in the gut could delay or prevent the onset of this disease.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.002">http://dx.doi.org/10.7554/eLife.01202.002</ext-link></p></abstract><kwd-group kwd-group-type="author-keywords"><title>Author keywords</title><kwd>microbiome</kwd><kwd>inflammation</kwd><kwd>autoimmunity</kwd><kwd>metagenomics</kwd><kwd>rheumatoid</kwd><kwd>arthritis</kwd></kwd-group><kwd-group kwd-group-type="research-organism"><title>Research organism</title><kwd>Human</kwd><kwd>Mouse</kwd></kwd-group><funding-group><award-group id="par-1"><funding-source><institution-wrap><institution>National Institutes of Health</institution></institution-wrap></funding-source><award-id>1RC2AR058986</award-id><principal-award-recipient><name><surname>Pamer</surname><given-names>Eric G</given-names></name><name><surname>Abramson</surname><given-names>Steven B</given-names></name><name><surname>Littman</surname><given-names>Dan R</given-names></name></principal-award-recipient></award-group><award-group id="par-2"><funding-source><institution-wrap><institution>Howard Hughes Medical Institute</institution></institution-wrap></funding-source><principal-award-recipient><name><surname>Littman</surname><given-names>Dan R</given-names></name></principal-award-recipient></award-group><award-group id="par-3"><funding-source><institution-wrap><institution>National Institutes of Health</institution></institution-wrap></funding-source><award-id>K23AR064318</award-id><principal-award-recipient><name><surname>Scher</surname><given-names>Jose U</given-names></name></principal-award-recipient></award-group><award-group id="par-4"><funding-source><institution-wrap><institution>National Institutes of Health</institution></institution-wrap></funding-source><award-id>R01AI042135</award-id><principal-award-recipient><name><surname>Pamer</surname><given-names>Eric G</given-names></name></principal-award-recipient></award-group><award-group id="par-5"><funding-source><institution-wrap><institution>American Gastroenterological Association</institution></institution-wrap></funding-source><principal-award-recipient><name><surname>Longman</surname><given-names>Randy S</given-names></name></principal-award-recipient></award-group><award-group id="par-6"><funding-source><institution-wrap><institution>NSF Graduate Research Fellowship</institution></institution-wrap></funding-source><award-id>1144247</award-id><principal-award-recipient><name><surname>Sczesnak</surname><given-names>Andrew</given-names></name></principal-award-recipient></award-group><award-group id="par-7"><funding-source><institution-wrap><institution>National Institutes of Health</institution></institution-wrap></funding-source><award-id>R01HG005969</award-id><principal-award-recipient><name><surname>Huttenhower</surname><given-names>Curtis</given-names></name></principal-award-recipient></award-group><award-group id="par-8"><funding-source><institution-wrap><institution>Danone Research</institution></institution-wrap></funding-source><award-id>PLF-5972-GD</award-id><principal-award-recipient><name><surname>Huttenhower</surname><given-names>Curtis</given-names></name></principal-award-recipient></award-group><funding-statement>The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.</funding-statement></funding-group><custom-meta-group><custom-meta><meta-name>elife-xml-version</meta-name><meta-value>2</meta-value></custom-meta><custom-meta specific-use="meta-only"><meta-name>Author impact statement</meta-name><meta-value>The sequencing of microbial genomes reveals that the presence of a particular microbial species in the gut may increase the risk of the autoimmune disease rheumatoid arthritis.</meta-value></custom-meta></custom-meta-group></article-meta></front><body><sec id="s1" sec-type="intro"><title>Introduction</title><p>Rheumatoid arthritis (RA) is a highly prevalent systemic autoimmune disease with predilection for the joints. If left untreated, RA can lead to chronic joint deformity, disability, and increased mortality. Despite recent advances towards understanding its pathogenesis (<xref ref-type="bibr" rid="bib26">Mcinnes and Schett, 2011</xref>), the etiology of RA remains elusive. Many genetic susceptibility risk alleles have been discovered and validated (<xref ref-type="bibr" rid="bib44">Stahl et al., 2010</xref>) but are insufficient to explain disease incidence. RA is therefore a complex (multifactorial) disease requiring both environmental and genetic factors for onset (<xref ref-type="bibr" rid="bib26">Mcinnes and Schett, 2011</xref>).</p><p>Among environmental factors, the intestinal microbiota has emerged as a possible candidate responsible for the priming of aberrant systemic immunity in RA (<xref ref-type="bibr" rid="bib35">Scher and Abramson, 2011</xref>). The microbiota encompasses hundreds of bacterial species whose products represent an enormous antigenic burden that must largely be compartmentalized to prevent immune system activation (<xref ref-type="bibr" rid="bib23">Littman and Pamer, 2011</xref>). In the healthy state, intestinal lamina propria cells of both innate and adaptive immune systems cooperate to maintain physiological homeostasis. In RA, there is increased production of both self-reactive antibodies and pro-inflammatory T lymphocytes. Although mechanisms for targeting of synovium by inflammatory cells have not been fully elucidated, studies in animal models suggest that both T cell and antibody responses are involved in arthritogenesis. Moreover, an imbalance in the composition of the gut microbiota can alter local T-cell responses and modulate systemic inflammation. Mice rendered deficient for the microbiota (germ-free) lack pro-inflammatory Th17 cells, and colonization of the gastrointestinal tract with segmented filamentous bacteria (SFB), a commensal microbe commonly found in mammals, is sufficient to induce accumulation of Th17 cells in the lamina propria (<xref ref-type="bibr" rid="bib20">Ivanov et al., 2009</xref>; <xref ref-type="bibr" rid="bib39">Sczesnak et al., 2011</xref>).</p><p>In several animal models of arthritis, mice are persistently healthy when raised in germ-free conditions. However, the introduction of specific gut bacterial species is sufficient to induce joint inflammation (<xref ref-type="bibr" rid="bib33">Rath et al., 1996</xref>; <xref ref-type="bibr" rid="bib1">Abdollahi-Roodsaz et al., 2008</xref>; <xref ref-type="bibr" rid="bib51">Wu et al., 2010</xref>), and antibiotic treatment both prevents and abrogates a rheumatoid arthritis-like phenotype in several mouse models. Upon mono-colonization of arthritis-prone K/BxN mice with SFB, the induced Th17 cells potentiate inflammatory disease (<xref ref-type="bibr" rid="bib51">Wu et al., 2010</xref>). An imbalance in intestinal microbial ecology, in which SFB is dominant, may result in reduced proportions or functions of anti-inflammatory regulatory T cells (Treg) and a predisposition towards autoimmunity. This appears to affect not only the local immune response, but also systemic inflammatory processes, and may explain, at least in part, reduced Treg cell function in RA patients (<xref ref-type="bibr" rid="bib53">Zanin-Zhorov et al., 2010</xref>). Thus, T cells whose functions are dictated by intestinal commensal bacteria can be effectors of pathogenesis in tissue-specific autoimmune disease.</p><p>Although recent studies of the human microbiome (<xref ref-type="bibr" rid="bib4">Arumugam et al., 2011</xref>; <xref ref-type="bibr" rid="bib18">Human Microbiome Project Consortium, 2012</xref>) have characterized the composition and diversity of the healthy gut microbiome, and disease-associated studies revealed correlations between taxonomic abundance and some clinical phenotypes (<xref ref-type="bibr" rid="bib13">Frank et al., 2011</xref>; <xref ref-type="bibr" rid="bib27">Morgan et al., 2012</xref>; <xref ref-type="bibr" rid="bib32">Qin et al., 2012</xref>), a role for distinct microbial taxa and metagenomic markers in systemic inflammatory disease has not been defined. While treatment with antibiotics has been a therapeutic modality in RA for decades, no microbial organism has been shown to be associated with the disease. Based on the discovery that SFB-induced Th17 cells directly contribute to the onset of arthritis in gnotobiotic mice (<xref ref-type="bibr" rid="bib51">Wu et al., 2010</xref>), we analyzed the fecal microbiota in patients with RA. We used 16S ribosomal RNA gene sequencing to classify the microbiota in patients with new-onset (untreated) RA, chronic (treated) RA, psoriatic arthritis, and age- and ethnicity-matched healthy controls. We found a marked association of <italic>Prevotella copri</italic> with new-onset RA (NORA) patients and not with other patient groups. Shotgun sequencing of the microbiome indicated that some <italic>P. copri</italic> genes are differentially present in NORA-associated and healthy samples. Colonization of mice with <italic>P. copri</italic> enhanced susceptibility to chemical colitis, consistent with a pro-inflammatory potential of this organism. Taken together, our results suggest that NORA-associated <italic>P. copri</italic> may contribute to the pathogenesis of human arthritis.</p></sec><sec id="s2" sec-type="results"><title>Results</title><sec id="s2-1"><title>Association of <italic>Prevotella</italic> with new-onset rheumatoid arthritis</title><p>To determine if particular bacterial clades are associated with rheumatoid arthritis, we performed sequencing of the 16S gene (regions V1–V2, 454 platform) on 114 fecal DNA samples—44 samples collected from NORA patients at time of initial diagnosis and prior to immunosuppressive treatment, 26 samples from patients with chronic, treated rheumatoid arthritis (CRA), 16 samples from patients with psoriatic arthritis (PsA), and 28 samples from healthy controls (HLT) (<xref ref-type="table" rid="tbl1">Table 1</xref>). Sequences were analyzed with MOTHUR (<xref ref-type="bibr" rid="bib38">Schloss et al., 2009</xref>) to cluster operational taxonomic units (OTUs, species level classification) at a 97% identity threshold, assign taxonomic identifiers, and calculate clade relative abundances. Although PsA patients revealed a reduction in sample diversity similar to that of IBD patients (<xref ref-type="bibr" rid="bib27">Morgan et al., 2012</xref>), diversity was comparable between NORA, CRA and healthy groups at 3.02 +/− 0.66 (mean, SD) overall by Shannon Diversity Index (<xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1A</xref>). However, when applying Simpson’s Dominance Index, the NORA group was less diverse (<xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1B</xref>), suggesting that these patients harbored a relatively higher abundance of common taxa. Analysis at the major taxonomic hierarchy levels showed no significant differences in either phyla abundance or the ratio of Bacteroidetes/Firmicutes (<xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1C</xref>) between all groups. At the level of family abundances, however, we noted a significant enrichment of Prevotellaceae in NORA subjects (<xref ref-type="fig" rid="fig1">Figure 1A</xref>, <xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1D</xref>). Using the linear discriminant effect size method (LEfSe, see ‘Materials and methods’) (<xref ref-type="bibr" rid="bib41">Segata et al., 2011</xref>) to compare detected clades (33 families, 177 genera, 996 OTUs) among all groups, we found a positive association of two specific <italic>Prevotella</italic> OTUs with NORA and an inverse correlation with Group XIV Clostridia, Lachnospiraceae, and <italic>Bacteroides</italic> as compared to healthy controls (<xref ref-type="fig" rid="fig1">Figure 1A</xref>). Of all detected Prevotellaceae OTUs, OTU4 was the most highly represented with 171,486 supporting reads at 11.49 +/− 17.85 (mean, SD) percent of reads per sample. OTU12, the next most abundant Prevotellaceae, was supported by 12,119 reads at 2.00 +/− 5.42 (mean, SD) percent of reads per sample. Other Prevotellaceae OTUs (including <italic>Prevotella</italic> OTU934) were more scarcely represented with 1,232 +/− 2,305 (mean, SD) total supporting reads at less than 0.5% total reads per sample. We therefore reasoned that OTU4 was the dominant <italic>Prevotella</italic> in our cohort with sixfold more supporting reads than the next most abundant OTU. Principal coordinate analysis with Bray-Curtis distances demonstrated that subjects form distinct clusters, irrespective of health or disease status (<xref ref-type="fig" rid="fig1">Figure 1B</xref>). The largest component of microbial variation corresponded to the carriage (or absence) of <italic>Prevotella</italic>, which significantly differentiated NORA subjects from healthy controls and other forms of arthritis. Consistent with other reports of either high <italic>Prevotella</italic> or high <italic>Bacteroides</italic> relative abundance, but rarely a high relative abundance of both, (<xref ref-type="bibr" rid="bib12">Faust et al., 2012</xref>; <xref ref-type="bibr" rid="bib52">Yatsunenko et al., 2012</xref>), we found segregation of <italic>Prevotella</italic> or <italic>Bacteroides</italic> dominance in the intestinal microbiome (<xref ref-type="fig" rid="fig1">Figure 1C</xref>).<table-wrap id="tbl1" position="float"><object-id pub-id-type="doi">10.7554/eLife.01202.003</object-id><label>Table 1.</label><caption><title>Demographic and clinical data among subjects with new-onset rheumatoid arthritis (NORA), chronic, treated rheumatoid arthritis (CRA), psoriatic arthritis (PsA), and healthy controls (HLT)</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.003">http://dx.doi.org/10.7554/eLife.01202.003</ext-link></p></caption><table frame="hsides" rules="groups"><thead><tr><th/><th>NORA</th><th>CRA</th><th>PsA</th><th>Healthy</th></tr><tr><th/><th>(n = 44)</th><th>(n = 26)</th><th>(n = 16)</th><th>(n = 28)</th></tr></thead><tbody><tr><td>Age, years, mean (median)</td><td>42.4 (40.0)</td><td>50.0 (49.0)</td><td>46.3 (46.0)</td><td>42.8 (40.0)</td></tr><tr><td>Female, %</td><td align="char" char=".">75</td><td align="char" char=".">88</td><td align="char" char=".">56</td><td align="char" char=".">75</td></tr><tr><td>Disease duration, months, mean (median)</td><td>5.4 (2.0)</td><td>72.3 (48.0)</td><td>0.8 (0.0)</td><td>N/A</td></tr><tr><td colspan="5">Disease activity parameters</td></tr><tr><td> ESR, mm/h, mean</td><td align="char" char=".">34.6</td><td align="char" char=".">33.5</td><td align="char" char=".">19.7</td><td align="char" char=".">10.2</td></tr><tr><td> CRP, mg/l, mean</td><td align="char" char=".">20.6</td><td align="char" char=".">8.2</td><td align="char" char=".">7.6</td><td align="char" char=".">1.1</td></tr><tr><td> DAS28, mean (median)</td><td>5.4 (5.7)</td><td>4.7 (5.0)</td><td>4.8 (4.7)</td><td>N/A</td></tr><tr><td> Patient VAS pain, mm, mean (median)</td><td>61.4 (57.5)</td><td>51.5 (62.5)</td><td>50.6 (45.0)</td><td>N/A</td></tr><tr><td> TJC-28, mean (median)</td><td>11.2 (8.5)</td><td>7.6 (7.0)</td><td>8.8 (6.5)</td><td>N/A</td></tr><tr><td> SJC-28, mean (median)</td><td>8.3 (8.0)</td><td>4.6 (3.0)</td><td>4.8 (3.0)</td><td>N/A</td></tr><tr><td colspan="5">Autoantibody status</td></tr><tr><td> IgM-RF positive, %</td><td align="char" char=".">95</td><td align="char" char=".">81</td><td align="char" char=".">13</td><td align="char" char=".">11</td></tr><tr><td> ACPA positive, %</td><td align="char" char=".">100</td><td align="char" char=".">85</td><td align="char" char=".">6</td><td align="char" char=".">7</td></tr><tr><td> IgM-RF and/or ACPA positive, %</td><td align="char" char=".">100</td><td align="char" char=".">96</td><td align="char" char=".">13</td><td align="char" char=".">14</td></tr><tr><td> IgM-RF titer, kU/l, mean (median)</td><td>341.3(157.0)</td><td>178.2 (89.0)</td><td>3.6 (0.0)</td><td>20.5 (0.0)</td></tr><tr><td> ACPA titer, kAU/l, mean (median)</td><td>117.6 (114.0)</td><td>90.8 (57.0)</td><td>1.6 (0.0)</td><td>9.6 (0.0)</td></tr><tr><td colspan="5">Medication use</td></tr><tr><td> Methotrexate, %</td><td align="char" char=".">0</td><td align="char" char=".">42</td><td align="char" char=".">6</td><td align="char" char=".">0</td></tr><tr><td> Prednisone, %</td><td align="char" char=".">0</td><td align="char" char=".">15</td><td align="char" char=".">6</td><td align="char" char=".">0</td></tr><tr><td> Biological agent, %</td><td align="char" char=".">0</td><td align="char" char=".">12</td><td align="char" char=".">0</td><td align="char" char=".">0</td></tr></tbody></table></table-wrap><fig-group><fig id="fig1" position="float"><object-id pub-id-type="doi">10.7554/eLife.01202.004</object-id><label>Figure 1.</label><caption><title>Differences in the relative abundance of <italic>Prevotella</italic> and <italic>Bacteroides</italic> in 114 subjects with and without arthritis, determined by 16S sequencing (regions V1–V2, 454 platform).</title><p>(<bold>A</bold>) LEfSe (<xref ref-type="bibr" rid="bib41">Segata et al., 2011</xref>) was used to compare the abundances of all detected clades among all groups, producing an effect size for each comparison (‘Materials and methods’). All results shown are highly significant (q<0.01) by Kruskal-Wallis test adjusted with the Benjamini-Hochberg procedure for multiple testing, except that indicated with an asterisk, which is significant at q<0.05. Negative values (left) correspond to effect sizes representative of NORA groups, while positive values (right) correspond to effect sizes in HLT subjects. <italic>Prevotella</italic> was found to be over-represented in NORA patients, while <italic>Bacteroides</italic> was over-represented in all other groups. (<bold>B</bold>) The Bray-Curtis distance between all subjects was calculated and used to generate a principal coordinates plot in MOTHUR (<xref ref-type="bibr" rid="bib38">Schloss et al., 2009</xref>). The first two components are shown. Subjects with an abundance of <italic>Prevotella</italic> greater than 10% were colored red. Other subjects were colored according to their <italic>Bacteroides</italic> abundance as shown. NORA subjects (stars) primarily cluster together according to their <italic>Prevotella</italic> abundance, and the x-axis is representative of differences in the relative abundance of <italic>Prevotella</italic> and <italic>Bacteroides</italic>. (<bold>C</bold>) The abundances of <italic>Prevotella</italic> (red) and <italic>Bacteroides</italic> (blue) are shown for all subjects, sorted in order of decreasing <italic>Prevotella</italic> abundance (>5%) and increasing <italic>Bacteroides</italic> abundance.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.004">http://dx.doi.org/10.7554/eLife.01202.004</ext-link></p><p><supplementary-material id="SD1-data"><object-id pub-id-type="doi">10.7554/eLife.01202.005</object-id><label>Figure 1—source data 1.</label><caption><title>Intermediate data and analysis tools for <xref ref-type="fig" rid="fig1">Figure 1</xref>.</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.005">http://dx.doi.org/10.7554/eLife.01202.005</ext-link></p></caption><media mime-subtype="zip" mimetype="application" xlink:href="elife01202s001.zip"/></supplementary-material><supplementary-material id="SD2-data"><object-id pub-id-type="doi">10.7554/eLife.01202.006</object-id><label>Figure 1—source data 2.</label><caption><title>Intermediate data and analysis tools for <xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1</xref>.</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.006">http://dx.doi.org/10.7554/eLife.01202.006</ext-link></p></caption><media mime-subtype="zip" mimetype="application" xlink:href="elife01202s002.zip"/></supplementary-material></p></caption><graphic xlink:href="elife01202f001"/></fig><fig id="fig1s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.01202.007</object-id><label>Figure 1—figure supplement 1.</label><caption><title>Gut microbiota richness, diversity and relative abundance in NORA patients and controls.</title><p>(<bold>A</bold> and <bold>B</bold>) Gut microbiota richness and diversity are similar among RA groups and healthy controls. (<bold>C</bold>) Phyla abundance by group. No significant differences were found at this taxonomic level. (<bold>D</bold>) Family abundance by group. NORA subjects have a significant increase in Prevotellaceae (red) and a concomitant decrease in Bacteroidaceae (blue) by FDR-adjusted Kruskal-Wallis test (q<0.01).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.007">http://dx.doi.org/10.7554/eLife.01202.007</ext-link></p></caption><graphic xlink:href="elife01202fs001"/></fig></fig-group></p><p>To taxonomically identify <italic>Prevotella</italic> OTU4, OTU12, and OTU934, we generated a phylogenetic tree using the consensus 16S sequences of these OTUs and matched regions from known <italic>Prevotella</italic> taxa (<xref ref-type="fig" rid="fig2s1">Figure 2—figure supplement 1</xref>). The analysis revealed these OTUs to cluster tightly with <italic>Prevotella copri</italic>, a microbe isolated from human feces (<xref ref-type="bibr" rid="bib17">Hayashi et al., 2007</xref>) and sequenced as part of the HMP’s reference genome initiative. To further characterize <italic>Prevotella</italic> OTU4, the most abundant taxon, we selected four high abundance NORA samples (028B, 030B, 061B, and 089B) for shotgun sequencing (single-end, 454 platform). The resulting long reads were used to generate metagenomic assemblies (<xref ref-type="table" rid="tbl2">Table 2</xref>, see ‘Materials and methods’) which served as input to PhyloPhlAn (<xref ref-type="bibr" rid="bib40">Segata et al., 2013</xref>). Briefly, PhyloPhlAn locates 400 ubiquitous bacterial genes in a given assembly by sequence alignment in amino acid space, then builds a tree by concatenating the most discriminative positions in each gene into a single long sequence and applying FastTree (<xref ref-type="bibr" rid="bib29">Price et al., 2010</xref>), a standard tree reconstruction tool. This produced a phylogenomic tree placing the taxon most represented in each sample’s metagenomic contigs (i.e., <italic>Prevotella</italic> OTU4) again in close association with <italic>Prevotella copri</italic> (<xref ref-type="fig" rid="fig2">Figure 2A</xref>)<italic>.</italic> We therefore chose to filter the resulting metagenomic assemblies by alignment to the <italic>P. copri</italic> reference genome to generate draft patient-derived genome assemblies (see ‘Materials and methods’). Comparison of these draft assemblies to reference <italic>P. copri</italic> and to one another revealed a high degree of similarity, with possible genome rearrangements (<xref ref-type="fig" rid="fig2">Figure 2B</xref>).<table-wrap id="tbl2" position="float"><object-id pub-id-type="doi">10.7554/eLife.01202.008</object-id><label>Table 2.</label><caption><title>Draft genome assembly statistics of four subjects with a high abundance of <italic>Prevotella</italic> OTU4</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.008">http://dx.doi.org/10.7554/eLife.01202.008</ext-link></p></caption><table frame="hsides" rules="groups"><thead><tr><th/><th/><th/><th/><th align="center" colspan="4">Total</th><th colspan="4"><italic>P. copri</italic> aligned</th></tr><tr><th>Subject ID</th><th>Group</th><th><italic>Prevotella</italic> OTU4 abundance (%)</th><th># reads</th><th># of contigs</th><th>Size (Mb)</th><th>N50 (kb)</th><th>Mean depth</th><th># of contigs</th><th>Size (Mb)</th><th>N50 (kb)</th><th>Mean depth</th></tr></thead><tbody><tr><td>028B</td><td>NORA</td><td align="char" char=".">27.7</td><td align="char" char=".">1,240,515</td><td align="char" char=".">19,988</td><td align="char" char=".">23.24</td><td align="char" char=".">1.45</td><td align="char" char=".">6.13</td><td align="char" char=".">115</td><td align="char" char=".">3.21</td><td align="char" char=".">59.84</td><td align="char" char=".">36.76</td></tr><tr><td>030B</td><td>NORA</td><td align="char" char=".">50.9</td><td align="char" char=".">1,041,546</td><td align="char" char=".">21,579</td><td align="char" char=".">17.35</td><td align="char" char=".">1.01</td><td align="char" char=".">6.97</td><td align="char" char=".">232</td><td align="char" char=".">2.60</td><td align="char" char=".">16.18</td><td align="char" char=".">44.14</td></tr><tr><td>061B</td><td>NORA</td><td align="char" char=".">66.5</td><td align="char" char=".">1,209,392</td><td align="char" char=".">9,241</td><td align="char" char=".">12.8</td><td align="char" char=".">1.58</td><td align="char" char=".">9.88</td><td align="char" char=".">74</td><td align="char" char=".">3.23</td><td align="char" char=".">79.98</td><td align="char" char=".">172.64</td></tr><tr><td>089B</td><td>NORA</td><td align="char" char=".">56.3</td><td align="char" char=".">1,395,872</td><td align="char" char=".">12,112</td><td align="char" char=".">23.47</td><td align="char" char=".">4.64</td><td align="char" char=".">23.12</td><td align="char" char=".">1,963</td><td align="char" char=".">3.96</td><td align="char" char=".">3.19</td><td align="char" char=".">30.39</td></tr><tr><td>Ref. genome</td><td>–</td><td>–</td><td>–</td><td>–</td><td>–</td><td>–</td><td>–</td><td align="char" char=".">83</td><td align="char" char=".">3.51</td><td align="char" char=".">131.4</td><td>–</td></tr></tbody></table></table-wrap><fig-group><fig id="fig2" position="float"><object-id pub-id-type="doi">10.7554/eLife.01202.009</object-id><label>Figure 2.</label><caption><title>Homology-based classification of patient-associated <italic>Prevotella</italic>.</title><p>Four NORA subjects with a high abundance of <italic>Prevotella</italic> OTU4 were selected for shotgun sequencing and metagenome assembly. (<bold>A</bold>) The resulting metagenomic contigs were used to generate a phylogenomic tree with PhyloPhlAn (<xref ref-type="bibr" rid="bib40">Segata et al., 2013</xref>). (<bold>B</bold>) Assemblies were filtered by alignment to the reference <italic>Prevotella copri DSM 18205</italic> genome, keeping contigs with at least one 300 bp region aligned at 97% identity or greater. The resulting draft patient-derived <italic>P. copri</italic> assemblies were aligned to one another, the reference <italic>P. copri</italic> genome, and two distinct <italic>Prevotella</italic> taxa (<italic>Prevotella buccae</italic> and <italic>Prevotella buccalis</italic>). Colored arcs represent assemblies as labeled, lines connecting arcs represent regions of >97% identity >1 kb in length, and gray lines dividing colored arcs represent boundaries between contigs. These results demonstrate that <italic>Prevotella</italic> OTU4, OTU12, and OTU934 form a clade with <italic>P. copri</italic> (left, red highlighted subtree) that is genetically distinct from more distant <italic>Prevotella</italic> taxa.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.009">http://dx.doi.org/10.7554/eLife.01202.009</ext-link></p><p><supplementary-material id="SD3-data"><object-id pub-id-type="doi">10.7554/eLife.01202.010</object-id><label>Figure 2—source data 1.</label><caption><title>Intermediate data and analysis tools for <xref ref-type="fig" rid="fig2">Figure 2</xref>.</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.010">http://dx.doi.org/10.7554/eLife.01202.010</ext-link></p></caption><media mime-subtype="zip" mimetype="application" xlink:href="elife01202s003.zip"/></supplementary-material><supplementary-material id="SD4-data"><object-id pub-id-type="doi">10.7554/eLife.01202.011</object-id><label>Figure 2—source data 2.</label><caption><title>Intermediate data and analysis tools for <xref ref-type="fig" rid="fig2s1">Figure 2—figure supplement 1</xref>.</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.011">http://dx.doi.org/10.7554/eLife.01202.011</ext-link></p></caption><media mime-subtype="zip" mimetype="application" xlink:href="elife01202s004.zip"/></supplementary-material></p></caption><graphic xlink:href="elife01202f002"/></fig><fig id="fig2s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.01202.012</object-id><label>Figure 2—figure supplement 1.</label><caption><title>The representative 16S sequenced reads for <italic>Prevotella</italic> OTU4, OTU12, and OTU934 were aligned with MUSCLE (<xref ref-type="bibr" rid="bib9">Edgar, 2004</xref>) and clustered with FastTree (<xref ref-type="bibr" rid="bib29">Price et al., 2010</xref>)</title><p>All three <italic>Prevotella</italic> OTUs cluster with the full-length reference 16S sequence of <italic>P. copri</italic>.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.012">http://dx.doi.org/10.7554/eLife.01202.012</ext-link></p></caption><graphic xlink:href="elife01202fs002"/></fig></fig-group></p><p>Overall, 75% (33/44) of NORA patients and 21.4% (6/28) of healthy controls carried <italic>P. copri</italic> in their intestinal microbiota compared to 11.5% (3/26) and 37.5% (6/16) in CRA and PsA patients, respectively, at a threshold for presence of >5% relative abundance. The prevalence of <italic>P. copri</italic> in NORA compared to CRA, PsA, and healthy controls was statistically significant by chi-squared test, but was not significant in pairwise comparisons of the latter three cohorts (<xref ref-type="table" rid="tbl3">Table 3</xref>).<table-wrap id="tbl3" position="float"><object-id pub-id-type="doi">10.7554/eLife.01202.013</object-id><label>Table 3.</label><caption><title>Statistical comparisons of <italic>Prevotella copri</italic> prevalence between cohort groups</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.013">http://dx.doi.org/10.7554/eLife.01202.013</ext-link></p></caption><table frame="hsides" rules="groups"><thead><tr><th>Comparison</th><th>Prevalence #1</th><th>Prevalence #2</th><th>Chi-squared p-value</th><th>Fisher’s exact p-value</th></tr></thead><tbody><tr><td><xref ref-type="table-fn" rid="tblfn1">*</xref>NORA vs HLT</td><td align="char" char="/">33/44</td><td align="char" char="/">6/28</td><td>2.612e-05</td><td>1.025e-05</td></tr><tr><td><xref ref-type="table-fn" rid="tblfn1">*</xref>NORA vs CRA</td><td align="char" char="/">33/44</td><td align="char" char="/">3/26</td><td>1.031e-06</td><td>2.551e-07</td></tr><tr><td><xref ref-type="table-fn" rid="tblfn2">†</xref>NORA vs PsA</td><td align="char" char="/">33/44</td><td align="char" char="/">6/16</td><td>0.01698</td><td>0.013</td></tr><tr><td>HLT vs CRA</td><td align="char" char="/">6/28</td><td align="char" char="/">3/26</td><td>0.5425</td><td>0.4704</td></tr><tr><td>HLT vs PsA</td><td align="char" char="/">6/28</td><td align="char" char="/">6/16</td><td>0.4239</td><td>0.3032</td></tr><tr><td>CRA vs PsA</td><td align="char" char="/">3/26</td><td align="char" char="/">6/16</td><td>0.1087</td><td>0.06282</td></tr></tbody></table><table-wrap-foot><fn id="tblfn1"><label>*</label><p>p<0.01.</p></fn><fn id="tblfn2"><label>†</label><p>p<0.05.</p></fn></table-wrap-foot></table-wrap></p></sec><sec id="s2-2"><title><italic>P. copri</italic> strains are variable and potentially diagnostic</title><p>Although initial shotgun sequencing of the patient-derived strains showed their similarity to <italic>P. copri</italic>, there were notable differences observed in assembled genomes upon comparison with the <italic>P. copri</italic> reference genome. This observation suggested that the presence or absence of particular genes in these strains might correlate with health or disease phenotypes in this cohort. To address this question, we performed shotgun sequencing on fecal DNA from NORA and healthy subjects, and chose to compare <italic>Prevotella</italic> sequences from 18 NORA <italic>Prevotella-</italic>positive subjects, which allowed for a depth of at least 7 M <italic>Prevotella-</italic>aligned reads (paired-end, 100 nt, Illumina platform), to those of <italic>P. copri</italic> from 17 healthy subjects (including 15 from the HMP database and 2 HLT from our cohort) (<xref ref-type="supplementary-material" rid="SD10-data">Supplementary file 1A</xref>). Samples sequenced to a depth of less than 7 M such reads were excluded (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1C</xref>), having insufficient depth for complete recovery of <italic>P. copri</italic> ORFs (see ‘Materials and methods’).</p><p>First, we examined the coverage of the <italic>P. copri</italic> reference genome by all subjects, as an indicator of inter-individual strain variability (<xref ref-type="bibr" rid="bib18">Human Microbiome Project Consortium, 2012</xref>). Overall, coverage was similar between healthy and NORA subjects in all but a few regions (<xref ref-type="fig" rid="fig3">Figure 3A</xref>, blue and red horizontal lines). Eight regions were poorly covered in all subjects with mean coverage below the 25<sup>th</sup> percentile of 0.79 FPKM, while several regions showed substantial variability between individuals (<xref ref-type="fig" rid="fig3">Figure 3A</xref>, gray vertical lines). To determine if the presence or absence of these regions within individuals was consistent between samplings, we applied MetaPhlAn (<xref ref-type="bibr" rid="bib42">Segata et al., 2012</xref>) to <italic>Prevotella</italic>-positive HMP samples collected over multiple visits (<xref ref-type="fig" rid="fig3">Figure 3B</xref>). Briefly, MetaPhlAn determines the presence or absence of metagenomic marker genes that are specific to particular bacterial clades by analyzing the coverage of such genes by sequenced reads. Genes are called specific for a bacterial clade if they are not found in any reference genomes outside the clade, but are found in all such genomes within the clade. In concordance with a previous report (<xref ref-type="bibr" rid="bib37">Schloissnig et al., 2013</xref>) documenting the temporal stability of metagenomic SNP patterns in individuals, we found that carriage of <italic>P. copri</italic> genes within an individual varied little between samplings. In addition to a stable set of <italic>P. copri</italic> core marker genes common to all samples, a subset of variable marker genes was observed to co-occur in islands across the <italic>P. copri</italic> genome, suggesting genomic rearrangements as a mechanism of variability (<xref ref-type="fig" rid="fig3">Figure 3A</xref>, blue boxes below plot). Together, these results suggest that <italic>P. copri</italic> strains vary between individuals and retain their individuality over time.<fig-group><fig id="fig3" position="float"><object-id pub-id-type="doi">10.7554/eLife.01202.014</object-id><label>Figure 3.</label><caption><title>Comparison of <italic>P. copri</italic> genomes from healthy and NORA subjects.</title><p>(<bold>A</bold>) Comparative coverage of the draft <italic>P. copri DSM 18205</italic> genome between individuals and within healthy and NORA groups. Gray points are median fragments per kilobase per million (FPKM) for 1-kb windows, gray lines within the plot are the interquartile range for each window, red and blue lines the LOWESS-smoothed average for NORA and healthy groups, respectively. Gray lines on the horizontal axis represent boundaries between assembled contigs. Regions are variably covered between subjects and groups, with several genomic islands lacking overall or especially variable (dark blue lines below the plot). (<bold>B</bold>) The presence (blue) or absence (gray) of previously-reported <italic>P. copri</italic>-unique marker genes (<xref ref-type="bibr" rid="bib42">Segata et al., 2012</xref>) in 11 stool samples from five subjects of the Human Microbiome Project (HMP) are shown as a heatmap. We report, in columns, only those <italic>P. copri</italic>-specific markers showing variable presence/absence patterns across the considered HMP samples. Each row represents a different sample collection date, groups of rows represent subjects, and groups of columns correspond to different variably covered genomic islands. Strains of <italic>P. copri</italic> are defined by the presence and absence of particular genes, which remain stable for at least 6 months in these individuals. All inter- and intra-individual comparisons between rows are highly statistically significant (p<<0.001, ‘ Materials and methods’). (<bold>C</bold>) The <italic>P. copri</italic> pangenome was identified by finding <italic>P. copri</italic> ORFs in all HMP and NORA cohort subjects, and the presence or absence of these ORFs was calculated for each subject (‘Materials and methods’, <xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1</xref>). Several ORFs are statistically significant biomarkers between healthy and NORA status (q<0.25) (<xref ref-type="supplementary-material" rid="SD10-data">Supplementary file 1B</xref>, ‘Materials and methods’).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.014">http://dx.doi.org/10.7554/eLife.01202.014</ext-link></p><p><supplementary-material id="SD5-data"><object-id pub-id-type="doi">10.7554/eLife.01202.015</object-id><label>Figure 3—source data 1.</label><caption><title>Intermediate data and analysis tools for <xref ref-type="fig" rid="fig3">Figure 3</xref>.</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.015">http://dx.doi.org/10.7554/eLife.01202.015</ext-link></p></caption><media mime-subtype="zip" mimetype="application" xlink:href="elife01202s005.zip"/></supplementary-material><supplementary-material id="SD6-data"><object-id pub-id-type="doi">10.7554/eLife.01202.016</object-id><label>Figure 3—source data 2.</label><caption><title>Intermediate data and analysis tools for <xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1</xref>.</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.016">http://dx.doi.org/10.7554/eLife.01202.016</ext-link></p></caption><media mime-subtype="zip" mimetype="application" xlink:href="elife01202s006.zip"/></supplementary-material></p></caption><graphic xlink:href="elife01202f003"/></fig><fig id="fig3s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.01202.017</object-id><label>Figure 3—figure supplement 1.</label><caption><title>Recovery of <italic>P. copri</italic> pangenome from HMP/RA shotgun reads and determination of presence/absence of <italic>P. copri</italic> ORFs by alignment of reads to pangenome gene catalog.</title><p>(<bold>A</bold>) Genes were called present in a sample if they were covered by aligned reads at an identity threshold of >97% over >97% of their length (red lines). (<bold>B</bold>) ORFs were called on contigs using MetaGeneMark (<xref ref-type="bibr" rid="bib54">Zhu et al., 2010</xref>) and were dereplicated with UCLUST (<xref ref-type="bibr" rid="bib10">Edgar, 2010</xref>) at an identity threshold of 97% (red line). (<bold>C</bold>) Recovery of a sample's <italic>P. copri</italic> pangenome saturated at approximately 7 million reads (red line). We therefore excluded samples with less than 7 million <italic>P. copri</italic> reads, defined as <italic>P. copri</italic> abundance determined by MetaPhlAn (<xref ref-type="bibr" rid="bib42">Segata et al., 2012</xref>) multiplied by the total number of quality-filtered reads. Samples with <italic>P. copri</italic> abundance likely misestimated (i.e., those with <3000 ORFs present) were also excluded (<xref ref-type="supplementary-material" rid="SD10-data">Supplementary file 1A</xref>). (<bold>D</bold>) Contigs were said to have originated from <italic>P. copri</italic> if they had at least one hit >97% identity over >300 bp (red lines).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.017">http://dx.doi.org/10.7554/eLife.01202.017</ext-link></p></caption><graphic xlink:href="elife01202fs003"/></fig><fig id="fig3s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.01202.018</object-id><label>Figure 3—figure supplement 2.</label><caption><title>Metagenomic context of discriminative biomarker ORFs.</title><p>ORFs found in the <italic>P. copri</italic> DSM 18205 reference genome are colored red, while those identified as differentially present in healthy and NORA groups are indicated with red asterisks. (<bold>A</bold>) Two ORFs, 3690 and 3694, are healthy-specific, occur on the same contig, and encode different components of the same NADH:quinone oxidoreductase. (<bold>B</bold>) Similarly, ORFs 62,568 and 62,569 occur on the same contig, are NORA-specific, and encode components of the same iron ABC transporter.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.018">http://dx.doi.org/10.7554/eLife.01202.018</ext-link></p></caption><graphic xlink:href="elife01202fs004"/></fig></fig-group></p><p>Next, we assembled a catalog of <italic>P. copri</italic> genes present across many individuals (i.e., the <italic>P. copri</italic> pangenome), by performing de novo metagenome assembly and gene calling on a per-sample basis (see ‘Materials and methods’). To determine if any ORFs were differentially present in NORA subjects as compared to healthy controls, we first reduced the set of interrogated ORFs by filtering partially assembled (i.e., containing gaps, lacking stop codons), short (i.e., less than 300 bp), and low-coverage (i.e., present in fewer than five subjects) ORFs to yield a final set of 3,291 high-confidence <italic>P. copri</italic> ORFs (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1</xref>). We found two ORFs differentially present in healthy controls, and 17 ORFs differentially present in NORA (<xref ref-type="fig" rid="fig3">Figure 3C</xref>; <xref ref-type="supplementary-material" rid="SD10-data">Supplementary file 1B</xref>). The two healthy-specific ORFs appear on the same metagenomic contig, encoding a nearly-complete <italic>nuo</italic> operon for NADH:ubiquinone oxidoreductase (<xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2A</xref>), adjacent to a <italic>Bacteroides</italic> conjugative transposon. Similarly, two of the NORA-specific ORFs appear together on another metagenomic contig, encoding an ATP-binding cassette iron transporter (<xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2B</xref>). These ORFs may represent good biomarkers for discrimination between healthy and disease-associated microbiota in the population at risk for RA.</p></sec><sec id="s2-3"><title>Functional potential of the NORA metagenome</title><p>To determine if the NORA metagenome encodes unique functions compared to healthy subjects, we applied HUMAnN (<xref ref-type="bibr" rid="bib2">Abubucker et al., 2012</xref>) to quantitate the coverage and abundances of KEGG (<xref ref-type="bibr" rid="bib21">Kanehisa and Goto, 2000</xref>) modules (small sets of genes in well-defined metabolic pathways) in healthy controls (n = 5) and a representative set of NORA subjects (n = 14) with and without <italic>Prevotella</italic>. We then applied LEfSe (<xref ref-type="bibr" rid="bib41">Segata et al., 2011</xref>) to find statistically significant differences between groups. This analysis revealed a low abundance of vitamin metabolism (i.e., biotin, pyroxidal, and folate) and pentose phosphate pathway modules in NORA, consistent with a lack of these functions in <italic>Prevotella</italic> genomes (<xref ref-type="fig" rid="fig4">Figure 4</xref>). At the coverage level (presence or absence), the NORA metagenome is defined by an absence of functions present in <italic>Bacteroides</italic> and Clostridia, clades typically found in low abundance in <italic>Prevotella</italic>-high NORA subjects.<fig id="fig4" position="float"><object-id pub-id-type="doi">10.7554/eLife.01202.019</object-id><label>Figure 4.</label><caption><title>Metabolic pathway representation in the microbiome of healthy and NORA subjects.</title><p>HUMAnN (<xref ref-type="bibr" rid="bib2">Abubucker et al., 2012</xref>) was applied to metagenomic reads (paired-end, 100 nt, Illumina platform) from NORA subjects (n = 14) and healthy controls (n = 5) to quantitate the abundances of hierarchically related KEGG modules in these samples (‘Materials and methods’ and <xref ref-type="supplementary-material" rid="SD10-data">Supplementary file 1A</xref>). LEfSe (<xref ref-type="bibr" rid="bib41">Segata et al., 2011</xref>) was used to find statistically significant differences between groups at an alpha cutoff of 0.001 and an effect size cutoff of 2.0. Results shown here are highly significant (p<0.001) and represent large differences between groups. Modules highlighted in red are over-abundant in NORA samples while modules highlighted in blue are over-abundant in healthy samples. <italic>Prevotella</italic>-dominated NORA metagenomes have a dearth of genes encoding vitamin and purine metabolizing enzymes, and an excess of cysteine metabolizing enzymes.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.019">http://dx.doi.org/10.7554/eLife.01202.019</ext-link></p><p><supplementary-material id="SD7-data"><object-id pub-id-type="doi">10.7554/eLife.01202.020</object-id><label>Figure 4—source data 1.</label><caption><title>Intermediate data and analysis tools for <xref ref-type="fig" rid="fig4">Figure 4</xref>.</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.020">http://dx.doi.org/10.7554/eLife.01202.020</ext-link></p></caption><media mime-subtype="zip" mimetype="application" xlink:href="elife01202s007.zip"/></supplementary-material></p></caption><graphic xlink:href="elife01202f004"/></fig></p><p><italic>Prevotella</italic> and <italic>Bacteroides</italic> are closely related both functionally and phylogenetically, yet, surprisingly, are rarely found together in high relative abundance despite their ability to dominate the gut microbiome individually (<xref ref-type="bibr" rid="bib12">Faust et al., 2012</xref>). We hypothesized that there might be a genetic difference in these two clades that could account for their apparent co-exclusionary relationship. We therefore sought to find genes differentially present in <italic>P. copri</italic> but not in any of the most abundant <italic>Bacteroides</italic> species. This revealed K05919 (superoxide reductase), K00390 (phosphoadenosine phosphosulfate reductase), and several transporters as uniquely present in <italic>P. copri</italic> (<xref ref-type="supplementary-material" rid="SD10-data">Supplementary file 1C</xref>), and also a set of genes absent in <italic>P. copri</italic> but present in <italic>Bacteroides</italic> (<xref ref-type="supplementary-material" rid="SD10-data">Supplementary file 1D</xref>).</p></sec><sec id="s2-4"><title>Relative abundance of <italic>P. copri</italic> in NORA inversely correlates with presence of shared-epitope risk alleles</title><p>Certain alleles within the human leukocyte-antigen (HLA) Class II locus confer higher risk of disease, in particular those belonging to DRB1 (i.e., ‘shared epitope’ alleles or SE) (<xref ref-type="bibr" rid="bib8">du Montcel et al., 2005</xref>; <xref ref-type="bibr" rid="bib15">Gregersen et al., 1987</xref>). To determine whether a higher abundance of <italic>P. copri</italic> is associated with the host genotype, we carried out HLA sequencing on DNA from all participants in our study (<xref ref-type="supplementary-material" rid="SD10-data">Supplementary file 1E</xref>). Consistent with recently published mouse data (<xref ref-type="bibr" rid="bib14">Gomez et al., 2012</xref>), the presence of SE alleles correlated with the composition of the gut microbiota. A subgroup analysis of NORA patients and healthy controls according to presence (or absence) of SE alleles revealed a significantly higher relative abundance of <italic>P. copri</italic> in those subjects lacking predisposing genes (<xref ref-type="fig" rid="fig5">Figure 5</xref>, p<0.001 in NORA, p<0.05 in HLT, ‘Materials and methods’).<fig id="fig5" position="float"><object-id pub-id-type="doi">10.7554/eLife.01202.021</object-id><label>Figure 5.</label><caption><title>Relationship of host HLA genotype to abundance of <italic>P. copri</italic> (OTU4, OTU12, and OTU934 combined relative abundance).</title><p>The HLA-class II genotype of all subjects was determined by sequence-based typing methodology (‘Materials and methods’). Groups were subdivided by the presence or absence of shared-epitope RA risk alleles (+/− SE as indicated above) and correlated with relative abundance of intestinal <italic>P. copri</italic>. A statistically significant correlation is seen between <italic>P. copri</italic> abundance and the genetic risk for rheumatoid arthritis in NORA (red stars) and healthy (blue circles) subjects by Welch’s two-tailed <italic>t</italic> test.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.021">http://dx.doi.org/10.7554/eLife.01202.021</ext-link></p><p><supplementary-material id="SD8-data"><object-id pub-id-type="doi">10.7554/eLife.01202.022</object-id><label>Figure 5—source data 1.</label><caption><title>Intermediate data and analysis tools for <xref ref-type="fig" rid="fig5">Figure 5</xref>.</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.022">http://dx.doi.org/10.7554/eLife.01202.022</ext-link></p></caption><media mime-subtype="zip" mimetype="application" xlink:href="elife01202s008.zip"/></supplementary-material></p></caption><graphic xlink:href="elife01202f005"/></fig></p></sec><sec id="s2-5"><title><italic>P. copri</italic> exacerbates colitis in mice</title><p>To determine if the <italic>Prevotella</italic>-associated metagenome is sufficient to predispose to increased inflammatory responses, antibiotic-treated C57BL/6 mice were colonized with <italic>P. copri</italic> by oral gavage<italic>.</italic> Analysis of DNA extracted from fecal samples 2 weeks post-gavage revealed robust colonization with <italic>P. copri</italic> (<xref ref-type="fig" rid="fig6">Figure 6A</xref>). Sequencing of the 16S gene (regions V1–V2, 454 platform) in fecal DNA from two representative mice colonized with <italic>P. copri</italic> revealed the ability of <italic>Prevotella</italic> to dominate the gut microbiota (<xref ref-type="fig" rid="fig6">Figure 6B</xref>). In comparison to fecal DNA from mice gavaged with media alone, <italic>P. copri</italic>-colonized mice had reduced Bacteroidales and Lachnospiraceae, similar to what was observed in this patient cohort (<xref ref-type="fig" rid="fig1">Figure 1A</xref>, <xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1D</xref>). Consistent with a previous report of a <italic>Prevotella</italic> taxon exacerbating an inflammatory phenotype (<xref ref-type="bibr" rid="bib11">Elinav et al., 2011</xref>), exposure of <italic>P. copri</italic>-colonized mice to 2% dextran sulfate sodium (DSS) in drinking water for 7 days resulted in more severe colitis as assessed by enhanced weight loss (<xref ref-type="fig" rid="fig6">Figure 6C</xref>), worse endoscopic score (<xref ref-type="fig" rid="fig6">Figure 6D</xref>), and increased epithelial damage on histological analysis (<xref ref-type="fig" rid="fig6">Figure 6E,F</xref>) when compared to littermate controls gavaged with media alone. Furthermore, in contrast to mice colonized with mouse commensal <italic>Bacteroides thetaiotamicron</italic> (<xref ref-type="fig" rid="fig6s1">Figure 6—figure supplement 1A</xref>), <italic>P. copri</italic> colonized mice similarly showed significantly decreased weight loss at day 7 following DSS exposure (<xref ref-type="fig" rid="fig6s1">Figure 6—figure supplement 1B</xref>). Analysis of the lamina propria CD4<sup>+</sup> T-cell response revealed an increase in IFNγ production following DSS induction, although no statistically significant differences were seen in IFNγ (Th1) or IL-17 production (Th17) following <italic>P. copri</italic> colonization (<xref ref-type="fig" rid="fig6s1">Figure 6—figure supplement 1C</xref>). Likewise, no differences in Foxp3<sup>+</sup> CD4<sup>+</sup> T-cells were observed. These data suggest that a <italic>Prevotella</italic>-defined microbiome may have the propensity to support inflammation in the context of a genetically susceptible host.<fig-group><fig id="fig6" position="float"><object-id pub-id-type="doi">10.7554/eLife.01202.023</object-id><label>Figure 6.</label><caption><title>Colonization with <italic>P. copri</italic> dominates the colonic microbiome and exacerbates local inflammatory responses.</title><p>(<bold>A</bold>) DNA was extracted from fecal pellets of media-gavaged mice and <italic>P. copri</italic>-gavaged mice 2 weeks after colonization and assayed by QPCR with <italic>P. copri</italic> specific primers compared to universal 16S. (<bold>B</bold>) Relative abundance of bacterial families in fecal DNA from media-gavaged and <italic>P. copri</italic>-colonized mice (shown in duplicate) by high-throughput 16S sequencing (regions V1–V2, 454 platform). (<bold>C</bold>) C57BL/6 mice colonized with <italic>P. copri</italic> (n = 15) or media alone (n = 13) controls were exposed to DSS for seven days and percent of starting body weight is shown. Composite data from three representative experiments are shown. (<bold>D</bold>) Representative colonoscopic images of mice colonized with <italic>P. copri</italic> or media gavage following DSS-induced colitis. Endoscopic colitis score for five individual animals is displayed. (<bold>E</bold> and <bold>F</bold>) Gross pathology (<bold>E</bold>) and histology (<bold>F</bold>) of colons from mice colonized with <italic>P. copri</italic> or media gavage following DSS-induced colitis.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.023">http://dx.doi.org/10.7554/eLife.01202.023</ext-link></p><p><supplementary-material id="SD9-data"><object-id pub-id-type="doi">10.7554/eLife.01202.024</object-id><label>Figure 6—source data 1.</label><caption><title>Intermediate data and analysis tools for <xref ref-type="fig" rid="fig6">Figure 6</xref>.</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.024">http://dx.doi.org/10.7554/eLife.01202.024</ext-link></p></caption><media mime-subtype="zip" mimetype="application" xlink:href="elife01202s009.zip"/></supplementary-material></p></caption><graphic xlink:href="elife01202f006"/></fig><fig id="fig6s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.01202.025</object-id><label>Figure 6—figure supplement 1.</label><caption><title><italic>P. copri</italic> colonization exacerbates chemically induced colitis.</title><p>(<bold>A</bold>) DNA was extracted from fecal pellets of media, <italic>P. copri</italic>, and <italic>B. thetaiotamicron</italic> gavaged mice 2 weeks after colonization and assayed by QPCR with <italic>P. copri</italic> or <italic>Bacteroides</italic> specific primers compared to universal 16S amplicon. (<bold>B</bold>) C57BL<bold>/</bold>6 mice colonized with <italic>P. copri</italic> (n = 10) or <italic>B. theta</italic> (n = 10) were exposed to DSS for seven days and percent of starting body weight is shown. (<bold>C</bold>) Percent of total CD4<sup>+</sup> T-cells in the colonic lamina propria expressing IL-17 (Th17) or IFNγ (Th1) following PMA/ionomycin stimulation or expressing Foxp3 (Treg).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.025">http://dx.doi.org/10.7554/eLife.01202.025</ext-link></p></caption><graphic xlink:href="elife01202fs005"/></fig></fig-group></p></sec></sec><sec id="s3" sec-type="discussion"><title>Discussion</title><p>Multiple lines of investigation have revealed that RA is a multifactorial disease that occurs in sequential phases. Notably, there is a prolonged period of autoimmunity (i.e., presence of circulating auto-antibodies such as rheumatoid factor and anti-citrullinated peptide antibodies) in a pre-clinical state that lasts many years, during which time there is no clinical or histologic evidence of inflammatory arthritis (<xref ref-type="bibr" rid="bib6">Deane et al., 2010</xref>). Before the onset of clinical disease, there is an increase in autoantibody titers and epitope spreading coupled with elevation in circulating pro-inflammatory cytokines. These findings have led to the ‘second-event’ hypothesis in RA, which proposes that an environmental factor triggers systemic joint inflammation in the context of pre-existent autoimmunity. Multiple mucosal sites and their residing microbial communities have been implicated, including the airways, the periodontal tissue and the intestinal lamina propria (<xref ref-type="bibr" rid="bib26">Mcinnes and Schett, 2011</xref>; <xref ref-type="bibr" rid="bib36">Scher et al., 2012</xref>).</p><p>Although a role for the gut microbiota has been clearly established in animal models of arthritis, it is not known if dysbiosis influences human RA. The human gut microbiota has been classified into unique enterotypes, one of which is defined by the predominance of <italic>Prevotella</italic> (<xref ref-type="bibr" rid="bib4">Arumugam et al., 2011</xref>). In our cohort, we found the microbiota of many subjects to be defined by a single taxon—<italic>P. copri</italic>—which was associated with the majority of untreated, new-onset rheumatoid arthritis (NORA) patients. <italic>P. copri</italic> was also detected in a minority of healthy subjects in cohorts from the Human Microbiome Project (<xref ref-type="bibr" rid="bib18">Human Microbiome Project Consortium, 2012</xref>), the European MetaHIT project (<xref ref-type="bibr" rid="bib31">Qin et al., 2010</xref>), and our study. Surprisingly, the prevalence of <italic>P. copri</italic> in chronic rheumatoid arthritis (CRA) patients, all of whom had been treated and exhibited reduced disease activity, was similar to that observed in the healthy subjects. One hypothesis is that the <italic>Prevotella-</italic>defined microbiota fail to thrive when there is less inflammation, perhaps due to a lack of inflammation-derived terminal electron acceptors, as seen for <italic>E. coli</italic> in inflammatory bowel disease (<xref ref-type="bibr" rid="bib49">Winter et al., 2013</xref>). Alternatively, the gut microbiota changes observed in newly diagnosed RA patients may be the consequence of a unique, NORA-specific systemic inflammatory response. While DAS28 scores were slightly lower in CRA and PsA patients (<xref ref-type="table" rid="tbl1">Table 1</xref>), the most remarkable difference was in levels of C-reactive protein (CRP). This raises the question of whether CRP itself may have microbial modulating properties. CRP is characteristically high in early and flaring RA, but not in other autoimmune diseases (e.g., systemic lupus erythematous, scleroderma, and PsA). A member of the pentraxin protein family, CRP was first identified in the plasma of patients with <italic>Streptococcus pneumoniae</italic> infection (<xref ref-type="bibr" rid="bib46">Tillett and Francis, 1930</xref>). Further, the primary bacterial ligand for CRP is phosphocholine, a component of multiple bacterial cell-wall components, including lipopolysaccharides (LPS). CRP binding to bacterial phosphocholine activates the complement system and enhances phagocytosis by macrophages. Whether or not CRP itself represents a specific response to the presence of <italic>P. copri</italic> in NORA is an area of future investigation. Interestingly, <italic>Prevotella</italic>-dominated healthy omnivore individuals were recently reported to have increased basal levels of serum TMAO (trimethylamine N-oxide), a product of inflammation linked to atherogenesis, compared to <italic>Bacteroides</italic>-dominated healthy individuals (<xref ref-type="bibr" rid="bib22">Koeth et al., 2013</xref>). While TMAO could be derived from increased consumption of meat (<xref ref-type="bibr" rid="bib22">Koeth et al., 2013</xref>), <italic>Prevotella</italic> has been previously associated with a dearth of meat in the diet (<xref ref-type="bibr" rid="bib50">Wu et al., 2011</xref>). Additional studies are needed to determine if prevalence of <italic>P. copri</italic> in the microbiota is associated with changes in specific metabolites.</p><p>Sequence alignment most closely linked NORA-associated <italic>Prevotella</italic> with the <italic>P. copri</italic> genome. Interestingly, large regions of the <italic>P. copri</italic> genome were scarcely covered in both our cohort and subjects of the HMP. As the reference strain of <italic>P. copri</italic> was isolated in Japan and all samples analyzed in our study were collected and sequenced in North America, these differences may reflect geographically-associated strain variability, consistent with a report ranking <italic>P. copri</italic> as the second-most variable member of the human gut microbiota between continents (<xref ref-type="bibr" rid="bib37">Schloissnig et al., 2013</xref>). Notably, comparison of sequences in NORA samples with those of <italic>P. copri</italic>-dominated healthy individuals evaluated in the HMP allowed us to identify ORFs associated with the NORA phenotype. Two ORFs, both encoding components of an iron transporter, were specific for NORA-associated <italic>P. copri</italic>, while two ORFs were specific for HLT-associated <italic>P. copri</italic> and encode components of a <italic>nuo</italic> operon. Iron transporters are known to be virulence factors in other bacterial clades, while the ubiquinone oxidoreductase pathway encoded by the <italic>nuo</italic> operon may provide a fitness advantage in the context of a healthy microbiome by allowing use of metabolites available therein. While colonization with <italic>P. copri</italic> increases the pre-test probability of NORA from 1% to approximately 3.95% in western cohorts (by Bayes’ theorem, see ‘Materials and methods’), the presence of one of the aforementioned ORFs may markedly increase the pre-test probability of NORA status. The diagnostic application of these biomarkers needs to be confirmed in larger cohorts.</p><p>Analysis of enzymatic functions in the <italic>Prevotella</italic>-dominated metagenome reveals a significant decrease in purine metabolic pathways, including tetrahydrofolate (THF) biosynthesis. This may have therapeutic implications since methotrexate (MTX), a folate analogue and a dihydrofolate (DHF) reductase inhibitor, remains the anchor drug for the treatment of RA (<xref ref-type="bibr" rid="bib43">Singh et al., 2012</xref>) and has inter-individual variability in terms of absorption and bioavailability. The THF biosynthetic pathway encoded by the gut metagenome, which includes a DHF reductase enzyme, may compete with host DHF reductase for MTX binding and metabolism. If so, an increase in DHF reductase-high microbiota in some RA subjects (i.e., <italic>Bacteroides</italic> overabundant) may help explain, at least partially, why only about half of RA patients respond adequately to oral MTX, ultimately requiring either parenteral administration or the addition of complementary immunosuppressants. <italic>Prevotella</italic>-high NORA subjects, with a dearth of DHF reductase in the gut, may respond better to oral MTX. Prospective human studies should help to clarify these observations.</p><p>RA is a multifactorial autoimmune disease in which certain alleles within the major histocompatibility complex (MHC) class II locus, specifically those belonging to DRB1 (i.e., shared epitope alleles), confer higher risk for disease. A recently published study with HLA-DR transgenic mice revealed that the gut microbiota was, at least partially, regulated by the HLA genes (<xref ref-type="bibr" rid="bib14">Gomez et al., 2012</xref>). Arthritis-susceptible DRB1*04:01 transgenic mice had a markedly different intestinal microbiota when compared to arthritis-resistant DRB1*04:02 animals, and this was associated with altered mucosal immune function (i.e., increased gene transcripts for Th17-related cytokines) and increased intestinal permeability. Our results suggest that, similarly, SE risk-alleles in humans may have an impact on the composition of the gut microbiota. Intriguingly, patients in the NORA cohort showed a significant inverse correlation between <italic>P. copri</italic> relative abundance and presence of SE alleles (<xref ref-type="fig" rid="fig5">Figure 5</xref>). It is therefore possible that, as in mice, certain human gut microbial communities are determined by specific MHC alleles that favor the expansion of particular species. As in the case of cigarette smoking, this could also represent a gene-environment interaction that contributes to RA pathogenesis. It is conceivable that a certain threshold for <italic>P. copri</italic> abundance may be necessary to overcome the lack of genetic predisposition in RA subjects, while a lower abundance may be sufficient to trigger disease in those carrying risk-alleles. Validation in expanded cohorts and mechanistic studies are needed to better understand the significance of these findings.</p><p>Colonization of mice with <italic>P. copri</italic> recapitulated the differences in relative abundances of <italic>Prevotella</italic> and <italic>Bacteroides</italic> previously reported in humans, and confirmed the ability of <italic>P. copri</italic> to dominate the colonic commensal microbiota in the absence of apparent disease (<xref ref-type="bibr" rid="bib12">Faust et al., 2012</xref>). This shift in abundances correlated with a metagenomic shift, which may support and/or perpetuate an inflammatory environment. For example, uniquely present superoxide reductase in <italic>P. copri</italic> may facilitate resistance to or allow the use of host-derived reactive oxygen species (ROS) generated during inflammation, perhaps as terminal electron acceptors for respiration (<xref ref-type="bibr" rid="bib49">Winter et al., 2013</xref>). Similarly, the <italic>P. copri</italic> genome encodes phosphoadenosine phosphosulfate reductase (PAPS), an oxidoreductase absent in <italic>Bacteroides</italic> that participates in sulfur metabolism and leads to the production of thioredoxin. Intriguingly, thioredoxin has been widely implicated in the pathogenesis of RA and high levels of this redox protein have been found in both serum and synovial fluid of RA patients (<xref ref-type="bibr" rid="bib25">Maurice et al., 1999</xref>).</p><p>Mice colonized with <italic>P. copri</italic> displayed increased inflammation in DSS-induced colitis. An appealing hypothesis from an evolutionary and ecological perspective is that the <italic>P. copri</italic>-defined microbiota thrives in a pro-inflammatory environment and may exacerbate inflammation for its own benefit. Another key feature of the <italic>P. copri-</italic>dominated microbiome is a community shift away from <italic>Bacteroides</italic>, Group XIV Clostridia, <italic>Blautia</italic>, and <italic>Lachnospiraceae</italic> clades, previously reported to be associated with an anti-inflammatory state and regulatory T-cell (Treg) production (<xref ref-type="bibr" rid="bib5">Atarashi et al., 2011</xref>; <xref ref-type="bibr" rid="bib34">Round et al., 2011</xref>). This could account, in part, for the observed differences in susceptibility to inflammation (<xref ref-type="bibr" rid="bib45">Tao et al., 2011</xref>). Further characterization of changes in the host immune system associated with a <italic>Prevotella</italic>-dominated microbiota should provide deeper insight into whether expansion of <italic>P. copri</italic> contributes causally to the development of autoimmunity in early onset RA.</p></sec><sec id="s4" sec-type="materials|methods"><title>Materials and methods</title><sec id="s4-1"><title>Study participants</title><p>Consecutive patients from the New York University rheumatology clinics and offices were screened for the presence of RA. After informed consent was signed, each patient’s medical history (according to chart review and interview/questionnaire), diet, and medications were determined. A screening musculoskeletal examination and laboratory assessments were also performed or reviewed. All RA patients who met the study criteria were offered enrollment.</p></sec><sec id="s4-2"><title>Inclusion and exclusion criteria</title><p>The criteria for inclusion in the study required that patients meet the American College of Rheumatology/European League Against Rheumatism 2010 classification criteria for RA (<xref ref-type="bibr" rid="bib3">Aletaha et al., 2010</xref>), including seropositivity for rheumatoid factor (RF) and/or anti–citrullinated protein antibodies (ACPAs) (assessed using an anti–cyclic citrullinated peptide ELISA; Euroimmun), and that all subjects be age 18 years or older. New-onset RA was defined as disease duration of a minimum of 6 weeks and up to 6 months since diagnosis, and absence of any treatment with disease-modifying anti-rheumatic drugs (DMARDs), biologic therapy or steroids (ever). Chronic RA was defined as any patient meeting the criteria for RA whose disease duration was a minimum of 6 months since diagnosis. Most subjects with chronic RA were receiving DMARDs (oral and/or biologic agents) and/or corticosteroids at the time of enrollment. Healthy controls were age-, sex-, and ethnicity-matched individuals with no personal history of inflammatory arthritis.</p><p>The exclusion criteria applied to all groups were as follows: recent (<3 months prior) use of any antibiotic therapy, current extreme diet (e.g., parenteral nutrition or macrobiotic diet), known inflammatory bowel disease, known history of malignancy, current consumption of probiotics, any gastrointestinal tract surgery leaving permanent residua (e.g., gastrectomy, bariatric surgery, colectomy), or significant liver, renal, or peptic ulcer disease. This study was approved by the Institutional Review Board of New York University School of Medicine.</p></sec><sec id="s4-3"><title>Sample collection and DNA extraction</title><p>Fecal samples were obtained within 24 hr of production. All samples were suspended in MoBio buffer-containing tubes. DNA was extracted using a combination of the MoBio Power Soil kit (Mo Bio Laboratories, Inc, Carlsbad, CA, USA) and a mechanical disruption (bead-beater) method based on a previously described protocol (<xref ref-type="bibr" rid="bib47">Ubeda et al., 2010</xref>). Samples were stored at −80°C.</p></sec><sec id="s4-4"><title>V1–V2 16S rDNA region amplification and sequencing</title><p>For each sample, three replicate PCRs were performed to amplify the V1 and V2 regions as previously described (<xref ref-type="bibr" rid="bib47">Ubeda et al., 2010</xref>). PCR products were sequenced on the 454 GS FLX Titanium platform (454 Life Sciences, Branford, CT, USA) to a depth of at least 2,600 reads per subject. Sequences have been deposited in the NCBI Sequence Read Archive under the accession number SRP023463.</p></sec><sec id="s4-5"><title>16S sequence analysis</title><p>Sequence data were compiled and processed using MOTHUR (<xref ref-type="bibr" rid="bib38">Schloss et al., 2009</xref>). Sequences were converted to standard FASTA format. Sequences shorter than 200 bp, containing undetermined bases or homopolymer stretches longer than 8 bp, with no exact match to the forward primer or a barcode, or that did not align with the appropriate 16S rRNA variable region were not included in the analysis. Using the 454 base quality scores, which range from 0–40 (0 being an ambiguous base), sequences were trimmed using a sliding-window technique, such that the minimum average quality score over a window of 50 bases never dropped below 30. Sequences were trimmed from the 3′-end until this criterion was met. Sequences were aligned to the 16S rRNA gene, using as template the SILVA reference alignment (<xref ref-type="bibr" rid="bib30">Pruesse et al., 2007</xref>), and the Needleman-Wunsch algorithm with the default scoring options. Potentially chimeric sequences were removed using the ChimeraSlayer program (<xref ref-type="bibr" rid="bib16">Haas et al., 2011</xref>). To minimize the effect of pyrosequencing errors in overestimating microbial diversity (<xref ref-type="bibr" rid="bib19">Huse et al., 2010</xref>), rare abundance sequences that differ in one or two nucleotides from a high abundance sequence were merged to the high abundance sequence using the pre.cluster option in MOTHUR. Sequences were grouped into operational taxonomic units (OTUs) using the average neighbor algorithm. Sequences with distance-based similarity of 97% or greater were assigned to the same OTU. OTU-based microbial diversity was estimated by calculating the Shannon diversity index and Simpson Index using <italic>mothur</italic>. Phylogenetic classification was performed for each sequence using the Bayesian classifier algorithm described by Wang and colleagues with the bootstrap cutoff 60% (<xref ref-type="bibr" rid="bib48">Wang et al., 2007</xref>).</p></sec><sec id="s4-6"><title>Statistical assessment of biomarkers using LEfSe</title><p>Briefly, LEfSe pairwise compares abundances of all biomarkers (e.g., bacterial clades) between all groups using the Kruskal-Wallis test, requiring all such tests to be statistically significant. Vectors resulting from the comparison of abundances (e.g., <italic>Prevotella</italic> relative abundance) between groups are used as input to linear discriminant analysis (LDA), which produces an effect size (<xref ref-type="fig" rid="fig1">Figure 1A</xref>). In analyses performed here, the main utility of LEfSe over traditional statistical tests is that an effect size is produced in addition to a p or q value. This allows us to sort the results of multiple tests by the magnitude of the difference between groups, not only by q values, as the two are not necessarily correlated. In the case of hierarchically organized groups (e.g., bacterial clades, or KEGG pathways), this lack of correlation can arise from differences in the number of hypotheses considered at different levels in the hierarchy. For example, at the genus level, there may be 1,000 tests performed, requiring a high level of significance to pass multiple testing correction, whereas at the phylum level, only 10 tests may be performed, requiring a less stringent threshold for significance.</p></sec><sec id="s4-7"><title>Processing of Illumina reads</title><p>Paired-end reads 100 bp in length were trimmed from both ends to yield the largest contiguous segment where all per-base QVs were >= 25. Reads < 50 bp in length after this step were discarded. Quality-filtered reads were then aligned to the human reference genome (hg19) using bowtie2 in—very-sensitive-local mode, keeping only those reads that failed to align. Human-filtered reads were then sorted into complete pairs and singletons (whose mates were removed by filtering) for downstream analyses.</p></sec><sec id="s4-8"><title>Calculation of <italic>P. copri DSM 18205</italic> genome coverage</title><p>The <italic>P. copri DSM 18205</italic>-reference genome (assembly GCA_000157935.1) was first concatenated into a pseudo-contig in order of increasing contig number. Filtered Illumina reads from <italic>P. copri</italic> positive NORA and healthy (including HMP subjects, <xref ref-type="supplementary-material" rid="SD10-data">Supplementary file 1A</xref>) subjects were aligned to the reference using bowtie2 in—very-sensitive-local mode. Paired-end reads aligning to non-overlapping 1 kb windows across the length of the genome were counted and normalized to FPKM (fragments per kilobase per million reads). The interquartile range (25<sup>th</sup> to 75<sup>th</sup> percentile), mean, and median FPKM for each window was calculated and displayed as a boxplot with R.</p></sec><sec id="s4-9"><title>Generation of a <italic>P. copri</italic> pangenome catalog</title><p>Filtered paired-end reads from <italic>P. copri</italic> positive subjects were first assembled according to the HMP Whole-Metagenome Assembly SOP (<xref ref-type="bibr" rid="bib28">Pop, 2011</xref>) using SOAPdenovo (<xref ref-type="bibr" rid="bib24">Luo et al., 2012</xref>). Briefly, paired-end and singleton reads were used concurrently with the parameters -K 25 -R -M 3 -d 1. The resulting contigs >300 bp in length were then aligned to the <italic>P. copri</italic> reference genome with BLASTN at an e value cutoff of 1e-5. A stringent cutoff requiring at least one hit of 97% identity across 300 bp was used to infer that a contig originated from a strain of <italic>P. copri</italic> (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1D</xref>). ORFs were then called on the resulting contigs using MetaGeneMark (<xref ref-type="bibr" rid="bib54">Zhu et al., 2010</xref>). The resulting ORFs were then clustered using USEARCH at an identity threshold of 97% to yield a final set of <italic>P. copri</italic> genes (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1D</xref>). Samples were excluded from further analyses if they had less than 7 million reads aligning to <italic>P. copri</italic> (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1C</xref>). This resulted in a catalog of 20,387 putative <italic>P. copri</italic> ORFs with 9,274 +/− 1,640 (mean, SD) present in each subject. Further filtering of partially assembled (i.e., containing gaps, lacking stop codons), short (i.e., less than 300 bp), and low-coverage (i.e., present in fewer than five subjects) ORFs yielded a final set of 3,291 high-confidence <italic>P. copri</italic> ORFs.</p></sec><sec id="s4-10"><title>Presence or absence determination of <italic>P. copri</italic> pangenome ORFs</title><p>Filtered reads were aligned to the <italic>P. copri</italic> pangenome catalog using bowtie2 in–very-fast mode. ORFs were said to be present in a sample if at least 97% of their length, minus one read length (i.e., 100 bp) to account for edge alignment artifacts, was covered at an identity of 97% or greater (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1A</xref>).</p></sec><sec id="s4-11"><title>Calculation of differential ORF presence in healthy and NORA</title><p>The presence or absence of ORFs in each sample was determined as above, and Fisher’s exact test was used on 2 × 2 contingency tables for each ORF. Resulting p were adjusted for multiple hypothesis testing by converting to false discovery rate (FDR) q values using the Benjamini-Hochberg procedure. ORFs with q<0.25 were considered statistically significant. Effect size was calculated using the below equation.<disp-formula id="equ1"><mml:math id="m1"><mml:mrow><mml:mi mathvariant="italic">Effect</mml:mi><mml:mo> </mml:mo><mml:mi mathvariant="italic">Size</mml:mi><mml:mo> </mml:mo><mml:mo>=</mml:mo><mml:mo> </mml:mo><mml:mfrac><mml:mrow><mml:mi mathvariant="italic">Absent</mml:mi><mml:mo> </mml:mo><mml:mi mathvariant="italic">in</mml:mi><mml:mo> </mml:mo><mml:mi mathvariant="italic">NORA</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">Total</mml:mi><mml:mo> </mml:mo><mml:mi mathvariant="italic">Absent</mml:mi></mml:mrow></mml:mfrac><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:mi mathvariant="italic">Present</mml:mi><mml:mo> </mml:mo><mml:mi mathvariant="italic">in</mml:mi><mml:mo> </mml:mo><mml:mi mathvariant="italic">NORA</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">Total</mml:mi><mml:mo> </mml:mo><mml:mi mathvariant="italic">Present</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula></p></sec><sec id="s4-12"><title>Application of Bayes’ theorem to <italic>P. copri</italic> presence and NORA status</title><p>In western cohorts, such as the Human Microbiome Project and our own, the prevalence of <italic>P. copri</italic> is approximately 19%, that is P(<italic>Prevotella</italic>) = 0.19. The approximate incidence of RA is thought to be 1%, that is P(NORA) = 0.01. In our cohort, we found that 75% of new-onset RA (NORA) subjects had 5% or more <italic>Prevotella</italic> OTU4, which we determined to be <italic>P. copri</italic>, that is P(<italic>Prevotella</italic>|NORA) = 0.75. We therefore applied Bayes’ theorem as given below.<disp-formula id="equ2"><mml:math id="m2"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi mathvariant="italic">NORA</mml:mi><mml:mtext>|</mml:mtext><mml:mi mathvariant="italic">Prevotella</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi mathvariant="italic">Prevotella</mml:mi><mml:mtext>|</mml:mtext><mml:mi mathvariant="italic">NORA</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">NORA</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">Prevotella</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula></p><p>The solution to this equation gives a 3.95% probability of NORA status if <italic>P. copri</italic> is present in the gut, compared to a 1% probability of NORA (i.e., the incidence of RA) given no prior information.</p></sec><sec id="s4-13"><title>Genome assembly</title><p>Long reads were obtained for several high-<italic>Prevotella</italic> abundance subjects (028B, 030B, 061B, 089B) on the 454 GS FLX Titanium platform. These reads were assembled with Newbler v2.6 to obtain metagenomic assemblies (<xref ref-type="table" rid="tbl2">Table 2</xref>). The resulting contigs were subsequently filtered by alignment to the <italic>P. copri DSM 18205</italic> reference genome, keeping those with at least one hit of 97% across 300 bp, to obtain draft patient-derived <italic>P. copri</italic> genomes.</p></sec><sec id="s4-14"><title>Statistical significance of marker gene profiles between samplings</title><p>If each gene (boxes in <xref ref-type="fig" rid="fig3">Figure 3B</xref>, rows 61 boxes in length) is considered independently and can be in one of two states (i.e., present or absent), the probability of an exact match between any two individuals is 2<sup>−61</sup>, or 2<sup>−60</sup> with one mismatch. Qualitatively, it can be seen that any intra- or inter-individual comparison is highly statistically significant. Further, if we concede that genes within an island are not truly independent, and there are six such islands which are considered identical with 1–2 mismatches allowed, the probability of such a match is 2<sup>−6</sup>, or 0.015625, less than a 0.05 threshold for significance.</p></sec><sec id="s4-15"><title>Quantification of metagenome function with HUMAnN and LEfSe</title><p>Filtered paired-end reads were aligned separately to all genomes in KEGG with USEARCH 6.0 (<xref ref-type="bibr" rid="bib10">Edgar, 2010</xref>) using parameters—usearch_local—maxaccepts 2—maxrejects 8–evalue 0.1–id 0.80. The results from each read in a pair (and singletons) were combined and processed with HUMAnN 0.96 (<xref ref-type="bibr" rid="bib2">Abubucker et al., 2012</xref>) with default parameters. Output tables containing per-sample abundance estimates of KEGG modules were then processed with LEfSe (<xref ref-type="bibr" rid="bib41">Segata et al., 2011</xref>) using an alpha cutoff of 0.001 and an effect size cutoff of 2.0.</p></sec><sec id="s4-16"><title>Human leukocyte antigen (HLA) allele determination</title><p>Genomic DNA was isolated from the peripheral blood of RA patients and controls using QIAamp Blood Mini Kit (Qiagen GmbH, Halden, Germany) according to the manufacturer’s instructions. HLA-DRB1 alleles were determined by Sequence-Based Typing (SBT) and by Single Specific Primer-Polymerase Chain Reaction (SSP-PCR) methodologies (Fred H Allen Laboratory of Immunogenetics, NY, USA; Weatherall Institute for Molecular Medicine, Oxford, UK) (<xref ref-type="supplementary-material" rid="SD10-data">Supplementary file 1E</xref>). Alleles considered to have the shared-epitope conferring higher risk for RA included: HLA-DRB1*01:01, 01:02, 04:01, 04:04, 04:05, 04:08, 10:01, 13:03, and 14:02, corresponding to S<sub>2</sub> and S<sub>3P</sub> RA risk classification (<xref ref-type="bibr" rid="bib8">du Montcel et al., 2005</xref>). Subjects with at least one copy of these alleles have >1.95 times the relative risk of disease compared to the least at-risk genotype studied.</p></sec><sec id="s4-17"><title>Colonization of mice</title><p>C57BL/6 mice (Jackson Laboratories) were treated with ampicillin, neomycin, metronidazole (all 1 g/l) for 7 days prior to gavage. <italic>P. copri</italic> (CB7, DSMZ) or <italic>B. thetaiotamicron</italic> (gift from E Martens) was grown to log phase under anaerobic conditions in PYG liquid media (Anaerobe Systems, CA, USA) and 10<sup>7</sup> CFU were used to inoculate mice. Feces were collected at 1 and 2 weeks post-gavage to confirm colonization. Fecal DNA was extracted with mechanical bead beating with 0.1 mm zirconia silica beads (Biospecs Inc.) in 2% SDS followed by phenol chloroform extraction. Confirmation of colonization was achieved with <italic>P. copri</italic> genome specific primers (F: CCGGACTCCTGCCCCTGCAA, R: GTTGCGCCAGGCACTGCGAT); <italic>Prevotella</italic> 16S primers (F: CACRGTAAACGATGGATGCC, R: GGTCGGGTTGCAGACC), <italic>B. thetaiotamicron</italic> SusC (F: CACAACAGCCATAGCGTTCCA, R: ATCGCAAAAATAAGATGGGCAAA) (Benjida et al JBC 2011), and Universal 16S Primers (F: ACTCCTACGGGAGGCAGCAGT, R: ATTACCGCGGCTGCTGGC). QPCR was performed with a Roche Lightcycler (Roche USA, South San Francisco, CA, USA) and the following cycling conditions: 9°C for 5 m, 40 cycles of 95°C for 10 s, and 60°C for 30 s, 72°C for 30 s. Genomic DNA from <italic>P. copri</italic> was used to generate a standard curve to quantitate ng of <italic>P. copri</italic> present per mg of total feces.</p></sec><sec id="s4-18"><title>DSS-induced colitis</title><p>Mice were given 2% dextran sulfate sodium (DSS) in drinking water <italic>ad libitum</italic> for 7 days. Body weight was evaluated every 1–2 days over 14 days. Colonic mucosal damage 0 to 3 cm proximal to the anal verge was evaluated by direct visualization using the Coloview (Karl Storz Veterinary Endoscopy, Tuttlingen, Germany). Endoscopic scoring was performed as previously described: assessment of colon thickening (0–3 points), fibrinization (0–3 points), granularity (0–3 points), morphology of the vascular pattern (0–3 points), and stool consistency (normal to unshaped; 0–3 points) (<xref ref-type="bibr" rid="bib5a">Becker et al., 2006</xref>).</p></sec><sec id="s4-19"><title>Cell isolation and intracellular staining</title><p>Lamina propria mononuclear cells were isolated from colonic tissue as previously described (<xref ref-type="bibr" rid="bib7">Diehl et al., 2013</xref>). Cells were stimulated with phorbol myristate acetate and ionomycin with brefeldin for 4 hr and prepared as per manufacturer’s instruction with Cytoperm/Cytofix (BD Biosciences) for intracellular cytokine evaluation of IL-17A (eBiosciences 17B7) and IFNγ (eBiosciences XMG1.2). For Foxp3 analysis, cells were fixed and permeabilized as per manufacturer’s instructions (eBiosciences, Inc., San Diego, CA, USA) and stained intracellularly with anti-Foxp3 (FJK-16s).</p></sec><sec id="s4-20"><title>Source data</title><p>Source files for the figures and figure supplements have been uploaded to github (<ext-link ext-link-type="uri" xlink:href="https://github.com/polyatail/scher_et_al_2013">https://github.com/polyatail/scher_et_al_2013</ext-link>) and as <xref ref-type="supplementary-material" rid="SD1-data">Figure 1—source data 1</xref>, <xref ref-type="supplementary-material" rid="SD2-data">Figure 1—source data 2</xref>, <xref ref-type="supplementary-material" rid="SD3-data">Figure 2—source data 1</xref>, <xref ref-type="supplementary-material" rid="SD4-data">Figure 2—source data 2</xref>, <xref ref-type="supplementary-material" rid="SD5-data">Figure 3—source data 1</xref>, <xref ref-type="supplementary-material" rid="SD6-data">Figure 3—source data 2</xref>, <xref ref-type="supplementary-material" rid="SD7-data">Figure 4—source data 1</xref>, <xref ref-type="supplementary-material" rid="SD8-data">Figure 5—source data 1</xref>, and <xref ref-type="supplementary-material" rid="SD9-data">Figure 6—source data 1</xref>. Any future updates will be made available on GitHub.</p></sec></sec></body><back><ack id="ack"><title>Acknowledgements</title><p>The authors would like to thank Pamela Rosenthal, Soumya Reddy, and Peter Izmirly for help in patient recruitment; Flo Pauli and Sarah Meadows (HudsonAlpha), Agnes Viale and Lauren Lipuma (MSKCC) for sequencing; Mukundan Attur (NYU) for help in sample preparation; Xiang Qin and Joseph Petrosino (Baylor Genome Center) for help with <italic>Prevotella</italic> sequencing; Eric Martens (U Michigan) for his gift of <italic>Bacteroides</italic> strains; Joe DeRisi (UCSF) for computational resources; and Gerard Honig, Gretchen Diehl and Elke Kurz (NYU) for early help with mouse and microbiology experiments.</p></ack><sec sec-type="additional-information"><title>Additional information</title><fn-group content-type="competing-interest"><title>Competing interests</title><fn fn-type="conflict" id="conf1"><p>The authors declare that no competing interests exist.</p></fn></fn-group><fn-group content-type="author-contribution"><title>Author contributions</title><fn fn-type="con" id="con1"><p>JUS, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con2"><p>AS, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con3"><p>RSL, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con4"><p>NS, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con5"><p>CU, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con6"><p>CB, Acquisition of data, Analysis and interpretation of data</p></fn><fn fn-type="con" id="con7"><p>TR, Acquisition of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con8"><p>VC, Acquisition of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con9"><p>EGP, Conception and design, Analysis and interpretation of data</p></fn><fn fn-type="con" id="con10"><p>SBA, Conception and design, Analysis and interpretation of data</p></fn><fn fn-type="con" id="con11"><p>CH, Conception and design, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con12"><p>DRL, Conception and design, Analysis and interpretation of data, Drafting or revising the article</p></fn></fn-group><fn-group content-type="ethics-information"><title>Ethics</title><fn fn-type="other"><p>Human subjects: Consecutive patients from New York University rheumatology clinics were offered enrollment in this study after informed consent was obtained. This study was approved by the Institutional Review Board of New York University School of Medicine (NYU IRB protocol H#09-0658).</p></fn><fn fn-type="other"><p>Animal experimentation: All animal experiments were performed in accordance with approved protocols for the New York University Institutional Animal Care and Usage Committee (institutional number A3435-01, protocol #110602-03).</p></fn></fn-group></sec><sec sec-type="supplementary-material"><title>Additional files</title><supplementary-material id="SD10-data"><object-id pub-id-type="doi">10.7554/eLife.01202.026</object-id><label>Supplementary file 1.</label><caption><p>(<bold>A</bold>) Read statistics of sequenced samples included in and excluded from biomarker analyses. (<bold>B</bold>) Presence/absence, p-values and FDR statistics for differentially represented ORFs in the P. copri pangenome biomarker analysis, with annotations. (<bold>C</bold>) KOs present in <italic>P. copri DSM 18205</italic> but not in any <italic>Bacteroides</italic> accounting for at least 5% of the total microbiota in any subject of the Human Microbiome Project. (<bold>D</bold>) KOs present in all genomes available for <italic>Bacteroides</italic> accounting for at least 5% of the total microbiota in any subject of the Human Microbiome Project and not present in <italic>P. copri DSM 18205</italic>. (<bold>E</bold>) HLA-DRB1 alleles were determined for subjects in the cohort. Counts of RA risk alleles (shared epitope) are indicated as 0 for homozygotes not at risk, one for heterozygotes, and two for homozygotes at risk (‘Materials and methods). Shared epitope alleles appear in bold.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01202.026">http://dx.doi.org/10.7554/eLife.01202.026</ext-link></p></caption><media mime-subtype="xlsx" mimetype="application" xlink:href="elife01202s010.xlsx"/></supplementary-material><sec sec-type="datasets"><title>Major datasets</title><p>The following dataset was generated:</p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro1"><name><surname>Scher</surname></name>, <etal/>, <year>2013</year><x>, </x><source>Intestinal microbiota of patients with arthritis</source><x>, </x><object-id pub-id-type="art-access-id">PRJNA203810</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA203810">http://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA203810</ext-link><x>, </x><comment>Publicly available at the NCBI BioProject database (<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/bioproject">http://www.ncbi.nlm.nih.gov/bioproject</ext-link>).</comment></related-object></p><p>The following previously published dataset was used:</p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro2"><collab>HMP Consortium</collab>, <year>2010</year><x>, </x><source>NIH Human Microbiome Project</source><x>, </x><object-id pub-id-type="art-access-id">PRJNA43021</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA43021">http://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA43021</ext-link><x>, </x><comment>Publicly available at the NCBI BioProject database (<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/bioproject">http://www.ncbi.nlm.nih.gov/bioproject</ext-link>).</comment></related-object></p></sec></sec><ref-list><title>References</title><ref id="bib1"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Abdollahi-Roodsaz</surname><given-names>S</given-names></name><name><surname>Joosten</surname><given-names>LA</given-names></name><name><surname>Koenders</surname><given-names>MI</given-names></name><name><surname>Devesa</surname><given-names>I</given-names></name><name><surname>Roelofs</surname><given-names>MF</given-names></name><name><surname>Radstake</surname><given-names>TR</given-names></name><etal/></person-group><year>2008</year><article-title>Stimulation of TLR2 and TLR4 differentially skews the balance of T cells in a mouse model of arthritis</article-title><source>J Clin Invest</source><volume>118</volume><fpage>205</fpage><lpage>16</lpage><pub-id pub-id-type="doi">10.1172/JCI32639</pub-id></element-citation></ref><ref id="bib2"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Abubucker</surname><given-names>S</given-names></name><name><surname>Segata</surname><given-names>N</given-names></name><name><surname>Goll</surname><given-names>J</given-names></name><name><surname>Schubert</surname><given-names>AM</given-names></name><name><surname>Izard</surname><given-names>J</given-names></name><name><surname>Cantarel</surname><given-names>BL</given-names></name><etal/></person-group><year>2012</year><article-title>Metabolic reconstruction for metagenomic data and its application to the human microbiome</article-title><source>PLOS Comput Biol</source><volume>8</volume><fpage>e1002358</fpage><pub-id pub-id-type="doi">10.1371/journal.pcbi.1002358</pub-id></element-citation></ref><ref id="bib3"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Aletaha</surname><given-names>D</given-names></name><name><surname>Neogi</surname><given-names>T</given-names></name><name><surname>Silman</surname><given-names>AJ</given-names></name><name><surname>Funovits</surname><given-names>J</given-names></name><name><surname>Felson</surname><given-names>DT</given-names></name><name><surname>Bingham</surname><given-names>CO</given-names><suffix>III</suffix></name><etal/></person-group><year>2010</year><article-title>2010 rheumatoid arthritis classification criteria: an American College of Rheumatology/European League Against Rheumatism collaborative initiative</article-title><source>Arthritis Rheum</source><volume>62</volume><fpage>2569</fpage><lpage>81</lpage><pub-id pub-id-type="doi">10.1002/art.27584</pub-id></element-citation></ref><ref id="bib4"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Arumugam</surname><given-names>M</given-names></name><name><surname>Raes</surname><given-names>J</given-names></name><name><surname>Pelletier</surname><given-names>E</given-names></name><name><surname>Le Paslier</surname><given-names>D</given-names></name><name><surname>Yamada</surname><given-names>T</given-names></name><name><surname>Mende</surname><given-names>DR</given-names></name><etal/></person-group><year>2011</year><article-title>Enterotypes of the human gut microbiome</article-title><source>Nature</source><volume>473</volume><fpage>174</fpage><lpage>80</lpage><pub-id pub-id-type="doi">10.1038/nature09944</pub-id></element-citation></ref><ref id="bib5"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Atarashi</surname><given-names>K</given-names></name><name><surname>Tanoue</surname><given-names>T</given-names></name><name><surname>Shima</surname><given-names>T</given-names></name><name><surname>Imaoka</surname><given-names>A</given-names></name><name><surname>Kuwahara</surname><given-names>T</given-names></name><name><surname>Momose</surname><given-names>Y</given-names></name><etal/></person-group><year>2011</year><article-title>Induction of colonic regulatory T cells by indigenous <italic>Clostridium</italic> species</article-title><source>Science</source><volume>331</volume><fpage>337</fpage><lpage>41</lpage><pub-id pub-id-type="doi">10.1126/science.1198469</pub-id></element-citation></ref><ref id="bib5a"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Becker</surname><given-names>C</given-names></name><name><surname>Fantini</surname><given-names>MC</given-names></name><name><surname>Neurath</surname><given-names>MF</given-names></name></person-group><year>2006</year><article-title>High resolution colonoscopy in live mice</article-title><source>Nat Protoc</source><volume>1</volume><fpage>2900</fpage><lpage>4</lpage><pub-id pub-id-type="doi">10.1038/nprot.2006.446</pub-id></element-citation></ref><ref id="bib6"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Deane</surname><given-names>KD</given-names></name><name><surname>Norris</surname><given-names>JM</given-names></name><name><surname>Holers</surname><given-names>VM</given-names></name></person-group><year>2010</year><article-title>Preclinical rheumatoid arthritis: identification, evaluation, and future directions for investigation</article-title><source>Rheum Dis Clin North Am</source><volume>36</volume><fpage>213</fpage><lpage>41</lpage><pub-id pub-id-type="doi">10.1016/j.rdc.2010.02.001</pub-id></element-citation></ref><ref id="bib7"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Diehl</surname><given-names>GE</given-names></name><name><surname>Longman</surname><given-names>RS</given-names></name><name><surname>Zhang</surname><given-names>JX</given-names></name><name><surname>Breart</surname><given-names>B</given-names></name><name><surname>Galan</surname><given-names>C</given-names></name><name><surname>Cuesta</surname><given-names>A</given-names></name><etal/></person-group><year>2013</year><article-title>Microbiota restricts trafficking of bacteria to mesenteric lymph nodes by CX(3)CR1(hi) cells</article-title><source>Nature</source><volume>494</volume><fpage>116</fpage><lpage>20</lpage><pub-id pub-id-type="doi">10.1038/nature11809</pub-id></element-citation></ref><ref id="bib8"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>du Montcel</surname><given-names>ST</given-names></name><name><surname>Michou</surname><given-names>L</given-names></name><name><surname>Petit-Teixeira</surname><given-names>E</given-names></name><name><surname>Osorio</surname><given-names>J</given-names></name><name><surname>Lemaire</surname><given-names>I</given-names></name></person-group><year>2005</year><article-title>New classification of HLA-DRB1 alleles supports the shared epitope hypothesis of rheumatoid arthritis susceptibility</article-title><source>Arthritis Rheum</source><volume>52</volume><fpage>1063</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1002/art.20989</pub-id></element-citation></ref><ref id="bib9"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Edgar</surname><given-names>RC</given-names></name></person-group><year>2004</year><article-title>MUSCLE: multiple sequence alignment with high accuracy and high throughput</article-title><source>Nucleic Acids Res</source><volume>32</volume><fpage>1792</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1093/nar/gkh340</pub-id></element-citation></ref><ref id="bib10"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Edgar</surname><given-names>RC</given-names></name></person-group><year>2010</year><article-title>Search and clustering orders of magnitude faster than BLAST</article-title><source>Bioinformatics</source><volume>26</volume><fpage>2460</fpage><lpage>1</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btq461</pub-id></element-citation></ref><ref id="bib11"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Elinav</surname><given-names>E</given-names></name><name><surname>Strowig</surname><given-names>T</given-names></name><name><surname>Kau</surname><given-names>AL</given-names></name><name><surname>Henao-Mejia</surname><given-names>J</given-names></name><name><surname>Thaiss</surname><given-names>CA</given-names></name><name><surname>Booth</surname><given-names>CJ</given-names></name><etal/></person-group><year>2011</year><article-title>NLRP6 inflammasome regulates colonic microbial ecology and risk for colitis</article-title><source>Cell</source><volume>145</volume><fpage>745</fpage><lpage>57</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2011.04.022</pub-id></element-citation></ref><ref id="bib12"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Faust</surname><given-names>K</given-names></name><name><surname>Sathirapongsasuti</surname><given-names>JF</given-names></name><name><surname>Izard</surname><given-names>J</given-names></name><name><surname>Segata</surname><given-names>N</given-names></name><name><surname>Gevers</surname><given-names>D</given-names></name><name><surname>Raes</surname><given-names>J</given-names></name><etal/></person-group><year>2012</year><article-title>Microbial co-occurrence relationships in the human microbiome</article-title><source>PLOS Comput Biol</source><volume>8</volume><fpage>e1002606</fpage><pub-id pub-id-type="doi">10.1371/journal.pcbi.1002606</pub-id></element-citation></ref><ref id="bib13"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Frank</surname><given-names>DN</given-names></name><name><surname>Robertson</surname><given-names>CE</given-names></name><name><surname>Hamm</surname><given-names>CM</given-names></name><name><surname>Kpadeh</surname><given-names>Z</given-names></name><name><surname>Zhang</surname><given-names>T</given-names></name><name><surname>Chen</surname><given-names>H</given-names></name><etal/></person-group><year>2011</year><article-title>Disease phenotype and genotype are associated with shifts in intestinal-associated microbiota in inflammatory bowel diseases</article-title><source>Inflamm Bowel Dis</source><volume>17</volume><fpage>179</fpage><lpage>84</lpage><pub-id pub-id-type="doi">10.1002/ibd.21339</pub-id></element-citation></ref><ref id="bib14"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gomez</surname><given-names>A</given-names></name><name><surname>Luckey</surname><given-names>D</given-names></name><name><surname>Yeoman</surname><given-names>CJ</given-names></name><name><surname>Marietta</surname><given-names>EV</given-names></name><name><surname>Berg Miller</surname><given-names>ME</given-names></name></person-group><year>2012</year><article-title>Loss of sex and age driven differences in the gut microbiome characterize arthritis-susceptible 0401 mice but not arthritis-resistant 0402 mice</article-title><source>PLOS ONE</source><volume>7</volume><fpage>e36095</fpage><pub-id pub-id-type="doi">10.1371/journal.pone.0036095</pub-id></element-citation></ref><ref id="bib15"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gregersen</surname><given-names>PK</given-names></name><name><surname>Silver</surname><given-names>J</given-names></name><name><surname>Winchester</surname><given-names>RJ</given-names></name></person-group><year>1987</year><article-title>The shared epitope hypothesis. An approach to understanding the molecular genetics of susceptibility to rheumatoid arthritis</article-title><source>Arthritis Rheum</source><volume>30</volume><fpage>1205</fpage><lpage>13</lpage><pub-id pub-id-type="doi">10.1002/art.1780301102</pub-id></element-citation></ref><ref id="bib16"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Haas</surname><given-names>BJ</given-names></name><name><surname>Gevers</surname><given-names>D</given-names></name><name><surname>Earl</surname><given-names>AM</given-names></name><name><surname>Feldgarden</surname><given-names>M</given-names></name><name><surname>Ward</surname><given-names>DV</given-names></name><name><surname>Giannoukos</surname><given-names>G</given-names></name><etal/></person-group><year>2011</year><article-title>Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons</article-title><source>Genome Res</source><volume>21</volume><fpage>494</fpage><lpage>504</lpage><pub-id pub-id-type="doi">10.1101/gr.112730.110</pub-id></element-citation></ref><ref id="bib17"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hayashi</surname><given-names>H</given-names></name><name><surname>Shibata</surname><given-names>K</given-names></name><name><surname>Sakamoto</surname><given-names>M</given-names></name><name><surname>Tomita</surname><given-names>S</given-names></name><name><surname>Benno</surname><given-names>Y</given-names></name></person-group><year>2007</year><article-title><italic>Prevotella copri</italic> sp. nov. and Prevotella stercorea sp. nov., isolated from human faeces</article-title><source>Int J Syst Evol Microbiol</source><volume>57</volume><fpage>941</fpage><lpage>6</lpage><pub-id pub-id-type="doi">10.1099/ijs.0.64778-0</pub-id></element-citation></ref><ref id="bib18"><element-citation publication-type="journal"><collab>Human Microbiome Project Consortium</collab><year>2012</year><article-title>Structure, function and diversity of the healthy human microbiome</article-title><source>Nature</source><volume>486</volume><fpage>207</fpage><lpage>14</lpage><pub-id pub-id-type="doi">10.1038/nature11234</pub-id></element-citation></ref><ref id="bib19"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Huse</surname><given-names>SM</given-names></name><name><surname>Welch</surname><given-names>DM</given-names></name><name><surname>Morrison</surname><given-names>HG</given-names></name><name><surname>Sogin</surname><given-names>ML</given-names></name></person-group><year>2010</year><article-title>Ironing out the wrinkles in the rare biosphere through improved OTU clustering</article-title><source>Environ Microbiol</source><volume>12</volume><fpage>1889</fpage><lpage>98</lpage><pub-id pub-id-type="doi">10.1111/j.1462-2920.2010.02193.x</pub-id></element-citation></ref><ref id="bib20"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ivanov</surname><given-names>Ii</given-names></name><name><surname>Atarashi</surname><given-names>K</given-names></name><name><surname>Manel</surname><given-names>N</given-names></name><name><surname>Brodie</surname><given-names>EL</given-names></name><name><surname>Shima</surname><given-names>T</given-names></name><name><surname>Karaoz</surname><given-names>U</given-names></name><etal/></person-group><year>2009</year><article-title>Induction of intestinal Th17 cells by segmented filamentous bacteria</article-title><source>Cell</source><volume>139</volume><fpage>485</fpage><lpage>98</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2009.09.033</pub-id></element-citation></ref><ref id="bib21"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kanehisa</surname><given-names>M</given-names></name><name><surname>Goto</surname><given-names>S</given-names></name></person-group><year>2000</year><article-title>KEGG: kyoto encyclopedia of genes and genomes</article-title><source>Nucleic Acids Res</source><volume>28</volume><fpage>27</fpage><lpage>30</lpage><pub-id pub-id-type="doi">10.1093/nar/28.1.27</pub-id></element-citation></ref><ref id="bib22"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Koeth</surname><given-names>RA</given-names></name><name><surname>Wang</surname><given-names>Z</given-names></name><name><surname>Levison</surname><given-names>BS</given-names></name><name><surname>Buffa</surname><given-names>JA</given-names></name><name><surname>Org</surname><given-names>E</given-names></name><name><surname>Sheehy</surname><given-names>BT</given-names></name><etal/></person-group><year>2013</year><article-title>Intestinal microbiota metabolism of l-carnitine, a nutrient in red meat, promotes atherosclerosis</article-title><source>Nat Med</source><volume>19</volume><fpage>576</fpage><lpage>85</lpage><pub-id pub-id-type="doi">10.1038/nm.3145</pub-id></element-citation></ref><ref id="bib23"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Littman</surname><given-names>DR</given-names></name><name><surname>Pamer</surname><given-names>EG</given-names></name></person-group><year>2011</year><article-title>Role of the commensal microbiota in normal and pathogenic host immune responses</article-title><source>Cell Host Microbe</source><volume>10</volume><fpage>311</fpage><lpage>23</lpage><pub-id pub-id-type="doi">10.1016/j.chom.2011.10.004</pub-id></element-citation></ref><ref id="bib24"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Luo</surname><given-names>R</given-names></name><name><surname>Liu</surname><given-names>B</given-names></name><name><surname>Xie</surname><given-names>Y</given-names></name><name><surname>Li</surname><given-names>Z</given-names></name><name><surname>Huang</surname><given-names>W</given-names></name><name><surname>Yuan</surname><given-names>J</given-names></name><etal/></person-group><year>2012</year><article-title>SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler</article-title><source>Gigascience</source><volume>1</volume><fpage>18</fpage><pub-id pub-id-type="doi">10.1186/2047-217X-1-18</pub-id></element-citation></ref><ref id="bib25"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Maurice</surname><given-names>MM</given-names></name><name><surname>Nakamura</surname><given-names>H</given-names></name><name><surname>Gringhuis</surname><given-names>S</given-names></name><name><surname>Okamoto</surname><given-names>T</given-names></name><name><surname>Yoshida</surname><given-names>S</given-names></name><name><surname>Kullmann</surname><given-names>F</given-names></name><etal/></person-group><year>1999</year><article-title>Expression of the thioredoxin-thioredoxin reductase system in the inflamed joints of patients with rheumatoid arthritis</article-title><source>Arthritis Rheum</source><volume>42</volume><fpage>2430</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1002/1529-0131(199911)42:11<2430::AID-ANR22>3.0.CO;2-6</pub-id></element-citation></ref><ref id="bib26"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>McInnes</surname><given-names>IB</given-names></name><name><surname>Schett</surname><given-names>G</given-names></name></person-group><year>2011</year><article-title>The pathogenesis of rheumatoid arthritis</article-title><source>N Engl J Med</source><volume>365</volume><fpage>2205</fpage><lpage>19</lpage><pub-id pub-id-type="doi">10.1056/NEJMra1004965</pub-id></element-citation></ref><ref id="bib27"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Morgan</surname><given-names>XC</given-names></name><name><surname>Tickle</surname><given-names>TL</given-names></name><name><surname>Sokol</surname><given-names>H</given-names></name><name><surname>Gevers</surname><given-names>D</given-names></name><name><surname>Devaney</surname><given-names>KL</given-names></name><name><surname>Ward</surname><given-names>DV</given-names></name><etal/></person-group><year>2012</year><article-title>Dysfunction of the intestinal microbiome in inflammatory bowel disease and treatment</article-title><source>Genome Biol</source><volume>13</volume><fpage>R79</fpage><pub-id pub-id-type="doi">10.1186/gb-2012-13-9-r79</pub-id></element-citation></ref><ref id="bib28"><element-citation publication-type="web"><person-group person-group-type="author"><name><surname>Pop</surname><given-names>M</given-names></name></person-group><year>2011</year><article-title>HMP Whole-Metagenome Assembly</article-title><ext-link ext-link-type="uri" xlink:href="http://www.hmpdacc.org/doc/HMP_Assembly_SOP.pdf">http://www.hmpdacc.org/doc/HMP_Assembly_SOP.pdf</ext-link></element-citation></ref><ref id="bib29"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Price</surname><given-names>MN</given-names></name><name><surname>Dehal</surname><given-names>PS</given-names></name><name><surname>Arkin</surname><given-names>AP</given-names></name></person-group><year>2010</year><article-title>FastTree 2–approximately maximum-likelihood trees for large alignments</article-title><source>PLOS ONE</source><volume>5</volume><fpage>e9490</fpage><pub-id pub-id-type="doi">10.1371/journal.pone.0009490</pub-id></element-citation></ref><ref id="bib30"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Pruesse</surname><given-names>E</given-names></name><name><surname>Quast</surname><given-names>C</given-names></name><name><surname>Knittel</surname><given-names>K</given-names></name><name><surname>Fuchs</surname><given-names>BM</given-names></name><name><surname>Ludwig</surname><given-names>W</given-names></name><name><surname>Peplies</surname><given-names>J</given-names></name><etal/></person-group><year>2007</year><article-title>SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB</article-title><source>Nucleic Acids Res</source><volume>35</volume><fpage>7188</fpage><lpage>96</lpage><pub-id pub-id-type="doi">10.1093/nar/gkm864</pub-id></element-citation></ref><ref id="bib31"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Qin</surname><given-names>J</given-names></name><name><surname>Li</surname><given-names>R</given-names></name><name><surname>Raes</surname><given-names>J</given-names></name><name><surname>Arumugam</surname><given-names>M</given-names></name><name><surname>Burgdorf</surname><given-names>KS</given-names></name><name><surname>Manichanh</surname><given-names>C</given-names></name><etal/></person-group><year>2010</year><article-title>A human gut microbial gene catalogue established by metagenomic sequencing</article-title><source>Nature</source><volume>464</volume><fpage>59</fpage><lpage>65</lpage><pub-id pub-id-type="doi">10.1038/nature08821</pub-id></element-citation></ref><ref id="bib32"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Qin</surname><given-names>J</given-names></name><name><surname>Li</surname><given-names>Y</given-names></name><name><surname>Cai</surname><given-names>Z</given-names></name><name><surname>Li</surname><given-names>S</given-names></name><name><surname>Zhu</surname><given-names>J</given-names></name><name><surname>Zhang</surname><given-names>F</given-names></name><etal/></person-group><year>2012</year><article-title>A metagenome-wide association study of gut microbiota in type 2 diabetes</article-title><source>Nature</source><volume>490</volume><fpage>55</fpage><lpage>60</lpage><pub-id pub-id-type="doi">10.1038/nature11450</pub-id></element-citation></ref><ref id="bib33"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rath</surname><given-names>HC</given-names></name><name><surname>Herfarth</surname><given-names>HH</given-names></name><name><surname>Ikeda</surname><given-names>JS</given-names></name><name><surname>Grenther</surname><given-names>WB</given-names></name><name><surname>Hamm</surname><given-names>TE</given-names><suffix>Jnr</suffix></name><name><surname>Balish</surname><given-names>E</given-names></name><etal/></person-group><year>1996</year><article-title>Normal luminal bacteria, especially <italic>Bacteroides</italic> species, mediate chronic colitis, gastritis, and arthritis in HLA-B27/human beta2 microglobulin transgenic rats</article-title><source>J Clin Invest</source><volume>98</volume><fpage>945</fpage><lpage>53</lpage><pub-id pub-id-type="doi">10.1172/JCI118878</pub-id></element-citation></ref><ref id="bib34"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Round</surname><given-names>JL</given-names></name><name><surname>Lee</surname><given-names>SM</given-names></name><name><surname>Li</surname><given-names>J</given-names></name><name><surname>Tran</surname><given-names>G</given-names></name><name><surname>Jabri</surname><given-names>B</given-names></name><name><surname>Chatila</surname><given-names>TA</given-names></name><etal/></person-group><year>2011</year><article-title>The Toll-like receptor 2 pathway establishes colonization by a commensal of the human microbiota</article-title><source>Science</source><volume>332</volume><fpage>974</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1126/science.1206095</pub-id></element-citation></ref><ref id="bib35"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Scher</surname><given-names>JU</given-names></name><name><surname>Abramson</surname><given-names>SB</given-names></name></person-group><year>2011</year><article-title>The microbiome and rheumatoid arthritis</article-title><source>Nat Rev Rheumatol</source><volume>7</volume><fpage>569</fpage><lpage>78</lpage><pub-id pub-id-type="doi">10.1038/nrrheum.2011.121</pub-id></element-citation></ref><ref id="bib36"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Scher</surname><given-names>JU</given-names></name><name><surname>Ubeda</surname><given-names>C</given-names></name><name><surname>Equinda</surname><given-names>M</given-names></name><name><surname>Khanin</surname><given-names>R</given-names></name><name><surname>Buischi</surname><given-names>Y</given-names></name><name><surname>Viale</surname><given-names>A</given-names></name><etal/></person-group><year>2012</year><article-title>Periodontal disease and the oral microbiota in new-onset rheumatoid arthritis</article-title><source>Arthritis Rheum</source><volume>64</volume><fpage>3083</fpage><lpage>94</lpage><pub-id pub-id-type="doi">10.1002/art.34539</pub-id></element-citation></ref><ref id="bib37"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schloissnig</surname><given-names>S</given-names></name><name><surname>Arumugam</surname><given-names>M</given-names></name><name><surname>Sunagawa</surname><given-names>S</given-names></name><name><surname>Mitreva</surname><given-names>M</given-names></name><name><surname>Tap</surname><given-names>J</given-names></name><name><surname>Zhu</surname><given-names>A</given-names></name><etal/></person-group><year>2013</year><article-title>Genomic variation landscape of the human gut microbiome</article-title><source>Nature</source><volume>493</volume><fpage>45</fpage><lpage>50</lpage><pub-id pub-id-type="doi">10.1038/nature11711</pub-id></element-citation></ref><ref id="bib38"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schloss</surname><given-names>PD</given-names></name><name><surname>Westcott</surname><given-names>SL</given-names></name><name><surname>Ryabin</surname><given-names>T</given-names></name><name><surname>Hall</surname><given-names>JR</given-names></name><name><surname>Hartmann</surname><given-names>M</given-names></name><name><surname>Hollister</surname><given-names>EB</given-names></name><etal/></person-group><year>2009</year><article-title>Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities</article-title><source>Appl Environ Microbiol</source><volume>75</volume><fpage>7537</fpage><lpage>41</lpage><pub-id pub-id-type="doi">10.1128/AEM.01541-09</pub-id></element-citation></ref><ref id="bib39"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sczesnak</surname><given-names>A</given-names></name><name><surname>Segata</surname><given-names>N</given-names></name><name><surname>Qin</surname><given-names>X</given-names></name><name><surname>Gevers</surname><given-names>D</given-names></name><name><surname>Petrosino</surname><given-names>JF</given-names></name><name><surname>Huttenhower</surname><given-names>C</given-names></name><etal/></person-group><year>2011</year><article-title>The genome of Th17 cell-inducing segmented filamentous bacteria reveals extensive auxotrophy and adaptations to the intestinal environment</article-title><source>Cell Host Microbe</source><volume>10</volume><fpage>260</fpage><lpage>72</lpage><pub-id pub-id-type="doi">10.1016/j.chom.2011.08.005</pub-id></element-citation></ref><ref id="bib40"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Segata</surname><given-names>N</given-names></name><name><surname>Bornigen</surname><given-names>D</given-names></name><name><surname>Morgan</surname><given-names>XC</given-names></name><name><surname>Huttenhower</surname><given-names>C</given-names></name></person-group><year>2013</year><article-title>PhyloPhlAn is a new method for improved phylogenetic and taxonomic placement of microbes</article-title><source>Nat Commun</source><volume>4</volume><fpage>2304</fpage><pub-id pub-id-type="doi">10.1038/ncomms3304</pub-id></element-citation></ref><ref id="bib41"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Segata</surname><given-names>N</given-names></name><name><surname>Izard</surname><given-names>J</given-names></name><name><surname>Waldron</surname><given-names>L</given-names></name><name><surname>Gevers</surname><given-names>D</given-names></name><name><surname>Miropolsky</surname><given-names>L</given-names></name><name><surname>Garrett</surname><given-names>WS</given-names></name><etal/></person-group><year>2011</year><article-title>Metagenomic biomarker discovery and explanation</article-title><source>Genome Biol</source><volume>12</volume><fpage>R60</fpage><pub-id pub-id-type="doi">10.1186/gb-2011-12-6-r60</pub-id></element-citation></ref><ref id="bib42"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Segata</surname><given-names>N</given-names></name><name><surname>Waldron</surname><given-names>L</given-names></name><name><surname>Ballarini</surname><given-names>A</given-names></name><name><surname>Narasimhan</surname><given-names>V</given-names></name><name><surname>Jousson</surname><given-names>O</given-names></name><name><surname>Huttenhower</surname><given-names>C</given-names></name></person-group><year>2012</year><article-title>Metagenomic microbial community profiling using unique clade-specific marker genes</article-title><source>Nat Methods</source><volume>9</volume><fpage>811</fpage><lpage>4</lpage><pub-id pub-id-type="doi">10.1038/nmeth.2066</pub-id></element-citation></ref><ref id="bib43"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Singh</surname><given-names>JA</given-names></name><name><surname>Furst</surname><given-names>DE</given-names></name><name><surname>Bharat</surname><given-names>A</given-names></name><name><surname>Curtis</surname><given-names>JR</given-names></name><name><surname>Kavanaugh</surname><given-names>AF</given-names></name><name><surname>Kremer</surname><given-names>JM</given-names></name><etal/></person-group><year>2012</year><article-title>2012 update of the 2008 American College of Rheumatology recommendations for the use of disease-modifying antirheumatic drugs and biologic agents in the treatment of rheumatoid arthritis</article-title><source>Arthritis Care Res (Hoboken)</source><volume>64</volume><fpage>625</fpage><lpage>39</lpage><pub-id pub-id-type="doi">10.1002/acr.21641</pub-id></element-citation></ref><ref id="bib44"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Stahl</surname><given-names>EA</given-names></name><name><surname>Raychaudhuri</surname><given-names>S</given-names></name><name><surname>Remmers</surname><given-names>EF</given-names></name><name><surname>Xie</surname><given-names>G</given-names></name><name><surname>Eyre</surname><given-names>S</given-names></name><name><surname>Thomson</surname><given-names>BP</given-names></name><etal/></person-group><year>2010</year><article-title>Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci</article-title><source>Nat Genet</source><volume>42</volume><fpage>508</fpage><lpage>14</lpage><pub-id pub-id-type="doi">10.1038/ng.582</pub-id></element-citation></ref><ref id="bib45"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tao</surname><given-names>J</given-names></name><name><surname>Kamanaka</surname><given-names>M</given-names></name><name><surname>Hao</surname><given-names>J</given-names></name><name><surname>Hao</surname><given-names>Z</given-names></name><name><surname>Jiang</surname><given-names>X</given-names></name><name><surname>Craft</surname><given-names>JE</given-names></name><etal/></person-group><year>2011</year><article-title>IL-10 signaling in CD4+ T cells is critical for the pathogenesis of collagen-induced arthritis</article-title><source>Arthritis Res Ther</source><volume>13</volume><fpage>R212</fpage><pub-id pub-id-type="doi">10.1186/ar3545</pub-id></element-citation></ref><ref id="bib46"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tillett</surname><given-names>WS</given-names></name><name><surname>Francis</surname><given-names>T</given-names></name></person-group><year>1930</year><article-title>Serological reactions in pneumonia with a non-protein somatic fraction of Pneumococcus</article-title><source>J Exp Med</source><volume>52</volume><fpage>561</fpage><lpage>71</lpage><pub-id pub-id-type="doi">10.1084/jem.52.4.561</pub-id></element-citation></ref><ref id="bib47"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ubeda</surname><given-names>C</given-names></name><name><surname>Taur</surname><given-names>Y</given-names></name><name><surname>Jenq</surname><given-names>RR</given-names></name><name><surname>Equinda</surname><given-names>MJ</given-names></name><name><surname>Son</surname><given-names>T</given-names></name><name><surname>Samstein</surname><given-names>M</given-names></name><etal/></person-group><year>2010</year><article-title>Vancomycin-resistant Enterococcus domination of intestinal microbiota is enabled by antibiotic treatment in mice and precedes bloodstream invasion in humans</article-title><source>J Clin Invest</source><volume>120</volume><fpage>4332</fpage><lpage>41</lpage><pub-id pub-id-type="doi">10.1172/JCI43918</pub-id></element-citation></ref><ref id="bib48"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname><given-names>Q</given-names></name><name><surname>Garrity</surname><given-names>GM</given-names></name><name><surname>Tiedje</surname><given-names>JM</given-names></name><name><surname>Cole</surname><given-names>JR</given-names></name></person-group><year>2007</year><article-title>Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy</article-title><source>Appl Environ Microbiol</source><volume>73</volume><fpage>5261</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1128/AEM.00062-07</pub-id></element-citation></ref><ref id="bib49"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Winter</surname><given-names>SE</given-names></name><name><surname>Winter</surname><given-names>MG</given-names></name><name><surname>Xavier</surname><given-names>MN</given-names></name><name><surname>Thiennimitr</surname><given-names>P</given-names></name><name><surname>Poon</surname><given-names>V</given-names></name><name><surname>Keestra</surname><given-names>AM</given-names></name><etal/></person-group><year>2013</year><article-title>Host-derived nitrate boosts growth of <italic>E. coli</italic> in the inflamed gut</article-title><source>Science</source><volume>339</volume><fpage>708</fpage><lpage>11</lpage><pub-id pub-id-type="doi">10.1126/science.1232467</pub-id></element-citation></ref><ref id="bib50"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname><given-names>GD</given-names></name><name><surname>Chen</surname><given-names>J</given-names></name><name><surname>Hoffmann</surname><given-names>C</given-names></name><name><surname>Bittinger</surname><given-names>K</given-names></name><name><surname>Chen</surname><given-names>YY</given-names></name><name><surname>Keilbaugh</surname><given-names>SA</given-names></name><etal/></person-group><year>2011</year><article-title>Linking long-term dietary patterns with gut microbial enterotypes</article-title><source>Science</source><volume>334</volume><fpage>105</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1126/science.1208344</pub-id></element-citation></ref><ref id="bib51"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname><given-names>HJ</given-names></name><name><surname>Ivanov</surname><given-names>Ii</given-names></name><name><surname>Darce</surname><given-names>J</given-names></name><name><surname>Hattori</surname><given-names>K</given-names></name><name><surname>Shima</surname><given-names>T</given-names></name><name><surname>Umesaki</surname><given-names>Y</given-names></name><etal/></person-group><year>2010</year><article-title>Gut-residing segmented filamentous bacteria drive autoimmune arthritis via T helper 17 cells</article-title><source>Immunity</source><volume>32</volume><fpage>815</fpage><lpage>27</lpage><pub-id pub-id-type="doi">10.1016/j.immuni.2010.06.001</pub-id></element-citation></ref><ref id="bib52"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yatsunenko</surname><given-names>T</given-names></name><name><surname>Rey</surname><given-names>FE</given-names></name><name><surname>Manary</surname><given-names>MJ</given-names></name><name><surname>Trehan</surname><given-names>I</given-names></name><name><surname>Dominguez-Bello</surname><given-names>MG</given-names></name><name><surname>Contreras</surname><given-names>M</given-names></name><etal/></person-group><year>2012</year><article-title>Human gut microbiome viewed across age and geography</article-title><source>Nature</source><volume>486</volume><fpage>222</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1038/nature11053</pub-id></element-citation></ref><ref id="bib53"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zanin-Zhorov</surname><given-names>A</given-names></name><name><surname>Ding</surname><given-names>Y</given-names></name><name><surname>Kumari</surname><given-names>S</given-names></name><name><surname>Attur</surname><given-names>M</given-names></name><name><surname>Hippen</surname><given-names>KL</given-names></name><name><surname>Brown</surname><given-names>M</given-names></name><etal/></person-group><year>2010</year><article-title>Protein kinase C-theta mediates negative feedback on regulatory T cell function</article-title><source>Science</source><volume>328</volume><fpage>372</fpage><lpage>6</lpage><pub-id pub-id-type="doi">10.1126/science.1186068</pub-id></element-citation></ref><ref id="bib54"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zhu</surname><given-names>W</given-names></name><name><surname>Lomsadze</surname><given-names>A</given-names></name><name><surname>Borodovsky</surname><given-names>M</given-names></name></person-group><year>2010</year><article-title>Ab initio gene identification in metagenomic sequences</article-title><source>Nucleic Acids Res</source><volume>38</volume><fpage>e132</fpage><pub-id pub-id-type="doi">10.1093/nar/gkq275</pub-id></element-citation></ref></ref-list></back><sub-article article-type="article-commentary" id="SA1"><front-stub><article-id pub-id-type="doi">10.7554/eLife.01202.027</article-id><title-group><article-title>Decision letter</article-title></title-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Mathis</surname><given-names>Diane</given-names></name><role>Reviewing editor</role><aff><institution>Harvard Medical School</institution>, <country>United States</country></aff></contrib></contrib-group></front-stub><body><boxed-text><p>eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see <ext-link ext-link-type="uri" xlink:href="http://elife.elifesciences.org/review-process">review process</ext-link>). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.</p></boxed-text><p>Thank you for sending your work entitled “<italic>Prevotella copri</italic> defines a metagenomic enterotype that correlates with enhanced susceptibility to arthritis” for consideration at <italic>eLife</italic>. Your article has been favorably evaluated by a Senior editor and 3 reviewers, one of whom is a member of our Board of Reviewing Editors.</p><p>The following individuals responsible for the peer review of your submission have agreed to reveal their identity: Diane Mathis, Reviewing editor.</p><p>The Reviewing editor and the other reviewers discussed their comments before reaching this decision, and the Reviewing editor has assembled the following comments to help you prepare a revised submission.</p><p>We have now received the three reviewers' comments on your manuscript “<italic>Prevotella copri</italic> defines a metagenomic enterotype that correlates with enhanced susceptibility to arthritis”. The reviewers all agreed that the manuscript is very interesting and potentially important. They also concurred that the data on mouse models are weak, primarily due to their marginal/questionable significance (see detailed comments below). The human data are rather more convincing, but would be improved by additional bioinformatic analyses (detailed below). Therefore, we invite you to submit a revised version that eliminates the mouse data and addresses the following issues:</p><p>1) The Introduction could be shortened, especially the discussion on Th17 cells, since the mechanistic studies showing that <italic>Prevotella</italic> specifically regulates Th17 cells are not conclusive. Also, human data on the role of IL-17 is really only robust for psoriasis, not RA or IBD.</p><p>2) The presentation is a bit unfocused. A large part of the paper involves genomic analyses that do not advance the main story. If the goal was to determine why <italic>P. copri</italic> increases in arthritis patients, there is very little in <xref ref-type="fig" rid="fig2 fig3 fig4">Figures 2–4</xref> or the main text to convincingly suggest a specific answer. These “side experiments” distract from the more central question as to the putative mechanism linking <italic>P. copri</italic> to disease.</p><p>3) The paper also suffers from scientific jargon. For example, use of the word “enterotype” in the title and text is quite different from its original application in the MetaHIT paper. We would recommend removing this word given recent concerns about the methodology used in the original studies (Wu et al., Science 2011; Koren et al., PLoS Comp Biol 2013), the inconsistent usage of this term in the field, the lack of any quantitative enterotype clustering in the current paper, and the focus of this study on <italic>P. copri</italic> instead of more general patterns in community structure. Other more minor offenders are “high-throughput 16S”, “dysbiotic”, and “clade diversity”.</p><p>4) The main text states that “NORA and healthy subjects form distinct clusters” based on <xref ref-type="fig" rid="fig1">Figure 1b</xref>. This is clearly not the case, as the NORA subjects (stars) and healthy controls (circles) are distributed across the entire graph. A more accurate statement would be that samples cluster by <italic>Prevotella</italic> abundance irrespective of disease phenotype.</p><p>5) The prevalence of OTU4 in patients and controls is a key finding that is mentioned in the Abstract. This should be expanded upon and appropriate statistical tests need to be included.</p><p>6) The observation that “<italic>P. copri</italic> strains vary between individuals and retain their individuality over time” seems like an important point, especially in light of other recent findings (e.g., Faith et al. Science 2013). <xref ref-type="fig" rid="fig3">Figure 3b</xref> seems to qualitatively support this point, but no statistical testing is done. Are there quantitative and significant differences between individuals? What controls were done?</p><p>7) <xref ref-type="fig" rid="fig3">Figure 3c</xref>: These groups don't hold up to multiple hypotheses so this panel and Table S3 should be removed.</p><p>8) The methotrexate discussion, albeit speculative, is really fascinating and a clearly novel aspect of this work. Are there any correlations with methotrexate usage and Bacteroides abundance? Any differences in efficacy? Additional bioinformatics here could be quite useful in designing follow-up studies.</p><p>9) The NORA samples are the only ones with high systemic inflammation, as indicated by CRP levels. So the correlation might be with that rather than with arthritis per se. This also raises the question as to whether the increased <italic>P. copri</italic> levels reflect cause or effect vis-à-vis the inflammation. These points should be discussed.</p><p>10) RA is an HLA-associated disease. Are the NORA and control individuals HLA-matched, which has been the norm in such studies? At least in mice, H-2 alleles impact the gut microbiota. If the cohorts aren't HLA-matched, could the authors do a correlation assessment with the individuals they have (assuming they have HLA-typed the cohort)?</p><p>11) The interpretation of data on the CIA model in this context is confounded by the fact that a bolus of mycobacterium (CFA) was injected together with collagen.</p><p>12) The data on the CIA model are weak. The differences are barely significant as shown, i.e., in <xref ref-type="fig" rid="fig5">Figure 5d</xref> and Figure S7b are AUC values statistically significant? Why are “data from 2 of 4 representative experiments” shown? What does it look like if all data are compiled?</p><p>13) The CIA studies require a control comparator, e.g., the <italic>B. thetaiotamicron</italic> used for the colitis experiments.</p></body></sub-article><sub-article article-type="reply" id="SA2"><front-stub><article-id pub-id-type="doi">10.7554/eLife.01202.028</article-id><title-group><article-title>Author response</article-title></title-group></front-stub><body><p>We appreciate the constructive comments and valuable points raised by the reviewers and the editor. We have now made changes and edits accordingly. Overall, we agree that ours represents an initial step to characterize a unique microbiome profile in human RA. Our study revealed a strong association of <italic>P. copri</italic> with RA, that, along with our metagenomic findings, should set the stage for future broader human studies (for replication and validation purposes) and, concomitantly, for mechanistic experiments aimed at gaining insights to address possible causation.</p><p><italic>1) The Introduction could be shortened, especially the discussion on Th17 cells, since the mechanistic studies showing that</italic> Prevotella <italic>specifically regulates Th17 cells are not conclusive. Also, human data on the role of IL-17 is really only robust for psoriasis, not RA or IBD</italic>.</p><p>The Introduction has been shortened as requested, especially the paragraph detailing the role of Th17 cells in the intestinal lamina propria. However, we feel that some reference to T cells is necessary for the reader to understand the background leading up to the experiments that we report here. Specifically, the report that a single intestinal microbial commensal—SFB—can induce spontaneous arthritis in a germ-free mouse model through activation of lamina propria and peripheral Th17 cells served as the inspiration to look for a similar microbe and mechanism in humans.</p><p><italic>2) The presentation is a bit unfocused. A large part of the paper involves genomic analyses that do not advance the main story. If the goal was to determine why</italic> P. copri <italic>increases in arthritis patients, there is very little in</italic> <xref ref-type="fig" rid="fig2 fig3 fig4"><italic>Figures 2–4</italic></xref> <italic>or the main text to convincingly suggest a specific answer. These “side experiments” distract from the more central question as to the putative mechanism linking</italic> P. copri <italic>to disease</italic>.</p><p>We agree that there is little data to convincingly suggest a specific answer as to why this association is observed. At this stage, we are seeking to discover appealing hypotheses that can be tested in future studies. To that end, <xref ref-type="fig" rid="fig2">Figure 2</xref> demonstrates that the <italic>Prevotella</italic> in our cohort, known only by 16S sequence, is actually one specific taxon: <italic>Prevotella copri</italic>. <xref ref-type="fig" rid="fig3">Figure 3</xref> allows for the possibility that unique genes encoded by these particular bacteria may influence the association, while <xref ref-type="fig" rid="fig4">Figure 4</xref> provides data in support of the notion that <italic>P. copri</italic> thrives in an inflammatory environment, and may exacerbate inflammation. Experiments to uncover the mechanistic basis of this association will require considerably more work, and we would like to leave readers with a sense of what may be possible and what our best leads are for future investigation.</p><p><italic>3) The paper also suffers from scientific jargon. For example, use of the word “enterotype” in the title and text is quite different from its original application in the MetaHIT paper. We would recommend removing this word given recent concerns about the methodology used in the original studies (Wu et al., Science 2011; Koren et al., PLoS Comp Biol 2013), the inconsistent usage of this term in the field, the lack of any quantitative enterotype clustering in the current paper, and the focus of this study on</italic> P. copri <italic>instead of more general patterns in community structure. Other more minor offenders are “high-throughput 16S”, “dysbiotic”, and “clade diversity”</italic>.</p><p>We agree with the reviewers that certain terms and semantics are important to better clarify our findings. In particular, we are aware that the word ‘enterotype’ has been questioned by recently published work. We have now removed references to enterotype from the title and text, as requested, and clarified instances in which we utilized terminology such as high-throughput 16S, dysbiosis, and clade diversity. In addition, we have sought to explain the utility of the various bioinformatics tools.</p><p><italic>4) The main text states that “NORA and healthy subjects form distinct clusters” based on</italic> <xref ref-type="fig" rid="fig1"><italic>Figure 1b</italic></xref><italic>. This is clearly not the case, as the NORA subjects (stars) and healthy controls (circles) are distributed across the entire graph. A more accurate statement would be that samples cluster by</italic> Prevotella <italic>abundance irrespective of disease phenotype</italic>.</p><p>We have changed the sentence as requested to better characterize this finding.</p><p><italic>5) The prevalence of OTU4 in patients and controls is a key finding that is mentioned in the Abstract. This should be expanded upon and appropriate statistical tests need to be included</italic>.</p><p>We have expanded upon this observation at the end of the first paragraph of the results section and performed chi-squared tests. Briefly, NORA v. HLT, CRA, and PsA are statistically significant (p<0.05), while pairwise comparisons between other groups are not significant.</p><p><italic>6) The observation that “</italic>P. copri <italic>strains vary between individuals and retain their individuality over time” seems like an important point, especially in light of other recent findings (e.g., Faith et al. Science 2013).</italic> <xref ref-type="fig" rid="fig3"><italic>Figure 3b</italic></xref> <italic>seems to qualitatively support this point, but no statistical testing is done. Are there quantitative and significant differences between individuals? What controls were done</italic>?</p><p>We have updated the legend to this figure and our Methods to reflect statistical testing. Briefly, if each of 61 genes is considered independently and can be in one of two states (i.e., present or absent), the probability of an exact match between any two individuals is 2<sup>-61</sup>, or 2<sup>-60</sup> with one mismatch. Qualitatively, it can be seen that any intra- or inter-individual comparison is highly statistically significant. Further, if we concede that genes within an island are not truly independent, and there are six such islands which are considered identical with 1–2 mismatches allowed, the probability of such a match is 2<sup>-6</sup>, or 0.015625, less than a 0.05 threshold for significance.</p><p><italic>7)</italic> <xref ref-type="fig" rid="fig3"><italic>Figure 3c</italic></xref><italic>: These groups don't hold up to multiple hypotheses so this panel and Table S3 should be removed</italic>.</p><p>We assume the reviewers mean that the FDR-adjusted p-values (q-values) are not less than 0.05, the standard minimum required for statistical significance. In this instance, we feel that a higher threshold for significance is justified. FDR is intended to work in a different way than a standard p-value generated by, for example, a t-test. The threshold can be viewed as the percentage of biomarkers that are likely to be false positives. Given the number of hits returned in this analysis (i.e., 19), we expect only 4.75 to be false positives, with a great majority expected to be true. In exploratory analyses such as the one we conducted for this paper, a higher FDR threshold is often used—for example, the popular GSEA (Gene Set Enrichment Analysis) software uses an FDR cutoff of 0.25 by default (<ext-link ext-link-type="uri" xlink:href="http://www.pnas.org/content/102/43/15545">http://www.pnas.org/content/102/43/15545</ext-link>). Additionally, the four ORFs we chose for discussion are components of the same pathway, which appear adjacent to one another on the same metagenomic contigs. While it is difficult to devise a statistical test for such a situation, biological intuition suggests that these may be meaningful. We do concede that without validation of these biomarker ORFs our claims cannot be stated too strongly, and we have tried to phrase our conclusions as such.</p><p><italic>8) The methotrexate discussion, albeit speculative, is really fascinating and a clearly novel aspect of this work. Are there any correlations with methotrexate usage and Bacteroides abundance? Any differences in efficacy? Additional bioinformatics here could be quite useful in designing follow-up studies</italic>.</p><p>The reviewers raise a very important point, namely that usage of methotrexate may be associated with both Bacteroides abundance and differences in treatment efficacy. The question of methotrexate efficacy, however, can only be addressed by prospective cohort design, as suggested by the reviewers. Similarly, and given the cross-sectional nature of our current study, the alteration of gut flora by the use of methotrexate cannot be answered by data accrued at this time. We agree with the reviewers that this is perhaps one of the most potentially relevant aspects of our work. We are currently engaged in prospective follow up studies to determine the effects of methotrexate in modulating gut microbiota.</p><p><italic>9) The NORA samples are the only ones with high systemic inflammation, as indicated by CRP levels. So the correlation might be with that rather than with arthritis</italic> per se<italic>. This also raises the question as to whether the increased</italic> P. copri <italic>levels reflect cause or effect vis-à-vis the inflammation. These points should be discussed</italic>.</p><p>We thank the reviewers for raising these important questions. In fact, during the study design phase, we have discussed at length which disease would represent the most appropriate control group/s, specifically to address systemic inflammation as a possible modulator of the gut microbiome. In our study, NORA samples had (as expected) overall higher disease activity scores, reflecting untreated RA. In order to address “inflammation” as a confounder, we have included two reasonable positive control groups. CRA samples in our study have, by inclusion criteria, longer disease duration and have been under various treatment regimens at the time of enrollment. We have also enrolled recent-onset, mostly untreated PsA samples as our second control group. In both cases, as reflected in <xref ref-type="table" rid="tbl1">Table 1</xref>, disease activity scores were slightly lower than those found in the NORA group. Although we believe this comparison addresses the issue of systemic inflammation as modulator of gut microbiome, it is still possible that microbiota changes observed in newly diagnosed RA patients represent rather a consequence of a unique, NORA-specific systemic inflammatory response. A paragraph has now been included discussing this alternative possibility. A second, related issue that requires further investigation is the role of CRP in the modulation of microbiota. Importantly, while DAS28 scores were slightly lower in CRA and PsA patients, the most remarkable difference was found in levels of CRP. It is particularly intriguing to us whether CRP itself may have microbial modulating properties. CRP is synthesized by the liver in response to factors released by macrophages and adipocytes. It is a member of the pentraxin protein family and was first identified in the plasma of patients with <italic>Streptococcus pneumoniae</italic> infection and it was named according to its ability to precipitate the somatic C-fraction of the pneumococcal cell wall. Curiously, CRP was the first pattern recognition receptor (PRR) to be identified. The primary bacterial ligand for CRP is now recognized to be phosphocholine, a component of several bacterial cell wall structures. The physiological role of CRP consists in binding phosphocholine and the activation of the complement system leading to phagocytosis. Interestingly, and unlike many other autoimmune diseases (such as Systemic Lupus Erythematous (SLE), scleroderma, polymyositis, dermatomyositis and PsA), CRP is characteristically high in RA. Whether or not CRP itself represents a specific response to the presence of <italic>P. copri</italic> or other taxa is an area of future investigation. We have now added a paragraph addressing the reviewers’ comments.</p><p><italic>10) RA is an HLA-associated disease. Are the NORA and control individuals HLA-matched, which has been the norm in such studies? At least in mice, H-2 alleles impact the gut microbiota. If the cohorts aren't HLA-matched, could the authors do a correlation assessment with the individuals they have (assuming they have HLA-typed the cohort)</italic>?</p><p>RA is considered a complex polygenic multifactorial autoimmune disease. Certain alleles within the HLA Class II locus confer higher risk for disease, in particular those belonging to DRB1 (i.e., shared epitope, or SE, alleles). However, genetic variance can only explain 20-30% of the cases. To address reviewers’ points, we have now included HLA-sequencing data. Indeed, consistent with recently published mouse data, the presence of SE risk-alleles seems to have an impact in the composition of gut microbiota. Our NORA cohort shows a significant inverse correlation between <italic>P. copri</italic> relative abundance and presence of shared epitope alleles. Intriguingly, a subgroup analysis of NORA patients according to presence/absence of SE alelles, revealed a significantly higher relative abundance of <italic>P. copri</italic> in those subjects lacking predisposing genes (P<0.001). It is possible therefore that, as in mice, microbiota abundance correlates with certain MHC alleles that favor an expansion of specific taxa. This could also represent a gene–environmental interaction for RA incidence, as reported for other factors such as smoking. Although we cannot prove causation, it is conceivable that a certain threshold of <italic>P. copri</italic> abundance may be necessary to overcome the lack of genetic predisposition in RA subjects, while a lower abundance may be sufficient to trigger disease in those carrying risk-alleles. Validation in expanded cohorts and mechanistic studies are needed to better understand the significance of these findings. A new figure and a paragraph expanding our findings are now included in the main text.</p><p><italic>11) The interpretation of data on the CIA model in this context is confounded by the fact that a bolus of mycobacterium (CFA) was injected together with collagen</italic>.</p><p>The CIA results have been removed.</p><p><italic>12) The data on the CIA model are weak. The differences are barely significant as shown, i.e., in</italic> <xref ref-type="fig" rid="fig5"><italic>Figure 5d</italic></xref> <italic>and</italic> <italic>Figure S7b</italic> <italic>are AUC values statistically significant? Why are “data from 2 of 4 representative experiments” shown? What does it look like if all data are compiled</italic>?</p><p>We agree; we have removed the CIA data.</p><p><italic>13) The CIA studies require a control comparator, e.g., the</italic> B. thetaiotamicron <italic>used for the colitis experiments</italic>.</p><p>The CIA results have been removed.</p></body></sub-article></article> |