This repository has been archived by the owner on Mar 2, 2018. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 7
/
elife02030.xml
1 lines (1 loc) · 175 KB
/
elife02030.xml
1
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.1d1 20130915//EN" "JATS-archivearticle1.dtd"><article article-type="research-article" dtd-version="1.1d1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="nlm-ta">elife</journal-id><journal-id journal-id-type="hwp">eLife</journal-id><journal-id journal-id-type="publisher-id">eLife</journal-id><journal-title-group><journal-title>eLife</journal-title></journal-title-group><issn publication-format="electronic">2050-084X</issn><publisher><publisher-name>eLife Sciences Publications, Ltd</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">02030</article-id><article-id pub-id-type="doi">10.7554/eLife.02030</article-id><article-categories><subj-group subj-group-type="display-channel"><subject>Research article</subject></subj-group><subj-group subj-group-type="heading"><subject>Biophysics and structural biology</subject></subj-group><subj-group subj-group-type="heading"><subject>Genomics and evolutionary biology</subject></subj-group></article-categories><title-group><article-title>Robust and accurate prediction of residue–residue interactions across protein interfaces using evolutionary information</article-title></title-group><contrib-group><contrib contrib-type="author" equal-contrib="yes" id="author-8806"><name><surname>Ovchinnikov</surname><given-names>Sergey</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="aff" rid="aff2"/><xref ref-type="fn" rid="equal-contrib">†</xref><xref ref-type="other" rid="par-1"/><xref ref-type="other" rid="par-3"/><xref ref-type="fn" rid="con1"/><xref ref-type="fn" rid="conf1"/><xref ref-type="other" rid="dataro1"/></contrib><contrib contrib-type="author" equal-contrib="yes" id="author-9601"><name><surname>Kamisetty</surname><given-names>Hetunandan</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="aff" rid="aff3"/><xref ref-type="fn" rid="equal-contrib">†</xref><xref ref-type="other" rid="par-1"/><xref ref-type="fn" rid="con2"/><xref ref-type="fn" rid="conf1"/><xref ref-type="other" rid="dataro1"/></contrib><contrib contrib-type="author" corresp="yes" id="author-8249"><name><surname>Baker</surname><given-names>David</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="corresp" rid="cor1">*</xref><xref ref-type="other" rid="par-1"/><xref ref-type="other" rid="par-2"/><xref ref-type="other" rid="par-3"/><xref ref-type="fn" rid="con3"/><xref ref-type="fn" rid="conf1"/><xref ref-type="other" rid="dataro1"/></contrib><aff id="aff1"><institution content-type="dept">Department of Biochemistry</institution>, <institution>Howard Hughes Medical Institute, University of Washington</institution>, <addr-line><named-content content-type="city">Seattle</named-content></addr-line>, <country>United States</country></aff><aff id="aff2"><institution content-type="dept">Molecular and Cellular Biology Program</institution>, <institution>University of Washington</institution>, <addr-line><named-content content-type="city">Seattle</named-content></addr-line>, <country>United States</country></aff><aff id="aff3"><institution>Facebook Inc.</institution>, <addr-line><named-content content-type="city">Seattle</named-content></addr-line>, <country>United States</country></aff></contrib-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Roux</surname><given-names>Benoit</given-names></name><role>Reviewing editor</role><aff><institution>University of Chicago</institution>, <country>United States</country></aff></contrib></contrib-group><author-notes><corresp id="cor1"><label>*</label>For correspondence: <email>dabaker@u.washington.edu</email></corresp><fn fn-type="con" id="equal-contrib"><label>†</label><p>These authors contributed equally to this work</p></fn></author-notes><pub-date date-type="pub" publication-format="electronic"><day>01</day><month>05</month><year>2014</year></pub-date><pub-date pub-type="collection"><year>2014</year></pub-date><volume>3</volume><elocation-id>e02030</elocation-id><history><date date-type="received"><day>08</day><month>12</month><year>2013</year></date><date date-type="accepted"><day>22</day><month>04</month><year>2014</year></date></history><permissions><copyright-statement>© 2014, Ovchinnikov et al</copyright-statement><copyright-year>2014</copyright-year><copyright-holder>Ovchinnikov et al</copyright-holder><license xlink:href="http://creativecommons.org/licenses/by/3.0/"><license-p>This article is distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/3.0/">Creative Commons Attribution License</ext-link>, which permits unrestricted use and redistribution provided that the original author and source are credited.</license-p></license></permissions><self-uri content-type="pdf" xlink:href="elife02030.pdf"/><abstract><object-id pub-id-type="doi">10.7554/eLife.02030.001</object-id><p>Do the amino acid sequence identities of residues that make contact across protein interfaces covary during evolution? If so, such covariance could be used to predict contacts across interfaces and assemble models of biological complexes. We find that residue pairs identified using a pseudo-likelihood-based method to covary across protein–protein interfaces in the 50S ribosomal unit and 28 additional bacterial protein complexes with known structure are almost always in contact in the complex, provided that the number of aligned sequences is greater than the average length of the two proteins. We use this method to make subunit contact predictions for an additional 36 protein complexes with unknown structures, and present models based on these predictions for the tripartite ATP-independent periplasmic (TRAP) transporter, the tripartite efflux system, the pyruvate formate lyase-activating enzyme complex, and the methionine ABC transporter.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.02030.001">http://dx.doi.org/10.7554/eLife.02030.001</ext-link></p></abstract><abstract abstract-type="executive-summary"><object-id pub-id-type="doi">10.7554/eLife.02030.002</object-id><title>eLife digest</title><p>Proteins are considered the ‘workhorse molecules’ of life and they are involved in virtually everything that cells do. Proteins are strings of amino acids that have folded into a specific three-dimensional shape. Proteins must have the correct shape to function properly, as they often work by binding to other proteins or molecules—much like a key fitting into a lock. Working out the structure of a protein can, therefore, provide major insights into how the protein does its job.</p><p>Two or more proteins can bind together and form a complex to perform various tasks; and solving the structures of these complexes can be challenging, even if the structures of the protein subunits are known. Now, Ovchinnikov, Kamisetty, and Baker have developed a method for predicting which parts of the proteins make contact with each other in a two-protein complex.</p><p>Different species can have copies of the same proteins; but a copy from one species might have different amino acids at certain positions when compared to a related copy from another species. As such, when pairs of interacting proteins from different species are compared, there will be many positions in the two proteins that vary. However, if the amino acid at a position in one protein (let's call it ‘X’) varies, and the amino acid at, say, position ‘Y’ in the other protein also varies such that for any given amino acid at position Y there is often a specific amino acid at position X; positions X and Y are said to ‘co-vary’. Ovchinnikov et al. noticed that when a pair of amino acids (one from each protein in a two-protein complex) co-varied, these two amino acids tended to make contact with each other at the protein–protein interface.</p><p>Ovchinnikov et al. used the new method to make predictions about the protein–protein interfaces in 28 protein complexes found in bacteria, and also to make a prediction about the interface between protein subunits in the bacterial ribosome. When these predictions were checked against the actual structures, which were all known beforehand, they were found to be accurate if the number of copies of each protein being compared is greater than the average length of the two proteins.</p><p>Ovchinnikov et al. went on to predict the amino acids on the protein–protein interfaces for another 36 bacterial protein complexes with unknown structures, and to present models for four larger complexes. The next challenge is to extend the method to protein complexes that are found only in eukaryotes (i.e., not in bacteria). Since the number of related copies for eukaryotic proteins tends to be smaller, there are fewer proteins to compare and it is therefore harder to detect ‘covariation’ when it occurs.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.02030.002">http://dx.doi.org/10.7554/eLife.02030.002</ext-link></p></abstract><kwd-group kwd-group-type="author-keywords"><title>Author keywords</title><kwd>protein coevolution</kwd><kwd>protein complexes</kwd><kwd>pseudo-likelihood</kwd></kwd-group><kwd-group kwd-group-type="research-organism"><title>Research organism</title><kwd>other</kwd></kwd-group><funding-group><award-group id="par-1"><funding-source><institution-wrap><institution-id institution-id-type="FundRef">http://dx.doi.org/10.13039/100000002</institution-id><institution>National Institutes of Health</institution></institution-wrap></funding-source><award-id>1R01GM092802-04</award-id><principal-award-recipient><name><surname>Ovchinnikov</surname><given-names>Sergey</given-names></name><name><surname>Kamisetty</surname><given-names>Hetunandan</given-names></name><name><surname>Baker</surname><given-names>David</given-names></name></principal-award-recipient></award-group><award-group id="par-2"><funding-source><institution-wrap><institution-id institution-id-type="FundRef">http://dx.doi.org/10.13039/100000002</institution-id><institution>National Institutes of Health</institution></institution-wrap></funding-source><award-id>National Institute of General Medical Studies, P41 GM103533</award-id><principal-award-recipient><name><surname>Baker</surname><given-names>David</given-names></name></principal-award-recipient></award-group><award-group id="par-3"><funding-source><institution-wrap><institution-id institution-id-type="FundRef">http://dx.doi.org/10.13039/100000774</institution-id><institution>Defense Threat Reduction Agency (DTRA)</institution></institution-wrap></funding-source><award-id>N00024-10-D-6318/0024-02</award-id><principal-award-recipient><name><surname>Ovchinnikov</surname><given-names>Sergey</given-names></name><name><surname>Baker</surname><given-names>David</given-names></name></principal-award-recipient></award-group><funding-statement>The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.</funding-statement></funding-group><custom-meta-group><custom-meta><meta-name>elife-xml-version</meta-name><meta-value>2</meta-value></custom-meta><custom-meta specific-use="meta-only"><meta-name>Author impact statement</meta-name><meta-value>Co-evolving residue pairs in the different components of a protein complex almost always make contact across the protein–protein interface, thus providing powerful restraints for the modeling of protein complexes.</meta-value></custom-meta></custom-meta-group></article-meta></front><body><sec id="s1" sec-type="intro"><title>Introduction</title><p>Recent work has demonstrated the accuracy of coevolution-based contact prediction for monomeric proteins using a global statistical model (<xref ref-type="bibr" rid="bib51">Thomas et al., 2008</xref>) to distinguish between direct and indirect couplings (<xref ref-type="bibr" rid="bib30">Marks et al., 2011</xref>; <xref ref-type="bibr" rid="bib33">Morcos et al., 2011</xref>; <xref ref-type="bibr" rid="bib20">Hopf et al., 2012</xref>; <xref ref-type="bibr" rid="bib36">Nugent and Jones, 2012</xref>; <xref ref-type="bibr" rid="bib23">Jones et al., 2012</xref>; <xref ref-type="bibr" rid="bib26">Lapedes et al., 2012</xref>; <xref ref-type="bibr" rid="bib31">Marks et al., 2012</xref>; <xref ref-type="bibr" rid="bib49">Sułkowska et al., 2012</xref>; <xref ref-type="bibr" rid="bib24">Kamisetty et al., 2013</xref>). While early approaches relied on estimating an inverse covariance matrix (<xref ref-type="bibr" rid="bib30">Marks et al., 2011</xref>; <xref ref-type="bibr" rid="bib33">Morcos et al., 2011</xref>; <xref ref-type="bibr" rid="bib23">Jones et al., 2012</xref>), more recent studies have shown that a pseudo-likelihood-based approach (<xref ref-type="bibr" rid="bib1">Balakrishnan et al., 2011</xref>) results in more accurate predictions (<xref ref-type="bibr" rid="bib16">Ekeberg et al., 2013</xref>; <xref ref-type="bibr" rid="bib24">Kamisetty et al., 2013</xref>) for a range of alignment sizes and protein lengths.</p><p>In contrast to this rich body of work for monomeric proteins, relatively little is known about the utility of such statistical models in predicting protein–protein interactions. The more general problem of predicting if two proteins interact with each other has been studied extensively using a wide variety of approaches (<xref ref-type="bibr" rid="bib12">de Juan et al., 2013</xref>; <xref ref-type="bibr" rid="bib21a">Hosur et al., 2012</xref>; <xref ref-type="bibr" rid="bib62">Zhang et al., 2012</xref>; <xref ref-type="bibr" rid="bib44">Shoemaker and Panchenko, 2007</xref>, <xref ref-type="bibr" rid="bib54">Valencia and Pazos, 2002</xref>, <xref ref-type="bibr" rid="bib37">Ochoa and Pazos, 2010</xref>). Amino acid residue coevolution has been used to predict residue–residue interactions across interfaces with local statistical models (<xref ref-type="bibr" rid="bib39">Pazos et al., 1997</xref>; <xref ref-type="bibr" rid="bib19">Halperin et al., 2006</xref>). As noted above, the accuracy of these models is reduced by the confounding of direct and indirect correlations (<xref ref-type="bibr" rid="bib27">Lapedes et al., 1999</xref>; <xref ref-type="bibr" rid="bib57">Weigt et al., 2009</xref>); the application of global statistical models to coevolution-based contact prediction across interfaces has been limited to the case of the histidine-kinase/response-regulator two component system (<xref ref-type="bibr" rid="bib7">Burger and van Nimwegen, 2008</xref>; <xref ref-type="bibr" rid="bib57">Weigt et al., 2009</xref>; <xref ref-type="bibr" rid="bib43">Schug et al., 2009</xref>; <xref ref-type="bibr" rid="bib10">Dago et al., 2012</xref>).</p><p>In this study, we examine residue–residue covariation across protein–protein interfaces using a pseudo-likelihood-based statistical method. In a large set of complexes of known structure, we find that covarying pairs of positions are almost always in contact in the three-dimensional structure, provided there are sufficient aligned sequences. We find further that significant residue–residue covariance occurs frequently between physically interacting protein pairs but very rarely between non-interacting pairs, and hence should be useful for predicting whether two proteins interact. We use the pseudo-likelihood method to predict contacts across protein-interfaces for 36 evolutionarily conserved complexes of unknown structure and present structure models for four of the complexes particularly well constrained by these data.</p></sec><sec id="s2" sec-type="results"><title>Results</title><p>For a single protein family, it is straightforward to generate a multiple sequence alignment and subsequently identify covarying residue pairs. To identify covarying residue pairs between two proteins A and B is not as easy: only organisms that contain an ortholog of protein A and protein B contribute, and in generating the alignments the protein A and protein B sequences for each organism must be properly paired. To simplify the ortholog identification problem, we focus on pairs of genes with conserved chromosomal locations separated in the genome by fewer than 20 other annotated genes. We then build GREMLIN global statistical models for sequences in the paired protein families. The models have ‘one-body’ parameters for each amino acid at each position in the two proteins, and ‘two-body’ parameters for each pair of amino acids at each pair of positions in the two proteins. These parameters are obtained by maximizing the pseudo-likelihood of the observed sequence pairs, rather than their likelihood, which makes the quite formidable estimation tractable. In the following sections, we investigate the structural contexts of residue pairs with large values of these two-body coupling parameters</p><sec id="s2-1"><title>Residue–residue covariation in the bacterial 50S ribosomal unit</title><p>We began by studying residue–residue coupling parameters in the bacterial 50S ribosomal subunit—the largest evolutionarily conserved bacterial multiprotein complex with an atomic resolution structure. For each individual protein in the complex, we constructed multiple sequence alignments by querying the UniProt sequence database (<xref ref-type="bibr" rid="bib58">Wu et al., 2006</xref>) for homologous sequences. For every pair of proteins in the complex, we then constructed a paired multiple sequence alignment (‘Materials and methods’). For each such paired alignment, we built a GREMLIN global statistical model, computed normalized coupling strengths from the two body coupling parameters, and ranked inter protein residue pairs based on these scores (‘Materials and methods’). A coupling strength larger than one indicates higher than average coupling between two residues.</p><p>We find that in the 50S ribosomal subunit only a small fraction of residue pairs coevolve, as indicated by coupling strengths (y axis of <xref ref-type="fig" rid="fig1">Figure 1A</xref>) greater than 1.5. Remarkably, the two residues in each of these pairs are almost all within 8 Å of each other in the 50S crystal structure (<xref ref-type="fig" rid="fig1">Figure 1A</xref>) and all are within 12 Å. The locations of the covarying residue pairs in the 50S structure (with the individual proteins pulled apart for clarity) are shown in <xref ref-type="fig" rid="fig1">Figure 1B</xref>; yellow lines indicate distances less than 8 Å and orange lines, distances less than 12 Å. For the 50S ribosome, the GREMLIN model was built using sequence data from ∼1500 non-redundant genomes; <xref ref-type="fig" rid="fig1">Figure 1D</xref> suggests that for complexes with such large numbers of aligned sequence, residue–residue interactions across interfaces can be predicted with quite high confidence based on amino acid sequence covariation.<fig-group><fig id="fig1" position="float"><object-id pub-id-type="doi">10.7554/eLife.02030.003</object-id><label>Figure 1.</label><caption><title>Residue pairs with high normalized coupling strengths are in contact in the 50S ribosomal subunit.</title><p>(<bold>A</bold>) Coupling strengths and inter-residue distances for each residue pair in the 50S subunit (black dots). Residue pairs with coupling strength greater than 1.5 are nearly always less than 8 Å apart. (<bold>B</bold>) Locations of coevolving (high coupling strength) residue pairs in the protein component of the 50S subunit. The monomers have been pulled apart slightly for clarity. Lines connect residue pairs with coupling strength greater than 1.5; yellow, distance less than 8 Å; orange, distance less than 12 Å. (<bold>C</bold>) Protein pairs with strong inter-residue covariation (colors) make contact in the three-dimensional structure (black boxes). For each protein pair, the sum of the coupling strength greater than 1.5 for each pair of 50S subunit proteins is indicated; black boxes indicate contacts in the crystal structure. (<bold>D</bold>) Dependence of contact prediction accuracy on coupling strength and the number of sequences in the alignments. For each of the indicated coupling strength cutoffs (colors), the frequency of contact in the 50S structure (y axis) was computed for sub alignments with different sequence depths (x axis).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.02030.003">http://dx.doi.org/10.7554/eLife.02030.003</ext-link></p></caption><graphic xlink:href="elife02030f001"/></fig><fig id="fig1s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.02030.004</object-id><label>Figure 1—figure supplement 1.</label><caption><title>Determining GREMLIN scores from normalized coupling strengths.</title><p>Top row: (<bold>A</bold>) Normalized Coupling strengths. (<bold>B</bold>) GREMLIN score obtained by fitting a sigmoidal function of normalized coupling strengths to observed frequencies on the 50S ribosome (left column) evaluated on the benchmark set (complexes from the NADH dehydrogenase, middle column and the remaining, right column). (<bold>C</bold>) The GREMLIN score is well-calibrated: the fraction of predictions with a Gremlin score of x that are correct (distance <12 Å) is roughly x (x in [0, 1]). The overall behavior is similar across the three datasets.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.02030.004">http://dx.doi.org/10.7554/eLife.02030.004</ext-link></p></caption><graphic xlink:href="elife02030fs001"/></fig></fig-group></p><p>For a large protein–protein complex, can the sum of the coupling strengths between pairs of proteins in the complex be used to distinguish directly interacting and non-interacting protein pairs? In the 50S subunit, every pair of proteins with summed coupling strengths (numbers in <xref ref-type="fig" rid="fig1">Figure 1C</xref>) greater than 1.5 interacts with each other (boxes in <xref ref-type="fig" rid="fig1">Figure 1C</xref>). There are, however, several instances of protein pairs that contact in the 50S subunit for which no covariance is observed; clearly not every interaction will be identified by the sum of the coupling strengths, for example between two proteins that are held together primarily by the ribosomal RNA.</p><p>How many aligned sequences are required for accurate contact prediction? To assess the dependence on alignment depth, we generated paired sub-alignments with varying numbers of sequences for every pair of 50S proteins and recomputed coupling strengths for each sub-alignment. For each alignment depth, we calculated the fraction of residue pairs within 12 Å for different ranges of coupling strengths. We find that the greater the number of aligned sequences, the lower the value of the coupling strength above which residue pairs are likely to be in contact in the structure (<xref ref-type="fig" rid="fig1">Figure 1D</xref>). For example, if the number of aligned sequences is greater than the sum of the lengths of the two proteins, residue–residue contact predictions are likely to be accurate if the coupling strength is 2 or greater (<xref ref-type="fig" rid="fig1">Figure 1D</xref>: orange dots), while if there are twice as many sequences, contact predictions are accurate above a coupling strength of 1.5 (the cutoff shown in <xref ref-type="fig" rid="fig1">Figure 1A</xref>). A sigmoidal function of the coupling strength and the number of sequences per position in the complex accurately fits the observed contact frequency data (‘Materials and methods’ and <xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1</xref>); we refer to the fitted values as GREMLIN scores for the remainder of the paper.</p></sec><sec id="s2-2"><title>Bacterial complex benchmark</title><p>We next generated paired-alignments for all <italic>E. coli</italic> gene-pairs that had conserved intergenic distances across genomes deposited in the UniProt (‘Materials and methods’). As the 50S results (<xref ref-type="fig" rid="fig1">Figure 1D</xref>) suggested that alignment depths greater than the average of the lengths of the two proteins were required for accurate prediction, we focused on paired alignments with at least this number of sequences—1126 gene pairs in total excluding the ribosomal proteins. For each of these 1126 pairs, we generated GREMLIN global statistical models and determined the coupling strength for each residue pair.</p><p>For 64 of the 1126 gene pairs, at least one pair of residues had GREMLIN score >0.85. For 28 of the 64 pairs three-dimensional structures have been determined experimentally, and the locations of the residue pairs with GREMLIN score >0.6 for several of these complexes are shown in <xref ref-type="fig" rid="fig2">Figure 2A</xref> (pairs within 8 Å are in yellow, between 8 Å and 12 Å in orange, and greater than 12 Å, in red). Almost all pairs with GREMLIN scores greater than 0.6 are in contact in the complex structures, with the notable exception of the NADH dehydrogenase subunits (<xref ref-type="fig" rid="fig2">Figure 2B</xref>). The complex is thought to undergo a cascade of conformational changes during electron transfer (<xref ref-type="bibr" rid="bib2">Baradaran et al., 2013</xref>); the high GREMLIN score contacts not made in the solved structure may provide insight into the nature of these changes. As observed for the 50S complex (<xref ref-type="fig" rid="fig1">Figure 1C</xref>), the existence of one or more high GREMLIN scores between two proteins provides evidence that the proteins interact: 44% (28/64) of the protein pairs with high GREMLIN scores form a complex which has been solved crystallographically compared to 8% (78/1126) over the whole set.<fig id="fig2" position="float"><object-id pub-id-type="doi">10.7554/eLife.02030.005</object-id><label>Figure 2.</label><caption><title>Residue covariation in complexes with known structures.</title><p>(<bold>A</bold>) Residue-pairs across protein chains with high GREMLIN scores almost always make contact across protein interfaces in experimentally determined complex structures. All contacts with GREMLIN scores greater than 0.6 are shown; the structures are pulled apart for clarity. Labels are according to chains in the PDB structure. (<bold>B</bold>) Complex I of the electron transport chain has an unusually large number of highly co-varying inter residue pairs not in contact in the crystal structure of 4HEA; these contacts may be formed in different state of the complex. Residue pairs within 8 Å are in yellow, between 8 Å and 12 Å in orange, and greater than 12 Å, in red. Distances are the minimal distances between any side chain heavy atom. Labels are according to chains in 4HEA. (<bold>C</bold>) Dependence of inter-residue distance distributions on GREMLIN score. All residue–residue pairs between subunits in the benchmark set were grouped into four bins based on their GREMLIN score (colors), and the distribution of residue–residue distances (x axis) within each bin computed from the three-dimensional structures. See <xref ref-type="supplementary-material" rid="SD1-data">Figure 2—source data 1</xref> for the table of all the interfaces used in the calculation.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.02030.005">http://dx.doi.org/10.7554/eLife.02030.005</ext-link></p><p><supplementary-material id="SD1-data"><object-id pub-id-type="doi">10.7554/eLife.02030.006</object-id><label>Figure 2—source data 1.</label><caption><title>PDB benchmark set.</title><p>The PDB id and chains in the benchmark set, with number of sequences per length (seq/len) in the multiple sequence alignment. For complexes involving more than one component, an all vs all analysis was performed.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.02030.006">http://dx.doi.org/10.7554/eLife.02030.006</ext-link></p></caption><media mime-subtype="xls" mimetype="application" xlink:href="elife02030s001.xls"/></supplementary-material></p></caption><graphic xlink:href="elife02030f002"/></fig></p></sec><sec id="s2-3"><title>Contact predictions for complexes of unknown structure</title><p>The results with the 50S ribosome and the protein pairs in the benchmark suggest that interactions can be accurately predicted across protein–protein interfaces given a sufficient number of aligned sequences. In <xref ref-type="fig" rid="fig3">Figure 3</xref>, we provide residue–residue contact predictions for the 36 of the 64 complexes with currently unknown structure (the <italic>E. coli</italic> gene sequences were clustered, and hence each complex may represent multiple <italic>E. coli</italic> gene pairs). These predictions should contribute to the determination of the structures of these biologically important complexes.<fig id="fig3" position="float"><object-id pub-id-type="doi">10.7554/eLife.02030.007</object-id><label>Figure 3.</label><caption><title>Predicted residue–residue interactions across protein interfaces of unknown structure.</title><p>Strongly co-evolving residue pairs for complexes without known structure that had at least one prediction with GREMLIN score greater than or equal to 0.85. Each row shows the residue pairs, their sequence identity and the GREMLIN score. Structure models for complexes highlighted in red are shown in <xref ref-type="fig" rid="fig5">Figure 5</xref>. Full dataset is provided with the deposited data.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.02030.007">http://dx.doi.org/10.7554/eLife.02030.007</ext-link></p></caption><graphic xlink:href="elife02030f003"/></fig></p></sec><sec id="s2-4"><title>From contacts to structural models</title><p>Are the predicted contacts useful in assembling models of the protein complex from models of each component? We evaluated this on a docking test set containing 18 protein complexes from the benchmark set where at least one component (or a close homolog) had a known structure in the <italic>apo</italic> form (‘Materials and methods’, docking test-set). We developed a docking protocol that used the predicted contacts as distance restraints and sampled the space of physically plausible structures to generate models of the protein–protein complex. The model with the best restraint score had an interface that was within 4 Å (in root mean square deviation) of the native interface in 14 of the 18 cases and had more than half the native contacts in 16 of the 18 cases (<xref ref-type="fig" rid="fig4">Figure 4A</xref>, <xref ref-type="fig" rid="fig4s1">Figure 4—figure supplement 1</xref>). Two of the cases in which the iRMSD (interface root-mean-square deviation) was the highest (bottom of table in A) are illustrated in <xref ref-type="fig" rid="fig4">Figure 4B–C</xref>: the high iRMSD is due to large changes in the conformation of one of the monomers upon binding; despite these changes the binding interface is reasonably accurately identified. Conformational changes that hinder the rigid-body docking protocol from sampling the bound conformation also occurred for thiazole synthase/sulfur carrier and phenylalanyl-tRNA synthase with iRMSD of 4.8A and 4.3A, respectively. In <xref ref-type="fig" rid="fig4">Figure 4D</xref>, a second energy minimum corresponds to a second interface in the complex with a different homo-oligomer subunit. In the absence of conformational changes, predicted contact guided docking is very accurate. The same protocol, on a positive control set of known bound structures of 41 protein-pairs (including 15 protein-pairs from the NADH electron transport complex), generated models that were within 2 Å of the native complex structure in 38 cases and within 4 Å in all but one case (<xref ref-type="supplementary-material" rid="SD2-data">Figure 4—source data 1</xref>, <xref ref-type="fig" rid="fig4s2">Figure 4—figure supplement 2</xref>).<fig-group><fig id="fig4" position="float"><object-id pub-id-type="doi">10.7554/eLife.02030.008</object-id><label>Figure 4.</label><caption><title>Contact guided protein–protein docking on a benchmark set of 18 protein complexes.</title><p>(<bold>A</bold>) Structure models for each complex were generated by docking structures of its constituents, at least one of which (blue) was not from the structure of the complex guided by coevolution derived distance restraints. The interface C-alpha RMSD (iRMSD) of the structural model with the lowest energy to the experimentally determined structure and the fraction of native contacts are shown. Structure models for cases in red are shown in <bold>B</bold> and <bold>C</bold> and <bold>D</bold>. (<bold>B</bold> and <bold>C</bold>) Comparison between native and docked structure for the two largest failures in the benchmark: the large iRMSD is due to large conformational changes in the monomers upon docking but the interface is still modeled correctly in the region not involved in conformational change. (<bold>D</bold>) Multiple minima in the docking landscape (right) correspond to distinct interfaces in the complex (left).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.02030.008">http://dx.doi.org/10.7554/eLife.02030.008</ext-link></p><p><supplementary-material id="SD2-data"><object-id pub-id-type="doi">10.7554/eLife.02030.009</object-id><label>Figure 4—source data 1.</label><caption><title>Bound set.</title><p>The iRMSD of the lowest energy structure and the fraction of native contacts in the positive control.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.02030.009">http://dx.doi.org/10.7554/eLife.02030.009</ext-link></p></caption><media mime-subtype="xls" mimetype="application" xlink:href="elife02030s002.xls"/></supplementary-material></p></caption><graphic xlink:href="elife02030f004"/></fig><fig id="fig4s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.02030.010</object-id><label>Figure 4—figure supplement 1.</label><caption><title>Docking landscapes showing iRMSD (x-axis) vs GREMLIN restraint score (y-axis).</title><p>Each point represents a structure model generated by docking the subunits guided by the GREMLIN score. Dark blue points are from calculations in which at least one subunit was solved independently of the complex; light blue points, from positive control calculations in which both subunits are from the bound complex.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.02030.010">http://dx.doi.org/10.7554/eLife.02030.010</ext-link></p></caption><graphic xlink:href="elife02030fs002"/></fig><fig id="fig4s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.02030.011</object-id><label>Figure 4—figure supplement 2.</label><caption><title>Bound set.</title><p>Docking landscapes with GREMLIN restraint score. X-axis, iRMSD; y-axis GREMLIN restraint score.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.02030.011">http://dx.doi.org/10.7554/eLife.02030.011</ext-link></p></caption><graphic xlink:href="elife02030fs003"/></fig></fig-group></p><p>Taken together, these results suggest that in cases with small conformational change, the docking protocol can recover the entire interface to high accuracy and in cases where binding is accompanied by a large conformational change, the protocol recovers the largest intact and/or unobstructed interface.</p><p>Of the complexes with unknown structure listed in <xref ref-type="fig" rid="fig3">Figure 3</xref>, we selected four cases with two or more high GREMLIN score (≥0.6) contact predictions across the interface that had experimentally determined structures for most of the subunits (‘Materials and methods’) and generated structural models of the complexes. These models provide the basis for formulating hypotheses about the structure/function of the complex, but we emphasize they are not experimentally determined structures; in particular the assumption in the modeling procedure that there are not large backbone rearrangements could be incorrect—in such cases the overall organization of the complex is still likely to be correct but the details of the interfaces could be considerably in error.</p></sec><sec id="s2-5"><title>The TRAP complex</title><p>The tripartite ATP-independent periplasmic (TRAP) transporters are composed of three proteins: two integral membrane proteins YIAM and YIAN, and one periplasmic protein YIAO (<xref ref-type="bibr" rid="bib34">Mulligan et al., 2011</xref>). The structure of the periplasmic domain is known, but the membrane portion is unknown. To generate a model of the three-dimensional structure of the complex, we built YIAM models using Rosetta de novo structure prediction (<xref ref-type="bibr" rid="bib46">Simons et al., 1999</xref>; <xref ref-type="bibr" rid="bib40">Raman et al., 2009</xref>) guided by the intra-monomer predicted contacts, and models for YIAN and YIAO using RosettaCM comparative modeling. For YIAN the homologous structure of 4f35 (<xref ref-type="bibr" rid="bib29">Mancusso et al., 2012</xref>) was used. The three monomer structure models were then assembled using PatchDock (<xref ref-type="bibr" rid="bib13">Duhovny et al., 2002</xref>) and RosettaRelax (<xref ref-type="bibr" rid="bib9">Conway et al., 2014</xref>) guided by the predicted intersubunit contacts (‘Materials and methods’). In the resultant model of the complex (<xref ref-type="fig" rid="fig5">Figure 5</xref>), YIAO interacts with both of the membrane components; this is supported by a number of intersubunit contacts (yellow lines).<fig id="fig5" position="float"><object-id pub-id-type="doi">10.7554/eLife.02030.012</object-id><label>Figure 5.</label><caption><title>Structure models for complexes with unknown structures.</title><p>Residue pairs with GREMLIN scores ≥ 0.60 are connected by yellow bars; the structures are pulled apart for clarity. For METQ-METI and PFLA-PFLB GREMLIN scores ≥ 0.3 are shown. For each docking calculation the docking energy landscape is shown, with iRMSD to the selected model on the x-axis. The multiple minima correspond to permutations of the labels on the subunits of the homo-oligomer complex. Predicted structures of each complex are provided with the deposited data.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.02030.012">http://dx.doi.org/10.7554/eLife.02030.012</ext-link></p></caption><graphic xlink:href="elife02030f005"/></fig></p></sec><sec id="s2-6"><title>Tripartite efflux system</title><p>Tripartite efflux complexes span both the inner and outer membrane, and are widely used in bacteria to pump toxic compounds out of the cell. The mode of interactions between the outer membrane factor and the membrane fusion protein is unresolved, with reports suggesting either a tip-to-tip interaction, the insertion of one into the other, or a multistage interaction with an initial tip-to-tip interaction, followed by sliding one through the channel of the other (<xref ref-type="bibr" rid="bib28">Long et al., 2012</xref>). We generated homology models for the subunits based on the alignments to 1yc9 (<xref ref-type="bibr" rid="bib17">Federici et al., 2005</xref>) and 3fpp (<xref ref-type="bibr" rid="bib61">Yum et al., 2009</xref>) and docked them to generate models of the multidrug resistance protein complex. The predicted residue–residue contacts for this family of complexes support the tip-to-tip interaction (<xref ref-type="fig" rid="fig4">Figure 4</xref>; yellow lines); the coevolution data did not provide any evidence to support the insertion model.</p></sec><sec id="s2-7"><title>Pyruvate formate lyase-activating enzyme complex</title><p>Pyruvate formate-lyase (PFL) catalyzes the reaction of acetyl-CoA and formate from pyruvate and CoA in the Fermentation pathway. Formate acetyltransferase 1 or Pyruvate formate-lyase 1 (PFLB) is activated by Pyruvate formate-lyase 1-activating enzyme (PFLA). The structure of the complex is unknown, but the structures of the individual proteins have been solved (PDB ids: 3c8f [<xref ref-type="bibr" rid="bib4">Becker and Kabsch, 2002</xref>] and 1h16 [<xref ref-type="bibr" rid="bib55">Vey et al., 2008</xref>]). We carried out rigid body docking calculations with these two proteins guided by GREMLIN predictions. Interestingly, the region that undergoes conformational change in the activating enzyme upon substrate binding (3c8f -> 3cb8 [<xref ref-type="bibr" rid="bib4">Becker and Kabsch, 2002</xref>]) is in the region we predict to be in contact with PFL.</p></sec><sec id="s2-8"><title>D-methionine transport system</title><p>D-methionine transporter is an ATP-driven transport system that transports methionine. We docked the <italic>E. coli</italic> structure of METI (3tui, chain A and B, <xref ref-type="bibr" rid="bib52">Johnson et al., 2012</xref>) with a RosettaCM model of METQ based on 3k2d (<xref ref-type="bibr" rid="bib60">Yu et al., 2011</xref>). The resulting docked model is consistent with the top ranked GREMLIN predictions (<xref ref-type="fig" rid="fig5">Figure 5</xref>).</p></sec></sec><sec id="s3" sec-type="discussion"><title>Discussion</title><p>Our results demonstrate unequivocally that there is strong selective pressure at protein–protein interfaces beyond simple residue conservation, and that co-evolving residue pairs are nearly always in contact in the protein complex. Not all contacting residues across protein interfaces likely co-evolve nor all protein–protein interfaces. Nevertheless, as illustrated in <xref ref-type="fig" rid="fig1 fig2">Figures 1 and 2</xref>, there is clearly sufficient coevolutionary signal to significantly constrain models of a large number of protein complexes.</p><p>There is a notable contrast in the utility of intra-monomer and intersubunit predicted contacts for structure modeling. We found previously (<xref ref-type="bibr" rid="bib24">Kamisetty et al., 2013</xref>) that contacts could be predicted with high accuracy for monomeric proteins, provided there were sufficient aligned sequences, but in such cases there was almost always already a structure of a family member from which comparative models could be built, limiting the utility of the predicted contacts in structure prediction (Though predicted contacts can be useful in modeling allosteric changes in protein structures [<xref ref-type="bibr" rid="bib20">Hopf et al., 2012</xref>; <xref ref-type="bibr" rid="bib32">Morcos et al., 2013</xref>]). In contrast, here we find that more than half of the complexes for which the protein families of the constituent subunits are sufficiently large for accurate contact prediction do not currently have three-dimensional structures. Hence, while predicted contacts can be very accurate for both monomeric globular proteins and for protein–protein complexes, they are more useful for structure modeling for the latter due to the much poorer representation of protein complexes in the PDB.</p><p>While our approach of constructing a global statistical model from paired sequence alignments is generally applicable to any taxa, the current study focuses on prokaryotes and mitochondria. Doing so allows us to largely avoid the problem of distinguishing between paralogs by exploiting the operon architecture of bacterial genomes (<xref ref-type="bibr" rid="bib22">Jacob et al., 2005</xref>). Constructing paired-sequence alignments for more complex genomic architectures is more involved and requires the ability to distinguish orthologs from paralogs, the subject of active research (<xref ref-type="bibr" rid="bib41">Remm et al., 2001</xref>; <xref ref-type="bibr" rid="bib11">Datta et al., 2009</xref>). Protocols for generating paired sequence alignments more generally are an important area for development in this area.</p></sec><sec id="s4" sec-type="materials|methods"><title>Materials and methods</title><sec id="s4-1"><title>Individual alignment generation</title><p>Multiple sequence alignments were generated for each of the 4303 <italic>E. coli</italic> protein genes as identified by EcoGene 3.0 (<xref ref-type="bibr" rid="bib63">Zhou et al., 2013</xref>) using HHblits (-n 8 -e 1E-20 -maxfilt ∞ -neffmax 20 -nodiff -realign_max ∞), and HHfilter (-id 100 -cov 75) in the HHsuite (version: 2.0.15, <xref ref-type="bibr" rid="bib42">Remmert et al., 2011</xref>). To reduce redundancy, we constructed HMMs from each MSA and clustered genes based on the HHΔ (<xref ref-type="bibr" rid="bib24">Kamisetty et al., 2013</xref>), a measure of HMM–HMM similarity: a pair of genes was assigned to the same cluster if the HHΔ is less than 0.5. This procedure resulted in 2340 non-redundant gene clusters.</p><p>For the benchmark set, a new alignment was generated using the sequence associated with each PDB. For the 50S ribosome and NADH dehydrogenase, we used <italic>Thermus thermophilus</italic> HB8 sequences from PDB structures 3uxr (<xref ref-type="bibr" rid="bib6">Bulkley et al., 2012</xref>) and 4hea (<xref ref-type="bibr" rid="bib2">Baradaran et al., 2013</xref>) respectively. For paralogous NADH dehydrogenase chains L, M, and N, we used an e-value of 1E-60 in the alignment generation protocol. In addition to complexes from the <italic>E. coli</italic> analysis, we also include the GatCAB amidotransferase complex in our benchmark set, using sequences from the PDB structure 3ip4 (<xref ref-type="bibr" rid="bib35">Nakamura et al., 2010</xref>). For cases where the PDB sequence length was much longer than average coverage, we modified the coverage filter to 50% of query. The sequences were then realigned using clustal omega v1.2 (--iterations 2 --full-iter) (<xref ref-type="bibr" rid="bib45">Sievers et al., 2011</xref>). Residues not present in the query sequence were dropped from subsequent analysis.</p></sec><sec id="s4-2"><title>Paired alignment generation</title><p>We construct alignments of paired protein sequences [x<sub>1</sub>, x<sub>2</sub>, …, x<sub>p</sub>; x<sub>p+1</sub>, …, x<sub>p+q</sub>] from the same genome with positions 1:p and p+1:p+q corresponding to the first and second proteins respectively. We refer to such a multiple sequence alignment of paired sequences as a paired alignment.</p><p>For gene families with a single copy in each genome such as the ribosomal proteins, constructing paired alignments is straightforward as sequence pairs from the same genome can simply be concatenated. While the process of generating paired alignments in general is complicated in the presence of multiple paralogs of a gene in a single genome, in prokaryotes, co-regulated genes are often co-located on the genome into operons. We exploit this property to avoid paralogous genes when creating paired sequences by restricting to gene pairs that have small, conserved intergenic distances. A similar approach was used to construct a database of fusion proteins in prokaryotic genomes (<xref ref-type="bibr" rid="bib48">Suhre and Claverie, 2004</xref>). Defining Δgene as the number of annotated genes between a gene pair, we only consider pairs with Δgene conserved in 60% of genomes and less than 20. To allow for ambiguity in annotation, if the second or third most common intergenic distance is within 1 of the mode, these gene-pairs are included in the conservation calculation. Given that most UniProt accession IDs are serially assigned in a genome (<xref ref-type="bibr" rid="bib53">UniProt Accession</xref>), Δgene can be rapidly evaluated by looking at the difference in accession ids. The paired alignment is then filtered to reduce redundancy to 90% sequence identity and to remove positions that have more than 75% gaps.</p></sec><sec id="s4-3"><title>Identification of protein complex structures</title><p>To identify protein pairs in the same complex structure, a HMM was constructed for each <italic>E. coli</italic> protein using hmmbuild from the already generated HHblits alignments. We then used hmmsearch to scan PDB sequences in the S2C database (<xref ref-type="bibr" rid="bib56">Wang et al.</xref>; Both hmmbuild and hmmsearch are part of the HMMER v3.1b package [<xref ref-type="bibr" rid="bib15">Eddy, 2009</xref>]). Only hits with e-value less than 1E-10 were considered. Protein pairs found in the same complex structure (PDB file) were considered to be in contact if a <inline-formula><mml:math id="inf1"><mml:mrow><mml:mtext>C</mml:mtext></mml:mrow></mml:math></inline-formula>α atom in one structure was within 12 Angstroms of a <inline-formula><mml:math id="inf2"><mml:mrow><mml:mtext>C</mml:mtext></mml:mrow></mml:math></inline-formula>α atom in the other.</p></sec><sec id="s4-4"><title>Gremlin model construction from paired alignments</title><p>GREMLIN constructs a global statistical model of the paired alignment, assigning a probability to every amino-acid sequence in the paired alignment:<disp-formula id="equ1"><mml:math id="m1"><mml:mrow><mml:mtext>p</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mtext>X</mml:mtext><mml:mn>1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mtext>X</mml:mtext><mml:mn>2</mml:mn></mml:msub><mml:mo>…</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mtext>X</mml:mtext></mml:mrow><mml:mtext>p</mml:mtext></mml:msub><mml:mo>;</mml:mo><mml:msub><mml:mtext>X</mml:mtext><mml:mrow><mml:mtext>p</mml:mtext><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>…</mml:mo><mml:msub><mml:mtext>X</mml:mtext><mml:mrow><mml:mtext>p</mml:mtext><mml:mo>+</mml:mo><mml:mtext>q</mml:mtext></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mtext>Z</mml:mtext></mml:mfrac><mml:mtext>exp</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msubsup><mml:mtext>Σ</mml:mtext><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mtext>p</mml:mtext><mml:mo>+</mml:mo><mml:mtext>q</mml:mtext></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mtext>v</mml:mtext><mml:mtext>i</mml:mtext></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mtext>X</mml:mtext><mml:mtext>i</mml:mtext></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:msubsup><mml:mtext>Σ</mml:mtext><mml:mrow><mml:mtext>j</mml:mtext><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mtext>p</mml:mtext><mml:mo>+</mml:mo><mml:mtext>q</mml:mtext></mml:mrow></mml:msubsup><mml:msub><mml:mtext>w</mml:mtext><mml:mrow><mml:mtext>i</mml:mtext><mml:mo>,</mml:mo><mml:mtext>j</mml:mtext></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mtext>X</mml:mtext><mml:mtext>i</mml:mtext></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mtext>X</mml:mtext><mml:mtext>j</mml:mtext></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>where, the v<sub>i</sub> are vectors encoding position-specific amino-acid propensities and the w<sub>ij</sub> are matrices encoding amino-acid coupling between positions i and j. These parameters are obtained from the aligned sequences by maximizing the regularized pseudo-likelihood (<xref ref-type="bibr" rid="bib1">Balakrishnan et al., 2011</xref>) of the alignment as described in (<xref ref-type="bibr" rid="bib24">Kamisetty et al., 2013</xref>):<disp-formula id="equ2"><mml:math id="m2"><mml:mrow><mml:mtext>v</mml:mtext><mml:mo>,</mml:mo><mml:mtext>w</mml:mtext><mml:mo>=</mml:mo><mml:mtext>arg max </mml:mtext><mml:mrow><mml:msubsup><mml:mtext>Σ</mml:mtext><mml:mn>1</mml:mn><mml:mtext>N</mml:mtext></mml:msubsup></mml:mrow><mml:msubsup><mml:mtext>Σ</mml:mtext><mml:mn>1</mml:mn><mml:mrow><mml:mtext>p</mml:mtext><mml:mo>+</mml:mo><mml:mtext>q</mml:mtext></mml:mrow></mml:msubsup><mml:mtext>log P</mml:mtext><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mtext>X</mml:mtext><mml:mtext>i</mml:mtext></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mtext>X</mml:mtext><mml:mn>1</mml:mn></mml:msub><mml:mtext>..</mml:mtext><mml:msub><mml:mtext>X</mml:mtext><mml:mrow><mml:mtext>i</mml:mtext><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mtext>X</mml:mtext><mml:mrow><mml:mtext>i</mml:mtext><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mtext>..</mml:mtext><mml:msub><mml:mtext>X</mml:mtext><mml:mrow><mml:mtext>p</mml:mtext><mml:mo>+</mml:mo><mml:mtext>q</mml:mtext></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>+</mml:mo><mml:mtext>R</mml:mtext><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext>v</mml:mtext><mml:mo>,</mml:mo><mml:mtext>w</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>where, each term in the summation is a conditional distribution capturing the probability of a particular amino-acid at a position in the context of the entire protein sequence and R(v,w) is a regularization term to prevent over-fitting.</p><p>Previous approaches (<xref ref-type="bibr" rid="bib33">Morcos et al., 2011</xref>; <xref ref-type="bibr" rid="bib23">Jones et al., 2012</xref>) estimated v, w using an approximate moment matching approach (<xref ref-type="bibr" rid="bib24">Kamisetty et al., 2013</xref>) by inverting a generalized covariance matrix. These rely on a Gaussian-like approximation to the global partition function. Unlike these approaches, estimation via the pseudo-likelihood avoids this approximation relying instead on local partition functions (<xref ref-type="bibr" rid="bib1">Balakrishnan et al., 2011</xref>; <xref ref-type="bibr" rid="bib16">Ekeberg et al., 2013</xref>; <xref ref-type="bibr" rid="bib24">Kamisetty et al., 2013</xref>). The resulting global optimization problem can be efficiently solved using standard convex optimization techniques and provides estimates for each vector v<sub>i</sub> and matrix w<sub>ij</sub> (<xref ref-type="bibr" rid="bib24">Kamisetty et al., 2013</xref>).</p></sec><sec id="s4-5"><title>Ranking residue pairs with gremlin scores</title><p>To reduce the w<sub>ij</sub> matrices to single values reflecting the strength of the coupling between positions i and j, we first compute s<sub>ij</sub>, their vector 2-norm (the square root of the averages of the squares of the individual matrix elements). We correct for differences in s<sub>ij</sub> due to sequence variability at different positions using the row and column averages of these values:<disp-formula id="equ3"><mml:math id="m3"><mml:mrow><mml:msubsup><mml:mtext>s</mml:mtext><mml:mrow><mml:mtext>ij</mml:mtext></mml:mrow><mml:mrow><mml:mtext>corr</mml:mtext></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msub><mml:mtext>s</mml:mtext><mml:mrow><mml:mtext>ij</mml:mtext></mml:mrow></mml:msub><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:mo><</mml:mo><mml:msub><mml:mtext>s</mml:mtext><mml:mrow><mml:mtext>ik</mml:mtext></mml:mrow></mml:msub><mml:msub><mml:mo>></mml:mo><mml:mtext>k</mml:mtext></mml:msub><mml:mo><</mml:mo><mml:msub><mml:mtext>s</mml:mtext><mml:mrow><mml:mtext>kj</mml:mtext></mml:mrow></mml:msub><mml:msub><mml:mo>></mml:mo><mml:mtext>k</mml:mtext></mml:msub></mml:mrow><mml:mrow><mml:mo><</mml:mo><mml:msub><mml:mi>s</mml:mi><mml:mrow><mml:mtext>kl</mml:mtext></mml:mrow></mml:msub><mml:msub><mml:mo>></mml:mo><mml:mrow><mml:mtext>kl</mml:mtext></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula>where brackets indicate averages taken over the indices outside the brackets in a manner similar to that of Average Product Correction (APC, <xref ref-type="bibr" rid="bib14">Dunn et al., 2008</xref>). Unlike the APC, we account for differences in the rates of evolution in the two protein families by computing the averages only over the positions of the proteins corresponding to positions i and j: if i and j are both in the first (second) protein, the averages are computed over the positions in the first (second) protein; if i is in the first protein and j in the second, the column average is computed only over the positions of the first protein and the row average, only over the positions of the second protein. We then compute a normalized coupling strength, <inline-formula><mml:math id="inf3"><mml:mrow><mml:msub><mml:mrow><mml:mtext>ncs</mml:mtext></mml:mrow><mml:mrow><mml:mtext>ij</mml:mtext></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>, by dividing the <inline-formula><mml:math id="inf4"><mml:mrow><mml:msubsup><mml:mtext>s</mml:mtext><mml:mrow><mml:mtext>ij</mml:mtext></mml:mrow><mml:mrow><mml:mtext>corr</mml:mtext></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula> by the average of the top 3L/2 <inline-formula><mml:math id="inf5"><mml:mrow><mml:msubsup><mml:mtext>s</mml:mtext><mml:mrow><mml:mtext>ij</mml:mtext></mml:mrow><mml:mrow><mml:mtext>corr</mml:mtext></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula> values across the two proteins (since there are roughly 3L/2 contacts for a protein of length L [<xref ref-type="bibr" rid="bib24">Kamisetty et al., 2013</xref>; SI]).</p><p>As illustrated in <xref ref-type="fig" rid="fig1">Figure 1D</xref>, the relation between normalized coupling strength and contact frequency varies with the ratio of the number of aligned sequences to the length of the protein complex. We also observed that residues were more frequently in contact for a given coupling strength when the top score for that complex was high. To account for these dependencies, we constructed a model that estimates the probability of being in contact based on the bacterial 50S ribosomal complex:<disp-formula id="equ4"><mml:math id="m4"><mml:mrow><mml:mtext>GremlinScore</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext>x</mml:mtext><mml:mo>,</mml:mo><mml:mtext>N</mml:mtext><mml:mo>/</mml:mo><mml:mtext>L</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>/</mml:mo><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mtext>exp</mml:mtext><mml:mo>(</mml:mo><mml:mo>−</mml:mo><mml:mtext>σ</mml:mtext><mml:mo>(</mml:mo><mml:mtext>x</mml:mtext><mml:mo>−</mml:mo><mml:mtext>μ</mml:mtext><mml:mo>)</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></disp-formula>where<disp-formula id="equ5"><mml:math id="m5"><mml:mrow><mml:mtext>μ</mml:mtext><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mtext>m</mml:mtext></mml:mrow><mml:mrow><mml:mtext>N</mml:mtext><mml:mo>/</mml:mo><mml:mtext>L</mml:mtext><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:mtext>c</mml:mtext></mml:mrow></mml:math></disp-formula>and x is <inline-formula><mml:math id="inf6"><mml:mrow><mml:mo>√</mml:mo><mml:msub><mml:mrow><mml:mtext>ncs</mml:mtext></mml:mrow><mml:mrow><mml:mtext>ij</mml:mtext></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> for the top scoring contact in each complex and <inline-formula><mml:math id="inf7"><mml:mrow><mml:mo>√</mml:mo><mml:msub><mml:mrow><mml:mtext>ncs</mml:mtext></mml:mrow><mml:mrow><mml:mtext>ij</mml:mtext></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> scaled by the Gremlin score of the top contact in all other cases. The values of m, c, and σ (0.47, 0.96, and 9.77 respectively) were determined by a non-linear fit to the observed frequencies in the 50S ribosomal data from <xref ref-type="fig" rid="fig1">Figure 1D</xref>. This function accurately accounts for the observed contact frequencies (<xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1</xref>).</p></sec><sec id="s4-6"><title>Conversion of gremlin scores to distance restraints</title><p>We converted coupling strengths into residue-pair specific distance restraints and included them in the Rosetta structure prediction program. We use sigmoidal distance restraints of the form:<disp-formula id="equ6"><label>(1)</label><mml:math id="m6"><mml:mrow><mml:mtext>restraint</mml:mtext><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mtext>d</mml:mtext><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mtext>weight</mml:mtext></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mtext>exp</mml:mtext><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>−</mml:mo><mml:mtext>slope</mml:mtext><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext>d</mml:mtext><mml:mo>−</mml:mo><mml:mtext>cutoff</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mtext>intercept</mml:mtext></mml:mrow></mml:math></disp-formula>where, d is the distance between the constrained atoms and the weight is proportional to <inline-formula><mml:math id="inf8"><mml:mrow><mml:msub><mml:mrow><mml:mtext>ncs</mml:mtext></mml:mrow><mml:mrow><mml:mtext>ij</mml:mtext></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>. The restraints were introduced between <inline-formula><mml:math id="inf9"><mml:mrow><mml:mtext>C</mml:mtext></mml:mrow></mml:math></inline-formula>β atoms (<inline-formula><mml:math id="inf10"><mml:mrow><mml:mtext>C</mml:mtext></mml:mrow></mml:math></inline-formula>α in the case of glycine) in the reduced-atom representation of Rosetta (centroid mode) and as ambiguous distance restraints (<xref ref-type="bibr" rid="bib25">Lange et al., 2012</xref>) between side-chain heavy atoms (cutoff of 5.5 and slope of 4) in the full-atom stage of Rosetta. For the centroid mode, restraints used the amino acid pair specific <inline-formula><mml:math id="inf11"><mml:mrow><mml:mtext>C</mml:mtext></mml:mrow></mml:math></inline-formula>β-<inline-formula><mml:math id="inf12"><mml:mrow><mml:mtext>C</mml:mtext></mml:mrow></mml:math></inline-formula>β cutoff and slopes, as described in <xref ref-type="bibr" rid="bib24">Kamisetty et al., 2013</xref> SI Table III. These distance restraints supplement the Rosetta all atom energy; the combination ensures the sampling of physically realistic structures consistent with the contact predictions.</p></sec><sec id="s4-7"><title>Comparative modeling</title><p>Comparative models were built using RosettaCM (<xref ref-type="bibr" rid="bib47">Song et al., 2013</xref>) based on alignments to homologous structures generated using HHsearch (<xref ref-type="bibr" rid="bib42">Remmert et al., 2011</xref>). For proteins that had missing density in regions predicted to be in contact, we used RosettaCM with co-evolution derived restraints to build the missing region before docking.</p></sec><sec id="s4-8"><title>De Novo modeling</title><p>The Rosetta ab initio protocol consists of two stages: in the initial stage (‘centroid’) side-chains are represented by fixed center-of-mass atoms allowing for rapid generation and evaluation of various protein-like topologies; the second stage (‘full-atom’) builds in explicit side-chains and carries out all atom energy minimization (<xref ref-type="bibr" rid="bib46">Simons et al., 1999</xref>; <xref ref-type="bibr" rid="bib40">Raman et al., 2009</xref>). YIAM, a membrane protein, was modeled with the Rosetta membrane energy function (<xref ref-type="bibr" rid="bib59">Yarov-Yarovoy et al., 2012</xref>, <xref ref-type="bibr" rid="bib3">Barth et al., 2007</xref>). Strong repulsive interactions (<xref ref-type="disp-formula" rid="equ1">Equation 1</xref>, weight: −100, cutoff: 35, slope: 2 and intercept: 100) were added between the center of the extracellular regions and the center of predicted intracellular regions, and strong attractive restraints (weight:100, cutoff:35, slope:2 and intercept: 0) within predicted intracellular regions and extracellular regions, effectively constructing a membrane-like sampling space. We used the consensus output of MESSA (<xref ref-type="bibr" rid="bib8">Cong and Grishin, 2012</xref>) to predict transmembrane regions. 100,000 models were generated and 20 models that best fit the restraints converged to a single cluster.</p></sec><sec id="s4-9"><title>Docking test set</title><p>Jackhammer (part of HMMER v3.1b package; <xref ref-type="bibr" rid="bib15">Eddy, 2009</xref>) was used to identify a subset of 18 complexes in the benchmark set where at least one of the proteins or a close homolog had a solved structure of its <italic>apo</italic> form. In cases where the structure was of a homologous protein (e-value < 1E-20) and where most of the interface residues were present, we generated a structural model of the target protein using comparative modeling. We only considered cases where at least one of the structures was unbound as the bound–bound docking problem is not representative of real world docking challenges (<xref ref-type="bibr" rid="bib5">Betts and Sternberg, 1999</xref>). The positive control shown in <xref ref-type="supplementary-material" rid="SD2-data">Figure 4—source data 1</xref> was run on all protein-pairs from the benchmark set, where at least two predicted inter contacts had a high GREMLIN score (>0.6).</p></sec><sec id="s4-10"><title>Complex assembly by protein–protein docking</title><p>For each inter restraint pair that is in the top 3/2L predictions, we used PatchDock v1.0, with clustering parameters (rmsd 0.5; discardClustersSmaller 0) (<xref ref-type="bibr" rid="bib13">Duhovny et al., 2002</xref>) to generate an ensemble of conformations that were then scored using all the restraints. For tripartite efflux pump, the surface segmentation parameters were further modified (low_patch_thr 0; prune_thr 0.1; flat 1), to allow for more diverse interfaces. The top 5 models by restraint score were energy-minimized in cartesian space using both inter and intra restraints with cycles of minimization and side chain repacking using Rosetta as described in <xref ref-type="bibr" rid="bib9">Conway et al. (2014)</xref>. The best scoring model by restraint score was then selected.</p><p>For fraction of native contact (Fnat) and interface root-mean-squared deviation (iRMSD) calculation, the interface residue–residue contacts are those where the minimal distance between any heavy side-chain atom is less than 5 Å. The Fnat calculation is performed as described in <xref ref-type="bibr" rid="bib24">Kamisetty et al. (2013)</xref> SI Table III.</p><p>All structural figures were drawn with PyMOL (The PyMOL Molecular Graphics System, Version 1.5.0.4 Schrödinger, LLC.).</p></sec><sec id="s4-11"><title>Data Availability</title><p>The multiple sequence alignments used in the analysis and the full GREMLIN results for all the calculations described in the paper are provided at <ext-link ext-link-type="uri" xlink:href="http://gremlin.bakerlab.org/complexes/">http://gremlin.bakerlab.org/complexes/</ext-link> along with a web-server for paired-alignment generation, coevolution analysis and contact prediction/Rosetta restraint generation. The paired-alignments along with the PDB coordinates of the predicted structures are also available at Dryad: <xref ref-type="bibr" rid="bib38">Ovchinnikov et al., 2014</xref>.</p></sec></sec></body><back><ack id="ack"><title>Acknowledgements</title><p>We thank Lei Shi and David La for their comments and helpful suggestions, and Rosetta@home participants for donating their computer time.</p><p>Note added in proof</p><p>Two other studies of protein-coevolution using global statistical models have recently appeared: <xref ref-type="bibr" rid="bib50">Tamir et al., 2014</xref>, and <xref ref-type="bibr" rid="bib21">Hopf et al., 2014</xref>. These studies provide independent validation of the robustness of global statistical methods for prediction of protein–protein contacts.</p></ack><sec sec-type="additional-information"><title>Additional information</title><fn-group content-type="competing-interest"><title>Competing interests</title><fn fn-type="conflict" id="conf1"><p>The authors declare that no competing interests exist.</p></fn></fn-group><fn-group content-type="author-contribution"><title>Author contributions</title><fn fn-type="con" id="con1"><p>SO, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article, Contributed unpublished essential data or reagents</p></fn><fn fn-type="con" id="con2"><p>HK, Conception and design, Analysis and interpretation of data, Drafting or revising the article, Contributed unpublished essential data or reagents</p></fn><fn fn-type="con" id="con3"><p>DB, Conception and design, Analysis and interpretation of data, Drafting or revising the article</p></fn></fn-group></sec><sec sec-type="supplementary-material"><title>Additional files</title><sec sec-type="datasets"><title>Major datasets</title><p>The following dataset was generated:</p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro1"><name><surname>Ovchinnikov</surname><given-names>S</given-names></name>, <name><surname>Kamisetty</surname><given-names>H</given-names></name>, <name><surname>Baker</surname><given-names>D</given-names></name>, <year>2014</year><x>, </x><source>Data from: Robust and accurate prediction of residue-residue interactions across protein interfaces using evolutionary information</source><x>, </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5061/dryad.s00vr">http://dx.doi.org/10.5061/dryad.s00vr</ext-link><x>, </x><comment>Available at Dryad Digital Repository under a CC0 Public Domain Dedication.</comment></related-object></p><p>The following previously published datasets were used:</p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro2"><name><surname>Reshetnikova</surname><given-names>L</given-names></name>, <name><surname>Moor</surname><given-names>N</given-names></name>, <name><surname>Lavrik</surname><given-names>O</given-names></name>, <name><surname>Vassylyev</surname><given-names>DG</given-names></name>, <year>2000</year><x>, </x><source>PHENYLALANYL TRNA SYNTHETASE COMPLEXED WITH PHENYLALANINE</source><x>, </x><object-id pub-id-type="art-access-id">1B70</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1b70/pdb">http://dx.doi.org/10.2210/pdb1b70/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro3"><name><surname>Wilkens</surname><given-names>S</given-names></name>, <name><surname>Capaldi</surname><given-names>RA</given-names></name>, <year>1998</year><x>, </x><source>SOLUTION STRUCTURE OF THE EPSILON SUBUNIT OF THE F1-ATPSYNTHASE FROM ESCHERICHIA COLI AND ORIENTATION OF THE SUBUNIT RELATIVE TO THE BETA SUBUNITS OF THE COMPLEX</source><x>, </x><object-id pub-id-type="art-access-id">1BSN</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1bsn/pdb">http://dx.doi.org/10.2210/pdb1bsn/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro4"><name><surname>Thoden</surname><given-names>JB</given-names></name>, <name><surname>Wesenberg</surname><given-names>G</given-names></name>, <name><surname>Raushel</surname><given-names>FM</given-names></name>, <name><surname>Holden</surname><given-names>HM</given-names></name>, <year>1999</year><x>, </x><source>STRUCTURE OF CARBAMOYL PHOSPHATE SYNTHETASE COMPLEXED WITH THE ATP ANALOG AMPPNP</source><x>, </x><object-id pub-id-type="art-access-id">1BXR</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1bxr/pdb">http://dx.doi.org/10.2210/pdb1bxr/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro5"><name><surname>Rebelo</surname><given-names>JM</given-names></name>, <name><surname>Macieira</surname><given-names>S</given-names></name>, <name><surname>Dias</surname><given-names>JM</given-names></name>, <name><surname>Huber</surname><given-names>R</given-names></name>, <name><surname>Romao</surname><given-names>MJ</given-names></name>, <year>2000</year><x>, </x><source>CRYSTAL STRUCTURE OF THE ALDEHYDE OXIDOREDUCTASE FROM DESULFOVIBRIO DESULFURICANS ATCC 27774</source><x>, </x><object-id pub-id-type="art-access-id">1DGJ</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1dgj/pdb">http://dx.doi.org/10.2210/pdb1dgj/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro6"><name><surname>Rowland</surname><given-names>P</given-names></name>, <name><surname>Larsen</surname><given-names>S</given-names></name>, <year>1997</year><x>, </x><source>DIHYDROOROTATE DEHYDROGENASE A FROM LACTOCOCCUS LACTIS</source><x>, </x><object-id pub-id-type="art-access-id">1DOR</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1dor/pdb">http://dx.doi.org/10.2210/pdb1dor/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro7"><name><surname>Roberts</surname><given-names>DL</given-names></name>, <name><surname>Salazar</surname><given-names>D</given-names></name>, <name><surname>Fulmer</surname><given-names>JP</given-names></name>, <name><surname>Frerman</surname><given-names>FE</given-names></name>, <name><surname>Kim</surname><given-names>JJ-P</given-names></name>, <year>1999</year><x>, </x><source>ELECTRON TRANSFER FLAVOPROTEIN (ETF) FROM PARACOCCUS DENITRIFICANS</source><x>, </x><object-id pub-id-type="art-access-id">1EFP</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1efp/pdb">http://dx.doi.org/10.2210/pdb1efp/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro8"><name><surname>Rowland</surname><given-names>P</given-names></name>, <name><surname>Norager</surname><given-names>S</given-names></name>, <name><surname>Jensen</surname><given-names>KF</given-names></name>, <name><surname>Larsen</surname><given-names>S</given-names></name>, <year>2001</year><x>, </x><source>CRYSTAL STRUCTURE OF LACTOCOCCUS LACTIS DIHYDROOROTATE DEHYDROGENASE B. DATA COLLECTED UNDER CRYOGENIC CONDITIONS</source><x>, </x><object-id pub-id-type="art-access-id">1EP3</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1ep3/pdb">http://dx.doi.org/10.2210/pdb1ep3/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro9"><name><surname>Becker</surname><given-names>A</given-names></name>, <name><surname>Kabsch</surname><given-names>W</given-names></name>, <year>2002</year><x>, </x><source>PYRUVATE FORMATE-LYASE (E.COLI) IN COMPLEX WITH PYRUVATE AND COA</source><x>, </x><object-id pub-id-type="art-access-id">1H16</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1h16/pdb">http://dx.doi.org/10.2210/pdb1h16/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro10"><name><surname>Morollo</surname><given-names>AA</given-names></name>, <name><surname>Eck</surname><given-names>MJ</given-names></name>, <year>2001</year><x>, </x><source>STRUCTURE OF THE COOPERATIVE ALLOSTERIC ANTHRANILATE SYNTHASE FROM SALMONELLA TYPHIMURIUM</source><x>, </x><object-id pub-id-type="art-access-id">1I1Q</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1i1q/pdb">http://dx.doi.org/10.2210/pdb1i1q/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro11"><name><surname>Kleiger</surname><given-names>G</given-names></name>, <name><surname>Perry</surname><given-names>J</given-names></name>, <name><surname>Eisenberg</surname><given-names>D</given-names></name>, <year>2001</year><x>, </x><source>3D structure of the E1beta subunit of pyruvate dehydrogenase from the archeon Pyrobaculum aerophilum</source><x>, </x><object-id pub-id-type="art-access-id">1IK6</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1ik6/pdb">http://dx.doi.org/10.2210/pdb1ik6/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro12"><name><surname>Yamada</surname><given-names>K</given-names></name>, <name><surname>Miyata</surname><given-names>T</given-names></name>, <name><surname>Tsuchiya</surname><given-names>D</given-names></name>, <name><surname>Oyama</surname><given-names>T</given-names></name>, <name><surname>Fujiwara</surname><given-names>Y</given-names></name>, <name><surname>Ohnishi</surname><given-names>T</given-names></name>, <name><surname>Iwasaki</surname><given-names>H</given-names></name>, <name><surname>Shinagawa</surname><given-names>H</given-names></name>, <name><surname>Ariyoshi</surname><given-names>M</given-names></name>, <name><surname>Mayanagi</surname><given-names>K</given-names></name>, <name><surname>Morikawa</surname><given-names>K</given-names></name>, <year>2002</year><x>, </x><source>RuvA-RuvB complex</source><x>, </x><object-id pub-id-type="art-access-id">1IXR</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1ixr/pdb">http://dx.doi.org/10.2210/pdb1ixr/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro13"><name><surname>Vitagliano</surname><given-names>L</given-names></name>, <name><surname>Masullo</surname><given-names>M</given-names></name>, <name><surname>Sica</surname><given-names>F</given-names></name>, <name><surname>Zagari</surname><given-names>A</given-names></name>, <name><surname>Bocchini</surname><given-names>V</given-names></name>, <year>2002</year><x>, </x><source>Crystal structure of Sulfolobus solfataricus elongation factor 1 alpha in complex with GDP</source><x>, </x><object-id pub-id-type="art-access-id">1JNY</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1jny/pdb">http://dx.doi.org/10.2210/pdb1jny/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro14"><name><surname>Parsons</surname><given-names>JF</given-names></name>, <name><surname>Jensen</surname><given-names>PY</given-names></name>, <name><surname>Pachikara</surname><given-names>AS</given-names></name>, <name><surname>Howard</surname><given-names>AJ</given-names></name>, <name><surname>Eisenstein</surname><given-names>E</given-names></name>, <name><surname>Ladner</surname><given-names>JE</given-names></name>, <year>2002</year><x>, </x><source>THE CRYSTAL STRUCTURE OF AMINODEOXYCHORISMATE SYNTHASE FROM FORMATE GROWN CRYSTALS</source><x>, </x><object-id pub-id-type="art-access-id">1K0E</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1k0e/pdb">http://dx.doi.org/10.2210/pdb1k0e/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro15"><name><surname>Korolev</surname><given-names>S</given-names></name>, <name><surname>Koroleva</surname><given-names>O</given-names></name>, <name><surname>Petterson</surname><given-names>K</given-names></name>, <name><surname>Collart</surname><given-names>F</given-names></name>, <name><surname>Dementieva</surname><given-names>I</given-names></name>, <name><surname>Joachimiak</surname><given-names>A</given-names></name>, , <collab>Midwest Center for Structural Genomics (MCSG)</collab><year>2002</year><x>, </x><source>CRYSTAL STRUCTURE OF ACETATE COA-TRANSFERASE ALPHA SUBUNIT</source><x>, </x><object-id pub-id-type="art-access-id">1K6D</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1k6d/pdb">http://dx.doi.org/10.2210/pdb1k6d/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro16"><name><surname>Takahashi</surname><given-names>H</given-names></name>, <name><surname>Tokunaga</surname><given-names>Y</given-names></name>, <name><surname>Kuroishi</surname><given-names>C</given-names></name>, <name><surname>Babayeva</surname><given-names>N</given-names></name>, <name><surname>Kuramitsu</surname><given-names>S</given-names></name>, <name><surname>Yokoyama</surname><given-names>S</given-names></name>, <name><surname>Miyano</surname><given-names>M</given-names></name>, <name><surname>Tahirov</surname><given-names>TH</given-names></name>, <year>2003</year><x>, </x><source>THE CRYSTAL STRUCTURE OF SUCCINYL-COA SYNTHETASE ALPHA SUBUNIT FROM THERMUS THERMOPHILUS</source><x>, </x><object-id pub-id-type="art-access-id">1OI7</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1oi7/pdb">http://dx.doi.org/10.2210/pdb1oi7/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro17"><name><surname>Weyand</surname><given-names>M</given-names></name>, <name><surname>Schlichting</surname><given-names>I</given-names></name>, <year>2000</year><x>, </x><source>CRYSTAL STRUCTURE OF WILD-TYPE TRYPTOPHAN SYNTHASE COMPLEXED WITH INDOLE PROPANOL PHOSPHATE</source><x>, </x><object-id pub-id-type="art-access-id">1QOP</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1qop/pdb">http://dx.doi.org/10.2210/pdb1qop/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro18"><name><surname>Unciuleac</surname><given-names>M</given-names></name>, <name><surname>Warkentin</surname><given-names>E</given-names></name>, <name><surname>Page</surname><given-names>CC</given-names></name>, <name><surname>Dutton</surname><given-names>PL</given-names></name>, <name><surname>Boll</surname><given-names>M</given-names></name>, <name><surname>Ermler</surname><given-names>U</given-names></name>, <year>2004</year><x>, </x><source>Structure of 4-hydroxybenzoyl-CoA reductase from Thauera aromatica</source><x>, </x><object-id pub-id-type="art-access-id">1RM6</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1rm6/pdb">http://dx.doi.org/10.2210/pdb1rm6/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro19"><name><surname>Settembre</surname><given-names>EC</given-names></name>, <name><surname>Dorrestein</surname><given-names>PC</given-names></name>, <name><surname>Zhai</surname><given-names>H</given-names></name>, <name><surname>Chatterjee</surname><given-names>A</given-names></name>, <name><surname>McLafferty</surname><given-names>FW</given-names></name>, <name><surname>Begley</surname><given-names>TP</given-names></name>, <name><surname>Ealick</surname><given-names>SE</given-names></name>, <year>2004</year><x>, </x><source>Structure of the thiazole synthase/ThiS complex</source><x>, </x><object-id pub-id-type="art-access-id">1TYG</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1tyg/pdb">http://dx.doi.org/10.2210/pdb1tyg/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro20"><name><surname>Hioki</surname><given-names>Y</given-names></name>, <name><surname>Ogasahara</surname><given-names>K</given-names></name>, <name><surname>Lee</surname><given-names>SJ</given-names></name>, <name><surname>Ma</surname><given-names>J</given-names></name>, <name><surname>Ishida</surname><given-names>M</given-names></name>, <name><surname>Yamagata</surname><given-names>Y</given-names></name>, <name><surname>Matsuura</surname><given-names>Y</given-names></name>, <name><surname>Ota</surname><given-names>M</given-names></name>, <name><surname>Kuramitsu</surname><given-names>S</given-names></name>, <name><surname>Yutani</surname><given-names>K</given-names></name>, , <collab>RIKEN Structural Genomics/Proteomics Initiative (RSGI)</collab><year>2005</year><x>, </x><source>X-ray crystal structure of the Tryptophan Synthase b2 Subunit from Hyperthermophile, Pyrococcus furiosus</source><x>, </x><object-id pub-id-type="art-access-id">1V8Z</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1v8z/pdb">http://dx.doi.org/10.2210/pdb1v8z/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro21"><name><surname>Frank</surname><given-names>RAW</given-names></name>, <name><surname>Pratap</surname><given-names>JV</given-names></name>, <name><surname>Pei</surname><given-names>XY</given-names></name>, <name><surname>Perham</surname><given-names>RN</given-names></name>, <name><surname>Luisi</surname><given-names>BF</given-names></name>, <year>2004</year><x>, </x><source>THE CRYSTAL STRUCTURE OF PYRUVATE DEHYDROGENASE E1 BOUND TO THE PERIPHERAL SUBUNIT BINDING DOMAIN OF E2</source><x>, </x><object-id pub-id-type="art-access-id">1W85</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1w85/pdb">http://dx.doi.org/10.2210/pdb1w85/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro22"><name><surname>Kuzin</surname><given-names>A</given-names></name>, <name><surname>Abashidze</surname><given-names>M</given-names></name>, <name><surname>Vorobiev</surname><given-names>S</given-names></name>, <name><surname>Forouhar</surname><given-names>F</given-names></name>, <name><surname>Acton</surname><given-names>T</given-names></name>, <name><surname>Ma</surname><given-names>L</given-names></name>, <name><surname>Xiao</surname><given-names>R</given-names></name>, <name><surname>Montelione</surname><given-names>G</given-names></name>, <name><surname>Tong</surname><given-names>L</given-names></name>, <name><surname>Hunt</surname><given-names>J</given-names></name>, , <collab>Northeast Structural Genomics Consortium (NESG)</collab><year>2004</year><x>, </x><source>Crystal structure of Northeast Structural Genomics Target SR156</source><x>, </x><object-id pub-id-type="art-access-id">1XM3</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1xm3/pdb">http://dx.doi.org/10.2210/pdb1xm3/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro23"><name><surname>Federici</surname><given-names>L</given-names></name>, <name><surname>Du</surname><given-names>D</given-names></name>, <name><surname>Walas</surname><given-names>F</given-names></name>, <name><surname>Matsumura</surname><given-names>H</given-names></name>, <name><surname>Fernandez-Recio</surname><given-names>J</given-names></name>, <name><surname>McKeegan</surname><given-names>KS</given-names></name>, <name><surname>Borges-Walmsley</surname><given-names>MI</given-names></name>, <name><surname>Luisi</surname><given-names>BF</given-names></name>, <name><surname>Walmsley</surname><given-names>AR</given-names></name>, <year>2005</year><x>, </x><source>The crystal structure of the outer membrane protein VceC from the bacterial pathogen Vibrio cholerae at 1.8 resolution</source><x>, </x><object-id pub-id-type="art-access-id">1YC9</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1yc9/pdb">http://dx.doi.org/10.2210/pdb1yc9/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro24"><name><surname>Mougous</surname><given-names>JD</given-names></name>, <name><surname>Lee</surname><given-names>DH</given-names></name>, <name><surname>Hubbard</surname><given-names>SC</given-names></name>, <name><surname>Schelle</surname><given-names>MW</given-names></name>, <name><surname>Vocadlo</surname><given-names>DJ</given-names></name>, <name><surname>Berger</surname><given-names>JM</given-names></name>, <name><surname>Bertozzi</surname><given-names>CR</given-names></name>, <year>2006</year><x>, </x><source>Crystal Structure of a GTP-Regulated ATP Sulfurylase Heterodimer from Pseudomonas syringae</source><x>, </x><object-id pub-id-type="art-access-id">1ZUN</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb1zun/pdb">http://dx.doi.org/10.2210/pdb1zun/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro25"><name><surname>Oberholzer</surname><given-names>AE</given-names></name>, <name><surname>Schneider</surname><given-names>P</given-names></name>, <name><surname>Bachler</surname><given-names>C</given-names></name>, <name><surname>Baumann</surname><given-names>U</given-names></name>, <name><surname>Erni</surname><given-names>B</given-names></name>, <year>2006</year><x>, </x><source>CRYSTAL STRUCTURE OF DHAL FROM E. COLI</source><x>, </x><object-id pub-id-type="art-access-id">2BTD</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb2btd/pdb">http://dx.doi.org/10.2210/pdb2btd/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro26"><name><surname>Numata</surname><given-names>T</given-names></name>, <name><surname>Fukai</surname><given-names>S</given-names></name>, <name><surname>Ikeuchi</surname><given-names>Y</given-names></name>, <name><surname>Suzuki</surname><given-names>T</given-names></name>, <name><surname>Nureki</surname><given-names>O</given-names></name>, <year>2006</year><x>, </x><source>crystal structure of heterohexameric TusBCD proteins, which are crucial for the tRNA modification</source><x>, </x><object-id pub-id-type="art-access-id">2D1P</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb2d1p/pdb">http://dx.doi.org/10.2210/pdb2d1p/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro27"><name><surname>Asada</surname><given-names>Y</given-names></name>, <name><surname>Kunishima</surname><given-names>N</given-names></name>, , <collab>RIKEN Structural Genomics/Proteomics Initiative (RSGI)</collab><year>2007</year><x>, </x><source>Structural study of Project ID aq_1548 from Aquifex aeolicus VF5</source><x>, </x><object-id pub-id-type="art-access-id">2EKC</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb2ekc/pdb">http://dx.doi.org/10.2210/pdb2ekc/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro28"><collab>Joint Center for Structural Genomics (JCSG)</collab><year>2006</year><x>, </x><source>Crystal structure of Glutamyl-tRNA(Gln) amidotransferase subunit A (tm1272) from THERMOTOGA MARITIMA at 1.80 A resolution</source><x>, </x><object-id pub-id-type="art-access-id">2GI3</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb2gi3/pdb">http://dx.doi.org/10.2210/pdb2gi3/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro29"><name><surname>Lokanath</surname><given-names>NK</given-names></name>, , <collab>RIKEN Structural Genomics/Proteomics Initiative (RSGI)</collab><year>2007</year><x>, </x><source>Structure of PH0203 protein from Pyrococcus horikoshii</source><x>, </x><object-id pub-id-type="art-access-id">2IT1</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb2it1/pdb">http://dx.doi.org/10.2210/pdb2it1/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro30"><name><surname>Fraser</surname><given-names>ME</given-names></name>, <year>2007</year><x>, </x><source>C123aT Mutant of E. coli Succinyl-CoA Synthetase Orthorhombic Crystal Form</source><x>, </x><object-id pub-id-type="art-access-id">2NU9</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb2nu9/pdb">http://dx.doi.org/10.2210/pdb2nu9/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro31"><name><surname>Hollenstein</surname><given-names>K</given-names></name>, <name><surname>Frei</surname><given-names>DC</given-names></name>, <name><surname>Locher</surname><given-names>KP</given-names></name>, <year>2007</year><x>, </x><source>ABC transporter ModBC in complex with its binding protein ModA</source><x>, </x><object-id pub-id-type="art-access-id">2ONK</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb2onk/pdb">http://dx.doi.org/10.2210/pdb2onk/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro32"><name><surname>Swindell</surname><given-names>JT</given-names><suffix>II</suffix></name>, <name><surname>Chen</surname><given-names>L</given-names></name>, <name><surname>Zhu</surname><given-names>J</given-names></name>, <name><surname>Ebihara</surname><given-names>A</given-names></name>, <name><surname>Shinkai</surname><given-names>A</given-names></name>, <name><surname>Kuramitsu</surname><given-names>S</given-names></name>, <name><surname>Yokoyama</surname><given-names>S</given-names></name>, <name><surname>Fu</surname><given-names>Z-Q</given-names></name>, <name><surname>Chrzas</surname><given-names>J</given-names></name>, <name><surname>Rose</surname><given-names>JP</given-names></name>, <name><surname>Wang</surname><given-names>B-C</given-names></name>, , <collab>Southeast Collaboratory for Structural Genomics (SECSG), RIKEN Structural Genomics/Proteomics Initiative (RSGI)</collab><year>2007</year><x>, </x><source>Crystal structure of conserved uncharacterized protein PH0987 from Pyrococcus horikoshii</source><x>, </x><object-id pub-id-type="art-access-id">2PHC</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb2phc/pdb">http://dx.doi.org/10.2210/pdb2phc/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro33"><name><surname>Jormakka</surname><given-names>M</given-names></name>, <name><surname>Yokoyama</surname><given-names>K</given-names></name>, <name><surname>Yano</surname><given-names>T</given-names></name>, <name><surname>Tamakoshi</surname><given-names>M</given-names></name>, <name><surname>Akimoto</surname><given-names>S</given-names></name>, <name><surname>Shimamura</surname><given-names>T</given-names></name>, <name><surname>Curmi</surname><given-names>P</given-names></name>, <name><surname>Iwata</surname><given-names>S</given-names></name>, <year>2008</year><x>, </x><source>POLYSULFIDE REDUCTASE NATIVE STRUCTURE</source><x>, </x><object-id pub-id-type="art-access-id">2VPZ</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb2vpz/pdb">http://dx.doi.org/10.2210/pdb2vpz/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro34"><name><surname>Ruprecht</surname><given-names>J</given-names></name>, <name><surname>Yankovskaya</surname><given-names>V</given-names></name>, <name><surname>Maklashina</surname><given-names>E</given-names></name>, <name><surname>Iwata</surname><given-names>S</given-names></name>, <name><surname>Cecchini</surname><given-names>G</given-names></name>, <year>2009</year><x>, </x><source>E. COLI SUCCINATE: QUINONE OXIDOREDUCTASE (SQR) WITH CARBOXIN BOUND</source><x>, </x><object-id pub-id-type="art-access-id">2WDQ</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb2wdq/pdb">http://dx.doi.org/10.2210/pdb2wdq/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro35"><name><surname>Kaila</surname><given-names>VRI</given-names></name>, <name><surname>Oksanen</surname><given-names>E</given-names></name>, <name><surname>Goldman</surname><given-names>A</given-names></name>, <name><surname>Verkhovsky</surname><given-names>MI</given-names></name>, <name><surname>Sundholm</surname><given-names>D</given-names></name>, <name><surname>Wikstrom</surname><given-names>M</given-names></name>, <year>2011</year><x>, </x><source>Bovine heart cytochrome c oxidase re-refined with molecular oxygen</source><x>, </x><object-id pub-id-type="art-access-id">2Y69</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb2y69/pdb">http://dx.doi.org/10.2210/pdb2y69/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro36"><name><surname>Yamada</surname><given-names>S</given-names></name>, <name><surname>Sugimoto</surname><given-names>H</given-names></name>, <name><surname>Kobayashi</surname><given-names>M</given-names></name>, <name><surname>Ohno</surname><given-names>A</given-names></name>, <name><surname>Nakamura</surname><given-names>H</given-names></name>, <name><surname>Shiro</surname><given-names>Y</given-names></name>, <year>2009</year><x>, </x><source>Crystal structure of histidine kinase ThkA (TM1359) in complex with response regulator protein TrrA (TM1360)</source><x>, </x><object-id pub-id-type="art-access-id">3A0R</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3a0r/pdb">http://dx.doi.org/10.2210/pdb3a0r/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro37"><name><surname>Yamada</surname><given-names>S</given-names></name>, <name><surname>Sugimoto</surname><given-names>H</given-names></name>, <name><surname>Kobayashi</surname><given-names>M</given-names></name>, <name><surname>Ohno</surname><given-names>A</given-names></name>, <name><surname>Nakamura</surname><given-names>H</given-names></name>, <name><surname>Shiro</surname><given-names>Y</given-names></name>, <year>2009</year><x>, </x><source>Crystal structure of response regulator protein TrrA (TM1360) from Thermotoga maritima in complex with Mg(2+)-BeF (SeMet, L89M)</source><x>, </x><object-id pub-id-type="art-access-id">3A10</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3a10/pdb">http://dx.doi.org/10.2210/pdb3a10/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro38"><name><surname>Vey</surname><given-names>JL</given-names></name>, <name><surname>Drennan</surname><given-names>CL</given-names></name>, <year>2008</year><x>, </x><source>4Fe-4S-Pyruvate formate-lyase Activating Enzyme with partially disordered AdoMet</source><x>, </x><object-id pub-id-type="art-access-id">3C8F</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3c8f/pdb">http://dx.doi.org/10.2210/pdb3c8f/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro39"><name><surname>Qin</surname><given-names>L</given-names></name>, <name><surname>Mills</surname><given-names>DA</given-names></name>, <name><surname>Buhrow</surname><given-names>L</given-names></name>, <name><surname>Hiser</surname><given-names>C</given-names></name>, <name><surname>Ferguson-Miller</surname><given-names>S</given-names></name>, <year>2008</year><x>, </x><source>Catalytic core subunits (I and II) of cytochrome c oxidase from Rhodobacter sphaeroides complexed with deoxycholic acid</source><x>, </x><object-id pub-id-type="art-access-id">3DTU</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3dtu/pdb">http://dx.doi.org/10.2210/pdb3dtu/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro40"><name><surname>Yum</surname><given-names>S</given-names></name>, <name><surname>Xu</surname><given-names>Y</given-names></name>, <name><surname>Piao</surname><given-names>S</given-names></name>, <name><surname>Ha</surname><given-names>N-C</given-names></name>, <year>2009</year><x>, </x><source>Crystal structure of E.coli MacA</source><x>, </x><object-id pub-id-type="art-access-id">3FPP</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3fpp/pdb">http://dx.doi.org/10.2210/pdb3fpp/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro41"><name><surname>Miallau</surname><given-names>L</given-names></name>, <name><surname>Cascio</surname><given-names>D</given-names></name>, <name><surname>Eisenberg</surname><given-names>D</given-names></name>, , <collab>TB Structural Genomics Consortium (TBSGC), Integrated Center for Structure and Function Innovation (ISFI)</collab><year>2009</year><x>, </x><source>The crystal structure of the toxin-antitoxin complex RelBE2 (Rv2865-2866) from Mycobacterium tuberculosis</source><x>, </x><object-id pub-id-type="art-access-id">3G5O</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3g5o/pdb">http://dx.doi.org/10.2210/pdb3g5o/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro42"><name><surname>Sazanov</surname><given-names>LA</given-names></name>, <name><surname>Berrisford</surname><given-names>JM</given-names></name>, <year>2009</year><x>, </x><source>Crystal structure of the hydrophilic domain of respiratory complex I from Thermus thermophilus, oxidized, 4 mol/ASU, re-refined to 3.15 angstrom resolution</source><x>, </x><object-id pub-id-type="art-access-id">3IAS</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3ias/pdb">http://dx.doi.org/10.2210/pdb3ias/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro43"><name><surname>Nakamura</surname><given-names>A</given-names></name>, <name><surname>Yao</surname><given-names>M</given-names></name>, <name><surname>Tanaka</surname><given-names>I</given-names></name>, <year>2009</year><x>, </x><source>The high resolution structure of GatCAB</source><x>, </x><object-id pub-id-type="art-access-id">3IP4</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3ip4/pdb">http://dx.doi.org/10.2210/pdb3ip4/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro44"><name><surname>Yu</surname><given-names>S</given-names></name>, <name><surname>Rhee</surname><given-names>S</given-names></name>, <year>2010</year><x>, </x><source>Crystal structure of Immunogenic lipoprotein A from Vibrio vulnificus</source><x>, </x><object-id pub-id-type="art-access-id">3K2D</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3k2d/pdb">http://dx.doi.org/10.2210/pdb3k2d/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro45"><name><surname>Kaufmann</surname><given-names>M</given-names></name>, <name><surname>Chernishof</surname><given-names>I</given-names></name>, <name><surname>Shin</surname><given-names>A</given-names></name>, <name><surname>Germano</surname><given-names>D</given-names></name>, <name><surname>Sawaya</surname><given-names>MR</given-names></name>, <name><surname>Waldo</surname><given-names>GS</given-names></name>, <name><surname>Arbing</surname><given-names>MA</given-names></name>, <name><surname>Perry</surname><given-names>J</given-names></name>, <name><surname>Eisenberg</surname><given-names>D</given-names></name>, , <collab>Integrated Center for Structure and Function Innovation (ISFI), TB Structural Genomics Consortium (TBSGC)</collab><year>2010</year><x>, </x><source>Allophanate Hydrolase Complex from Mycobacterium smegmatis, Msmeg0435-Msmeg0436</source><x>, </x><object-id pub-id-type="art-access-id">3MML</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3mml/pdb">http://dx.doi.org/10.2210/pdb3mml/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro46"><name><surname>Cingolani</surname><given-names>G</given-names></name>, <name><surname>Duncan</surname><given-names>TM</given-names></name>, <year>2011</year><x>, </x><source>Structure of the E.coli F1-ATP synthase inhibited by subunit Epsilon</source><x>, </x><object-id pub-id-type="art-access-id">3OAA</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3oaa/pdb">http://dx.doi.org/10.2210/pdb3oaa/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro47"><name><surname>Shi</surname><given-names>R</given-names></name>, <name><surname>McDonald</surname><given-names>L</given-names></name>, <name><surname>Matte</surname><given-names>A</given-names></name>, <name><surname>Cygler</surname><given-names>M</given-names></name>, <name><surname>Ekiel</surname><given-names>I</given-names></name>, , <collab>Montreal-Kingston Bacterial Structural Genomics Initiative (BSGI)</collab><year>2011</year><x>, </x><source>Crystal Structure of E.coli Dha kinase DhaK</source><x>, </x><object-id pub-id-type="art-access-id">3PNK</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3pnk/pdb">http://dx.doi.org/10.2210/pdb3pnk/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro48"><name><surname>Shi</surname><given-names>R</given-names></name>, <name><surname>McDonald</surname><given-names>L</given-names></name>, <name><surname>Matte</surname><given-names>A</given-names></name>, <name><surname>Cygler</surname><given-names>M</given-names></name>, <name><surname>Ekiel</surname><given-names>I</given-names></name>, , <collab>Montreal-Kingston Bacterial Structural Genomics Initiative (BSGI)</collab><year>2011</year><x>, </x><source>Crystal Structure of E.coli Dha kinase DhaK-DhaL complex</source><x>, </x><object-id pub-id-type="art-access-id">3PNL</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3pnl/pdb">http://dx.doi.org/10.2210/pdb3pnl/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro49"><name><surname>Nocek</surname><given-names>B</given-names></name>, <name><surname>Stein</surname><given-names>A</given-names></name>, <name><surname>Marshall</surname><given-names>N</given-names></name>, <name><surname>Jedrzejczak</surname><given-names>R</given-names></name>, <name><surname>Babnigg</surname><given-names>G</given-names></name>, <name><surname>Joachimiak</surname><given-names>A</given-names></name>, , <collab>Midwest Center for Structural Genomics (MCSG)</collab><year>2011</year><x>, </x><source>Protein-protein complex of subunit 1 and 2 of Molybdopterin-converting factor from Helicobacter pylori 26695</source><x>, </x><object-id pub-id-type="art-access-id">3RPF</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3rpf/pdb">http://dx.doi.org/10.2210/pdb3rpf/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro50"><name><surname>Nocek</surname><given-names>B</given-names></name>, <name><surname>Stein</surname><given-names>A</given-names></name>, <name><surname>Marshall</surname><given-names>N</given-names></name>, <name><surname>Jedrzejczak</surname><given-names>R</given-names></name>, <name><surname>Babnigg</surname><given-names>G</given-names></name>, <name><surname>Joachimiak</surname><given-names>A</given-names></name>, , <collab>Midwest Center for Structural Genomics (MCSG)</collab><year>2011</year><x>, </x><source>Complex structure of 3-oxoadipate coA-transferase subunit A and B from Helicobacter pylori 26695</source><x>, </x><object-id pub-id-type="art-access-id">3RRL</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3rrl/pdb">http://dx.doi.org/10.2210/pdb3rrl/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro51"><name><surname>Johnson</surname><given-names>E</given-names></name>, <name><surname>Nguyen</surname><given-names>PT</given-names></name>, <name><surname>Rees</surname><given-names>DC</given-names></name>, <year>2011</year><x>, </x><source>Inward facing conformations of the MetNI methionine ABC transporter: CY5 native crystal form</source><x>, </x><object-id pub-id-type="art-access-id">3TUI</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3tui/pdb">http://dx.doi.org/10.2210/pdb3tui/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro52"><name><surname>Safro</surname><given-names>M</given-names></name>, <name><surname>Klipcan</surname><given-names>L</given-names></name>, <name><surname>Moor</surname><given-names>N</given-names></name>, <name><surname>Finarov</surname><given-names>I</given-names></name>, <name><surname>Kessler</surname><given-names>N</given-names></name>, <name><surname>Sukhanova</surname><given-names>M</given-names></name>, <year>2011</year><x>, </x><source>Crystal structure of human mitochondrial PheRS complexed with tRNAPhe in the active open state</source><x>, </x><object-id pub-id-type="art-access-id">3TUP</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3tup/pdb">http://dx.doi.org/10.2210/pdb3tup/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro53"><name><surname>Bulkley</surname><given-names>D</given-names></name>, <name><surname>Johnson</surname><given-names>FA</given-names></name>, <name><surname>Steitz</surname><given-names>TA</given-names></name>, <year>2012</year><x>, </x><source>The structure of thermorubin in complex with the 70S ribosome from Thermus thermophilus. This file contains the 50S subunit of one 70S ribosome. The entire crystal structure contains two 70S ribosomes</source><x>, </x><object-id pub-id-type="art-access-id">3UXR</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb3uxr/pdb">http://dx.doi.org/10.2210/pdb3uxr/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro54"><name><surname>Mancusso</surname><given-names>RL</given-names></name>, <name><surname>Gregorio</surname><given-names>GG</given-names></name>, <name><surname>Liu</surname><given-names>Q</given-names></name>, <name><surname>Wang</surname><given-names>DN</given-names></name>, <year>2012</year><x>, </x><source>Crystal Structure of a bacterial dicarboxylate/sodium symporter</source><x>, </x><object-id pub-id-type="art-access-id">4F35</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb4f35/pdb">http://dx.doi.org/10.2210/pdb4f35/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro55"><name><surname>Baradaran</surname><given-names>R</given-names></name>, <name><surname>Berrisford</surname><given-names>JM</given-names></name>, <name><surname>Minhas</surname><given-names>GS</given-names></name>, <name><surname>Sazanov</surname><given-names>LA</given-names></name>, <year>2013</year><x>, </x><source>Crystal structure of the entire respiratory complex I from Thermus thermophilus</source><x>, </x><object-id pub-id-type="art-access-id">4HEA</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb4hea/pdb">http://dx.doi.org/10.2210/pdb4hea/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro56"><name><surname>Broussard</surname><given-names>TC</given-names></name>, <name><surname>Kobe</surname><given-names>MJ</given-names></name>, <name><surname>Pakhomova</surname><given-names>S</given-names></name>, <name><surname>Neau</surname><given-names>DB</given-names></name>, <name><surname>Price</surname><given-names>AE</given-names></name>, <name><surname>Champion</surname><given-names>TS</given-names></name>, <name><surname>Waldrop</surname><given-names>GL</given-names></name>, <year>2013</year><x>, </x><source>Crystal Structure of Biotin Carboxyl Carrier Protein-Biotin Carboxylase Complex from E.coli</source><x>, </x><object-id pub-id-type="art-access-id">4HR7</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2210/pdb4hr7/pdb">http://dx.doi.org/10.2210/pdb4hr7/pdb</ext-link><x>, </x><comment>Publicly available at the RCSB Protein Data Bank.</comment></related-object></p></sec></sec><ref-list><title>References</title><ref id="bib1"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Balakrishnan</surname><given-names>S</given-names></name><name><surname>Kamisetty</surname><given-names>H</given-names></name><name><surname>Carbonell</surname><given-names>JG</given-names></name><name><surname>Lee</surname><given-names>Su-I</given-names></name><name><surname>Langmead</surname><given-names>CJ</given-names></name></person-group><year>2011</year><article-title>Learning generative models for protein fold families</article-title><source>Proteins: structure, Function, and Bioinformatics</source><volume>79</volume><fpage>1061</fpage><lpage>1078</lpage><pub-id pub-id-type="doi">10.1002/prot.22934</pub-id></element-citation></ref><ref id="bib2"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Baradaran</surname><given-names>R</given-names></name><name><surname>Berrisford</surname><given-names>JM</given-names></name><name><surname>Minhas</surname><given-names>GS</given-names></name><name><surname>Sazanov</surname><given-names>LA</given-names></name></person-group><year>2013</year><article-title>Crystal structure of the entire respiratory complex I</article-title><source>Nature</source><volume>494</volume><fpage>443</fpage><lpage>448</lpage><pub-id pub-id-type="doi">10.1038/nature11871</pub-id></element-citation></ref><ref id="bib3"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Barth</surname><given-names>P</given-names></name><name><surname>Schonbrun</surname><given-names>J</given-names></name><name><surname>Baker</surname><given-names>D</given-names></name></person-group><year>2007</year><article-title>Toward high-resolution prediction and design of transmembrane helical protein structures</article-title><source>Proceedings of the National Academy of Sciences of the United States of America</source><volume>104</volume><fpage>15682</fpage><lpage>15687</lpage><pub-id pub-id-type="doi">10.1073/pnas.0702515104</pub-id></element-citation></ref><ref id="bib4"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Becker</surname><given-names>A</given-names></name><name><surname>Kabsch</surname><given-names>W</given-names></name></person-group><year>2002</year><article-title>X-ray structure of pyruvate formate-lyase in complex with pyruvate and CoA. How the enzyme uses the Cys-418 thiyl radical for pyruvate cleavage</article-title><source>Journal of Biological Chemistry</source><volume>277</volume><fpage>40036</fpage><lpage>40042</lpage><pub-id pub-id-type="doi">10.1074/jbc.M205821200</pub-id></element-citation></ref><ref id="bib5"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Betts</surname><given-names>MJ</given-names></name><name><surname>Sternberg</surname><given-names>MJE</given-names></name></person-group><year>1999</year><article-title>An analysis of conformational changes on protein–protein association: implications for predictive docking</article-title><source>Protein Engineering</source><volume>12</volume><fpage>271</fpage><lpage>283</lpage><pub-id pub-id-type="doi">10.1093/protein/12.4.271</pub-id></element-citation></ref><ref id="bib6"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bulkley</surname><given-names>D</given-names></name><name><surname>Johnson</surname><given-names>F</given-names></name><name><surname>Steitz</surname><given-names>TA</given-names></name></person-group><year>2012</year><article-title>The antibiotic thermorubin inhibits protein synthesis by binding to inter-subunit bridge B2a of the ribosome</article-title><source>Journal of Molecular Biology</source><volume>416</volume><fpage>571</fpage><lpage>578</lpage><pub-id pub-id-type="doi">10.1016/j.jmb.2011.12.055</pub-id></element-citation></ref><ref id="bib7"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Burger</surname><given-names>L</given-names></name><name><surname>van Nimwegen</surname><given-names>E</given-names></name></person-group><year>2008</year><article-title>Accurate prediction of protein–protein interactions from sequence alignments using a Bayesian method</article-title><source>Molecular Systems Biology</source><volume>4</volume><fpage>165</fpage><pub-id pub-id-type="doi">10.1038/msb4100203</pub-id></element-citation></ref><ref id="bib8"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cong</surname><given-names>Q</given-names></name><name><surname>Grishin</surname><given-names>NV</given-names></name></person-group><year>2012</year><article-title>MESSA: MEta-server for protein sequence analysis</article-title><source>BMC Biology</source><volume>10</volume><fpage>82</fpage><pub-id pub-id-type="doi">10.1186/1741-7007-10-82</pub-id></element-citation></ref><ref id="bib9"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Conway</surname><given-names>P</given-names></name><name><surname>Tyka</surname><given-names>MD</given-names></name><name><surname>DiMaio</surname><given-names>F</given-names></name><name><surname>Konerding</surname><given-names>DE</given-names></name><name><surname>Baker</surname><given-names>D</given-names></name></person-group><year>2014</year><article-title>Relaxation of backbone bond geometry improves protein energy landscape modeling</article-title><source>Protein Science</source><volume>23</volume><fpage>47</fpage><lpage>55</lpage><pub-id pub-id-type="doi">10.1002/pro.2389</pub-id></element-citation></ref><ref id="bib10"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dago</surname><given-names>AE</given-names></name><name><surname>Schug</surname><given-names>A</given-names></name><name><surname>Procaccini</surname><given-names>A</given-names></name><name><surname>Hoch</surname><given-names>JA</given-names></name><name><surname>Weigt</surname><given-names>M</given-names></name><name><surname>Szurmant</surname><given-names>H</given-names></name></person-group><year>2012</year><article-title>Structural basis of histidine kinase autophosphorylation deduced by integrating genomics, molecular dynamics, and mutagenesis</article-title><source>Proceedings of the National Academy of Sciences of the United States of America</source><volume>109</volume><fpage>E1733</fpage><lpage>E1742</lpage><pub-id pub-id-type="doi">10.1073/pnas.1201301109</pub-id></element-citation></ref><ref id="bib11"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Datta</surname><given-names>RS</given-names></name><name><surname>Meacham</surname><given-names>C</given-names></name><name><surname>Samad</surname><given-names>B</given-names></name><name><surname>Neyer</surname><given-names>C</given-names></name><name><surname>Sjölander</surname><given-names>K</given-names></name></person-group><year>2009</year><article-title>Berkeley PHOG: PhyloFacts orthology group prediction web server</article-title><source>Nucleic Acids Research</source><volume>37</volume><fpage>W84</fpage><lpage>W89</lpage><pub-id pub-id-type="doi">10.1093/nar/gkp373</pub-id></element-citation></ref><ref id="bib12"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>de Juan</surname><given-names>D</given-names></name><name><surname>Pazos</surname><given-names>F</given-names></name><name><surname>Valencia</surname><given-names>A</given-names></name></person-group><year>2013</year><article-title>Emerging methods in protein co-evolution</article-title><source>Nature Reviews Genetics</source><volume>14</volume><fpage>249</fpage><lpage>261</lpage><pub-id pub-id-type="doi">10.1038/nrg3414</pub-id></element-citation></ref><ref id="bib13"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Duhovny</surname><given-names>D</given-names></name><name><surname>Nussinov</surname><given-names>R</given-names></name><name><surname>Wolfson</surname><given-names>HJ</given-names></name></person-group><year>2002</year><article-title>Efficient unbound docking of rigid molecules</article-title><person-group person-group-type="editor"><name><surname>Berlin</surname></name></person-group><source>Algorithms in bioinformatics</source><publisher-name>Springer</publisher-name><publisher-loc>Heidelberg</publisher-loc><fpage>185</fpage><lpage>200</lpage><pub-id pub-id-type="doi">10.1007/3-540-45784-4_14</pub-id></element-citation></ref><ref id="bib14"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dunn</surname><given-names>SD</given-names></name><name><surname>Wahl</surname><given-names>LM</given-names></name><name><surname>Gloor</surname><given-names>GB</given-names></name></person-group><year>2008</year><article-title>Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction</article-title><source>Bioinformatics</source><volume>24</volume><fpage>333</fpage><lpage>340</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btm604</pub-id></element-citation></ref><ref id="bib15"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Eddy</surname><given-names>SR</given-names></name></person-group><year>2009</year><article-title>A new generation of homology search tools based on probabilistic inference</article-title><source>Genome Informatics</source><volume>23</volume><fpage>205</fpage><lpage>211</lpage></element-citation></ref><ref id="bib16"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ekeberg</surname><given-names>M</given-names></name><name><surname>Lövkvist</surname><given-names>C</given-names></name><name><surname>Lan</surname><given-names>Y</given-names></name><name><surname>Weigt</surname><given-names>M</given-names></name><name><surname>Aurell</surname><given-names>E</given-names></name></person-group><year>2013</year><article-title>Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models</article-title><source>Physical Review E</source><volume>87</volume><fpage>012707</fpage><pub-id pub-id-type="doi">10.1103/PhysRevE.87.012707</pub-id></element-citation></ref><ref id="bib17"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Federici</surname><given-names>L</given-names></name><name><surname>Du</surname><given-names>D</given-names></name><name><surname>Walas</surname><given-names>F</given-names></name><name><surname>Matsumura</surname><given-names>H</given-names></name><name><surname>Fernandez-Recio</surname><given-names>J</given-names></name><name><surname>McKeegan</surname><given-names>KS</given-names></name><name><surname>Borges-Walmsley</surname><given-names>MI</given-names></name><name><surname>Luisi</surname><given-names>BF</given-names></name><name><surname>Walmsley</surname><given-names>AR</given-names></name></person-group><year>2005</year><article-title>The crystal structure of the outer membrane protein VceC from the bacterial pathogen <italic>Vibrio cholerae</italic> at 1.8 Å resolution</article-title><source>Journal of Biological Chemistry</source><volume>280</volume><fpage>15307</fpage><lpage>15314</lpage><pub-id pub-id-type="doi">10.1074/jbc.M500401200</pub-id></element-citation></ref><ref id="bib18"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gray</surname><given-names>JJ</given-names></name><name><surname>Moughon</surname><given-names>S</given-names></name><name><surname>Wang</surname><given-names>C</given-names></name><name><surname>Schueler-Furman</surname><given-names>O</given-names></name><name><surname>Kuhlman</surname><given-names>B</given-names></name><name><surname>Rohl</surname><given-names>CA</given-names></name><name><surname>Baker</surname><given-names>D</given-names></name></person-group><year>2003</year><article-title>Protein–protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations</article-title><source>Journal of Molecular Biology</source><volume>331</volume><fpage>281</fpage><lpage>299</lpage><pub-id pub-id-type="doi">10.1016/S0022-2836(03)00670-3</pub-id></element-citation></ref><ref id="bib19"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Halperin</surname><given-names>I</given-names></name><name><surname>Wolfson</surname><given-names>H</given-names></name><name><surname>Nussinov</surname><given-names>R</given-names></name></person-group><year>2006</year><article-title>Correlated mutations: Advances and limitations. A study on fusion proteins and on the Cohesin-Dockerin families</article-title><source>Proteins: structure, Function, and Bioinformatics</source><volume>63</volume><fpage>832</fpage><lpage>845</lpage><pub-id pub-id-type="doi">10.1002/prot.20933</pub-id></element-citation></ref><ref id="bib20"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hopf</surname><given-names>TA</given-names></name><name><surname>Colwell</surname><given-names>LJ</given-names></name><name><surname>Sheridan</surname><given-names>R</given-names></name><name><surname>Rost</surname><given-names>B</given-names></name><name><surname>Sander</surname><given-names>C</given-names></name><name><surname>Marks</surname><given-names>DS</given-names></name></person-group><year>2012</year><article-title>Three-dimensional structures of membrane proteins from genomic sequencing</article-title><source>Cell</source><volume>149</volume><fpage>1607</fpage><lpage>1621</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2012.04.012</pub-id></element-citation></ref><ref id="bib21"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hopf</surname><given-names>TA</given-names></name><name><surname>Schärfe</surname><given-names>CPI</given-names></name><name><surname>Rodrigues</surname><given-names>JPGLM</given-names></name><name><surname>Green</surname><given-names>AG</given-names></name><name><surname>Sander</surname><given-names>C</given-names></name><name><surname>Bonvin</surname><given-names>AMJJ</given-names></name><name><surname>Marks</surname><given-names>DS</given-names></name></person-group><year>2014</year><article-title>Sequence co-evolution gives 3D contacts and structures of protein complexes. <italic>bioRxiv</italic></article-title><pub-id pub-id-type="doi">10.1101/004762</pub-id></element-citation></ref><ref id="bib21a"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hosur</surname><given-names>R</given-names></name><name><surname>Peng</surname><given-names>J</given-names></name><name><surname>Vinayagam</surname><given-names>A</given-names></name><name><surname>Stelzl</surname><given-names>U</given-names></name><name><surname>Xu</surname><given-names>J</given-names></name><name><surname>Perrimon</surname><given-names>N</given-names></name><name><surname>Bienkowska</surname><given-names>J</given-names></name><name><surname>Berger</surname><given-names>B</given-names></name></person-group><year>2012</year><article-title>A computational framework for boosting confidence in high-throughput protein-protein interaction datasets.</article-title><source>Genome biology</source><volume>13</volume><fpage>R76</fpage><pub-id pub-id-type="doi">10.1186/gb-2012-13-8-r76</pub-id></element-citation></ref><ref id="bib22"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jacob</surname><given-names>F</given-names></name><name><surname>Perrin</surname><given-names>D</given-names></name><name><surname>Sánchez</surname><given-names>C</given-names></name><name><surname>Monod</surname><given-names>J</given-names></name></person-group><year>2005</year><article-title>L'opéron: groupe de gènes à expression coordonnée par un opérateur [CR Acad. Sci. Paris 250 (1960) 1727–1729]</article-title><source>Comptes Rendus Biologies</source><volume>328</volume><fpage>514</fpage><lpage>520</lpage><pub-id pub-id-type="doi">10.1016/j.crvi.2005.04.005</pub-id></element-citation></ref><ref id="bib52"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Johnson</surname><given-names>E</given-names></name><name><surname>Nguyen</surname><given-names>PT</given-names></name><name><surname>Yeates</surname><given-names>TO</given-names></name><name><surname>Rees</surname><given-names>DC</given-names></name></person-group><year>2012</year><article-title>Inward facing conformations of the MetNI methionine ABC transporter: Implications for the mechanism of transinhibition</article-title><source>Protein Science</source><volume>21</volume><fpage>84</fpage><lpage>96</lpage><pub-id pub-id-type="doi">10.1002/pro.765</pub-id></element-citation></ref><ref id="bib23"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jones</surname><given-names>DT</given-names></name><name><surname>Buchan</surname><given-names>DWA</given-names></name><name><surname>Cozzetto</surname><given-names>D</given-names></name><name><surname>Pontil</surname><given-names>M</given-names></name></person-group><year>2012</year><article-title>PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments</article-title><source>Bioinformatics</source><volume>28</volume><fpage>184</fpage><lpage>190</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btr638</pub-id></element-citation></ref><ref id="bib24"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kamisetty</surname><given-names>H</given-names></name><name><surname>Ovchinnikov</surname><given-names>S</given-names></name><name><surname>Baker</surname><given-names>D</given-names></name></person-group><year>2013</year><article-title>Assessing the utility of coevolution-based residue–residue contact predictions in a sequence-and structure-rich era</article-title><source>Proceedings of the National Academy of Sciences of the United States of America</source><volume>110</volume><fpage>15674</fpage><lpage>15679</lpage><pub-id pub-id-type="doi">10.1073/pnas.1319550110</pub-id></element-citation></ref><ref id="bib25"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lange</surname><given-names>OF</given-names></name><name><surname>Rossi</surname><given-names>P</given-names></name><name><surname>Sgourakis</surname><given-names>NG</given-names></name><name><surname>Song</surname><given-names>Y</given-names></name><name><surname>Hsiau-Wei</surname><given-names>L</given-names></name><name><surname>Aramini</surname><given-names>JM</given-names></name><name><surname>Ertekin</surname><given-names>A</given-names></name><name><surname>Xiao</surname><given-names>R</given-names></name><name><surname>Acton</surname><given-names>TB</given-names></name><name><surname>Montelione</surname><given-names>GT</given-names></name><name><surname>Baker</surname><given-names>D</given-names></name></person-group><year>2012</year><article-title>Determination of solution structures of proteins up to 40 kDa using CS-Rosetta with sparse NMR data from deuterated samples</article-title><source>Proceedings of the National Academy of Sciences of the United States of America</source><volume>109</volume><fpage>10873</fpage><lpage>10878</lpage><pub-id pub-id-type="doi">10.1073/pnas.1203013109</pub-id></element-citation></ref><ref id="bib26"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lapedes</surname><given-names>A</given-names></name><name><surname>Giraud</surname><given-names>B</given-names></name><name><surname>Jarzynski</surname><given-names>C</given-names></name></person-group><year>2012</year><article-title>Using sequence alignments to predict protein structure and stability with high accuracy</article-title><ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/1207.2484">http://arxiv.org/abs/1207.2484</ext-link></element-citation></ref><ref id="bib27"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lapedes</surname><given-names>AS</given-names></name><name><surname>Giraud</surname><given-names>BG</given-names></name><name><surname>Liu</surname><given-names>LC</given-names></name><name><surname>Stormo</surname><given-names>GD</given-names></name></person-group><year>1999</year><article-title>Correlated mutations in models of protein sequences: phylogenetic and structural effects</article-title><source>Lecture Notes-Monograph Series</source><volume>33</volume><fpage>236</fpage><lpage>256</lpage><pub-id pub-id-type="doi">10.1214/lnms/1215455556</pub-id></element-citation></ref><ref id="bib28"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Long</surname><given-names>F</given-names></name><name><surname>Su</surname><given-names>CC</given-names></name><name><surname>Lei</surname><given-names>HT</given-names></name><name><surname>Bolla</surname><given-names>JR</given-names></name><name><surname>Do</surname><given-names>SV</given-names></name><name><surname>Yu</surname><given-names>EW</given-names></name></person-group><year>2012</year><article-title>Structure and mechanism of the tripartite CusCBA heavy-metal efflux complex</article-title><source>Philosophical Transactions of the Royal Society B: Biological Sciences</source><volume>367</volume><fpage>1047</fpage><lpage>1058</lpage><pub-id pub-id-type="doi">10.1098/rstb.2011.0203</pub-id></element-citation></ref><ref id="bib29"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Mancusso</surname><given-names>R</given-names></name><name><surname>Gregorio</surname><given-names>GG</given-names></name><name><surname>Liu</surname><given-names>Q</given-names></name><name><surname>Wang</surname><given-names>Da-N</given-names></name></person-group><year>2012</year><article-title>Structure and mechanism of a bacterial sodium-dependent dicarboxylate transporter</article-title><source>Nature</source><volume>491</volume><fpage>622</fpage><lpage>626</lpage><pub-id pub-id-type="doi">10.1038/nature11542</pub-id></element-citation></ref><ref id="bib30"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Marks</surname><given-names>DS</given-names></name><name><surname>Colwell</surname><given-names>LJ</given-names></name><name><surname>Sheridan</surname><given-names>R</given-names></name><name><surname>Hopf</surname><given-names>TA</given-names></name><name><surname>Pagnani</surname><given-names>A</given-names></name><name><surname>Zecchina</surname><given-names>R</given-names></name><name><surname>Sander</surname><given-names>C</given-names></name></person-group><year>2011</year><article-title>Protein 3D structure computed from evolutionary sequence variation</article-title><source>PLOS ONE</source><volume>6</volume><fpage>e28766</fpage><pub-id pub-id-type="doi">10.1371/journal.pone.0028766</pub-id></element-citation></ref><ref id="bib31"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Marks</surname><given-names>DS</given-names></name><name><surname>Hopf</surname><given-names>TA</given-names></name><name><surname>Sander</surname><given-names>C</given-names></name></person-group><year>2012</year><article-title>Protein structure prediction from sequence variation</article-title><source>Nature Biotechnology</source><volume>30</volume><fpage>1072</fpage><lpage>1080</lpage><pub-id pub-id-type="doi">10.1038/nbt.2419</pub-id></element-citation></ref><ref id="bib32"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Morcos</surname><given-names>F</given-names></name><name><surname>Jana</surname><given-names>B</given-names></name><name><surname>Hwa</surname><given-names>T</given-names></name><name><surname>Onuchic</surname><given-names>JN</given-names></name></person-group><year>2013</year><article-title>Coevolutionary signals across protein lineages help capture multiple protein conformations</article-title><source>Proceedings of the National Academy of Sciences of the United States of America</source><volume>110</volume><fpage>20533</fpage><lpage>20538</lpage><pub-id pub-id-type="doi">10.1073/pnas.1315625110</pub-id></element-citation></ref><ref id="bib33"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Morcos</surname><given-names>F</given-names></name><name><surname>Pagnani</surname><given-names>A</given-names></name><name><surname>Lunt</surname><given-names>B</given-names></name><name><surname>Bertolino</surname><given-names>A</given-names></name><name><surname>Marks</surname><given-names>DS</given-names></name><name><surname>Sander</surname><given-names>C</given-names></name><name><surname>Zecchina</surname><given-names>R</given-names></name><name><surname>Onuchic</surname><given-names>JN</given-names></name><name><surname>Hwa</surname><given-names>T</given-names></name><name><surname>Weigt</surname><given-names>M</given-names></name></person-group><year>2011</year><article-title>Direct-coupling analysis of residue coevolution captures native contacts across many protein families</article-title><source>Proceedings of the National Academy of Sciences of the United States of America</source><volume>108</volume><fpage>E1293</fpage><lpage>E1301</lpage><pub-id pub-id-type="doi">10.1073/pnas.1111471108</pub-id></element-citation></ref><ref id="bib34"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Mulligan</surname><given-names>C</given-names></name><name><surname>Fischer</surname><given-names>M</given-names></name><name><surname>Thomas</surname><given-names>GH</given-names></name></person-group><year>2011</year><article-title>Tripartite ATP-independent periplasmic (TRAP) transporters in bacteria and archaea</article-title><source>FEMS Microbiology Reviews</source><volume>35</volume><fpage>68</fpage><lpage>86</lpage><pub-id pub-id-type="doi">10.1111/j.1574-6976.2010.00236.x</pub-id></element-citation></ref><ref id="bib35"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nakamura</surname><given-names>A</given-names></name><name><surname>Sheppard</surname><given-names>K</given-names></name><name><surname>Yamane</surname><given-names>J</given-names></name><name><surname>Yao</surname><given-names>M</given-names></name><name><surname>Söll</surname><given-names>D</given-names></name><name><surname>Tanaka</surname><given-names>I</given-names></name></person-group><year>2010</year><article-title>Two distinct regions in Staphylococcus aureus GatCAB guarantee accurate tRNA recognition</article-title><source>Nucleic Acids Research</source><volume>38</volume><fpage>672</fpage><lpage>682</lpage><pub-id pub-id-type="doi">10.1093/nar/gkp955</pub-id></element-citation></ref><ref id="bib36"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nugent</surname><given-names>T</given-names></name><name><surname>Jones</surname><given-names>DT</given-names></name></person-group><year>2012</year><article-title>Accurate de novo structure prediction of large transmembrane protein domains using fragment-assembly and correlated mutation analysis</article-title><source>Proceedings of the National Academy of Sciences of the United States of America</source><volume>109</volume><fpage>E1540</fpage><lpage>E1547</lpage><pub-id pub-id-type="doi">10.1073/pnas.1120036109</pub-id></element-citation></ref><ref id="bib37"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ochoa</surname><given-names>D</given-names></name><name><surname>Pazos</surname><given-names>F</given-names></name></person-group><year>2010</year><article-title>Studying the co-evolution of protein families with the Mirrortree web server</article-title><source>Bioinformatics</source><volume>26</volume><fpage>1370</fpage><lpage>1371</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btq137</pub-id></element-citation></ref><ref id="bib38"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ovchinnikov</surname><given-names>S</given-names></name><name><surname>Kamisetty</surname><given-names>H</given-names></name><name><surname>Baker</surname><given-names>D</given-names></name></person-group><year>2014</year><article-title>Data from: Robust and accurate prediction of residue-residue interactions across protein interfaces using evolutionary information</article-title><source>Dryad Digital Repository</source><pub-id pub-id-type="doi">10.5061/dryad.s00vr</pub-id></element-citation></ref><ref id="bib39"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Pazos</surname><given-names>F</given-names></name><name><surname>Helmer-Citterich</surname><given-names>M</given-names></name><name><surname>Ausiello</surname><given-names>G</given-names></name><name><surname>Valencia</surname><given-names>A</given-names></name></person-group><year>1997</year><article-title>Correlated mutations contain information about protein-protein interaction</article-title><source>Journal of Molecular Biology</source><volume>271</volume><fpage>511</fpage><lpage>523</lpage><pub-id pub-id-type="doi">10.1006/jmbi.1997.1198</pub-id></element-citation></ref><ref id="bib40"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Raman</surname><given-names>S</given-names></name><name><surname>Vernon</surname><given-names>R</given-names></name><name><surname>Thompson</surname><given-names>J</given-names></name><name><surname>Tyka</surname><given-names>M</given-names></name><name><surname>Sadreyev</surname><given-names>R</given-names></name><name><surname>Pei</surname><given-names>J</given-names></name><name><surname>Kim</surname><given-names>D</given-names></name><name><surname>Kellogg</surname><given-names>E</given-names></name><name><surname>DiMaio</surname><given-names>F</given-names></name><name><surname>Lange</surname><given-names>O</given-names></name><name><surname>Kinch</surname><given-names>L</given-names></name><name><surname>Sheffler</surname><given-names>W</given-names></name><name><surname>Kim</surname><given-names>BH</given-names></name><name><surname>Das</surname><given-names>R</given-names></name><name><surname>Grishin</surname><given-names>NV</given-names></name><name><surname>Baker</surname><given-names>D</given-names></name></person-group><year>2009</year><article-title>Structure prediction for CASP8 with all‐atom refinement using Rosetta</article-title><source>Proteins: structure, Function, and Bioinformatics</source><volume>77</volume><fpage>89</fpage><lpage>99</lpage><pub-id pub-id-type="doi">10.1002/prot.22540</pub-id></element-citation></ref><ref id="bib41"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Remm</surname><given-names>M</given-names></name><name><surname>Storm</surname><given-names>CEV</given-names></name><name><surname>Sonnhammer</surname><given-names>ELL</given-names></name></person-group><year>2001</year><article-title>Automatic clustering of orthologs and in-paralogs from pairwise species comparisons</article-title><source>Journal of Molecular Biology</source><volume>314</volume><fpage>1041</fpage><lpage>1052</lpage><pub-id pub-id-type="doi">10.1006/jmbi.2000.5197</pub-id></element-citation></ref><ref id="bib42"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Remmert</surname><given-names>M</given-names></name><name><surname>Biegert</surname><given-names>A</given-names></name><name><surname>Hauser</surname><given-names>A</given-names></name><name><surname>Söding</surname><given-names>J</given-names></name></person-group><year>2011</year><article-title>HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment</article-title><source>Nature Methods</source><volume>9</volume><fpage>173</fpage><lpage>175</lpage><pub-id pub-id-type="doi">10.1038/nmeth.1818</pub-id></element-citation></ref><ref id="bib43"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schug</surname><given-names>A</given-names></name><name><surname>Weigt</surname><given-names>M</given-names></name><name><surname>Onuchic</surname><given-names>JN</given-names></name><name><surname>Hwa</surname><given-names>T</given-names></name><name><surname>Szurmant</surname><given-names>H</given-names></name></person-group><year>2009</year><article-title>High-resolution protein complexes from integrating genomic information with molecular simulation</article-title><source>Proceedings of the National Academy of Sciences of the United States of America</source><volume>106</volume><fpage>22124</fpage><lpage>22129</lpage><pub-id pub-id-type="doi">10.1073/pnas.0912100106</pub-id></element-citation></ref><ref id="bib44"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shoemaker</surname><given-names>BA</given-names></name><name><surname>Panchenko</surname><given-names>AR</given-names></name></person-group><year>2007</year><article-title>Deciphering protein–protein interactions. Part II. Computational methods to predict protein and domain interaction partners</article-title><source>PLOS Computational Biology</source><volume>3</volume><fpage>e43</fpage><pub-id pub-id-type="doi">10.1371/journal.pcbi.0030043</pub-id></element-citation></ref><ref id="bib45"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sievers</surname><given-names>F</given-names></name><name><surname>Wilm</surname><given-names>A</given-names></name><name><surname>Dineen</surname><given-names>D</given-names></name><name><surname>Gibson</surname><given-names>TJ</given-names></name><name><surname>Karplus</surname><given-names>K</given-names></name><name><surname>Li</surname><given-names>W</given-names></name><name><surname>Lopez</surname><given-names>R</given-names></name><name><surname>McWilliam</surname><given-names>H</given-names></name><name><surname>Remmert</surname><given-names>M</given-names></name><name><surname>Söding</surname><given-names>J</given-names></name><name><surname>Thompson</surname><given-names>JD</given-names></name><name><surname>Higgins</surname><given-names>DG</given-names></name></person-group><year>2011</year><article-title>Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega</article-title><source>Molecular Systems Biology</source><volume>7</volume><fpage>539</fpage><pub-id pub-id-type="doi">10.1038/msb.2011.75</pub-id></element-citation></ref><ref id="bib46"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Simons</surname><given-names>KT</given-names></name><name><surname>Ruczinski</surname><given-names>I</given-names></name><name><surname>Kooperberg</surname><given-names>C</given-names></name><name><surname>Fox</surname><given-names>BA</given-names></name><name><surname>Bystroff</surname><given-names>C</given-names></name><name><surname>Baker</surname><given-names>D</given-names></name></person-group><year>1999</year><article-title>Improved recognition of native-like protein structures using a combination of sequence‐dependent and sequence‐independent features of proteins</article-title><source>Proteins: structure, Function, and Bioinformatics</source><volume>34</volume><fpage>82</fpage><lpage>95</lpage><pub-id pub-id-type="doi">10.1002/(SICI)1097-0134(19990101)34:1<82::AID-PROT7>3.0.CO;2-A</pub-id></element-citation></ref><ref id="bib47"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Song</surname><given-names>Y</given-names></name><name><surname>DiMaio</surname><given-names>F</given-names></name><name><surname>Wang</surname><given-names>RYR</given-names></name><name><surname>Kim</surname><given-names>D</given-names></name><name><surname>Miles</surname><given-names>C</given-names></name><name><surname>Brunette</surname><given-names>TJ</given-names></name><name><surname>Thompson</surname><given-names>J</given-names></name><name><surname>Baker</surname><given-names>D</given-names></name></person-group><year>2013</year><article-title>High-Resolution comparative modeling with RosettaCM</article-title><source>Structure</source><volume>21</volume><fpage>1735</fpage><lpage>1742</lpage><pub-id pub-id-type="doi">10.1016/j.str.2013.08.005</pub-id></element-citation></ref><ref id="bib48"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Suhre</surname><given-names>K</given-names></name><name><surname>Claverie</surname><given-names>JM</given-names></name></person-group><year>2004</year><article-title>FusionDB: a database for in-depth analysis of prokaryotic gene fusion events</article-title><source>Nucleic Acids Research</source><volume>32</volume><fpage>D273</fpage><lpage>D276</lpage><pub-id pub-id-type="doi">10.1093/nar/gkh053</pub-id></element-citation></ref><ref id="bib49"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sułkowska</surname><given-names>JI</given-names></name><name><surname>Morcos</surname><given-names>F</given-names></name><name><surname>Weigt</surname><given-names>M</given-names></name><name><surname>Hwa</surname><given-names>T</given-names></name><name><surname>Onuchic</surname><given-names>JN</given-names></name></person-group><year>2012</year><article-title>Genomics-aided structure prediction</article-title><source>Proceedings of the National Academy of Sciences of the United States of America</source><volume>109</volume><fpage>10340</fpage><lpage>10345</lpage><pub-id pub-id-type="doi">10.1073/pnas.1207864109</pub-id></element-citation></ref><ref id="bib50"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tamir</surname><given-names>S</given-names></name><name><surname>Rotem-Bamberger</surname><given-names>S</given-names></name><name><surname>Katz</surname><given-names>C</given-names></name><name><surname>Morcos</surname><given-names>F</given-names></name><name><surname>Hailey</surname><given-names>KL</given-names></name><name><surname>Zuris</surname><given-names>JA</given-names></name><name><surname>Wang</surname><given-names>C</given-names></name><name><surname>Conlan</surname><given-names>AR</given-names></name><name><surname>Lipper</surname><given-names>CH</given-names></name><name><surname>Paddock</surname><given-names>ML</given-names></name><name><surname>Mittler</surname><given-names>R</given-names></name><name><surname>Onuchic</surname><given-names>JN</given-names></name><name><surname>Jennings</surname><given-names>PA</given-names></name><name><surname>Friedler</surname><given-names>A</given-names></name><name><surname>Nechushtai</surname><given-names>R</given-names></name></person-group><year>2014</year><article-title>Integrated strategy reveals the protein interface between cancer targets Bcl-2 and NAF-1</article-title><source>Proceedings of the National Academy of Sciences of the United States of America</source><volume>111</volume><fpage>5177</fpage><lpage>5182</lpage><pub-id pub-id-type="doi">10.1073/pnas.1403770111</pub-id></element-citation></ref><ref id="bib51"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Thomas</surname><given-names>J</given-names></name><name><surname>Ramakrishnan</surname><given-names>N</given-names></name><name><surname>Bailey-Kellogg</surname><given-names>C</given-names></name></person-group><year>2008</year><article-title>Graphical models of residue coupling in protein families</article-title><source>IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)</source><volume>5</volume><fpage>183</fpage><lpage>197</lpage><pub-id pub-id-type="doi">10.1109/TCBB.2007.70225</pub-id></element-citation></ref><ref id="bib53"><element-citation publication-type="web"><person-group person-group-type="author"><collab>UniProt Accession</collab></person-group><article-title>UniProt User manual</article-title><ext-link ext-link-type="uri" xlink:href="http://www.uniprot.org/manual/accession_numbers">http://www.uniprot.org/manual/accession_numbers</ext-link><comment>. accessed September 9, 2013</comment></element-citation></ref><ref id="bib54"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Valencia</surname><given-names>A</given-names></name><name><surname>Pazos</surname><given-names>F</given-names></name></person-group><year>2002</year><article-title>Computational methods for the prediction of protein interactions</article-title><source>Current Opinion in Structural Biology</source><volume>12</volume><fpage>368</fpage><lpage>373</lpage><pub-id pub-id-type="doi">10.1016/S0959-440X(02)00333-0</pub-id></element-citation></ref><ref id="bib55"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Vey</surname><given-names>JL</given-names></name><name><surname>Yang</surname><given-names>J</given-names></name><name><surname>Li</surname><given-names>M</given-names></name><name><surname>Broderick</surname><given-names>WE</given-names></name><name><surname>Broderick</surname><given-names>JB</given-names></name><name><surname>Drennan</surname><given-names>CL</given-names></name></person-group><year>2008</year><article-title>Structural basis for glycyl radical formation by pyruvate formate-lyase activating enzyme</article-title><source>Proceedings of the National Academy of Sciences of the United States of America</source><volume>105</volume><fpage>16137</fpage><lpage>16141</lpage><pub-id pub-id-type="doi">10.1073/pnas.0806640105</pub-id></element-citation></ref><ref id="bib56"><element-citation publication-type="web"><person-group person-group-type="author"><name><surname>Wang</surname><given-names>G</given-names></name><name><surname>Dunbrack</surname><given-names>RL</given-names><suffix>Jr</suffix></name></person-group><article-title>S2C: a database correlating sequence and atomic coordinate residue numbering in the Protein Data Bank</article-title><comment>Dunbrack Lab. <ext-link ext-link-type="uri" xlink:href="http://dunbrack.fccc.edu/s2c/">http://dunbrack.fccc.edu/s2c/</ext-link>. Accessed October 12, 2013</comment></element-citation></ref><ref id="bib57"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Weigt</surname><given-names>M</given-names></name><name><surname>White</surname><given-names>RA</given-names></name><name><surname>Szurmant</surname><given-names>H</given-names></name><name><surname>Hoch</surname><given-names>JA</given-names></name><name><surname>Hwa</surname><given-names>T</given-names></name></person-group><year>2009</year><article-title>Identification of direct residue contacts in protein–protein interaction by message passing</article-title><source>Proceedings of the National Academy of Sciences of the United States of America</source><volume>106</volume><fpage>67</fpage><lpage>72</lpage><pub-id pub-id-type="doi">10.1073/pnas.0805923106</pub-id></element-citation></ref><ref id="bib58"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname><given-names>CH</given-names></name><name><surname>Apweiler</surname><given-names>R</given-names></name><name><surname>Bairoch</surname><given-names>A</given-names></name><name><surname>Natale</surname><given-names>DA</given-names></name><name><surname>Barker</surname><given-names>WC</given-names></name><name><surname>Boeckmann</surname><given-names>B</given-names></name><name><surname>Ferro</surname><given-names>S</given-names></name><name><surname>Gasteiger</surname><given-names>E</given-names></name><name><surname>Huang</surname><given-names>H</given-names></name><name><surname>Lopez</surname><given-names>R</given-names></name><name><surname>Magrane</surname><given-names>M</given-names></name><name><surname>Martin</surname><given-names>MJ</given-names></name><name><surname>Mazumder</surname><given-names>R</given-names></name><name><surname>O'Donovan</surname><given-names>C</given-names></name><name><surname>Redaschi</surname><given-names>N</given-names></name><name><surname>Suzek</surname><given-names>B</given-names></name></person-group><year>2006</year><article-title>The Universal Protein Resource (UniProt): an expanding universe of protein information</article-title><source>Nucleic Acids Research</source><volume>34</volume><fpage>D187</fpage><lpage>D191</lpage><pub-id pub-id-type="doi">10.1093/nar/gkj161</pub-id></element-citation></ref><ref id="bib59"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yarov-Yarovoy</surname><given-names>V</given-names></name><name><surname>DeCaen</surname><given-names>PG</given-names></name><name><surname>Westenbroek</surname><given-names>RE</given-names></name><name><surname>Pan</surname><given-names>CY</given-names></name><name><surname>Scheuer</surname><given-names>T</given-names></name><name><surname>Baker</surname><given-names>D</given-names></name><name><surname>Catterall</surname><given-names>WA</given-names></name></person-group><year>2012</year><article-title>Structural basis for gating charge movement in the voltage sensor of a sodium channel</article-title><source>Proceedings of the National Academy of Sciences of the United States of America</source><volume>109</volume><fpage>E93</fpage><lpage>E102</lpage><pub-id pub-id-type="doi">10.1073/pnas.1118434109</pub-id></element-citation></ref><ref id="bib60"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yu</surname><given-names>S</given-names></name><name><surname>Yeon Lee</surname><given-names>N</given-names></name><name><surname>Park</surname><given-names>SJ</given-names></name><name><surname>Rhee</surname><given-names>S</given-names></name></person-group><year>2011</year><article-title>Crystal structure of toll-like receptor 2-activating lipoprotein IIpA from Vibrio vulnificus</article-title><source>Proteins: structure, Function, and Bioinformatics</source><volume>79</volume><fpage>1020</fpage><lpage>1025</lpage><pub-id pub-id-type="doi">10.1002/prot.22929</pub-id></element-citation></ref><ref id="bib61"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yum</surname><given-names>S</given-names></name><name><surname>Xu</surname><given-names>Y</given-names></name><name><surname>Piao</surname><given-names>S</given-names></name><name><surname>Sim</surname><given-names>SH</given-names></name><name><surname>Kim</surname><given-names>HM</given-names></name><name><surname>Jo</surname><given-names>WS</given-names></name><name><surname>Kim</surname><given-names>KJ</given-names></name><name><surname>Kweon</surname><given-names>HS</given-names></name><name><surname>Jeong</surname><given-names>MH</given-names></name><name><surname>Jeon</surname><given-names>H</given-names></name><name><surname>Lee</surname><given-names>K</given-names></name><name><surname>Ha</surname><given-names>NC</given-names></name></person-group><year>2009</year><article-title>Crystal structure of the periplasmic component of a tripartite macrolide-specific efflux pump</article-title><source>Journal of Molecular Biology</source><volume>387</volume><fpage>1286</fpage><lpage>1297</lpage><pub-id pub-id-type="doi">10.1016/j.jmb.2009.02.048</pub-id></element-citation></ref><ref id="bib62"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname><given-names>QC</given-names></name><name><surname>Petrey</surname><given-names>D</given-names></name><name><surname>Deng</surname><given-names>L</given-names></name><name><surname>Qiang</surname><given-names>L</given-names></name><name><surname>Shi</surname><given-names>Y</given-names></name><name><surname>Thu</surname><given-names>CA</given-names></name><name><surname>Bisikirska</surname><given-names>B</given-names></name><name><surname>Lefebvre</surname><given-names>C</given-names></name><name><surname>Accili</surname><given-names>D</given-names></name><name><surname>Hunter</surname><given-names>T</given-names></name><name><surname>Maniatis</surname><given-names>T</given-names></name><name><surname>Califano</surname><given-names>A</given-names></name><name><surname>Honig</surname><given-names>B</given-names></name></person-group><year>2012</year><article-title>Structure-based prediction of protein-protein interactions on a genome-wide scale</article-title><source>Nature</source><volume>490</volume><fpage>556</fpage><lpage>560</lpage><pub-id pub-id-type="doi">10.1038/nature11503</pub-id></element-citation></ref><ref id="bib63"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname><given-names>J</given-names></name><name><surname>Rudd</surname><given-names>KE</given-names></name></person-group><year>2013</year><article-title>EcoGene 3.0</article-title><source>Nucleic Acids Research</source><volume>41</volume><fpage>D613</fpage><lpage>D624</lpage><pub-id pub-id-type="doi">10.1093/nar/gks1235</pub-id></element-citation></ref></ref-list></back><sub-article article-type="article-commentary" id="SA1"><front-stub><article-id pub-id-type="doi">10.7554/eLife.02030.013</article-id><title-group><article-title>Decision letter</article-title></title-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Roux</surname><given-names>Benoit</given-names></name><role>Reviewing editor</role><aff><institution>University of Chicago</institution>, <country>United States</country></aff></contrib></contrib-group></front-stub><body><boxed-text><p>eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see <ext-link ext-link-type="uri" xlink:href="http://elifesciences.org/review-process">review process</ext-link>). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.</p></boxed-text><p>Thank you for sending your work entitled “Robust and accurate prediction of residue-residue interactions across protein interfaces using evolutionary information” for consideration at <italic>eLife</italic>. Despite some concerns, your article has been favorably evaluated by a Senior editor (John Kuriyan), a Reviewing editor, and 2 reviewers. To minimize delays, you should first respond by email to describe how you plan to revise the article before submitting a full revision. We are hopeful that by your submitting a detailed response this can be discussed by the editors and the reviewers, and that a revised manuscript need not then be submitted to re-review.</p><p>The Reviewing editor and the reviewers discussed their comments, and there is consensus that this is an interesting and potentially very influential manuscript using evolution methods to improve both protein structure prediction as well as protein-protein interactions. The manuscript describes the application of the pseudo-likelihood approach, first described by <xref ref-type="bibr" rid="bib1">Balakrishnan et al. (2011)</xref>, to the determination of interacting residues across protein-protein interfaces. The coupling information is then used for the construction of docked models for a number of biologically important systems. The results seem to be very good and interesting, at least at a qualitative level. There are, however, some concerns:</p><p>1) Why were structural models built only for the unknown complexes, and why not for the ones with known structure? Quantitative results for at least some of the complexes with known structure would be helpful to better understand the accuracy of the resulting models and the impact of distance constraints.</p><p>2) There is not sufficient quantitative evidence demonstrating that the method described here is far superior to previous methods such as mfDCA and plDCA.</p><p>3) It is somewhat unfortunate that the paper as currently written overstates the novelty of the work, including unnecessary and unsubstantiated criticism of previous methods. Understandably, the authors did not want to repeat the discussion of the algorithm given previously by Balakrishnan et al. and also briefly in their PNAS paper (<xref ref-type="bibr" rid="bib24">Kamisetty et al., 2013</xref>). However, considering that the primary readers of <italic>eLife</italic> are most likely biologists, it might be useful to add a few sentences in the Introduction about the innovative nature of the pseudo-likelihood approach, and why is it better than the earlier methods. More generally, there are many places in the paper where some more explanation would help the readers. Note that <italic>eLife</italic> has no limitation on the length of the paper, and hence it is possible to improve readability.</p></body></sub-article><sub-article article-type="reply" id="SA2"><front-stub><article-id pub-id-type="doi">10.7554/eLife.02030.014</article-id><title-group><article-title>Author response</article-title></title-group></front-stub><body><p><italic>1) Why were structural models built only for the unknown complexes, and why not for the ones with known structure? Quantitative results for at least some of the complexes with known structure would be helpful to better understand the accuracy of the resulting models and the impact of distance constraints</italic>.</p><p>This is an excellent suggestion. We have carried out a detailed analysis of docking results on proteins of known structure, and we will include this in the revised version. Briefly, in 30 of the 33 cases where there were at least two predicted contacts with scores greater than 0.6, the docking model with the best gremlin score was within 2 Å RMSD of the experimentally determined complex structure. We will point out that this likely overestimates the accuracy of our models of the unknown complexes, however, since docking of bound (from the structure of the complex) structures generally is more accurate than docking of unbound structures.</p><p><italic>2) There is not sufficient quantitative evidence demonstrating that the method described here is far superior to previous methods such as mfDCA and plDCA</italic>.</p><p>The demonstration that the pseudo-likelihood approach is quantitatively better than previous methods was presented in the two previous papers on this approach. This is not the focus of this paper, as should be clear from the Abstract. The focus of this paper is, instead, the demonstration that co-evolution data can be used to generate reliable models of a wide range of protein protein complexes of biological interest. We go very considerably beyond previous work on this topic, which has focused on the Sensor histidine kinase+Response Regulator (SK/RR) two-component systems.</p><p><italic>3) It is somewhat unfortunate that the paper as currently written overstates the novelty of the work, including unnecessary and unsubstantiated criticism of previous methods. Understandably, the authors did not want to repeat the discussion of the algorithm given previously by Balakrishnan et al. and also briefly in their PNAS paper (</italic><xref ref-type="bibr" rid="bib24"><italic>Kamisetty et al., 2013</italic></xref><italic>). However, considering that the primary readers of</italic> eLife <italic>are most likely biologists, it might be useful to add a few sentences in the Introduction about the innovative nature of the pseudo-likelihood approach, and why is it better than the earlier methods. More generally, there are many places in the paper where some more explanation would help the readers. Note that</italic> eLife <italic>has no limitation on the length of the paper, and hence it is possible to improve readability</italic>.</p><p>The contribution of this paper is not the development of the pseudo-likelihood method, but the study of a large set of protein complexes of significant biological interest. We will provide more explanation of the innovative nature of the approach in the methods section. Where necessary, we will refer to the relevant sections of our PNAS paper (<xref ref-type="bibr" rid="bib24">Kamisetty et al., 2013</xref>) for technical details.</p></body></sub-article></article>