Permalink
Switch branches/tags
Nothing to show
Find file
Fetching contributors…
Cannot retrieve contributors at this time
1 lines (1 sloc) 212 KB
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.1d1 20130915//EN" "JATS-archivearticle1.dtd"><article article-type="research-article" dtd-version="1.1d1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="nlm-ta">eLife</journal-id><journal-id journal-id-type="hwp">elife</journal-id><journal-id journal-id-type="publisher-id">eLife</journal-id><journal-title-group><journal-title>eLife</journal-title></journal-title-group><issn publication-format="electronic">2050-084X</issn><publisher><publisher-name>eLife Sciences Publications, Ltd</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">00311</article-id><article-id pub-id-type="doi">10.7554/eLife.00311</article-id><article-categories><subj-group subj-group-type="display-channel"><subject>Research article</subject></subj-group><subj-group subj-group-type="heading"><subject>Biophysics and structural biology</subject></subj-group></article-categories><title-group><article-title>Modelling dynamics in protein crystal structures by ensemble refinement</article-title></title-group><contrib-group><contrib contrib-type="author" id="author-2554"><name><surname>Burnley</surname><given-names>B Tom</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="fn" rid="con1"/><xref ref-type="fn" rid="conf1"/><xref ref-type="other" rid="dataro1"/></contrib><contrib contrib-type="author" id="author-2555"><name><surname>Afonine</surname><given-names>Pavel V</given-names></name><xref ref-type="aff" rid="aff2"/><xref ref-type="fn" rid="con2"/><xref ref-type="fn" rid="conf1"/><xref ref-type="other" rid="dataro1"/></contrib><contrib contrib-type="author" id="author-2556"><name><surname>Adams</surname><given-names>Paul D</given-names></name><xref ref-type="aff" rid="aff2"/><xref ref-type="aff" rid="aff3"/><xref ref-type="other" rid="par-2"/><xref ref-type="fn" rid="con3"/><xref ref-type="fn" rid="conf1"/><xref ref-type="other" rid="dataro1"/></contrib><contrib contrib-type="author" corresp="yes" id="author-2524"><name><surname>Gros</surname><given-names>Piet</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="corresp" rid="cor1">*</xref><xref ref-type="other" rid="par-1"/><xref ref-type="other" rid="par-3"/><xref ref-type="fn" rid="con4"/><xref ref-type="fn" rid="conf1"/><xref ref-type="other" rid="dataro1"/></contrib><aff id="aff1"><institution content-type="dept">Crystal and Structural Chemistry, Bijvoet Center for Biomolecular Research, Department of Chemistry, Faculty of Science</institution>, <institution>Utrecht University</institution>, <addr-line><named-content content-type="city">Utrecht</named-content></addr-line>, <country>The Netherlands</country></aff><aff id="aff2"><institution>Lawrence Berkeley National Laboratory</institution>, <addr-line><named-content content-type="city">Berkeley</named-content></addr-line>, <country>United States</country></aff><aff id="aff3"><institution content-type="dept">Department of Bioengineering</institution>, <institution>University of California Berkeley</institution>, <addr-line><named-content content-type="city">Berkeley</named-content></addr-line>, <country>United States</country></aff></contrib-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Brunger</surname><given-names>Axel T</given-names></name><role>Reviewing editor</role><aff><institution>Howard Hughes Medical Institute, Stanford University</institution>, <country>United States</country></aff></contrib></contrib-group><author-notes><corresp id="cor1"><label>*</label>For correspondence: <email>p.gros@uu.nl</email></corresp></author-notes><pub-date date-type="pub" publication-format="electronic"><day>18</day><month>12</month><year>2012</year></pub-date><pub-date pub-type="collection"><year>2012</year></pub-date><volume>1</volume><elocation-id>e00311</elocation-id><history><date date-type="received"><day>08</day><month>10</month><year>2012</year></date><date date-type="accepted"><day>23</day><month>10</month><year>2012</year></date></history><permissions><license xlink:href="http://creativecommons.org/publicdomain/zero/1.0/"><license-p>This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/publicdomain/zero/1.0/">Creative Commons CC0</ext-link> public domain dedication.</license-p></license></permissions><self-uri content-type="pdf" xlink:href="elife00311.pdf"/><abstract><object-id pub-id-type="doi">10.7554/eLife.00311.001</object-id><p>Single-structure models derived from X-ray data do not adequately account for the inherent, functionally important dynamics of protein molecules. We generated ensembles of structures by time-averaged refinement, where local molecular vibrations were sampled by molecular-dynamics (MD) simulation whilst global disorder was partitioned into an underlying overall translation–libration–screw (TLS) model. Modeling of 20 protein datasets at 1.1–3.1 Å resolution reduced cross-validated <italic>R</italic><sub><italic>free</italic></sub> values by 0.3–4.9%, indicating that ensemble models fit the X-ray data better than single structures. The ensembles revealed that, while most proteins display a well-ordered core, some proteins exhibit a ‘molten core’ likely supporting functionally important dynamics in ligand binding, enzyme activity and protomer assembly. Order–disorder changes in HIV protease indicate a mechanism of entropy compensation for ordering the catalytic residues upon ligand binding by disordering specific core residues. Thus, ensemble refinement extracts dynamical details from the X-ray data that allow a more comprehensive understanding of structure–dynamics–function relationships.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.001">http://dx.doi.org/10.7554/eLife.00311.001</ext-link></p></abstract><abstract abstract-type="executive-summary"><object-id pub-id-type="doi">10.7554/eLife.00311.002</object-id><title>eLife digest</title><p>It has been clear since the early days of structural biology in the late 1950s that proteins and other biomolecules are continually changing shape, and that these changes have an important influence on both the structure and function of the molecules. X-ray diffraction can provide detailed information about the structure of a protein, but only limited information about how its structure fluctuates over time. Detailed information about the dynamic behaviour of proteins is essential for a proper understanding of a variety of processes, including catalysis, ligand binding and protein–protein interactions, and could also prove useful in drug design.</p><p>Currently most of the X-ray crystal structures in the Protein Data Bank are ‘snap-shots’ with limited or no information about protein dynamics. However, X-ray diffraction patterns are affected by the dynamics of the protein, and also by distortions of the crystal lattice, so three-dimensional (3D) models of proteins ought to take these phenomena into account. Molecular-dynamics (MD) computer simulations transform 3D structures into 4D ‘molecular movies’ by predicting the movement of individual atoms.</p><p>Combining MD simulations with crystallographic data has the potential to produce more realistic ensemble models of proteins in which the atomic fluctuations are represented by multiple structures within the ensemble. Moreover, in addition to improved structural information, this process—which is called ensemble refinement—can provide dynamical information about the protein. Earlier attempts to do this ran into problems because the number of model parameters needed was greater than the number of observed data points. Burnley et al. now overcome this problem by modelling local molecular vibrations with MD simulations and, at the same time, using a course-grain model to describe global disorder of longer length scales.</p><p>Ensemble refinement of high-resolution X-ray diffraction datasets for 20 different proteins from the Protein Data Bank produced a better fit to the data than single structures for all 20 proteins. Ensemble refinement also revealed that 3 of the 20 proteins had a ‘molten core’, rather than the well-ordered residues core found in most proteins: this is likely to be important in various biological functions including ligand binding, filament formation and enzymatic function. Burnley et al. also showed that a HIV enzyme underwent an order–disorder transition that is likely to influence how this enzyme works, and that similar transitions might influence the interactions between the small-molecule drug Imatinib (also known as Gleevec) and the enzymes it targets. Ensemble refinement could be applied to the majority of crystallography data currently being collected, or collected in the past, so further insights into the properties and interactions of a variety of proteins and other biomolecules can be expected.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.002">http://dx.doi.org/10.7554/eLife.00311.002</ext-link></p></abstract><kwd-group kwd-group-type="author-keywords"><title>Author keywords</title><kwd>protein</kwd><kwd>crystallography</kwd><kwd>structure</kwd><kwd>function</kwd><kwd>dynamics</kwd></kwd-group><kwd-group kwd-group-type="research-organism"><title>Research organism</title><kwd>None</kwd></kwd-group><funding-group><award-group id="par-1"><funding-source><institution-wrap><institution>European Research Council</institution></institution-wrap></funding-source><award-id>233229</award-id><principal-award-recipient><name><surname>Gros</surname><given-names>Piet</given-names></name></principal-award-recipient></award-group><award-group id="par-2"><funding-source><institution-wrap><institution>National Institutes of Health</institution></institution-wrap></funding-source><award-id>P01GM063210</award-id><principal-award-recipient><name><surname>Adams</surname><given-names>Paul D</given-names></name></principal-award-recipient></award-group><award-group id="par-3"><funding-source><institution-wrap><institution>The Netherlands Organization for Scientific Research (NWO)</institution></institution-wrap></funding-source><award-id>01.80.104.00</award-id><principal-award-recipient><name><surname>Gros</surname><given-names>Piet</given-names></name></principal-award-recipient></award-group><funding-statement>The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.</funding-statement></funding-group><custom-meta-group><custom-meta><meta-name>eLife-xml-version</meta-name><meta-value>1.0</meta-value></custom-meta><custom-meta specific-use="meta-only"><meta-name>Author impact statement</meta-name><meta-value>A combination of molecular dynamics simulations and X-ray diffraction data has been used to construct more realistic models of proteins and to provide new insights into their interactions with other proteins and biomolecules.</meta-value></custom-meta></custom-meta-group></article-meta></front><body><sec id="s1" sec-type="intro"><title>Introduction</title><p>Since the dawn of structural biology there have been experimental observations of dynamic motion in proteins and other biomolecules (<xref ref-type="bibr" rid="bib35">Linderstrøm-Lang and Schellman, 1959</xref>). Multiple biophysical methods have firmly established that such atomic ‘wigglings and jigglings’ (<xref ref-type="bibr" rid="bib18">Feynman et al., 1963</xref>) play an inherent role in both protein structure and function; and, in conjunction with high-resolution structures insights into dynamics aid the understanding of biomolecular functions in catalysis, ligand or drug binding and macromolecular interactions. Presently X-ray diffraction and NMR spectroscopy are the primary source of data for high-resolution protein structures. Whereas microscopy methods may provide information regarding long-range conformational changes, NMR characterizes fluctuations at atomic detail. However, due to the challenging nature of such experiments the number of dynamics studies is relatively sparse in contrast with the wealth of structural information available in the Protein Data Bank (PDB) (<xref ref-type="bibr" rid="bib6">Berman et al., 2000</xref>). The majority of entries in the PDB derived from X-ray diffraction data are presented as static, single, structures, although there is often extensive disorder resulting from protein dynamics and crystal-lattice distortions. Extracting atomic fluctuations from these diffraction data would dramatically increase the scope for dynamics studies of biomolecules and potentially reveal atomic details of structure–function–dynamic mechanisms that have previously been obscured.</p><p>The diffraction data of proteins are affected by multiple sources of disorder, notably arising from atomic vibrations, concerted motions of protein domains and inter-molecular lattice distortions. Structural models of proteins should account for both anisotropic and anharmonic distributions around the mean atomic positions to reproduce the observed Bragg intensities accurately (<xref ref-type="bibr" rid="bib56">Vitkup et al., 2002</xref>; <xref ref-type="bibr" rid="bib20">Furnham et al., 2006</xref>). However, explicit modelling of such distributions in macromolecules using current methods requires extensive parameterization inappropriate for the diffraction quality of a typical protein crystal. Multi-conformer structures represent both anisotropic and anharmonic disorder, but despite numerous attempts at automating the inclusion of minor conformations (<xref ref-type="bibr" rid="bib15">DePristo et al., 2004</xref>; <xref ref-type="bibr" rid="bib34">Levin et al., 2007</xref>; <xref ref-type="bibr" rid="bib49">Terwilliger et al., 2007</xref>; <xref ref-type="bibr" rid="bib31">Korostelev et al., 2009</xref>; <xref ref-type="bibr" rid="bib54">van den Bedem et al., 2009</xref>; <xref ref-type="bibr" rid="bib32">Lang et al., 2010</xref>), 95% of all protein residues in the Protein Data Bank (PDB) (<xref ref-type="bibr" rid="bib6">Berman et al., 2000</xref>) derived from diffraction data are modelled with a single conformation (<xref ref-type="bibr" rid="bib32">Lang et al., 2010</xref>). As opposed to multiple discrete models, a MD simulation with time-averaged restraints (<xref ref-type="bibr" rid="bib22">Gros et al., 1990</xref>) results in a population of structures in which the individual models are interrelated by a Boltzmann-weighted energy function. This method introduced by <xref ref-type="bibr" rid="bib51">Torda et al. (1989)</xref> and implemented in macromolecular crystallography by <xref ref-type="bibr" rid="bib22">Gros et al. (1990)</xref>, showed a reduction in <italic>R</italic>-value. However, cross-validation introduced subsequently (<xref ref-type="bibr" rid="bib10">Brünger, 1992</xref>) revealed chronically over-fitted models (<xref ref-type="bibr" rid="bib11">Burling and Brunger, 1994</xref>; <xref ref-type="bibr" rid="bib14">Clarage and Phillips, 1994</xref>; <xref ref-type="bibr" rid="bib43">Schiffer et al., 1995</xref>).</p><p>Here, we present an ensemble-refinement method that restricts the number of structures modelled and thereby prevents over-fitting of the data. We model large-scale motions, attributable to, for example, lattice distortions, by an underlying global disorder model. This approach allows MD simulations to sample local atomic fluctuations only, without the need for sampling large-scale global disorder. We show that the method yields reproducible ensembles with improved fit to the X-ray data, as validated by cross validation, <italic>R</italic><sub><italic>free</italic></sub> (<xref ref-type="bibr" rid="bib10">Brünger, 1992</xref>), and stereochemical analyses. Analyses of the ensembles show that detailed features are observed indicating atomic fluctuations that may be relevant for the biological function of the macromolecules.</p></sec><sec id="s2" sec-type="results|discussion"><title>Results and discussion</title><sec id="s2-1"><title>Ensemble refinement of 20 datasets from the PDB</title><p>We performed MD simulations, in which the model was restrained by a time-averaged X-ray (<xref ref-type="bibr" rid="bib22">Gros et al., 1990</xref>), maximum-likelihood (<xref ref-type="bibr" rid="bib40">Pannu and Read, 1996</xref>; <xref ref-type="bibr" rid="bib2">Adams et al., 1997</xref>; <xref ref-type="bibr" rid="bib37">Murshudov et al., 1997</xref>) target function (see ‘Materials and methods’). The X-ray restraint optimized 〈<bold><italic>F</italic></bold><sub><italic>calc</italic></sub>(<italic>hkl</italic>)〉 against <bold><italic>F</italic></bold><sub><italic>obs</italic></sub>(<italic>hkl</italic>), where 〈<bold><italic>F</italic></bold><sub><italic>calc</italic></sub>(<italic>hkl</italic>)〉 are computed as rolling averages from the structures in the MD trajectory, with the length of the averaging window determined by the relaxation time <italic>τ</italic><sub><italic>x</italic></sub>. This approach contrasts with the traditional crystallographic refinement approach, where <bold><italic>F</italic></bold><sub><italic>calc</italic></sub>(<italic>hkl</italic>) are computed from a single structure and optimized against <bold><italic>F</italic></bold><sub><italic>obs</italic></sub>(<italic>hkl</italic>).</p><p>Prior to the simulations we approximated the large-scale disorder by an overall TLS model derived from the atomic <italic>B</italic>-factors of the refined single structure. Using one TLS group per protein molecule or domain, we iteratively fitted TLS parameters (<xref ref-type="bibr" rid="bib45">Schomaker and Trueblood, 1968</xref>; <xref ref-type="bibr" rid="bib58">Winn et al., 2001</xref>) to the atomic <italic>B</italic>-factors of the protein atoms excluding atoms with large deviations in <italic>B</italic>-factor from the TLS-derived <italic>B</italic>-factor (the parameter <italic>p</italic><sub><italic>TLS</italic></sub> described the percentage of atoms included in TLS-fitting; see ‘Materials and methods’). The resulting TLS model was applied to all atoms throughout the simulation. Effectively, this TLS model of the protein core excludes the effects of hyper-flexible surface loops and, hence, describes the global disorder that may be attributed to inter-molecular lattice distortions and overall intra-molecular breathing or domain shifts.</p><p>Ensemble refinement was tested using 20 diffraction datasets from the PDB and started from either the PDB or PDB_REDO (<xref ref-type="bibr" rid="bib28">Joosten et al., 2010</xref>) structures (‘Materials and methods’). Upper resolution limits of the datasets ranged from 1.1 to 3.1 Å resolution and structures had 50 to 1,004 amino-acid residues in the asymmetric unit (<xref ref-type="table" rid="tbl1">Table 1</xref>). The simulations were run at an effective temperature of 300 K for the protein atoms, using a temperature bath (<italic>T</italic><sub><italic>bath</italic></sub>) slightly below 300 K to allow for heating due to the non-conservative nature of the time-averaged X-ray restraint modulated by its weight <italic>w</italic><sub><italic>x-ray</italic></sub> (‘Materials and methods’). Explicitly modelled solvent atoms were added and/or removed intermittently during the simulation dependent on the corresponding electron-density and difference maps (‘Materials and methods’). Bulk solvent effects were accounted for by an averaged Flat Bulk-Solvent Model (<xref ref-type="bibr" rid="bib27">Jiang and Brünger, 1994</xref>; <xref ref-type="bibr" rid="bib3">Afonine et al., 2005</xref>) (‘Materials and methods’). The parameters <italic>p</italic><sub><italic>TLS</italic></sub>, <italic>τ</italic><sub><italic>x</italic></sub> and the <italic>T</italic><sub><italic>bath</italic></sub> and <italic>w</italic><sub><italic>x-ray</italic></sub> pair were optimized in a grid search resulting in a shallow optimum scored by <italic>R</italic><sub><italic>free</italic></sub> (<xref ref-type="fig" rid="fig1">Figure 1A</xref>). After a period of equilibration, the trajectory of structures was acquired over an extensive period of time (40 times <italic>τ</italic><sub><italic>x</italic></sub>).<table-wrap id="tbl1" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.003</object-id><label>Table 1.</label><caption><p>Ensemble refinement statistics for 20 datasets. Datasets were taken from the PDB or PDB_REDO and were re-refined using ensemble refinement and phenix.refine. The relaxation time <italic>τ</italic><sub><italic>x</italic></sub> used, the resulting number of structures in the final ensemble and <italic>R<sub>work</sub></italic> and <italic>R<sub>free</sub></italic> values are given. The ensemble models yield improved <italic>R<sub>free</sub></italic> values for all datasets, ranging in improvement from 0.3% to 4.9% with a mean improvement of 1.8%. The PDB accession numbers are as follows: 1KZK (<xref ref-type="bibr" rid="bib41">Reiling et al., 2002</xref>), 3K0M (<xref ref-type="bibr" rid="bib19">Fraser et al., 2009</xref>), 3K0N (<xref ref-type="bibr" rid="bib19">Fraser et al., 2009</xref>), 2PC0 (<xref ref-type="bibr" rid="bib26">Heaslet et al., 2007</xref>), 1UOY (<xref ref-type="bibr" rid="bib39">Olsen et al., 2004</xref>), 3CA7 (<xref ref-type="bibr" rid="bib30">Klein et al., 2008</xref>), 2R8Q (<xref ref-type="bibr" rid="bib57">Wang et al., 2007</xref>), 3QL0 (<xref ref-type="bibr" rid="bib7">Bhabha et al., 2011</xref>), 1X6P (<xref ref-type="bibr" rid="bib16">Dunlop et al., 2005</xref>), 1F2F (<xref ref-type="bibr" rid="bib29">Kimber et al., 2000</xref>), 3QL3 (<xref ref-type="bibr" rid="bib7">Bhabha et al., 2011</xref>), 1YTT (<xref ref-type="bibr" rid="bib12">Burling et al., 1996</xref>), 3GWH (<xref ref-type="bibr" rid="bib42">Rodríguez et al., 2009</xref>), 1BV1 (<xref ref-type="bibr" rid="bib21">Gajhede et al., 1996</xref>), 1IEP (<xref ref-type="bibr" rid="bib38">Nagar et al., 2002</xref>), 2XFA (<xref ref-type="bibr" rid="bib48">Singh et al., 2011</xref>), 3ODU (<xref ref-type="bibr" rid="bib59">Wu et al., 2010</xref>), 1M52 (<xref ref-type="bibr" rid="bib38">Nagar et al., 2002</xref>), 3CM8 (<xref ref-type="bibr" rid="bib25">He et al., 2008</xref>) and 3RZE (<xref ref-type="bibr" rid="bib47">Shimamura et al., 2011</xref>)</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.003">http://dx.doi.org/10.7554/eLife.00311.003</ext-link></p></caption><table frame="hsides" rules="groups"><thead><tr><td rowspan="2">PDB ID</td><td rowspan="2">Resolution (Å)</td><td colspan="4">Ensemble refinement</td><td colspan="2">phenix.refine</td><td colspan="2">Ensemble—phenix.refine</td></tr><tr><td><italic>τ</italic><sub><italic>x</italic></sub> (ps)</td><td>No. of structures</td><td><italic>R</italic><sub><italic>work</italic></sub></td><td><italic>R</italic><sub><italic>free</italic></sub></td><td><italic>R</italic><sub><italic>work</italic></sub></td><td><italic>R</italic><sub><italic>free</italic></sub></td><td><italic>ΔR</italic><sub><italic>work</italic></sub></td><td><italic>ΔR</italic><sub><italic>fre</italic>e</sub></td></tr></thead><tbody><tr><td>1KZK</td><td align="char" char=".">1.1</td><td align="char" char=".">1.5</td><td align="char" char=".">600</td><td align="char" char=".">0.125</td><td align="char" char=".">0.153</td><td align="char" char=".">0.136</td><td align="char" char=".">0.155</td><td align="char" char=".">−0.011</td><td align="char" char=".">−0.003</td></tr><tr><td>3K0M</td><td align="char" char=".">1.3</td><td align="char" char=".">2.0</td><td align="char" char=".">250</td><td align="char" char=".">0.104</td><td align="char" char=".">0.129</td><td align="char" char=".">0.116</td><td align="char" char=".">0.132</td><td align="char" char=".">−0.012</td><td align="char" char=".">−0.003</td></tr><tr><td>3K0N</td><td align="char" char=".">1.4</td><td align="char" char=".">1.0</td><td align="char" char=".">209</td><td align="char" char=".">0.115</td><td align="char" char=".">0.133</td><td align="char" char=".">0.119</td><td align="char" char=".">0.143</td><td align="char" char=".">−0.004</td><td align="char" char=".">−0.010</td></tr><tr><td>2PC0</td><td align="char" char=".">1.4</td><td align="char" char=".">0.8</td><td align="char" char=".">250</td><td align="char" char=".">0.145</td><td align="char" char=".">0.188</td><td align="char" char=".">0.161</td><td align="char" char=".">0.193</td><td align="char" char=".">−0.016</td><td align="char" char=".">−0.005</td></tr><tr><td>1UOY</td><td align="char" char=".">1.5</td><td align="char" char=".">1.0</td><td align="char" char=".">167</td><td align="char" char=".">0.104</td><td align="char" char=".">0.137</td><td align="char" char=".">0.155</td><td align="char" char=".">0.185</td><td align="char" char=".">−0.051</td><td align="char" char=".">−0.049</td></tr><tr><td>3CA7</td><td align="char" char=".">1.5</td><td align="char" char=".">0.8</td><td align="char" char=".">40</td><td align="char" char=".">0.149</td><td align="char" char=".">0.184</td><td align="char" char=".">0.171</td><td align="char" char=".">0.212</td><td align="char" char=".">−0.022</td><td align="char" char=".">−0.029</td></tr><tr><td>2R8Q</td><td align="char" char=".">1.5</td><td align="char" char=".">1.0</td><td align="char" char=".">200</td><td align="char" char=".">0.132</td><td align="char" char=".">0.162</td><td align="char" char=".">0.158</td><td align="char" char=".">0.178</td><td align="char" char=".">−0.026</td><td align="char" char=".">−0.016</td></tr><tr><td>3QL0</td><td align="char" char=".">1.6</td><td align="char" char=".">0.5</td><td align="char" char=".">70</td><td align="char" char=".">0.204</td><td align="char" char=".">0.254</td><td align="char" char=".">0.229</td><td align="char" char=".">0.270</td><td align="char" char=".">−0.024</td><td align="char" char=".">−0.017</td></tr><tr><td>1X6P</td><td align="char" char=".">1.6</td><td align="char" char=".">1.0</td><td align="char" char=".">400</td><td align="char" char=".">0.121</td><td align="char" char=".">0.149</td><td align="char" char=".">0.140</td><td align="char" char=".">0.175</td><td align="char" char=".">−0.019</td><td align="char" char=".">−0.026</td></tr><tr><td>1F2F</td><td align="char" char=".">1.7</td><td align="char" char=".">0.8</td><td align="char" char=".">143</td><td align="char" char=".">0.128</td><td align="char" char=".">0.168</td><td align="char" char=".">0.160</td><td align="char" char=".">0.198</td><td align="char" char=".">−0.032</td><td align="char" char=".">−0.031</td></tr><tr><td>3QL3</td><td align="char" char=".">1.8</td><td align="char" char=".">0.5</td><td align="char" char=".">80</td><td align="char" char=".">0.160</td><td align="char" char=".">0.208</td><td align="char" char=".">0.170</td><td align="char" char=".">0.221</td><td align="char" char=".">−0.010</td><td align="char" char=".">−0.013</td></tr><tr><td>1YTT</td><td align="char" char=".">1.8</td><td align="char" char=".">0.3</td><td align="char" char=".">84</td><td align="char" char=".">0.139</td><td align="char" char=".">0.174</td><td align="char" char=".">0.166</td><td align="char" char=".">0.189</td><td align="char" char=".">−0.027</td><td align="char" char=".">−0.014</td></tr><tr><td>3GWH</td><td align="char" char=".">2.0</td><td align="char" char=".">1.0</td><td align="char" char=".">39</td><td align="char" char=".">0.160</td><td align="char" char=".">0.200</td><td align="char" char=".">0.187</td><td align="char" char=".">0.220</td><td align="char" char=".">−0.027</td><td align="char" char=".">−0.021</td></tr><tr><td>1BV1</td><td align="char" char=".">2.0</td><td align="char" char=".">0.4</td><td align="char" char=".">78</td><td align="char" char=".">0.149</td><td align="char" char=".">0.182</td><td align="char" char=".">0.154</td><td align="char" char=".">0.205</td><td align="char" char=".">−0.005</td><td align="char" char=".">−0.023</td></tr><tr><td>1IEP</td><td align="char" char=".">2.1</td><td align="char" char=".">0.5</td><td align="char" char=".">200</td><td align="char" char=".">0.183</td><td align="char" char=".">0.238</td><td align="char" char=".">0.196</td><td align="char" char=".">0.245</td><td align="char" char=".">−0.012</td><td align="char" char=".">−0.007</td></tr><tr><td>2XFA</td><td align="char" char=".">2.1</td><td align="char" char=".">1.0</td><td align="char" char=".">100</td><td align="char" char=".">0.171</td><td align="char" char=".">0.217</td><td align="char" char=".">0.184</td><td align="char" char=".">0.244</td><td align="char" char=".">−0.013</td><td align="char" char=".">−0.027</td></tr><tr><td>3ODU</td><td align="char" char=".">2.5</td><td align="char" char=".">0.3</td><td align="char" char=".">50</td><td align="char" char=".">0.208</td><td align="char" char=".">0.269</td><td align="char" char=".">0.219</td><td align="char" char=".">0.281</td><td align="char" char=".">−0.010</td><td align="char" char=".">−0.012</td></tr><tr><td>1M52</td><td align="char" char=".">2.6</td><td align="char" char=".">0.5</td><td align="char" char=".">50</td><td align="char" char=".">0.161</td><td align="char" char=".">0.211</td><td align="char" char=".">0.168</td><td align="char" char=".">0.228</td><td align="char" char=".">−0.007</td><td align="char" char=".">−0.017</td></tr><tr><td>3CM8</td><td align="char" char=".">2.9</td><td align="char" char=".">0.5</td><td align="char" char=".">67</td><td align="char" char=".">0.194</td><td align="char" char=".">0.235</td><td align="char" char=".">0.205</td><td align="char" char=".">0.248</td><td align="char" char=".">−0.011</td><td align="char" char=".">−0.013</td></tr><tr><td>3RZE</td><td align="char" char=".">3.1</td><td align="char" char=".">0.1</td><td align="char" char=".">72</td><td align="char" char=".">0.210</td><td align="char" char=".">0.280</td><td align="char" char=".">0.210</td><td align="char" char=".">0.291</td><td align="char" char=".">0.000</td><td align="char" char=".">−0.011</td></tr><tr><td colspan="7" rowspan="3"/><td><bold>Max</bold></td><td align="char" char=".">−0.051</td><td align="char" char=".">−0.049</td></tr><tr><td><bold>Min</bold></td><td align="char" char=".">0.000</td><td align="char" char=".">−0.003</td></tr><tr><td><bold>Mean</bold></td><td align="char" char=".">−0.018</td><td align="char" char=".">−0.018</td></tr></tbody></table></table-wrap><fig id="fig1" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.004</object-id><label>Figure 1.</label><caption><p>Example of ensemble refinement for dataset 1UOY. (<bold>A</bold>) Optimisation of empirical ensemble refinement parameters (<italic>τ</italic><sub><italic>x</italic></sub>, <italic>p</italic><sub><italic>TLS</italic></sub> and <italic>T</italic><sub><italic>bath</italic></sub>). Simulations are performed independently and in parallel. The plot shows effect of <italic>τ</italic><sub><italic>x</italic></sub>, <italic>p</italic><sub><italic>TLS</italic></sub> on <italic>R</italic><sub><italic>free</italic></sub> (each grid point corresponds to the lowest <italic>R</italic><sub><italic>free</italic></sub> among all <italic>T</italic><sub><italic>bath</italic></sub> values). Optimum parameters are selected by <italic>R</italic><sub><italic>free</italic></sub>. (<bold>B</bold>) <italic>R</italic>-values obtained during ensemble-refinement simulation, solid lines <italic>R</italic><sub><italic>work</italic></sub> and dashed lines <italic>R</italic><sub><italic>free</italic></sub>; high values are observed for instantaneous models (yellow) contrasting with the rolling average used in the target function (red) and the final ensemble (blue). (<bold>C</bold>) <italic>R</italic>-values are reduced throughout the resolution range for ensemble model (blue) compared with phenix.refine re-refined single structure (black); solid lines <italic>R</italic><sub><italic>work</italic></sub> and dashed line <italic>R</italic><sub><italic>free</italic></sub>. (<bold>D</bold>) Number of structures in the ensemble, reduced by equidistant selection, <italic>versus R</italic><sub><italic>work</italic></sub> (solid line) and <italic>R</italic><sub><italic>free</italic></sub> (dashed line). Final number of structures is selected as the minimum number required reproducing the <italic>R</italic><sub><italic>free</italic></sub> + 0.1%; in this case resulting in an ensemble containing 167 structures. (<bold>E</bold>) Density difference maps for the ensemble structure (<italic>m</italic><bold><italic>F</italic></bold><sub><italic>obs</italic></sub> − <italic>D</italic><bold><italic>F</italic></bold><sub><italic>model</italic></sub>)exp[<italic>iφ</italic><sub><italic>model</italic></sub>], left-hand side, and the single structure right-hand side, contoured at 0.34 e/Å<sup>3</sup> (equivalent to 3.0 σ for the ensemble model), positive and negative densities are coloured green and red respectively. All molecular graphics figures are drawn using PyMol (The PyMOL Molecular Graphics System, Schrödinger, LLC).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.004">http://dx.doi.org/10.7554/eLife.00311.004</ext-link></p></caption><graphic xlink:href="elife00311f001"/></fig></p><p><xref ref-type="fig" rid="fig1">Figure 1B</xref> shows the <italic>R</italic>-values as they developed over the simulation time for a structure with PDB code 1UOY (<xref ref-type="bibr" rid="bib39">Olsen et al., 2004</xref>), for which the largest improvement in <italic>R</italic><sub><italic>free</italic></sub> was observed among the datasets tested (possibly due to the high degree of anisotropic and anharmonic side-chain motion for this case). The <italic>R</italic>-values started at a high value and remained high (∼35%) for the individual structures, which is in agreement with the observation that the derived global TLS <italic>B</italic>-factor model is not optimal for fitting a single structure to the data. Averaging the structure factors over the relaxation time <italic>τ</italic><sub><italic>x</italic></sub> of 1 ps (corresponding to the rolling average structure factors used in the X-ray restraint) dropped the <italic>R</italic><sub><italic>work</italic></sub> and <italic>R</italic><sub><italic>free</italic></sub> to ∼11% and ∼15% respectively. The <italic>R</italic><sub><italic>work</italic></sub> and <italic>R</italic><sub><italic>free</italic></sub> of the collected ensemble of structures (corresponding to unweighted averaged structure factors) monotonically decreased to 10.3 and 13.7% respectively. over the acquisition period of 40 ps. The improvement in <italic>R</italic>-values from the ensemble model with respect to the single-structure model spanned the entire resolution range of the data (<xref ref-type="fig" rid="fig1">Figure 1C</xref>). Acquisition over 40 times <italic>τ</italic><sub><italic>x</italic></sub> yielded a highly redundant set of structures. We reduced the number of structures by calculating the minimum number of structures, that is 167 in the case of 1UOY, required to reproduce the <italic>R</italic>-value of the trajectory (<xref ref-type="fig" rid="fig1">Figure 1D</xref> and ‘Materials and methods’).</p><p>Analysis of all 20 datasets showed that ensemble refinement improved the <italic>R</italic><sub><italic>free</italic></sub> by between 0.3 and 4.9 percentage points compared to single structures re-refined using the same program package, that is Phenix (<xref ref-type="bibr" rid="bib4">Afonine et al., 2012</xref>), with a mean improvement of 1.8% in <italic>R</italic><sub><italic>free</italic></sub> values (<xref ref-type="table" rid="tbl1">Table 1</xref>, <xref ref-type="fig" rid="fig2">Figure 2A</xref>). The effect of the starting structure on ensemble refinement was assessed by using alternative refinement programs, phenix.refine, Refmac (<xref ref-type="bibr" rid="bib52">Vagin et al., 2004</xref>), and Buster (<xref ref-type="bibr" rid="bib9">Bricogne et al., 2009</xref>), to generate varying input models. No significant differences were observed due to the different starting models (<xref ref-type="table" rid="tbl3 tbl4">Tables 3 and 4</xref>). The improvement in <italic>R</italic><sub><italic>free</italic></sub>, number of structures in the final ensemble and the averaging time <italic>τ</italic><sub><italic>x</italic></sub> tended to increase with resolution (<xref ref-type="fig" rid="fig2">Figure 2A–C</xref>). The optimum values for the parameters <italic>p</italic><sub><italic>TLS</italic></sub> and <italic>T</italic><sub><italic>bath</italic></sub> are not correlated with resolution (<xref ref-type="fig" rid="fig2">Figure 2D,E</xref>). Concomitant with the reduction in <italic>R</italic>-values, the ensemble models reduced electron-density differences, decreasing rms fluctuations in difference maps by 0 to 41% with an average of 12% improvement (<xref ref-type="table" rid="tbl2">Table 2</xref>). The difference electron-density maps for the single-structure and ensemble models indicated improvements throughout the asymmetric unit cell, as exemplified in <xref ref-type="fig" rid="fig1">Figure 1E</xref>.<fig id="fig2" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.005</object-id><label>Figure 2.</label><caption><p>Ensemble refinement parameters and results as function of resolution of the datasets. (<bold>A</bold>) Gain in <italic>R</italic><sub><italic>free</italic></sub> of ensemble refinement compared with re-refinement using phenix.refine, (<bold>B</bold>) number of structures in the final ensemble model, (<bold>C</bold>) optimum relaxation time, <italic>τ</italic><sub><italic>x</italic></sub>, (<bold>D</bold>) optimum <italic>p</italic><sub><italic>TLS</italic></sub> and (<bold>E</bold>) optimum <italic>T</italic><sub><italic>bath</italic></sub> plotted as function of resolution of the dataset.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.005">http://dx.doi.org/10.7554/eLife.00311.005</ext-link></p></caption><graphic xlink:href="elife00311f002"/></fig><table-wrap id="tbl2" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.006</object-id><label>Table 2.</label><caption><p>Rms (<italic>m</italic><italic>F</italic><sub><italic>obs</italic></sub> − <italic>DF</italic><sub><italic>model</italic></sub>)exp[<italic>iφ</italic><sub><italic>model</italic></sub>] difference densities obtained from ensemble refinement and re-refinement in phenix.refine</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.006">http://dx.doi.org/10.7554/eLife.00311.006</ext-link></p></caption><table frame="hsides" rules="groups"><thead><tr><td rowspan="2">PDB ID</td><td rowspan="2">Resolution (Å)</td><td colspan="2">σ<sub>mFo−DFc</sub> (e/Å<sup>3</sup>)</td></tr><tr><td>Ensemble</td><td>phenix.refine</td></tr></thead><tbody><tr><td>1KZK</td><td align="char" char=".">1.1</td><td align="char" char=".">0.138</td><td align="char" char=".">0.161</td></tr><tr><td>3K0M</td><td align="char" char=".">1.3</td><td align="char" char=".">0.016</td><td align="char" char=".">0.018</td></tr><tr><td>3K0N</td><td align="char" char=".">1.4</td><td align="char" char=".">0.007</td><td align="char" char=".">0.008</td></tr><tr><td>2PCO</td><td align="char" char=".">1.4</td><td align="char" char=".">0.099</td><td align="char" char=".">0.099</td></tr><tr><td>1UOY</td><td align="char" char=".">1.5</td><td align="char" char=".">0.115</td><td align="char" char=".">0.162</td></tr><tr><td>3CA7</td><td align="char" char=".">1.5</td><td align="char" char=".">0.132</td><td align="char" char=".">0.148</td></tr><tr><td>2R8Q</td><td align="char" char=".">1.5</td><td align="char" char=".">0.104</td><td align="char" char=".">0.118</td></tr><tr><td>3QL0</td><td align="char" char=".">1.6</td><td align="char" char=".">0.124</td><td align="char" char=".">0.138</td></tr><tr><td>1X6P</td><td align="char" char=".">1.6</td><td align="char" char=".">0.098</td><td align="char" char=".">0.105</td></tr><tr><td>1F2F</td><td align="char" char=".">1.7</td><td align="char" char=".">0.104</td><td align="char" char=".">0.126</td></tr><tr><td>3QL3</td><td align="char" char=".">1.8</td><td align="char" char=".">0.131</td><td align="char" char=".">0.139</td></tr><tr><td>1YTT</td><td align="char" char=".">1.8</td><td align="char" char=".">0.170</td><td align="char" char=".">0.215</td></tr><tr><td>3GWH</td><td align="char" char=".">2.0</td><td align="char" char=".">0.125</td><td align="char" char=".">0.138</td></tr><tr><td>1BV1</td><td align="char" char=".">2.0</td><td align="char" char=".">0.109</td><td align="char" char=".">0.119</td></tr><tr><td>1IEP</td><td align="char" char=".">2.1</td><td align="char" char=".">0.084</td><td align="char" char=".">0.091</td></tr><tr><td>2XFA</td><td align="char" char=".">2.1</td><td align="char" char=".">0.069</td><td align="char" char=".">0.074</td></tr><tr><td>3ODU</td><td align="char" char=".">2.5</td><td align="char" char=".">0.105</td><td align="char" char=".">0.113</td></tr><tr><td>1M52</td><td align="char" char=".">2.6</td><td align="char" char=".">0.088</td><td align="char" char=".">0.093</td></tr><tr><td>3CM8</td><td align="char" char=".">2.9</td><td align="char" char=".">0.036</td><td align="char" char=".">0.036</td></tr><tr><td>3RZE</td><td align="char" char=".">3.1</td><td align="char" char=".">0.070</td><td align="char" char=".">0.070</td></tr></tbody></table></table-wrap><table-wrap id="tbl3" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.007</object-id><label>Table 3.</label><caption><p>Effect of input structure on ensemble refinement. For three datasets ensemble refinement was performed using a starting structure from three different refinement programs. For each structure three random number seed repeats of ensemble refinement were performed and the <italic>R</italic>-factors are shown to be highly similar</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.007">http://dx.doi.org/10.7554/eLife.00311.007</ext-link></p></caption><table frame="hsides" rules="groups"><thead><tr><td rowspan="3">PDB</td><td colspan="3" rowspan="2">Re-refinement</td><td colspan="8">Ensemble refinement</td></tr><tr><td colspan="2">Repeat 1</td><td colspan="2">Repeat 2</td><td colspan="2">Repeat 3</td><td colspan="2">Mean</td></tr><tr><td>Program</td><td><italic>R</italic><sub><italic>work</italic></sub></td><td><italic>R</italic><sub><italic>free</italic></sub></td><td><italic>R</italic><sub><italic>work</italic></sub></td><td><italic>R</italic><sub><italic>free</italic></sub></td><td><italic>R</italic><sub><italic>work</italic></sub></td><td><italic>R</italic><sub><italic>free</italic></sub></td><td><italic>R</italic><sub><italic>work</italic></sub></td><td><italic>R</italic><sub><italic>free</italic></sub></td><td><italic>R</italic><sub><italic>work</italic></sub></td><td><italic>R</italic><sub><italic>free</italic></sub></td></tr></thead><tbody><tr><td rowspan="3">1UOY</td><td>Buster</td><td align="char" char=".">0.167</td><td align="char" char=".">0.196</td><td align="char" char=".">0.108</td><td align="char" char=".">0.144</td><td align="char" char=".">0.112</td><td align="char" char=".">0.145</td><td align="char" char=".">0.110</td><td align="char" char=".">0.146</td><td align="char" char=".">0.110</td><td align="char" char=".">0.145</td></tr><tr><td>Refmac</td><td align="char" char=".">0.147</td><td align="char" char=".">0.170</td><td align="char" char=".">0.104</td><td align="char" char=".">0.137</td><td align="char" char=".">0.103</td><td align="char" char=".">0.140</td><td align="char" char=".">0.105</td><td align="char" char=".">0.144</td><td align="char" char=".">0.104</td><td align="char" char=".">0.140</td></tr><tr><td>Phenix</td><td align="char" char=".">0.155</td><td align="char" char=".">0.185</td><td align="char" char=".">0.109</td><td align="char" char=".">0.142</td><td align="char" char=".">0.109</td><td align="char" char=".">0.147</td><td align="char" char=".">0.111</td><td align="char" char=".">0.149</td><td align="char" char=".">0.110</td><td align="char" char=".">0.146</td></tr><tr><td rowspan="3">3CA7</td><td>Buster</td><td align="char" char=".">0.177</td><td align="char" char=".">0.208</td><td align="char" char=".">0.137</td><td align="char" char=".">0.186</td><td align="char" char=".">0.137</td><td align="char" char=".">0.192</td><td align="char" char=".">0.141</td><td align="char" char=".">0.197</td><td align="char" char=".">0.138</td><td align="char" char=".">0.192</td></tr><tr><td>Refmac</td><td align="char" char=".">0.170</td><td align="char" char=".">0.205</td><td align="char" char=".">0.139</td><td align="char" char=".">0.187</td><td align="char" char=".">0.135</td><td align="char" char=".">0.189</td><td align="char" char=".">0.138</td><td align="char" char=".">0.193</td><td align="char" char=".">0.137</td><td align="char" char=".">0.189</td></tr><tr><td>Phenix</td><td align="char" char=".">0.171</td><td align="char" char=".">0.212</td><td align="char" char=".">0.138</td><td align="char" char=".">0.180</td><td align="char" char=".">0.142</td><td align="char" char=".">0.189</td><td align="char" char=".">0.148</td><td align="char" char=".">0.193</td><td align="char" char=".">0.142</td><td align="char" char=".">0.187</td></tr><tr><td rowspan="3">1BV1</td><td>Buster</td><td align="char" char=".">0.161</td><td align="char" char=".">0.204</td><td align="char" char=".">0.137</td><td align="char" char=".">0.184</td><td align="char" char=".">0.138</td><td align="char" char=".">0.185</td><td align="char" char=".">0.137</td><td align="char" char=".">0.186</td><td align="char" char=".">0.138</td><td align="char" char=".">0.185</td></tr><tr><td>Refmac</td><td align="char" char=".">0.178</td><td align="char" char=".">0.231</td><td align="char" char=".">0.140</td><td align="char" char=".">0.182</td><td align="char" char=".">0.143</td><td align="char" char=".">0.184</td><td align="char" char=".">0.143</td><td align="char" char=".">0.189</td><td align="char" char=".">0.142</td><td align="char" char=".">0.185</td></tr><tr><td>Phenix</td><td align="char" char=".">0.154</td><td align="char" char=".">0.205</td><td align="char" char=".">0.139</td><td align="char" char=".">0.188</td><td align="char" char=".">0.138</td><td align="char" char=".">0.189</td><td align="char" char=".">0.140</td><td align="char" char=".">0.189</td><td align="char" char=".">0.139</td><td align="char" char=".">0.189</td></tr></tbody></table></table-wrap><table-wrap id="tbl4" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.008</object-id><label>Table 4.</label><caption><p><bold><italic>F</italic></bold><sub><italic>model</italic></sub> cross-correlation scores for ensembles generated with different input models. Three different refinement programs generated alternative starting structures, see <xref ref-type="table" rid="tbl3">Table 3</xref>. The best ensemble was selected as judged by <italic>R</italic><sub><italic>free</italic></sub>. <bold><italic>F</italic></bold><sub><italic>model</italic></sub> cross correlation scores are &gt;0.99 for all pairs of ensemble structures for all three datasets</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.008">http://dx.doi.org/10.7554/eLife.00311.008</ext-link></p></caption><table frame="hsides" rules="groups"><thead><tr><td rowspan="2">PDB</td><td colspan="2">Ensemble pair</td><td rowspan="2">CC</td></tr><tr><td>Re-refined input</td><td>Re-refined input</td></tr></thead><tbody><tr><td rowspan="3">1UOY</td><td>Refmac</td><td>Buster</td><td align="char" char=".">0.997</td></tr><tr><td>Refmac</td><td>Phenix</td><td align="char" char=".">0.997</td></tr><tr><td>Buster</td><td>Phenix</td><td align="char" char=".">0.999</td></tr><tr><td rowspan="3">3CA7</td><td>Refmac</td><td>Buster</td><td align="char" char=".">0.993</td></tr><tr><td>Refmac</td><td>Phenix</td><td align="char" char=".">0.992</td></tr><tr><td>Buster</td><td>Phenix</td><td align="char" char=".">0.996</td></tr><tr><td rowspan="3">1BV1</td><td>Refmac</td><td>Buster</td><td align="char" char=".">0.992</td></tr><tr><td>Refmac</td><td>Phenix</td><td align="char" char=".">0.990</td></tr><tr><td>Buster</td><td>Phenix</td><td align="char" char=".">0.992</td></tr></tbody></table></table-wrap></p></sec><sec id="s2-2"><title>Validation of ensemble refinement</title><p>We used the high-quality experimental phases available to high resolution for 1YTT of mannose-binding protein (<xref ref-type="bibr" rid="bib12">Burling et al., 1996</xref>) to validate the ensemble-refinement method. The overall correlation coefficient between the electron-density map from the ensemble model (obtained without experimental phases) and the experimentally phased electron-density map was 0.903, compared with 0.873 and 0.895 for the published and re-refined single structures. These seemingly small improvements in overall quality indicators allow for significant local improvements. Real-space correlation coefficients (<xref ref-type="bibr" rid="bib8">Brändén and Jones, 1990</xref>) highlighted marked local improvements for flexible residues in particular (<xref ref-type="fig" rid="fig3">Figure 3A</xref>) with 11 residues improving by more than 0.1 in correlation coefficient. This observation was consistent with local improvements in electron-density differences in regions of flexible or disordered side chains (<xref ref-type="fig" rid="fig3">Figure 3B</xref>). Moreover, the ensemble model contained structural details previously identified in a multiple-model approach by <xref ref-type="bibr" rid="bib12">Burling et al. (1996)</xref>, as shown for the anisotropic distribution for the side chain of Phe121 (<xref ref-type="fig" rid="fig3">Figure 3C</xref>) and diffuse water shells around hydrophobic residues (<xref ref-type="fig" rid="fig3">Figure 3D</xref>). <xref ref-type="fig" rid="fig3">Figure 3E</xref> shows that the most occupied water sites in the ensemble correlated with low atomic <italic>B</italic>-factors for waters in the single-structure model.<fig id="fig3" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.009</object-id><label>Figure 3.</label><caption><p>Validation of ensemble refinement using dataset 1YTT with exceptionally high quality experimental phases. (<bold>A</bold>) Real space cross-correlation of experimentally phased electron density map (|<bold><italic>F</italic></bold><sub><italic>obs</italic></sub>|exp[<italic>iφ</italic><sub><italic>obs</italic></sub>]) and model map (|<bold><italic>F</italic></bold><sub><italic>model</italic></sub>|exp[<italic>iφ</italic><sub><italic>model</italic></sub>]) for the single-structure (black) and ensemble model (chain A and B, blue and red respectively) shows improvements particularly for disordered areas (atomic <italic>B</italic>-factors from the re-refined single structure are shown in grey dashed lines). (<bold>B</bold>) Example of improved vector-difference map (|<bold><italic>F</italic></bold><sub><italic>obs</italic></sub>|exp[<italic>iφ</italic><sub><italic>obs</italic></sub>] − |<bold><italic>F</italic></bold><sub><italic>model</italic></sub>|exp[<italic>iφ</italic><sub><italic>model</italic></sub>]), contoured at 0.71 e/Å<sup>3</sup> equivalent to 2.5 σ for the single structure for Gln167, chain A, for single (left-hand side) and ensemble structure (right-hand side). (<bold>C</bold>) Conformer distribution of Phe121 (chain A) with the experimental phased map (|<bold><italic>F</italic></bold><sub><italic>obs</italic></sub>|exp[<italic>iφ</italic><sub><italic>obs</italic></sub>]) contoured at 1.4 σ is highly similar to the multi-conformer shown in Figure 1c in <xref ref-type="bibr" rid="bib12">Burling et al. (1996)</xref>. (<bold>D</bold>) Partially disordered solvent shell (red) around residue Leu203 (chain A) as anticipated in <xref ref-type="bibr" rid="bib12">Burling et al. (1996)</xref>. Ensemble structure with experimental phased experimental map (|<bold><italic>F</italic></bold><sub><italic>obs</italic></sub>|exp[<italic>iφ</italic><sub><italic>obs</italic></sub>]) contoured at 1.4 σ (left side) and 0.7 σ (right side), as shown in Figure 2b in <xref ref-type="bibr" rid="bib12">Burling et al. (1996)</xref>. (<bold>E</bold>) Scatter plot showing the anti-correlation between the <italic>B</italic>-factor of explicit solvent molecules in the re-refined single-structure and the relative occupancy of water molecules at that same position (within 0.5-Å distance) in the ensemble model. Due to the difficulty in differentiating between disorder (<italic>B</italic>-factor) and occupancy for explicitly modelled water atoms in single structures a high <italic>B</italic>-factor is likely to correspond to a partially occupied site.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.009">http://dx.doi.org/10.7554/eLife.00311.009</ext-link></p></caption><graphic xlink:href="elife00311f003"/></fig></p><p>Next, we analysed the stereochemistry of the computed ensemble models. The robustness of the observed atomic distributions was tested by repeating ensemble refinements 10 times using different random starting velocities. <xref ref-type="fig" rid="fig4">Figure 4A</xref> shows that the observed distributions are highly reproducible. With data extending to 1.5-Å resolution correlations above 0.99 were observed between residue distributions from separate runs. At lower resolutions, the majority of residues showed correlations above 0.95, with occasionally correlations dropping below 0.8 in very flexible regions (see <xref ref-type="fig" rid="fig4">Figure 4B,C</xref>). Clearly, the ensembles contain multiple values for each geometrical term that form a distribution, instead of a single stereochemical value obtained from a single structure; <xref ref-type="fig" rid="fig5">Figure 5A–D</xref> presents examples of side-chain distributions (by χ<sub>1</sub> and χ<sub>2</sub> angles) observed in the ensembles along with standard deviations computed from the 10 repeats. Averaged over all 20 cases, the rms deviations from idealized bond lengths, bond angles and dihedral angles for the re-refined single-structures were 0.010 Å, 1.26° and 15.2° respectively. (<xref ref-type="supplementary-material" rid="SD1-data">Figure 6—source data 1</xref>). These deviations decreased for the ensemble models by 0.002 Å, 0.26° and 6.6° respectively, when considering the centroids of the observed stereochemical distributions. Taking all fluctuations around the centroids (i.e. complete distributions) into account, these values increased by 0.002 Å, 0.33° and 4.0° respectively compared to the statistics from single structures. This indicates high stereochemical quality for the ensemble model, but that the ensemble of structures contained fluctuations exhibiting larger deviations from ideality. <xref ref-type="fig" rid="fig6">Figure 6A,B</xref> shows that high-energy conformations, as indicated by for example non-favourable Ramachandran φ,ψ-angle combinations, occurred transiently and were concentrated in regions of structural flexibility. Counting the most frequent Ramachandran classification for each φ,ψ-angle showed that the ensembles have a similar percentage for ‘allowed’ and an increased number of ‘outliers’ compared to the single structures (<xref ref-type="fig" rid="fig6">Figure 6C</xref> and <xref ref-type="supplementary-material" rid="SD2-data">Figure 6—source data 2</xref>). These analyses illustrate that in ensemble refinement conformations were sampled, rather than optimized to a single configuration as in single-structure refinement. Similar to Brünger (<xref ref-type="bibr" rid="bib10">Brünger, 1992</xref>), we observe that lower <italic>R</italic><sub><italic>free</italic></sub> values correlate with better quality of the Ramachandran statistics (<xref ref-type="fig" rid="fig6">Figure 6D</xref>).<fig id="fig4" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.010</object-id><label>Figure 4.</label><caption><p>Sampling reproducibility of ensemble refinement. (<bold>A</bold>) Cross-correlations (CC) calculated for all pairs from 10 random-number seed repeat ensemble refinements of the 1UOY dataset extending to 1.5-Å resolution. (<bold>B</bold>) Cross correlations computed for 1BV1 (2.0-Å resolution); and, (<bold>C</bold>) for 3CM8 (2.9-Å resolution). Mean CC shown in solid blue (black error bars indicate ±1 σ). Cross correlations were computed from real-space <bold><italic>F</italic></bold><sub><italic>model</italic></sub> electron-density map correlations (<xref ref-type="bibr" rid="bib8">Brändén and Jones, 1990</xref>). <italic>B</italic>-factors from the single structures refined using phenix.refine are shown in dotted grey lines.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.010">http://dx.doi.org/10.7554/eLife.00311.010</ext-link></p></caption><graphic xlink:href="elife00311f004"/></fig><fig id="fig5" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.011</object-id><label>Figure 5.</label><caption><p>Reproducibility of side-chain rotamer distributions. Mean χ<sub>1</sub> and χ<sub>2</sub> distributions of four side-chains from the 10 repeats, with error bars ±1 σ, are shown for 1UOY. The four residues presented are those with the two highest CC values (see <xref ref-type="fig" rid="fig4">Figure 4A</xref>), (<bold>A</bold>) Gln11 (0.9999) and (<bold>B</bold>) Arg32 (0.9999), and the two lowest CC values, (<bold>C</bold>) Lys39 (0.9976) and (<bold>D</bold>) Arg13 (0.9966).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.011">http://dx.doi.org/10.7554/eLife.00311.011</ext-link></p></caption><graphic xlink:href="elife00311f005"/></fig><fig id="fig6" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.012</object-id><label>Figure 6.</label><caption><p>Ramachandran analysis. Distribution of Ramachandran torsion angles classified as outliers (red) and allowed (blue) for ensemble models, 1UOY (<bold>A</bold>) and 1BV1 (<bold>B</bold>). Plot shows percentage of classification per residue (i.e. relative number of times a φ,ψ-torsion angle combination is scored as outlier or allowed as defined by phenix.ramalyze). Structure inserts show (left-hand side) the location of the non-favourable torsion angles, outliers (red) and allowed (blue), and (right-hand side) a <italic>B</italic>-factor putty representation for the single structure refined with phenix.refine. (<bold>C</bold>) Overall Ramachandran statistics for ensemble and re-refined models. The Ramachandran statistics for the ensemble models are calculated in two ways: blue shows the percentage of outliers (left side) or allowed (right side) from all structures in the ensemble (cf. ‘whole distribution’ in <xref ref-type="supplementary-material" rid="SD1-data">Figure 6—source data 1</xref>), whereas red shows these percentages based on the most frequent occurring classification of each φ,ψ combination (cf. ‘centroid distribution’). The grey lines show the percentage of allowed (left side) and outliers (right side) for the re-refined single structures. Ramachandran statistics per re-refined single structure and ensemble are given in <xref ref-type="supplementary-material" rid="SD2-data">Figure 6—source data 2</xref>. (<bold>D</bold>) Correlation of Ramachandran statistics with <italic>R</italic><sub><italic>free</italic></sub> values obtained from ensemble refinement. Three ensemble refinements were performed for the dataset 1UOY using different random-number seeds at <italic>T</italic><sub><italic>bath</italic></sub> values of 220, 260, 280, 290 and 295 K. Shown are the number of Ramachandran outliers (left side) and allowed (right side) in the ensemble as function of the <italic>R</italic><sub><italic>free</italic></sub> value.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.012">http://dx.doi.org/10.7554/eLife.00311.012</ext-link></p><p><supplementary-material id="SD1-data"><object-id pub-id-type="doi">10.7554/eLife.00311.013</object-id><label>Figure 6—source data 1.</label><caption><title>Geometries of single-structure models and ensemble models.</title><p>Rms deviations (RMSD) from ideal bond, angle and dihedral geometries calculated for single structures re-refined using phenix.refine. Geometries for ensemble structures were calculated using two methods, the ‘whole distribution’, where the RMSD was calculated for each restraint (averaged over all structures), √〈(<italic>x</italic><sub><italic>ideal</italic></sub> − <italic>x</italic><sub><italic>model</italic></sub>)<sup>2</sup>〉, and ‘centroid’ where the RMSD was calculated using the mean deviation from ideality for each restraint, √〈(〈<italic>x</italic><sub><italic>ideal</italic></sub> − <italic>x</italic><sub><italic>model</italic></sub>〉)<sup>2</sup>〉, which for unimodal functions equals √〈(<italic>x</italic><sub><italic>ideal</italic></sub> − 〈<italic>x</italic><sub><italic>model</italic></sub>〉)<sup>2</sup>〉.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.013">http://dx.doi.org/10.7554/elife.00311.013</ext-link></p></caption><media mime-subtype="xlsx" mimetype="application" xlink:href="elife00311s001.xlsx"/></supplementary-material></p><p><supplementary-material id="SD2-data"><object-id pub-id-type="doi">10.7554/eLife.00311.014</object-id><label>Figure 6—source data 2.</label><caption><title>Ramachandran statistics for re-refined and ensemble models.</title><p>The Ramachandran statistics for the ensemble models are calculated in two ways: ‘Ramachandran (mean)’ shows the percentage of outliers, allowed and favoured averaged over all structures in the ensemble (cf. ‘whole distribution’ in <xref ref-type="supplementary-material" rid="SD1-data">Figure 6—source data 1</xref>), whereas ‘Ramachandran (mode)’ shows these percentages based on the most frequent occurring classification of each φ,ψ combination (cf. ‘centroid distribution’ in <xref ref-type="supplementary-material" rid="SD1-data">Figure 6—source data 1</xref>).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.014">http://dx.doi.org/10.7554/elife.00311.014</ext-link></p></caption><media mime-subtype="xlsx" mimetype="application" xlink:href="elife00311s002.xlsx"/></supplementary-material></p></caption><graphic xlink:href="elife00311f006"/></fig></p><p>The presence of non-crystallographic symmetry (NCS) allowed for crystallographically-independent observations of atomic fluctuations in multiple copies of a protein molecule (<xref ref-type="fig" rid="fig7">Figure 7A–C</xref>). In some cases, the applied global TLS models differed significantly between NCS-related copies (<xref ref-type="fig" rid="fig7">Figure 7B</xref>). Nevertheless, we observed atomic fluctuations similar both in magnitude and location for related copies in areas not affected by crystal packing (<xref ref-type="fig" rid="fig7">Figure 7B,C</xref>; additional cases of NCS are presented in <xref ref-type="fig" rid="fig7s1 fig7s2 fig7s3 fig7s4">Figure 7—figure supplement 1–4</xref>). Apparently, variations in overall disorder arising from packing differences of NCS copies (as indicated by different <italic>B</italic>-factor distributions) were well accounted for by the applied global TLS models. Similarly, a global increase in disorder present in a dataset collected at ambient temperature vs an isomorphous dataset collected under cryo-conditions was fully accounted for by an increase in global TLS (<xref ref-type="fig" rid="fig8">Figure 8A</xref>). These data indicate that the derived atomic fluctuations are molecular traits and that the global TLS model accounts for overall disorder, which includes for example lattice or packing effects.<fig-group><fig id="fig7" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.015</object-id><label>Figure 7.</label><caption><p>Comparison of atomic fluctuations for non-crystallographic symmetry related protein copies for dataset 1M52. (<bold>A</bold>) Cα trace of the re-refined single structure coloured by <italic>B</italic>-factor (from blue to red with increasing <italic>B</italic>-factor) for the two chains (left) and the <italic>B</italic>-factors plotted per residue number for protein chain A (blue) and B (red) (right). (<bold>B</bold>) <italic>B</italic>-factors from the basal TLS model (left) and rms atomic fluctuations (right) in the ensemble model averaged per residue. Differences in crystal packing restrict the flexibility of chain B around residue 47. (<bold>C</bold>) Comparison (left) and superposition (right) of a region of the protein (indicated by black box in (<bold>A</bold>)) of the ensemble of structures observed for protein copy A (blue) and B (red). Analogous analyses for 2R8Q, 1YTT, 1IEP and 2XFA are shown in <xref ref-type="fig" rid="fig7s1 fig7s2 fig7s3 fig7s4">Figure 7—figure supplements 1–4</xref>. The protein copies in 3GWH and 3ODU showed backbone shifts greater than 4.5 Å and were left out of this analysis.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.015">http://dx.doi.org/10.7554/eLife.00311.015</ext-link></p></caption><graphic xlink:href="elife00311f007"/></fig><fig id="fig7s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00311.016</object-id><label>Figure 7—figure supplement 1.</label><caption><p>Comparison of atomic fluctuations for NCS related protein copies for dataset 2R8Q.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.016">http://dx.doi.org/10.7554/eLife.00311.016</ext-link></p></caption><graphic xlink:href="elife00311fs001"/></fig><fig id="fig7s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00311.017</object-id><label>Figure 7—figure supplement 2.</label><caption><p>Comparison of atomic fluctuations for NCS related protein copies for dataset 1YTT.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.017">http://dx.doi.org/10.7554/eLife.00311.017</ext-link></p></caption><graphic xlink:href="elife00311fs002"/></fig><fig id="fig7s3" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00311.018</object-id><label>Figure 7—figure supplement 3.</label><caption><p>Comparison of atomic fluctuations for NCS related protein copies for dataset 1IEP.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.018">http://dx.doi.org/10.7554/eLife.00311.018</ext-link></p></caption><graphic xlink:href="elife00311fs003"/></fig><fig id="fig7s4" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00311.019</object-id><label>Figure 7—figure supplement 4.</label><caption><p>Comparison of atomic fluctuations for NCS related protein copies for dataset 2XFA.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.019">http://dx.doi.org/10.7554/eLife.00311.019</ext-link></p></caption><graphic xlink:href="elife00311fs004"/></fig></fig-group><fig id="fig8" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.020</object-id><label>Figure 8.</label><caption><p>Ensemble refinement of two isomorphous proline isomerase datasets collected at 100 K and 288 K. (<bold>A</bold>) Left, basal TLS <italic>B</italic>-factors of ensemble models for 100 K and 288 K datasets (blue and green, respectively). Right, atomic rms fluctuations of ensemble models for 100 K and 288 K datasets (blue and green, respectively). (<bold>B</bold>) Re-refined single-structure (left) and ensemble model (right) for 100 K dataset. (<bold>C</bold>) Re-refined single-structure and ensemble model for 288 K dataset. In (<bold>B</bold>) and (<bold>C</bold>) atoms are coloured by <italic>B</italic>-factor (5 to 25 Å<sup>2</sup>). As with the published single structure refinement (<xref ref-type="bibr" rid="bib19">Fraser et al., 2009</xref>) alternative conformations were not found for residues Leu98, Ser99 and Phe113 at 100K.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.020">http://dx.doi.org/10.7554/eLife.00311.020</ext-link></p></caption><graphic xlink:href="elife00311f008"/></fig></p></sec><sec id="s2-3"><title>Functional dynamics revealed by ensemble refinement</title><p>Inspection of the obtained ensembles showed that most proteins, as expected, are characterized by well-ordered residues in the protein core and flexible residue side chains and loops on the outside (an example is given in <xref ref-type="fig" rid="fig9">Figure 9A</xref>). However, three cases exhibited marked flexibility of residue side chains on the inside of the molecule. 1BV1 (<xref ref-type="bibr" rid="bib21">Gajhede et al., 1996</xref>), major birch pollen allergen, has a large forked solvent channel with multiple disordered side chains and a diffuse water network (<xref ref-type="fig" rid="fig9">Figure 9B</xref>). The cavity is consistent with its putative role as a general plant steroid carrier (<xref ref-type="bibr" rid="bib36">Marković-Housley et al., 2003</xref>). Presumably, the flexible internal residues play a role in binding the diverse ligands. More surprising are the disordered cores in 1X6P of PAK pilin (<xref ref-type="bibr" rid="bib16">Dunlop et al., 2005</xref>) (<xref ref-type="fig" rid="fig9">Figure 9C</xref>) and 3K0N of the enzyme proline isomerase (<xref ref-type="bibr" rid="bib19">Fraser et al., 2009</xref>) (<xref ref-type="fig" rid="fig9">Figure 9D</xref>); in both these cases, the datasets were recorded at ambient temperatures. Multiple (16) aliphatic and aromatic side chains are highly flexible, forming a molten core in the pilin molecule. These flexible residues, which are extremely well conserved in the type IV pilin family, create the central interface between the characteristic long α-helix and β-sheet of this protein fold (<xref ref-type="bibr" rid="bib24">Hazes et al., 2000</xref>). We hypothesize that this monomeric pilin structure represents an intermediate molten state, which becomes stabilized upon protomer filament formation. The third case with flexible residues on the interior is 3K0N of proline isomerase (<xref ref-type="bibr" rid="bib19">Fraser et al., 2009</xref>). As with the pilin structure, several (<xref ref-type="disp-formula" rid="eqn11">11</xref>) aromatic and aliphatic residues showed large side chain fluctuations, yielding a molten core of the protein structure. However, the same protein under cryogenic conditions (3K0M) (<xref ref-type="bibr" rid="bib19">Fraser et al., 2009</xref>) showed mostly well-ordered side chains in the ensemble (<xref ref-type="fig" rid="fig9">Figure 9D</xref>, right-hand side), indicating that at cryogenic conditions the molten core has been annealed to its ground state configuration. As discussed in more detail in the next paragraph, the observed flexibility of the core residues at ambient temperature is likely of functional relevance for the enzyme. Thus, the computed ensemble models highlighted a hitherto unnoticed phenomenon of molten cores in folded proteins, which are likely relevant for the biological function of these molecules.<fig id="fig9" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.021</object-id><label>Figure 9.</label><caption><p>Overview of side-chain dynamics in ensemble structures. Atoms are coloured by their relative probability in the ensemble (see ‘Materials and methods’), reflecting the degree of disorder (ranging from well-ordered in blue to disordered in red). Bottom left insert shows secondary structure cartoon. Three datasets exhibit disordered interior sides chains forming a molten core region. (<bold>A</bold>) 3CA7 shows an ordered core with disordered hydrophilic side chains on the outside and is typical of the majority of the datasets. (<bold>B</bold>) 1BV1, the major pollen allergen and putative plant steroid transporter, has a disordered central cavity (location of cavity show with dotted lines). (<bold>C</bold>) 1X6P in the monomeric form of the fibril forming PAK pilin shows multiple disordered aliphatic and aromatic side chains in the interface between the N-terminal α-helix and the four stranded β-sheet domain. (<bold>D</bold>) Proline isomerase exhibits a molten core at 288 K, 3K0N (left); however, these interior dynamics are frozen-out at 100 K, 3K0M (right).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.021">http://dx.doi.org/10.7554/eLife.00311.021</ext-link></p></caption><graphic xlink:href="elife00311f009"/></fig></p><p>NMR spectroscopy has previously revealed specific dynamics for active-site residues of proline isomerase (<xref ref-type="bibr" rid="bib17">Eisenmesser et al., 2005</xref>). The solvent-exposed residues Arg55 and Met61 in the active site showed disorder in 3K0M (<xref ref-type="bibr" rid="bib19">Fraser et al., 2009</xref>), where data were collected at 100 K. For 3K0N collected at 288 K, a number of additional residues with multiple conformations were observed (<xref ref-type="fig" rid="fig8">Figure 8A,B</xref>). These included Ser99, Phe113, which are part of the substrate-binding pocket together with Arg55 and Met61 (<xref ref-type="fig" rid="fig10">Figure 10A</xref>), and Leu98, which neighbours the flexible residue Ser99 but points into the hydrophobic core (<xref ref-type="bibr" rid="bib19">Fraser et al., 2009</xref>). Ensemble refinement of the 288 K data revealed a large number of residue side chains in the core to be flexible. This flexibility in the core appears to be linked to the dynamics of active-site residues through the intervening β-sheet. In particular, the main-chain H-bonding network (C=O·HN) of neighbouring β-strands within the sheet was flexible, as indicated by anisotropy in the C=O bonds of residues 55-62-113-98 (with the largest anisotropy observed for 55 and 62; see <xref ref-type="fig" rid="fig10">Figure 10B</xref>). Analysis of the side-chain conformations for the active-site residues Arg55, Met61, Ser99 and Phe113 showed a 10% minor conformation for the four active-site residues (<xref ref-type="fig" rid="fig10">Figure 10C</xref>), which is in good agreement with NMR relaxation data (see Figure 2 in <xref ref-type="bibr" rid="bib17">Eisenmesser et al., 2005</xref>). Mutation of Ser99 to Thr (&gt;14 Å away from the catalytic Arg55) affects the side-chain distributions and lowers the activity ∼300-fold, similar to an Arg55Lys mutation of the catalytic residue (<xref ref-type="bibr" rid="bib17">Eisenmesser et al., 2005</xref>; <xref ref-type="bibr" rid="bib19">Fraser et al., 2009</xref>). Thus, the ensemble refinement results support the notion put forward by Eisenmesser et al. and Fraser et al. that side chain dynamics play a critical role in the enzymatic function of proline isomerase and, moreover, expand upon this theme to reveal mechanistic insights arising from the underlying detailed dynamics.<fig id="fig10" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.022</object-id><label>Figure 10.</label><caption><p>Dynamics in the binding pocket of proline isomerase at 288 K. (<bold>A</bold>) The location of the binding pocket comprised of residues Arg55, Met61, Ser99 and Phe113. (<bold>B</bold>) Zoom in of binding pocket (as dotted lines in (<bold>A</bold>)) showing flexible <bold>β</bold>-sheet for C=O·HN network of residues 55-62-113-98 in neighbouring <bold>β</bold>-strands. (<bold>C</bold>) All four residues show a ∼9:1 ratio between major and minor conformations which is in good agreement with NMR relaxation dispersion data collected a similar temperature (<xref ref-type="bibr" rid="bib17">Eisenmesser et al., 2005</xref>). Histograms show mean χ<sub>1</sub> angles generated from 10 random number repeats of ensemble refinement (error bars ±1 σ). Inserts show the relevant side chains, coloured by atomic probability (see ‘Materials and methods’), as observed in the ensemble reported in <xref ref-type="table" rid="tbl1">Table 1</xref>.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.022">http://dx.doi.org/10.7554/eLife.00311.022</ext-link></p></caption><graphic xlink:href="elife00311f010"/></fig></p><p>Ligand binding to HIV protease is known to have marked effects on the enzyme structure (<xref ref-type="bibr" rid="bib26">Heaslet et al., 2007</xref>). We compared HIV protease in its apo form, 2PC0 (<xref ref-type="bibr" rid="bib26">Heaslet et al., 2007</xref>), and bound to ligand JE-2147, 1KZK (<xref ref-type="bibr" rid="bib41">Reiling et al., 2002</xref>). As for proline isomerase, HIV protease exhibited flexible, moldable, substrate-binding pockets in the apo state. Enthalpic and entropic binding of the ligand with high affinity (<italic>K</italic><sub><italic>D</italic></sub> = 41 pM) (<xref ref-type="bibr" rid="bib55">Velazquez-Campoy et al., 2001</xref>) reduced the flexibility in the substrate binding pockets by protein–ligand H-bond interactions and van der Waals stacking (<xref ref-type="fig" rid="fig11">Figure 11A,B</xref>). Upon ligand binding, Asp90 became ordered through H-bonding with the ligand in P2, whereas its dimeric partner lacked a H-bonding partner in P2′ and remained flexible as in the unbound state. The canonical aspartic protease catalytic residues, Asp25 of both monomers, became ordered upon ligand binding. Concomitantly, we observed significant changes in dynamics of specific core residues (<xref ref-type="fig" rid="fig11">Figure 11C</xref>). Some residues, most notably Thr26, ‘froze’ (Thr26 is part of the conserved Asp25-Thr26-Gly-27 sequence). In contrast, the side chains of other residues, most notably Cys95 and Leu97 opposite of Thr26, became highly disordered in the bound state, whereas they were relatively ordered in the unbound state. This observation supports NMR data that showed that conformational variability increases upon inhibitor binding for Leu97 amongst others (<xref ref-type="bibr" rid="bib50">Torchia and Ishima, 2003</xref>). These data suggest that the entropy lost by the catalytic aspartates upon ligand binding is compensated with an increase in disorder of specific core residue side chains. This type of dynamic modulation was also observed for Ca<sup>2+</sup> binding to calmodulin, where this effect was dubbed entropy compensation (<xref ref-type="bibr" rid="bib33">Lee et al., 2000</xref>). Similar to the molten core dynamics for proline isomerase, the structure ensembles generated by the ensemble refinement method revealed specific core dynamics for HIV protease, in particular a conformational exchange that is likely functionally relevant.<fig id="fig11" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.023</object-id><label>Figure 11.</label><caption><p>Comparison of ensemble structures of bound and unbound forms of HIV protease. (<bold>A</bold>) Residues in the P1 binding sites are disordered in the unbound HIV protease (2PC0), left-hand side, with carbon atoms shown in cyan, oxygen red and nitrogen blue. These residues become ordered in HIV protease in complex with a high affinity inhibitor, JE-2147 (1KZK), right-hand side with carbon atoms of the protease shown in green and of the inhibitor in purple. In 1KZK the two chains of the functional dimer are present in the asymmetric unit, whereas in 2PC0 a monomer is present in the asymmetric unit and the dimer is drawn using the crystallographic twofold axis. (<bold>B</bold>) Shows an alternative orientation showing the P2 binding site. (<bold>C</bold>) The catalytic Asp25 becomes ordered upon binding of the inhibitor, forming a hydrogen bond with the P1 carbonyl and hydroxyl of JE-2147. In contrast, the distal residues Cys95 and Leu97 at the dimer interface become less ordered upon binding.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.023">http://dx.doi.org/10.7554/eLife.00311.023</ext-link></p></caption><graphic xlink:href="elife00311f011"/></fig></p><p>The development of new small molecule therapeutics is often supported by the use of macromolecular structure, typically X-ray crystallography of complexes between target proteins and drug candidates. These complexes are typically interpreted as static structures, and the impact of dynamics, if considered at all, is probed using computational methods. Our new ensemble refinement approach makes it possible to study the role of dynamics in drug–target complexes in the context of the experimental data. Therefore, we analysed two structures of Abl kinase in complex with Imatinib (also known as Gleevec), that is 1IEP, and PD173955, that is 1M52 (<xref ref-type="bibr" rid="bib38">Nagar et al., 2002</xref>). These compounds bind the Abl kinase with high affinity, 37 nM (<xref ref-type="bibr" rid="bib44">Schindler et al., 2000</xref>) and 100 nM respectively (<xref ref-type="bibr" rid="bib38">Nagar et al., 2002</xref>). The ensembles provide insights into the flexibility of the protein residues and the ligand moieties in the complex. <xref ref-type="fig" rid="fig12">Figure 12A</xref> shows the variation in H-bonding observed in Abl kinase–Imatinib. Variable H-bonding interactions were observed for the hydrophilic N-methylpiperazine moiety with the backbone carbonyl atoms of Ile360 and His361. In contrast, the ensemble displayed a well-ordered H-bond between the anilino-NH and Thr315 ‘gatekeeper’ side-chain. Moreover, the ordered water network between Glu286, Lys271 and the pyrimidine moiety of Imatinib (<xref ref-type="bibr" rid="bib38">Nagar et al., 2002</xref>) was reproduced in the ensemble model (<xref ref-type="fig" rid="fig12">Figure 12B</xref>). We observed that the Abl kinase adopts two different states in these crystal structures. In the Imatinib complex the activation loop, residues 381–402, is highly disordered (<xref ref-type="fig" rid="fig12">Figure 12C</xref>), which was confirmed by comparison to previously published NMR data (<xref ref-type="bibr" rid="bib53">Vajpai et al., 2008</xref>). In general, the ensemble models indicate details of tight and highly ordered drug–target interactions on one side vs disordered interactions elsewhere, which are indicative of less tight interactions, that may suggest which sites to modify in a drug-optimization cycle.<fig id="fig12" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.024</object-id><label>Figure 12.</label><caption><p>ABL-kinase Imatinib binding site. (<bold>A</bold>) Imatinib binding site in chain A of the 1IEP dataset showing distribution of the six protein–ligand hydrogen bonds in chain A and chain B (red and blue respectively). (<bold>B</bold>) Hydrogen bond network of ordered water network observed in the re-refined single structure, left, and the ensemble model, right. (<bold>C</bold>) The activation loop (shown in pink) is disordered when ABL-kinase is complexed with Imatinib (shown in cyan) as observed previously in solution (<xref ref-type="bibr" rid="bib53">Vajpai et al., 2008</xref>).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.024">http://dx.doi.org/10.7554/eLife.00311.024</ext-link></p></caption><graphic xlink:href="elife00311f012"/></fig></p></sec></sec><sec id="s3" sec-type="conclusions"><title>Conclusions</title><p>We have shown that far more structural information can be reliably extracted from protein diffraction data than is achieved to date by traditional single-structure modelling methods. Our ensemble refinement method samples distributions that reflect structural details of protein dynamics. The resulting ensemble models provide a more comprehensive description of the molecules and allow interpretation of the molecular function in terms of both the three-dimensional arrangements of the protein residues and their flexibilities. Moreover, ensemble models minimize the risk of structural over-interpretation associated with the seemingly rigid single-structure models. We found comparative analyses of protein molecules in different states to be very useful for identifying detailed changes in structural dynamics that may be mechanistically relevant for the molecular function.</p><p>Partitioning large-scale disorder into a global model separates intermolecular variations of protein packing in the crystal from the detailed intra-molecular atomic fluctuations. Effectively, the X-ray gradient dictates the MD sampling to yield featureless, (<italic>m</italic><bold><italic>F</italic></bold><sub><italic>obs</italic></sub>−<italic>D</italic><bold><italic>F</italic></bold><sub><italic>model</italic></sub>)exp[<italic>iφ</italic><sub><italic>model</italic></sub>], electron-density difference maps, while the global disorder model is accounted for by taking <italic>B</italic><sub><italic>TLS</italic></sub> into account when computing the atomic densities. In this way, the ensemble of structures is generated to model the anisotropic and anharmonic electron-density distributions precisely, while being restrained by the bonded and non-bonded energy terms used in the MD simulation. The separation of global disorder and local atomic fluctuations contrasts the original approach by <xref ref-type="bibr" rid="bib22">Gros et al. (1990)</xref>, where the MD sampling had to account for both the large scale global disorder and local fluctuations leading to very long relaxation times <italic>τ</italic><sub><italic>x</italic></sub> of 16 ps. In the current work much shorter relaxation times of 0.25–2 ps can be used, thereby limiting potential over-fitting markedly. The method is applicable to data with a wide range of upper resolution limits. We see marked improvements in <italic>R</italic><sub><italic>free</italic></sub> for datasets ranging from 1.5 to 2.6-Å upper-resolution limit. A detailed interpretation of the ensembles is allowed, supported by the very high local correlations between independent ensemble refinements. However, at lower resolution limits and for highly disordered loops the local correlation between independent runs drops and detailed interpretation is not feasible. Thus, even though the number of independent parameters in an ensemble model is not clearly defined (and therefore the parameter-to-observation ratio is unclear), the gain in <italic>R</italic><sub><italic>free</italic></sub> and the very high local correlations between independent runs indicate a high reliability of the ensemble models. However, the method is not a panacea for highly disordered protein regions. In the absence of ordered conformations for a certain region of the protein (as implicitly defined by the diffraction data) the ensemble refinement will sample diverse conformations in order to prevent the build up of negative peaks in the electron-density difference map. In other words, if the data ‘says’ that a region is disordered, ensemble refinement will generate diverse conformations for that region. Furthermore, dataset pathologies caused, for example, by radiation damage may have confounded effects that obscure the dynamics inherent to the protein molecule. Thus, perhaps somewhat counter-intuitively, this modelling method that accounts for inherent protein dynamics does not help to resolve structural details of disordered regions, but is particularly suited to resolve dynamical fluctuations in ordered parts of the protein structure.</p><p>Ensemble refinement of 20 protein datasets highlighted global dynamics features of protein molecules. Surprisingly, in some cases the ensembles indicated the existence of folded protein structures that display molten cores. Most likely, such molten cores may indicate intermediates of protein molecules that function in larger complexes (such as PAK pilin), or alternatively these molten cores support dynamical fluctuations that are needed for ligand binding and enzyme functioning (as for birch pollen allergen and proline isomerase respectively). Furthermore, the ensembles show details of specific order–disorder transitions, or conformational exchanges, between active site and core residues (as for HIV protease in the unbound and bound state) that suggest a mechanism of entropy compensation to support the enzymatic activity. The difference in dynamics observed between the ensembles of proline isomerase at cryo and ambient temperatures indicates that flash freezing of a crystal anneals local conformational fluctuations and thereby removes protein dynamics that may be functionally relevant.</p><p>In conclusion, this new method of modelling X-ray diffraction data reveals a wealth of detailed information about the dynamics of biomolecules that complements the high-resolution structural information already available from the crystallographic experiment. In depth understanding of structure–dynamics in biomolecules will enhance our insights into the molecular mechanisms that underlie biological processes.</p></sec><sec id="s4" sec-type="materials|methods"><title>Materials and methods</title><p>The method of ensemble refinement was implemented in the Phenix software (<xref ref-type="bibr" rid="bib1">Adams et al., 2010</xref>). Adaptations and new procedures developed for ensemble refinement are given in section ‘Ensemble refinement methods’. Simulations were performed as described in section ‘Ensemble refinement protocol’. Details of the single-structure re-refinements used for comparison with the ensemble models are given in section ‘Single structure re-refinements’. Validation of the global disorder TLS model, the dependency of ensemble refinement on the starting structure and additional ensemble refinement calculations are given in ‘Additional ensemble refinement calculations’.</p><p>Ensemble refinement was performed using phenix.ensemble_refinement, as will be made available in the next release of Phenix.</p><sec id="s4-1"><title>Ensemble refinement methods</title><sec id="s4-1-1"><title>Time-averaged restraints</title><p>The overall model structure factors are calculated as <xref ref-type="disp-formula" rid="eqn1">(1)</xref>, defined by <xref ref-type="bibr" rid="bib3">Afonine et al. (2005)</xref>, incorporating overall anisotropic scaling (<xref ref-type="bibr" rid="bib46">Sheriff and Hendrickson, 1987</xref>) and bulk solvent contributions (<xref ref-type="bibr" rid="bib27">Jiang and Brünger, 1994</xref>).<disp-formula id="eqn1"><label>(1)</label><mml:math id="m1"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">model</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>k</mml:mi><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">h</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msup><mml:mi mathvariant="bold-italic">A</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mi mathvariant="bold-italic">B</mml:mi><mml:mrow><mml:mi mathvariant="italic">cart</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">A</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mi>T</mml:mi></mml:msup><mml:mi mathvariant="bold-italic">h</mml:mi></mml:mrow><mml:mn>4</mml:mn></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>k</mml:mi><mml:mrow><mml:mi mathvariant="italic">sol</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>B</mml:mi><mml:mrow><mml:mi mathvariant="italic">sol</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mi>s</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mn>4</mml:mn></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">mask</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>where, <italic>k</italic> is the overall scale factor, <bold><italic>h</italic></bold> is the column vector with Miller indices, <bold><italic>A</italic></bold> is the orthogonalisation matrix, <bold><italic>B</italic></bold><sub><italic>cart</italic></sub> is the anisotropic scale matrix, <bold><italic>F</italic></bold><sub><italic>calc</italic></sub> is the structure factors calculated from atomic model, <italic>k</italic><sub><italic>sol</italic></sub> and <italic>B</italic><sub><italic>sol</italic></sub> are the parameters for the flat bulk solvent model and <bold><italic>F</italic></bold><sub><italic>mask</italic></sub> are the structure factors calculated from bulk solvent mask.</p><p>In order to restrain the instantaneous structures produced during the MD simulation with time and spatially averaged X-ray data, time-averaged restraints are used (<xref ref-type="bibr" rid="bib22">Gros et al., 1990</xref>). This produces time-averaged (or rolling-average) structure factors such that <xref ref-type="disp-formula" rid="eqn1">(1)</xref> becomes <xref ref-type="disp-formula" rid="eqn2">(2)</xref>.<disp-formula id="eqn2"><label>(2)</label><mml:math id="m2"><mml:mrow><mml:msubsup><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">model</mml:mi></mml:mrow><mml:mi>t</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:mi>k</mml:mi><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">h</mml:mi><mml:mtext>T</mml:mtext></mml:msup><mml:msup><mml:mi mathvariant="bold-italic">A</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mi mathvariant="bold-italic">B</mml:mi><mml:mrow><mml:mi mathvariant="italic">cart</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">A</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mi>T</mml:mi></mml:msup><mml:mi mathvariant="bold-italic">h</mml:mi></mml:mrow><mml:mn>4</mml:mn></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mo>〈</mml:mo><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">calc</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>〉</mml:mo></mml:mrow></mml:mrow><mml:mi>t</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>k</mml:mi><mml:mrow><mml:mi mathvariant="italic">sol</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>B</mml:mi><mml:mrow><mml:mi mathvariant="italic">sol</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mi>s</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mn>4</mml:mn></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:msub><mml:mrow><mml:mrow><mml:mo>〈</mml:mo><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">mask</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>〉</mml:mo></mml:mrow></mml:mrow><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p><p>This is a time-dependent memory function, that is a ‘rolling’ average, where the size of the averaging window is controlled by the <italic>τ</italic><sub><italic>x</italic></sub> parameter (typically 1 ps). This averaging function is updated with the current individual structure every 10 time-steps (<italic>∆t</italic>) during the simulation and is implemented as in <xref ref-type="disp-formula" rid="eqn3">(3)</xref>.<disp-formula id="eqn3"><label>(3)</label><mml:math id="m3"><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mo>〈</mml:mo><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">calc</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>〉</mml:mo></mml:mrow></mml:mrow><mml:mi>t</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mi mathvariant="italic">Δt</mml:mi></mml:mrow><mml:mo>/</mml:mo><mml:mrow><mml:msub><mml:mi>τ</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow></mml:msup><mml:msub><mml:mrow><mml:mrow><mml:mo>〈</mml:mo><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">calc</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>〉</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mi mathvariant="italic">Δt</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mi mathvariant="italic">Δt</mml:mi></mml:mrow><mml:mo>/</mml:mo><mml:mrow><mml:msub><mml:mi>τ</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow></mml:msup></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msubsup><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">calc</mml:mi></mml:mrow><mml:mi>t</mml:mi></mml:msubsup><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p></sec><sec id="s4-1-2"><title>Dual explicit-bulk solvent model</title><p>Due to the stochastic behaviour of solvent molecules and the number of partially disordered or low occupancy sites, explicitly modelled solvent atoms are repositioned every 250 time-steps. Electron-density and difference density maps are generated using <bold><italic>F</italic></bold><sub><italic>model</italic></sub><sup><italic>t</italic></sup>, excluding reflections in the free <italic>R</italic> set. Water oxygen atoms with an electron-density peak &gt;1.0 σ in the 2<italic>m</italic><bold><italic>F</italic></bold><sub><italic>obs</italic></sub> − <italic>D</italic><bold><italic>F</italic></bold><sub><italic>model</italic></sub> map or a peak &gt;3.0 σ in the <italic>m</italic><bold><italic>F</italic></bold><sub><italic>obs</italic></sub> − <italic>D</italic><bold><italic>F</italic></bold><sub><italic>model</italic></sub> map are preserved, otherwise the atom is removed. New water atoms are added for positions which have a 2<italic>m</italic><bold><italic>F</italic></bold><sub><italic>obs</italic></sub> − <italic>D</italic><bold><italic>F</italic></bold><sub><italic>model</italic></sub> peak &gt;1.0 σ and a <italic>m</italic><bold><italic>F</italic></bold><sub><italic>obs</italic></sub> − <italic>D</italic><bold><italic>F</italic></bold><sub><italic>model</italic></sub> peak &gt;3.0 σ, and are between 1.8–3.0 Å in distance to an existing atom. For high-resolution cases these criteria are adjusted to include <italic>m</italic><bold><italic>F</italic></bold><sub><italic>obs</italic></sub> <italic>− D</italic><bold><italic>F</italic></bold><sub><italic>model</italic></sub> map peaks &gt;2.5 σ. Newly positioned atoms are assigned a random, Boltzmann-weighted, velocity. Explicitly modelled solvent atoms contribute to the atomic model (<bold><italic>F</italic></bold><sub><italic>calc</italic></sub>).</p><p>Bulk solvent is modelled using a solvent mask (<xref ref-type="bibr" rid="bib3">Afonine et al., 2005</xref>). The mask structure factors (<bold><italic>F</italic></bold><sub><italic>mask</italic></sub>) are averaged in the same manner as the atomic model (<bold><italic>F</italic></bold><sub><italic>calc</italic></sub>) <xref ref-type="disp-formula" rid="eqn4">(4)</xref>.<disp-formula id="eqn4"><label>(4)</label><mml:math id="m4"><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mo>〈</mml:mo><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">mask</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>〉</mml:mo></mml:mrow></mml:mrow><mml:mi>t</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mi mathvariant="italic">Δt</mml:mi></mml:mrow><mml:mo>/</mml:mo><mml:mrow><mml:msub><mml:mi>τ</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow></mml:msup><mml:msub><mml:mrow><mml:mrow><mml:mo>〈</mml:mo><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">mask</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>〉</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mi mathvariant="italic">Δt</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mrow><mml:mrow><mml:mo>−</mml:mo><mml:mi mathvariant="italic">Δt</mml:mi></mml:mrow><mml:mo>/</mml:mo><mml:mrow><mml:msub><mml:mi>τ</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow></mml:msup></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msubsup><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">mask</mml:mi></mml:mrow><mml:mi>t</mml:mi></mml:msubsup><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p><p>The <italic>k</italic><sub><italic>sol</italic></sub> and <italic>B</italic><sub><italic>sol</italic></sub> bulk solvent parameters and <bold><italic>B</italic></bold><sub><italic>cart</italic></sub> scaling parameters used for the duration of the simulation are calculated from the starting structure as described previously (<xref ref-type="bibr" rid="bib3">Afonine et al., 2005</xref>), they are re-optimized for the final ensemble.</p></sec><sec id="s4-1-3"><title>Constrained target functions</title><p>The overall scale factor, <italic>k</italic>, is constrained during the simulation. For the maximum-likelihood target function, as shown for acentric reflections, <xref ref-type="disp-formula" rid="eqn5">(5)</xref> during normalisation <xref ref-type="disp-formula" rid="eqn6">(6)</xref>, the sum of the rolling-average structure factor array <xref ref-type="disp-formula" rid="eqn2">(2)</xref> is scaled to the sum of the structure factor array from the starting model (<italic>F</italic><sub><italic>ref</italic></sub>) as shown in <xref ref-type="disp-formula" rid="eqn7">(7)</xref>.<disp-formula id="eqn5"><label>(5)</label><mml:math id="m5"><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi mathvariant="italic">x-ray</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>2</mml:mn><mml:msub><mml:mi>E</mml:mi><mml:mrow><mml:mi mathvariant="italic">obs</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:msubsup><mml:mi mathvariant="normal">σ</mml:mi><mml:mi>A</mml:mi><mml:mn>2</mml:mn></mml:msubsup></mml:mrow></mml:mfrac><mml:mtext>exp</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:msubsup><mml:mi>E</mml:mi><mml:mi mathvariant="italic">obs</mml:mi><mml:mn>2</mml:mn></mml:msubsup></mml:mrow><mml:mo>+</mml:mo><mml:msubsup><mml:mi mathvariant="normal">σ</mml:mi><mml:mi>A</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:msubsup><mml:mi>E</mml:mi><mml:mrow><mml:mi mathvariant="italic">model</mml:mi></mml:mrow><mml:mi>t</mml:mi></mml:msubsup></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:msubsup><mml:mi mathvariant="normal">σ</mml:mi><mml:mi>A</mml:mi><mml:mn>2</mml:mn></mml:msubsup></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>I</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mn>2</mml:mn><mml:msub><mml:mi>E</mml:mi><mml:mrow><mml:mi mathvariant="italic">obs</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi mathvariant="normal">σ</mml:mi><mml:mi>A</mml:mi></mml:msub><mml:msubsup><mml:mi>E</mml:mi><mml:mrow><mml:mi mathvariant="italic">model</mml:mi></mml:mrow><mml:mi>t</mml:mi></mml:msubsup></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:msubsup><mml:mi mathvariant="normal">σ</mml:mi><mml:mi>A</mml:mi><mml:mn>2</mml:mn></mml:msubsup></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula><disp-formula id="eqn6"><label>(6)</label><mml:math id="m6"><mml:mrow><mml:mrow><mml:msubsup><mml:mi>E</mml:mi><mml:mi mathvariant="italic">model</mml:mi><mml:mi>t</mml:mi></mml:msubsup></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>k</mml:mi><mml:msubsup><mml:mi mathvariant="italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">model</mml:mi></mml:mrow><mml:mi>t</mml:mi></mml:msubsup></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:mi>ε</mml:mi><mml:mtext> </mml:mtext><mml:msub><mml:mi>Σ</mml:mi><mml:mi>N</mml:mi></mml:msub></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula><disp-formula id="eqn7"><label>(7)</label><mml:math id="m7"><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:munder><mml:mi>Σ</mml:mi><mml:mrow><mml:mi mathvariant="italic">hkl</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mi mathvariant="italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">ref</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:munder><mml:mi>Σ</mml:mi><mml:mrow><mml:mi mathvariant="italic">hkl</mml:mi></mml:mrow></mml:munder><mml:msubsup><mml:mi mathvariant="italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">model</mml:mi></mml:mrow><mml:mi>t</mml:mi></mml:msubsup></mml:mrow></mml:mfrac><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>where, <italic>E</italic><sub><italic>obs</italic></sub> and <italic>E</italic><sub><italic>model</italic></sub> are the normalised structure factors, σ<sub><italic>A</italic></sub> is the Sigma-A weighting factor, <italic>I</italic><sub><italic>0</italic></sub> is a modified Bessel function of order 0 and <italic>ε</italic> is the expected intensity factor.</p></sec><sec id="s4-1-4"><title>Temperature bath and X-ray weight</title><p>The simulations are performed such that the non-solvent atoms are at a target temperature (<italic>T</italic><sub><italic>target</italic></sub>) of 300 K, where the simulation is coupled to a velocity-scaled temperature-bath (<xref ref-type="bibr" rid="bib5">Berendsen et al., 1984</xref>). The temperature bath is set to a value less than 300 K, typically 295–298 K. Because the X-ray restraints are computed from a time-dependent memory function, the X-ray energy term is non-conservative and thus heating occurs. During the equilibration phase the X-ray weight (<italic>w</italic><sub><italic>x-ray</italic></sub>) is modulated by the temperature of the protein atoms (<italic>T</italic><sub><italic>protein</italic></sub>) every 10 time-steps (<italic>∆t</italic>), such that the non-solvent atoms sample consistently at the target temperature <xref ref-type="disp-formula" rid="eqn8">(8)</xref>.<disp-formula id="eqn8"><label>(8)</label><mml:math id="m8"><mml:mrow><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi mathvariant="italic">x-ray</mml:mi></mml:mrow><mml:mi>t</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi mathvariant="italic">x-ray</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mi mathvariant="italic">Δt</mml:mi></mml:mrow></mml:msubsup><mml:mfrac><mml:mrow><mml:msub><mml:mi>T</mml:mi><mml:mrow><mml:mi mathvariant="italic">target</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>T</mml:mi><mml:mrow><mml:mi mathvariant="italic">protein</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p><p>Thus, the thermostat offset controls the X-ray weight in a system independent manner whilst maintaining the target temperature. In the acquisition phase the X-ray weight is fixed to the averaged value used in the equilibration phase.</p></sec><sec id="s4-1-5"><title>TLS approximation of the global disorder</title><p>The partitioning of inter-molecular disorder is performed before the start of the simulation using ADPs from the traditionally refined starting structure. TLS groups are assigned per molecule or domain as appropriate to model global packing disorder. For each group, TLS parameters are fitted to the ADPs of the starting structure for all non-solvent, non-hydrogen atoms. The agreement of the isotropic equivalents for the fitted TLS ADPs (<italic>B</italic><sub><italic>tls</italic></sub>) and the reference ADPs (<italic>B</italic><sub><italic>ref</italic></sub>) is scored as <xref ref-type="disp-formula" rid="eqn9">(9)</xref> for all non-solvent, non-hydrogen atoms.<disp-formula id="eqn9"><label>(9)</label><mml:math id="m9"><mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:msubsup><mml:mi>B</mml:mi><mml:mrow><mml:mi mathvariant="italic">ref</mml:mi></mml:mrow><mml:mi>i</mml:mi></mml:msubsup><mml:mo>−</mml:mo><mml:msubsup><mml:mi>B</mml:mi><mml:mrow><mml:mi mathvariant="italic">tls</mml:mi></mml:mrow><mml:mi>i</mml:mi></mml:msubsup></mml:mrow><mml:mo>|</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></p><p>A percentile of atoms with the poorest fitting ADPs (<italic>p</italic><sub><italic>TLS</italic></sub>) are excluded from the next round of TLS parameter fitting and repeated until the fitted TLS parameters converge. The converged TLS parameters are then applied to all atoms within that group for the duration of the simulation. Solvent atoms are assigned to the TLS group of the closest non-water atom, this assignment is updated every 250 time-steps. This TLS model produced lower <italic>R</italic>-values than using the ADP values from the re-refined single structure or using the one overall isotropic <italic>B</italic>-factor for all atoms in the model (<xref ref-type="table" rid="tbl5">Table 5</xref>).<table-wrap id="tbl5" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.025</object-id><label>Table 5.</label><caption><p>Comparison of three <italic>B</italic>-factor models for ensemble refinement. Burling et al. (<xref ref-type="bibr" rid="bib11">Burling and Brunger, 1994</xref>) had shown previously that the choice of ADPs for ensemble refinement can affect the resultant structures. Three alternative ADP models were tested for seven datasets. (1) ‘Global isotropic <italic>B</italic>-factor’, one overall isotropic <italic>B</italic>-factor applied to all atoms in the simulation. Multiple trials were performed to establish the optimum single value. For comparison the Wilson <italic>B</italic>-factor of the data is listed. (2) ‘Refined ADPs', ADPs from the refined single-structures. Best results were obtained by multiplying the refined ADPs by given scale factor. (3) ‘Fitted TLS ADPs', fitted TLS model obtained as described in ‘Materials and methods’</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.025">http://dx.doi.org/10.7554/eLife.00311.025</ext-link></p></caption><table frame="hsides" rules="groups"><thead><tr><td rowspan="2">PDB</td><td rowspan="2">Resolution (Å)</td><td colspan="4">Global isotropic <italic>B</italic>-factor</td><td colspan="3">Refined ADPs</td><td colspan="3">Fitted TLS ADPs</td></tr><tr><td><italic>R</italic><sub><italic>work</italic></sub></td><td><italic>R</italic><sub><italic>free</italic></sub></td><td>Wilson <italic>B</italic>-factor (Å<sup>2</sup>)</td><td>Global <italic>B</italic>-factor (Å<sup>2</sup>)</td><td><italic>R</italic><sub><italic>work</italic></sub></td><td><italic>R</italic><sub><italic>free</italic></sub></td><td>Scale factor</td><td><italic>R</italic><sub><italic>work</italic></sub></td><td><italic>R</italic><sub><italic>free</italic></sub></td><td>pTLS</td></tr></thead><tbody><tr><td>3K0M</td><td align="char" char=".">1.3</td><td align="char" char=".">0.117</td><td align="char" char=".">0.147</td><td align="char" char=".">12.0</td><td align="char" char=".">12.0</td><td align="char" char=".">0.125</td><td align="char" char=".">0.146</td><td align="char" char=".">0.9</td><td align="char" char=".">0.103</td><td align="char" char=".">0.130</td><td align="char" char=".">0.3</td></tr><tr><td>3K0N</td><td align="char" char=".">1.4</td><td align="char" char=".">0.121</td><td align="char" char=".">0.153</td><td align="char" char=".">19.1</td><td align="char" char=".">19.1</td><td align="char" char=".">0.126</td><td align="char" char=".">0.153</td><td align="char" char=".">0.9</td><td align="char" char=".">0.114</td><td align="char" char=".">0.133</td><td align="char" char=".">0.1</td></tr><tr><td>1UOY</td><td align="char" char=".">1.5</td><td align="char" char=".">0.103</td><td align="char" char=".">0.148</td><td align="char" char=".">10.4</td><td align="char" char=".">9.4</td><td align="char" char=".">0.107</td><td align="char" char=".">0.144</td><td align="char" char=".">0.9</td><td align="char" char=".">0.101</td><td align="char" char=".">0.136</td><td align="char" char=".">0.3</td></tr><tr><td>3CA7</td><td align="char" char=".">1.5</td><td align="char" char=".">0.129</td><td align="char" char=".">0.194</td><td align="char" char=".">16.8</td><td align="char" char=".">13.4</td><td align="char" char=".">0.142</td><td align="char" char=".">0.192</td><td align="char" char=".">0.9</td><td align="char" char=".">0.142</td><td align="char" char=".">0.190</td><td align="char" char=".">0.5</td></tr><tr><td>1X6P</td><td align="char" char=".">1.6</td><td align="char" char=".">0.108</td><td align="char" char=".">0.158</td><td align="char" char=".">15.9</td><td align="char" char=".">12.7</td><td align="char" char=".">0.113</td><td align="char" char=".">0.152</td><td align="char" char=".">0.8</td><td align="char" char=".">0.121</td><td align="char" char=".">0.150</td><td align="char" char=".">0.8</td></tr><tr><td>1F2F</td><td align="char" char=".">1.7</td><td align="char" char=".">0.116</td><td align="char" char=".">0.184</td><td align="char" char=".">15.6</td><td align="char" char=".">14.8</td><td align="char" char=".">0.123</td><td align="char" char=".">0.167</td><td align="char" char=".">0.8</td><td align="char" char=".">0.126</td><td align="char" char=".">0.167</td><td align="char" char=".">0.7</td></tr><tr><td>1BV1</td><td align="char" char=".">2.0</td><td align="char" char=".">0.125</td><td align="char" char=".">0.192</td><td align="char" char=".">22.6</td><td align="char" char=".">18.1</td><td align="char" char=".">0.135</td><td align="char" char=".">0.191</td><td align="char" char=".">0.8</td><td align="char" char=".">0.145</td><td align="char" char=".">0.182</td><td align="char" char=".">0.6</td></tr><tr><td><bold>Mean</bold></td><td><bold>-</bold></td><td align="char" char="."><bold>0.117</bold></td><td align="char" char="."><bold>0.168</bold></td><td>-</td><td>-</td><td align="char" char="."><bold>0.125</bold></td><td align="char" char="."><bold>0.164</bold></td><td>-</td><td align="char" char="."><bold>0.122</bold></td><td align="char" char="."><bold>0.155</bold></td><td>-</td></tr></tbody></table></table-wrap></p></sec><sec id="s4-1-6"><title>Generation of the final ensemble</title><p>Structure factors for the final ensemble are calculated from the population of collected structure as in <xref ref-type="disp-formula" rid="eqn2">(2)</xref> where <bold><italic>F</italic></bold><sub><italic>calc</italic></sub> and <bold><italic>F</italic></bold><sub><italic>mask</italic></sub> are defined as <xref ref-type="disp-formula" rid="eqn10">(10)</xref> and <xref ref-type="disp-formula" rid="eqn11">(11)</xref>.<disp-formula id="eqn10"><label>(10)</label><mml:math id="m10"><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mo>〈</mml:mo><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">calc</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>〉</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi mathvariant="italic">final</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>n</mml:mi></mml:mfrac><mml:mstyle displaystyle="true"><mml:munderover><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mrow><mml:msubsup><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">calc</mml:mi></mml:mrow><mml:mi>i</mml:mi></mml:msubsup><mml:mo>,</mml:mo></mml:mrow></mml:mstyle></mml:mrow></mml:math></disp-formula><disp-formula id="eqn11"><label>(11)</label><mml:math id="m11"><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mo>〈</mml:mo><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">mask</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>〉</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi mathvariant="italic">final</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>n</mml:mi></mml:mfrac><mml:mstyle displaystyle="true"><mml:munderover><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">mask</mml:mi></mml:mrow><mml:mi>i</mml:mi></mml:msubsup><mml:mo>.</mml:mo></mml:mrow></mml:mstyle></mml:mrow></mml:math></disp-formula></p><p>The acquisition phase is split into several time blocks, in each of which 250 structures are typically stored. The <italic>R</italic>-values of all possible contiguous time blocks are calculated and the periods with the lowest <italic>R</italic><sub><italic>work</italic></sub> are selected. This selection reduces the <italic>R</italic><sub><italic>work</italic></sub> by 0–1.0% (mean improvement in 0.3%). For the 1YTT dataset with high quality experimental phases, the block selection for lowest <italic>R</italic><sub><italic>work</italic></sub> corresponds well with the overall map correlation coefficient computed between the experimentally phased map and the map derived from the ensemble model (<xref ref-type="fig" rid="fig13">Figure 13</xref>). Next, to reduce the redundancy in the number of structures in the final ensemble (during the simulation thousands of structures are collected), we calculate the smallest number of structures that reproduce the <italic>R</italic><sub><italic>free</italic></sub> within 0.1%. This is performed by iteratively parsing the stored structures with increasing time spacing (see <xref ref-type="fig" rid="fig1">Figure 1D</xref>). The overall and bulk-solvent scale factors are optimised for the final ensemble. The ensembles of structures are stored using the standard PDB format for multiple models, with <italic>B</italic>-factors listed as computed from the TLS model and overall <italic>B</italic>-factor scaling contributions.<fig id="fig13" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.026</object-id><label>Figure 13.</label><caption><p>Correlation of <italic>R</italic>-values and overall map correlation coefficient for the 1YTT dataset in the block selection procedure. The correlation coefficients are calculated between the experimentally phased electron density map (|<bold><italic>F</italic></bold><sub><italic>obs</italic></sub>|exp[<italic>iφ</italic><sub><italic>obs</italic></sub>]) and ensemble model maps (|<bold><italic>F</italic></bold><sub><italic>model</italic></sub>|exp[<italic>iφ</italic><sub><italic>model</italic></sub>]) computed for different blocks of consecutive simulation times; blue squares indicate <italic>R</italic><sub><italic>work</italic></sub> and red squares indicate <italic>R</italic><sub><italic>free</italic></sub>.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.026">http://dx.doi.org/10.7554/eLife.00311.026</ext-link></p></caption><graphic xlink:href="elife00311f013"/></fig></p></sec><sec id="s4-1-7"><title>Calculation of atomic positional probability</title><p>All atoms comprising the ensemble are assigned a probability (<italic>P</italic><sub><italic>i</italic></sub>) based on the positional likelihood of <italic>atom i</italic> in a given model relative to the complete ensemble of models. <bold><italic>F</italic></bold><sub><italic>calc</italic></sub> electron-density maps are calculated for each model in the ensemble and 〈<bold><italic>F</italic></bold><sub><italic>calc</italic></sub>〉 electron-density map is calculated for the complete ensemble as <xref ref-type="disp-formula" rid="eqn10">(10)</xref>. <italic>P</italic><sub><italic>i</italic></sub> is calculated as <xref ref-type="disp-formula" rid="eqn12">(12)</xref>.<disp-formula id="eqn12"><label>(12)</label><mml:math id="m12"><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msubsup><mml:mi>ρ</mml:mi><mml:mrow><mml:mrow><mml:mo>〈</mml:mo><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">calc</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>〉</mml:mo></mml:mrow></mml:mrow><mml:mi>i</mml:mi></mml:msubsup></mml:mrow><mml:mrow><mml:msubsup><mml:mi>ρ</mml:mi><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mrow><mml:mi mathvariant="italic">calc</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mi>i</mml:mi></mml:msubsup></mml:mrow></mml:mfrac><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p><p>Calculating from an electron-density function allows for non-Gaussian distributions unlike RMSF, which is calculated from mean atomic position. These probabilities aid the visual inspection of the ensemble models and allow the observer to control the level of detail displayed (<xref ref-type="fig" rid="fig14">Figure 14</xref>).<fig id="fig14" position="float"><object-id pub-id-type="doi">10.7554/eLife.00311.027</object-id><label>Figure 14.</label><caption><p>Interpretation of global and local details of 1UOY ensemble model is aided by relative atomic probability (as described in ‘Materials and methods’). Ensemble models, left and centre, are colour by individual atom probability (0–1) from red to blue. Single structures, right, are coloured by individual atomic <italic>B</italic>-factor as refined in phenix.refine. (<bold>A</bold>) Global structure, selecting different probability ranges highlights partially ordered water positions. (<bold>B</bold>) Atomic probabilities of loop regain features correlate with <italic>B</italic>-factors in single structure. Anharmonic motion of Ser5 can be observed as well as anisotropic motion at Tyr7, which is shown in more detail in (<bold>C</bold>).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00311.027">http://dx.doi.org/10.7554/eLife.00311.027</ext-link></p></caption><graphic xlink:href="elife00311f014"/></fig></p></sec></sec><sec id="s4-2"><title>Ensemble refinement protocol</title><sec id="s4-2-1"><title>Preparing the starting model</title><p>The starting structures were taken from the PDB server or from the PDB_REDO server if the <italic>R</italic><sub><italic>free</italic></sub> was &lt;0.25% than the equivalent PDB structure. We removed alternative positions and set corresponding occupancies to one. Overall anisotropic scale factors and solvent scale and <italic>B</italic>-factor (<italic>k</italic><sub><italic>sol</italic></sub> and <italic>B</italic><sub><italic>sol</italic></sub>) were calculated based on these traditional single-structures (i.e. using the refined <italic>B</italic>-factor models). Next, the atomic <italic>B</italic>-factors were substituted by <italic>B</italic>-factors derived the global TLS disorder model (‘Materials and methods’—TLS approximation of the global disorder).</p></sec><sec id="s4-2-2"><title>X-ray restrained MD simulation</title><p>At <italic>t</italic> = 0 〈<bold><italic>F</italic></bold><sub><italic>calc</italic></sub>〉 and 〈<bold><italic>F</italic></bold><sub><italic>sol</italic></sub>〉 are set to <bold><italic>F</italic></bold><sub><italic>calc</italic></sub> and <bold><italic>F</italic></bold><sub><italic>sol</italic></sub>. Boltzmann-weighted velocities are assigned to the atoms, corresponding to <italic>T</italic> = 300 K. The bath temperature <italic>T</italic><sub><italic>bath</italic></sub> used for velocity scaling is coupled to the X-ray weight (<italic>w</italic><sub><italic>x-ray</italic></sub>) calculation, resulting in a temperature of 300 K for all non-solvent atoms. The simulation time-step used is 0.5 fs and the force-field parameterisation is as described (<xref ref-type="bibr" rid="bib23">Grosse-Kunstleve et al., 2004</xref>). Simulations are started in parallel with varying values of <italic>p</italic><sub><italic>TLS</italic></sub> (e.g. 0.2, 0.6, 0.8, 0.9, 1.0), <italic>τ</italic><sub><italic>x</italic></sub> (e.g. 0.25, 0.5 and 1.0 ps) and <italic>T</italic><sub><italic>bath</italic></sub> (e.g. 295 and 299 K). Water positions are picked according to electron-density criteria and updated every 250 steps. Every 10 time-steps rolling average structure factors, 〈<bold><italic>F</italic></bold><sub><italic>calc</italic></sub>〉 and 〈<bold><italic>F</italic></bold><sub><italic>sol</italic></sub>〉, are updated for use in the time-averaged X-ray restraints. σ<sub><italic>A</italic></sub> Values are updated if the <italic>R</italic><sub><italic>free</italic></sub> of the rolling average model improves by &gt;0.25%. The simulations have an equilibration phase (20<italic>τ</italic><sub><italic>x</italic></sub>) in which the temperature, X-ray weight and averaged structure factors stabilize. This is followed by an acquisition phase (40<italic>τ</italic><sub><italic>x</italic></sub>) where the values for <italic>w</italic><sub><italic>x-ray</italic></sub> and σ<sub><italic>A</italic></sub> are fixed and the structures for the final ensemble model are collected.</p></sec><sec id="s4-2-3"><title>CPU time</title><p>CPU time for a dataset at 2.0-Å resolution with 199 residues in the asymmetric unit is 25 hr for each simulation using a 1.9 GHz processor.</p></sec></sec><sec id="s4-3"><title>Single structure re-refinements</title><p>The single structure re-refinements used the same starting structure as the ensemble refinements (alternative conformations were not removed) and were re-refined using phenix.refine (version 1.7.1) (<xref ref-type="bibr" rid="bib4">Afonine et al., 2012</xref>) and Buster (version 2.10.0) (<xref ref-type="bibr" rid="bib9">Bricogne et al., 2009</xref>). Standard parameters were used with the exception of optimizing the target weights and increasing the number of macro-cycles to 8 in phenix.refine. Explicit water refinement was performed, anisotropic ADPs were used if present in starting structure and TLS parameters were defined as used in the starting structure. PDB_REDO (<xref ref-type="bibr" rid="bib28">Joosten et al., 2010</xref>) models were used as deposited.</p></sec><sec id="s4-4"><title>Additional ensemble refinement calculations</title><sec id="s4-4-1"><title>Testing <italic>B</italic>-factor model in ensemble refinement</title><p>Burling et al. (<xref ref-type="bibr" rid="bib11">Burling and Brunger, 1994</xref>) had previously shown that the choice of ADPs for ensemble refinement can affect the resultant structures. Three alternative ADP models were tested for seven datasets, as shown in <xref ref-type="table" rid="tbl5">Table 5</xref>. ADP model 1, ‘Global isotropic <italic>B</italic>-factor’, uses one overall isotropic <italic>B</italic>-factor applied to all atoms in the simulation. Multiple trials were performed to establish the optimum single value. For comparison the Wilson <italic>B</italic>-factor of the data is listed. ADP model 2, ‘Refined ADPs’, uses the ADPs from the refined single-structures. Best results were obtained by multiplying the refined ADPs by given scale factor. ADP model 3, ‘Basal TLS ADPs’, uses the basal TLS model with one TLS group per chain (including all non-hydrogen, non-solvent atoms) obtained as described in ‘Materials and methods’—TLS approximation of the global disorder, where <italic>p</italic><sub><italic>TLS</italic></sub> is the percentage of atoms included the iterative fitting procedure. The basal TLS model returns the lowest <italic>R</italic><sub><italic>free</italic></sub> values in all test cases.</p></sec><sec id="s4-4-2"><title>Effect of starting model on ensemble refinement</title><p>To test the effect of the starting structure three datasets (1UOY, 3CA7 and 1BV1) were re-refined with Buster, phenix.refine and Refmac as given by the PDB_REDO server. Each of these re-refined structures was used as the input structure for ensemble refinement, using the same run-time parameters. Each ensemble refinement was repeated three times using a different random number to generate the initial atomic velocities. The results are shown in <xref ref-type="table" rid="tbl3">Table 3</xref>. The mean <italic>R</italic><sub><italic>free</italic></sub> (averaged over the random number seed repeats) of the resulting ensembles from the three different input structures are within 0.5%. The <bold><italic>F</italic></bold><sub><italic>model</italic></sub> cross correlation of ensemble pairs (best representative from each program, selected by <italic>R</italic><sub><italic>free</italic></sub>) was calculated and is shown in <xref ref-type="table" rid="tbl4">Table 4</xref>. All ensemble pairs exhibit a cross correlation of greater than 0.99.</p></sec><sec id="s4-4-3"><title>Partial occupancies</title><p>Because occupancy and <italic>B</italic>-factor are strongly coupled in a traditional refinement, the occupancies of bound ligands and ions are typically set to unity, while the corresponding <italic>B</italic>-factors are refined in a single-structure refinement. In ensemble refinement, the <italic>B</italic>-factors are not refined, but are derived from the global TLS model and the atomic fluctuations.</p><p>All simulations were initially performed with full occupancy for bound ligands and ions. In several cases, this resulted in excessive sampling of the ligand or ion, as seen when inspecting the ensemble and reported by the kinetic energies during the simulation, which were far in excess of neighbouring protein atoms. These observations indicate that the corresponding occupancy of the bound ligand or ion is less than one. In these cases the occupancies were lowered and the simulations repeated until the kinetic energy of the ligand or ion were equivalent to the proximal protein components.</p></sec></sec></sec></body><back><ack id="ack"><title>Acknowledgements</title><p>We gratefully acknowledge L. Kroon-Batenburg, R.J. Read and T. Terwilliger for discussions and A.T. Brunger for providing the experimental data for the 1YTT dataset. B.T.B. and P.G. developed the method. B.T.B., with the help of P.V.A. and P.D.A., programmed the method in PHENIX. B.T.B. and P.G. analysed the data and wrote the manuscript. P.D.A. and P.V.A. assisted in the writing of the manuscript. The atomic coordinates and structure factors for the ensemble structures are available at <ext-link ext-link-type="uri" xlink:href="http://www.phenix-online.org/phenix_data/">http://www.phenix-online.org/phenix_data/</ext-link> and have been deposited in the Dryad online repository (<xref ref-type="bibr" rid="bib13">Burnley et al., 2012</xref>).</p></ack><sec sec-type="additional-information"><title>Additional information</title><fn-group content-type="competing-interest"><title>Competing interests</title><fn fn-type="conflict" id="conf1"><p>The authors have declared that no competing interests exist</p></fn></fn-group><fn-group content-type="author-contribution"><title>Author contributions</title><fn fn-type="con" id="con1"><p>BTB: developed the method, programmed the method in PHENIX, analysed the data and wrote the manuscript</p></fn><fn fn-type="con" id="con2"><p>PVA: helped program the method in PHENIX and assisted in the writing of the manuscript</p></fn><fn fn-type="con" id="con3"><p>PDA: helped program the method in PHENIX and assisted in the writing of the manuscript</p></fn><fn fn-type="con" id="con4"><p>PG: developed the method, analysed the data and wrote the manuscript</p></fn></fn-group></sec><sec sec-type="supplementary-material"><title>Additional files</title><sec sec-type="datasets"><title>Major datasets</title><p>The following datasets were generated</p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro1"><name><surname>Burnley</surname><given-names>BT</given-names></name>, <name><surname>Afonine</surname><given-names>PV</given-names></name>, <name><surname>Adams</surname><given-names>PD</given-names></name>, <name><surname>Gros</surname><given-names>P</given-names></name>, <year>2012</year><x>, </x><source>Data from: Modelling dynamics in protein crystal structures by ensemble refinement</source><x>, </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5061/dryad.5n01h">http://dx.doi.org/10.5061/dryad.5n01h</ext-link><x>, </x><comment>Available at Dryad Digital Repository</comment></related-object></p><p>The following previously published datasets were used:</p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 2" document-id-type="dataset" document-type="data" id="dataro2"><name><surname>Reiling</surname><given-names>KK</given-names></name>, <name><surname>Endres</surname><given-names>NF</given-names></name>, <name><surname>Dauber</surname><given-names>DS</given-names></name>, <name><surname>Craik</surname><given-names>CS</given-names></name>, <name><surname>Stroud</surname><given-names>RM</given-names></name>, <year>2002</year><x>, </x><source>JE-2147-HIV Protease Complex</source><x>, </x><object-id pub-id-type="art-access-id">1KZK</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 3" document-id-type="dataset" document-type="data" id="dataro3"><name><surname>Fraser</surname><given-names>JS</given-names></name>, <name><surname>Clarkson</surname><given-names>MW</given-names></name>, <name><surname>Degnan</surname><given-names>SC</given-names></name>, <name><surname>Erion</surname><given-names>R</given-names></name>, <name><surname>Kern</surname><given-names>D</given-names></name>, <name><surname>Alber</surname><given-names>T</given-names></name>, <year>2009</year><x>, </x><source>Cryogenic structure of CypA</source><x>, </x><object-id pub-id-type="art-access-id">3K0M</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 4" document-id-type="dataset" document-type="data" id="dataro4"><name><surname>Fraser</surname><given-names>JS</given-names></name>, <name><surname>Clarkson</surname><given-names>MW</given-names></name>, <name><surname>Degnan</surname><given-names>SC</given-names></name>, <name><surname>Erion</surname><given-names>R</given-names></name>, <name><surname>Kern</surname><given-names>D</given-names></name>, <name><surname>Alber</surname><given-names>T</given-names></name>, <year>2009</year><x>, </x><source>Room temperature structure of CypA</source><x>, </x><object-id pub-id-type="art-access-id">3K0N</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 5" document-id-type="dataset" document-type="data" id="dataro5"><name><surname>Heaslet</surname><given-names>H</given-names></name>, <name><surname>Rosenfeld</surname><given-names>R</given-names></name>, <name><surname>Giffin</surname><given-names>M</given-names></name>, <name><surname>Lin</surname><given-names>YC</given-names></name>, <name><surname>Tam</surname><given-names>K</given-names></name>, <name><surname>Torbett</surname><given-names>BE</given-names></name>, <name><surname>Elder</surname><given-names>JH</given-names></name>, <name><surname>McRee</surname><given-names>DE</given-names></name>, <name><surname>Stout</surname><given-names>CD</given-names></name>, <year>2007</year><x>, </x><source>Apo Wild-type HIV Protease in the open conformation</source><x>, </x><object-id pub-id-type="art-access-id">2PC0</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 6" document-id-type="dataset" document-type="data" id="dataro6"><name><surname>Olsen</surname><given-names>JG</given-names></name>, <name><surname>Flensburg</surname><given-names>C</given-names></name>, <name><surname>Olsen</surname><given-names>O</given-names></name>, <name><surname>Bricogne</surname><given-names>G</given-names></name>, <name><surname>Henriksen</surname><given-names>A</given-names></name>, <year>2004</year><x>, </x><source>The Bubble Protein from <italic>Penicillium brevicompactum</italic> Dierckx Exudate</source><x>, </x><object-id pub-id-type="art-access-id">1UOY</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 7" document-id-type="dataset" document-type="data" id="dataro7"><name><surname>Klein</surname><given-names>DE</given-names></name>, <name><surname>Stayrook</surname><given-names>SE</given-names></name>, <name><surname>Shi</surname><given-names>F</given-names></name>, <name><surname>Narayan</surname><given-names>K</given-names></name>, <name><surname>Lemmon</surname><given-names>MA</given-names></name>, <year>2008</year><x>, </x><source>High Resolution Crystal Structure of the EGF domain of spitz</source><x>, </x><object-id pub-id-type="art-access-id">3CA7</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 8" document-id-type="dataset" document-type="data" id="dataro8"><name><surname>Wang</surname><given-names>H</given-names></name>, <name><surname>Yan</surname><given-names>Z</given-names></name>, <name><surname>Geng</surname><given-names>J</given-names></name>, <name><surname>Kunz</surname><given-names>S</given-names></name>, <name><surname>Seebeck</surname><given-names>T</given-names></name>, <name><surname>Ke</surname><given-names>H</given-names></name>, <year>2007</year><x>, </x><source>Structure of LmjPDEB1 in complex with IBMX</source><x>, </x><object-id pub-id-type="art-access-id">2R8Q</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 9" document-id-type="dataset" document-type="data" id="dataro9"><name><surname>Bhabha</surname><given-names>G</given-names></name>, <name><surname>Lee</surname><given-names>J</given-names></name>, <name><surname>Ekiert</surname><given-names>DC</given-names></name>, <name><surname>Gam</surname><given-names>J</given-names></name>, <name><surname>Wilson</surname><given-names>IA</given-names></name>, <name><surname>Dyson</surname><given-names>HJ</given-names></name>, <name><surname>Benkovic</surname><given-names>SJ</given-names></name>, <name><surname>Wright</surname><given-names>PE</given-names></name>, <year>2011</year><x>, </x><source>Crystal structure of N23PP/S148A mutant of <italic>E. coli</italic> dihydrofolate reductase</source><x>, </x><object-id pub-id-type="art-access-id">3QL0</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 10" document-id-type="dataset" document-type="data" id="dataro10"><name><surname>Dunlop</surname><given-names>KV</given-names></name>, <name><surname>Irvin</surname><given-names>RT</given-names></name>, <name><surname>Hazes</surname><given-names>B</given-names></name>, <year>2005</year><x>, </x><source>Structure 4; room temperature crystal structure of truncated pak pilin from <italic>Pseudomonas aeruginosa</italic> at 1.63A resolution</source><x>, </x><object-id pub-id-type="art-access-id">1X6P</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 11" document-id-type="dataset" document-type="data" id="dataro11"><name><surname>Kimber</surname><given-names>MS</given-names></name>, <name><surname>Nachman</surname><given-names>J</given-names></name>, <name><surname>Cunningham</surname><given-names>AM</given-names></name>, <name><surname>Gish</surname><given-names>GD</given-names></name>, <name><surname>Pawson</surname><given-names>T</given-names></name>, <name><surname>Pai</surname><given-names>EF</given-names></name>, <year>2000</year><x>, </x><source>SRC SH2 ThrEF1Trp Mutant</source><x>, </x><object-id pub-id-type="art-access-id">1F2F</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 12" document-id-type="dataset" document-type="data" id="dataro12"><name><surname>Bhabha</surname><given-names>G</given-names></name>, <name><surname>Lee</surname><given-names>J</given-names></name>, <name><surname>Ekiert</surname><given-names>DC</given-names></name>, <name><surname>Gam</surname><given-names>J</given-names></name>, <name><surname>Wilson</surname><given-names>IA</given-names></name>, <name><surname>Dyson</surname><given-names>HJ</given-names></name>, <name><surname>Benkovic</surname><given-names>SJ</given-names></name>, <name><surname>Wright</surname><given-names>PE</given-names></name>, <year>2011</year><x>, </x><source>Re-refined coordinates for PDB entry 1RX2</source><x>, </x><object-id pub-id-type="art-access-id">3QL3</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 13" document-id-type="dataset" document-type="data" id="dataro13"><name><surname>Burling</surname><given-names>FT</given-names></name>, <name><surname>Weis</surname><given-names>WI</given-names></name>, <name><surname>Flaherty</surname><given-names>KM</given-names></name>, <name><surname>Brunger</surname><given-names>AT</given-names></name>, <year>1996</year><x>, </x><source>Yb substituted subtilisin fragment of mannose binding protein A (Sub-MBP-A), MAD structure at 110K</source><x>, </x><object-id pub-id-type="art-access-id">1YTT</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 14" document-id-type="dataset" document-type="data" id="dataro14"><name><surname>Rodriguez</surname><given-names>DD</given-names></name>, <name><surname>Grosse</surname><given-names>C</given-names></name>, <name><surname>Himmel</surname><given-names>S</given-names></name>, <name><surname>Gonzalez</surname><given-names>C</given-names></name>, <name><surname>de Ilarduya</surname><given-names>IM</given-names></name>, <name><surname>Becker</surname><given-names>S</given-names></name>, <name><surname>Sheldrick</surname><given-names>GM</given-names></name>, <name><surname>Uson</surname><given-names>I</given-names></name>, <year>2009</year><x>, </x><source>Crystallographic Ab Initio protein solution far below atomic resolution</source><x>, </x><object-id pub-id-type="art-access-id">3GWH</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 15" document-id-type="dataset" document-type="data" id="dataro15"><name><surname>Gajhede</surname><given-names>M</given-names></name>, <name><surname>Osmark</surname><given-names>P</given-names></name>, <name><surname>Poulsen</surname><given-names>FM</given-names></name>, <name><surname>Ipsen</surname><given-names>H</given-names></name>, <name><surname>Larsen</surname><given-names>JN</given-names></name>, <name><surname>Joost van Neerven</surname><given-names>RJ</given-names></name>, <name><surname>Schou</surname><given-names>C</given-names></name>, <name><surname>Lowenstein</surname><given-names>H</given-names></name>, <name><surname>Spangfort</surname><given-names>MD</given-names></name>, <year>1996</year><x>, </x><source>Birch pollen allergen Bet V 1</source><x>, </x><object-id pub-id-type="art-access-id">1BV1</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 16" document-id-type="dataset" document-type="data" id="dataro16"><name><surname>Nagar</surname><given-names>B</given-names></name>, <name><surname>Bornmann</surname><given-names>W</given-names></name>, <name><surname>Pellicena</surname><given-names>P</given-names></name>, <name><surname>Schindler</surname><given-names>T</given-names></name>, <name><surname>Veach</surname><given-names>DR</given-names></name>, <name><surname>Miller</surname><given-names>WT</given-names></name>, <name><surname>Clarkson</surname><given-names>B</given-names></name>, <name><surname>Kuriyan</surname><given-names>J</given-names></name>, <year>2002</year><x>, </x><source>Crystal structure of the C-Abl Kinase domain in complex with STI-571</source><x>, </x><object-id pub-id-type="art-access-id">1IEP</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 17" document-id-type="dataset" document-type="data" id="dataro17"><name><surname>Singh</surname><given-names>BK</given-names></name>, <name><surname>Sattler</surname><given-names>JM</given-names></name>, <name><surname>Chatterjee</surname><given-names>M</given-names></name>, <name><surname>Huttu</surname><given-names>J</given-names></name>, <name><surname>Schuler</surname><given-names>H</given-names></name>, <name><surname>Kursula</surname><given-names>I</given-names></name>, <year>2011</year><x>, </x><source>Crystal structure of <italic>Plasmodium berghei</italic> actin depolymerization Factor 2</source><x>, </x><object-id pub-id-type="art-access-id">2XFA</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 18" document-id-type="dataset" document-type="data" id="dataro18"><name><surname>Wu</surname><given-names>B</given-names></name>, <name><surname>Chien</surname><given-names>EY</given-names></name>, <name><surname>Mol</surname><given-names>CD</given-names></name>, <name><surname>Fenalti</surname><given-names>G</given-names></name>, <name><surname>Liu</surname><given-names>W</given-names></name>, <name><surname>Katritch</surname><given-names>V</given-names></name>, <name><surname>Abagyan</surname><given-names>R</given-names></name>, <name><surname>Brooun</surname><given-names>A</given-names></name>, <name><surname>Wells</surname><given-names>P</given-names></name>, <name><surname>Bi</surname><given-names>FC</given-names></name>, <name><surname>Hamel</surname><given-names>DJ</given-names></name>, <name><surname>Kuhn</surname><given-names>P</given-names></name>, <name><surname>Handel</surname><given-names>TM</given-names></name>, <name><surname>Cherezov</surname><given-names>V</given-names></name>, <name><surname>Stevens</surname><given-names>RC</given-names></name>, <year>2010</year><x>, </x><source>The 2.5 A structure of the CXCR4 chemokine receptor in complex with small molecule antagonist IT1t</source><x>, </x><object-id pub-id-type="art-access-id">3ODU</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 19" document-id-type="dataset" document-type="data" id="dataro19"><name><surname>Nagar</surname><given-names>B</given-names></name>, <name><surname>Bornmann</surname><given-names>W</given-names></name>, <name><surname>Pellicena</surname><given-names>P</given-names></name>, <name><surname>Schindler</surname><given-names>T</given-names></name>, <name><surname>Veach</surname><given-names>DR</given-names></name>, <name><surname>Miller</surname><given-names>WT</given-names></name>, <name><surname>Clarkson</surname><given-names>B</given-names></name>, <name><surname>Kuriyan</surname><given-names>J</given-names></name>, <year>2002</year><x>, </x><source>Crystal Structure of the c-Abl Kinase domain in complex with PD173955</source><x>, </x><object-id pub-id-type="art-access-id">1M52</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 20" document-id-type="dataset" document-type="data" id="dataro20"><name><surname>He</surname><given-names>X</given-names></name>, <name><surname>Zhou</surname><given-names>J</given-names></name>, <name><surname>Bartlam</surname><given-names>M</given-names></name>, <name><surname>Zhang</surname><given-names>R</given-names></name>, <name><surname>Ma</surname><given-names>J</given-names></name>, <name><surname>Lou</surname><given-names>Z</given-names></name>, <name><surname>Li</surname><given-names>X</given-names></name>, <name><surname>Li</surname><given-names>J</given-names></name>, <name><surname>Joachimiak</surname><given-names>A</given-names></name>, <name><surname>Zeng</surname><given-names>Z</given-names></name>, <name><surname>Ge</surname><given-names>R</given-names></name>, <name><surname>Rao</surname><given-names>Z</given-names></name>, <name><surname>Liu</surname><given-names>Y</given-names></name>, <year>2008</year><x>, </x><source>A RNA polymerase subunit structure from virus</source><x>, </x><object-id pub-id-type="art-access-id">3CM8</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p><p><related-object content-type="generated-dataset" document-id=" Dataset ID and/or url 21" document-id-type="dataset" document-type="data" id="dataro21"><name><surname>Shimamura</surname><given-names>T</given-names></name>, <name><surname>Shiroishi</surname><given-names>M</given-names></name>, <name><surname>Weyand</surname><given-names>S</given-names></name>, <name><surname>Tsujimoto</surname><given-names>H</given-names></name>, <name><surname>Winter</surname><given-names>G</given-names></name>, <name><surname>Katritch</surname><given-names>V</given-names></name>, <name><surname>Abagyan</surname><given-names>R</given-names></name>, <name><surname>Cherezov</surname><given-names>V</given-names></name>, <name><surname>Liu</surname><given-names>W</given-names></name>, <name><surname>Han</surname><given-names>GW</given-names></name>, <name><surname>Kobayashi</surname><given-names>T</given-names></name>, <name><surname>Stevens</surname><given-names>RC</given-names></name>, <name><surname>Iwata</surname><given-names>S</given-names></name>, <year>2011</year><x>, </x><source>Structure of the human histamine H1 receptor in complex with doxepin</source><x>, </x><object-id pub-id-type="art-access-id">3RZE</object-id><x>; </x><comment>Publically available at the RCSB Protein Data Bank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org/pdb/">http://www.rcsb.org/pdb/</ext-link>)</comment></related-object></p></sec></sec><ref-list><title>References</title><ref id="bib1"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Adams</surname><given-names>PD</given-names></name><name><surname>Afonine</surname><given-names>PV</given-names></name><name><surname>Bunkóczi</surname><given-names>G</given-names></name><name><surname>Chen</surname><given-names>VB</given-names></name><name><surname>Davis</surname><given-names>IW</given-names></name><name><surname>Echols</surname><given-names>N</given-names></name><etal/></person-group><year>2010</year><article-title>PHENIX: a comprehensive Python-based system for macromolecular structure solution</article-title><source>Acta Crystallogr D Biol Crystallogr</source><volume>66</volume><fpage>213</fpage><lpage>21</lpage><pub-id pub-id-type="doi">10.1107/S0907444909052925</pub-id></element-citation></ref><ref id="bib2"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Adams</surname><given-names>PD</given-names></name><name><surname>Pannu</surname><given-names>NS</given-names></name><name><surname>Read</surname><given-names>RJ</given-names></name><name><surname>Brünger</surname><given-names>AT</given-names></name></person-group><year>1997</year><article-title>Cross-validated maximum likelihood enhances crystallographic simulated annealing refinement</article-title><source>Proc Natl Acad Sci USA</source><volume>94</volume><fpage>5018</fpage><lpage>23</lpage></element-citation></ref><ref id="bib3"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Afonine</surname><given-names>PV</given-names></name><name><surname>Grosse-Kunstleve</surname><given-names>RW</given-names></name><name><surname>Adams</surname><given-names>PD</given-names></name></person-group><year>2005</year><article-title>A robust bulk-solvent correction and anisotropic scaling procedure</article-title><source>Acta Crystallogr D Biol Crystallogr</source><volume>61</volume><fpage>850</fpage><lpage>5</lpage><pub-id pub-id-type="doi">10.1107/S0907444905007894</pub-id></element-citation></ref><ref id="bib4"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Afonine</surname><given-names>PV</given-names></name><name><surname>Grosse-Kunstleve</surname><given-names>RW</given-names></name><name><surname>Echols</surname><given-names>N</given-names></name><name><surname>Headd</surname><given-names>JJ</given-names></name><name><surname>Moriarty</surname><given-names>NW</given-names></name><name><surname>Mustyakimov</surname><given-names>M</given-names></name><etal/></person-group><year>2012</year><article-title>Towards automated crystallographic structure refinement with phenix.refine</article-title><source>Acta Crystallogr D Biol Crystallogr</source><volume>68</volume><fpage>352</fpage><lpage>67</lpage><pub-id pub-id-type="doi">10.1107/S0907444912001308</pub-id></element-citation></ref><ref id="bib5"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Berendsen</surname><given-names>HJC</given-names></name><name><surname>Postma</surname><given-names>JPM</given-names></name><name><surname>van Gunsteren</surname><given-names>WF</given-names></name><name><surname>DiNola</surname><given-names>A</given-names></name><name><surname>Haak</surname><given-names>JR</given-names></name></person-group><year>1984</year><article-title>Molecular dynamics with coupling to an external bath</article-title><source>The Journal of Chemical Physics</source><volume>81</volume><fpage>3684</fpage><pub-id pub-id-type="doi">10.1063/1.448118</pub-id></element-citation></ref><ref id="bib6"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Berman</surname><given-names>HM</given-names></name><name><surname>Westbrook</surname><given-names>J</given-names></name><name><surname>Feng</surname><given-names>Z</given-names></name><name><surname>Gilliland</surname><given-names>G</given-names></name><name><surname>Bhat</surname><given-names>TN</given-names></name><name><surname>Weissig</surname><given-names>H</given-names></name><etal/></person-group><year>2000</year><article-title>The protein data bank</article-title><source>Nucleic Acids Res</source><volume>28</volume><fpage>235</fpage><lpage>42</lpage><pub-id pub-id-type="doi">10.1093/nar/28.1.235</pub-id></element-citation></ref><ref id="bib7"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bhabha</surname><given-names>G</given-names></name><name><surname>Lee</surname><given-names>J</given-names></name><name><surname>Ekiert</surname><given-names>DC</given-names></name><name><surname>Gam</surname><given-names>J</given-names></name><name><surname>Wilson</surname><given-names>IA</given-names></name><name><surname>Dyson</surname><given-names>HJ</given-names></name><etal/></person-group><year>2011</year><article-title>A dynamic knockout reveals that conformational fluctuations influence the chemical step of enzyme catalysis</article-title><source>Science</source><volume>332</volume><fpage>234</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1126/science.1198542</pub-id></element-citation></ref><ref id="bib8"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Brändén</surname><given-names>CI</given-names></name><name><surname>Jones</surname><given-names>TA</given-names></name></person-group><year>1990</year><article-title>Between objectivity and subjectivity</article-title><source>Nature</source><volume>343</volume><fpage>687</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1038/343687a0</pub-id></element-citation></ref><ref id="bib9"><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>Bricogne</surname><given-names>G</given-names></name><name><surname>Blanc</surname><given-names>E</given-names></name><name><surname>Brandi</surname><given-names>M</given-names></name><name><surname>Flensburg</surname><given-names>C</given-names></name><name><surname>Keller</surname><given-names>P</given-names></name><name><surname>Paciorek</surname><given-names>W</given-names></name><etal/></person-group><year>2009</year><comment>BUSTER, version 2.8.0. Retrieved from about:home</comment></element-citation></ref><ref id="bib10"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Brünger</surname><given-names>AT</given-names></name></person-group><year>1992</year><article-title>Free R value: a novel statistical quantity for assessing the accuracy of crystal structures</article-title><source>Nature</source><volume>355</volume><fpage>472</fpage><lpage>5</lpage><pub-id pub-id-type="doi">10.1038/355472a0</pub-id></element-citation></ref><ref id="bib11"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Burling</surname><given-names>FT</given-names></name><name><surname>Brunger</surname><given-names>AT</given-names></name></person-group><year>1994</year><article-title>Thermal motion and conformational disorder in protein crystal structures: comparison of multi-conformer and time-averaging models</article-title><source>Israel Journal of Chemistry</source><volume>34</volume><fpage>165</fpage><lpage>75</lpage></element-citation></ref><ref id="bib12"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Burling</surname><given-names>FT</given-names></name><name><surname>Weis</surname><given-names>WI</given-names></name><name><surname>Flaherty</surname><given-names>KM</given-names></name><name><surname>Brünger</surname><given-names>AT</given-names></name></person-group><year>1996</year><article-title>Direct observation of protein solvation and discrete disorder with experimental crystallographic phases</article-title><source>Science</source><volume>271</volume><fpage>72</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1126/science.271.5245.72</pub-id></element-citation></ref><ref id="bib13"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Burnley</surname><given-names>BT</given-names></name><name><surname>Afonine</surname><given-names>PV</given-names></name><name><surname>Adams</surname><given-names>PD</given-names></name><name><surname>Gros</surname><given-names>P</given-names></name></person-group><year>2012</year><article-title>Data from: modelling dynamics in protein crystal structures by ensemble refinement</article-title><source>Dryad Digital Repository</source><pub-id pub-id-type="doi">10.5061/dryad.5n01h</pub-id></element-citation></ref><ref id="bib14"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Clarage</surname><given-names>JB</given-names></name><name><surname>Phillips</surname><given-names>GN</given-names><suffix>Jr</suffix></name></person-group><year>1994</year><article-title>Cross-validation tests of time-averaged molecular dynamics refinements for determination of protein structures by X-ray crystallography</article-title><source>Acta Crystallogr D Biol Crystallogr</source><volume>50</volume><fpage>24</fpage><lpage>36</lpage><pub-id pub-id-type="doi">10.1107/S0907444993009515</pub-id></element-citation></ref><ref id="bib15"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>DePristo</surname><given-names>MA</given-names></name><name><surname>de Bakker</surname><given-names>PI</given-names></name><name><surname>Blundell</surname><given-names>TL</given-names></name></person-group><year>2004</year><article-title>Heterogeneity and inaccuracy in protein structures solved by X-ray crystallography</article-title><source>Structure</source><volume>12</volume><fpage>831</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1016/j.str.2004.02.031</pub-id></element-citation></ref><ref id="bib16"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dunlop</surname><given-names>KV</given-names></name><name><surname>Irvin</surname><given-names>RT</given-names></name><name><surname>Hazes</surname><given-names>B</given-names></name></person-group><year>2005</year><article-title>Pros and cons of cryocrystallography: should we also collect a room-temperature data set?</article-title><source>Acta Crystallogr D Biol Crystallogr</source><volume>61</volume><fpage>80</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1107/S0907444904027179</pub-id></element-citation></ref><ref id="bib17"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Eisenmesser</surname><given-names>EZ</given-names></name><name><surname>Millet</surname><given-names>O</given-names></name><name><surname>Labeikovsky</surname><given-names>W</given-names></name><name><surname>Korzhnev</surname><given-names>DM</given-names></name><name><surname>Wolf-Watz</surname><given-names>M</given-names></name><name><surname>Bosco</surname><given-names>DA</given-names></name><etal/></person-group><year>2005</year><article-title>Intrinsic dynamics of an enzyme underlies catalysis</article-title><source>Nature</source><volume>438</volume><fpage>117</fpage><lpage>21</lpage><pub-id pub-id-type="doi">10.1038/nature04105</pub-id></element-citation></ref><ref id="bib18"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Feynman</surname><given-names>RP</given-names></name><name><surname>Leighton</surname><given-names>RB</given-names></name><name><surname>Sands</surname><given-names>ML</given-names></name></person-group><source>The Feynman lectures on physics, Vol 1: Mainly mechanics, radiation, and heat</source><publisher-loc>Reading</publisher-loc><publisher-name>Addison-Wesley</publisher-name><year>1963</year></element-citation></ref><ref id="bib19"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Fraser</surname><given-names>JS</given-names></name><name><surname>Clarkson</surname><given-names>MW</given-names></name><name><surname>Degnan</surname><given-names>SC</given-names></name><name><surname>Erion</surname><given-names>R</given-names></name><name><surname>Kern</surname><given-names>D</given-names></name><name><surname>Alber</surname><given-names>T</given-names></name></person-group><year>2009</year><article-title>Hidden alternative structures of proline isomerase essential for catalysis</article-title><source>Nature</source><volume>462</volume><fpage>669</fpage><lpage>73</lpage><pub-id pub-id-type="doi">10.1038/nature08615</pub-id></element-citation></ref><ref id="bib20"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Furnham</surname><given-names>N</given-names></name><name><surname>Blundell</surname><given-names>TL</given-names></name><name><surname>DePristo</surname><given-names>MA</given-names></name><name><surname>Terwilliger</surname><given-names>TC</given-names></name></person-group><year>2006</year><article-title>Is one solution good enough?</article-title><source>Nat Struct Mol Biol</source><volume>13</volume><fpage>184</fpage><lpage>5</lpage><pub-id pub-id-type="doi">10.1038/nsmb0306-184</pub-id></element-citation></ref><ref id="bib21"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gajhede</surname><given-names>M</given-names></name><name><surname>Osmark</surname><given-names>P</given-names></name><name><surname>Poulsen</surname><given-names>FM</given-names></name><name><surname>Ipsen</surname><given-names>H</given-names></name><name><surname>Larsen</surname><given-names>JN</given-names></name><name><surname>Joost van Neerven</surname><given-names>RJ</given-names></name><etal/></person-group><year>1996</year><article-title>X-ray and NMR structure of Bet v 1, the origin of birch pollen allergy</article-title><source>Nat Struct Mol Biol</source><volume>3</volume><fpage>1040</fpage><lpage>5</lpage><pub-id pub-id-type="doi">10.1038/nsb1296-1040</pub-id></element-citation></ref><ref id="bib22"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gros</surname><given-names>P</given-names></name><name><surname>van Gunsteren</surname><given-names>WF</given-names></name><name><surname>Hol</surname><given-names>WG</given-names></name></person-group><year>1990</year><article-title>Inclusion of thermal motion in crystallographic structures by restrained molecular dynamics</article-title><source>Science</source><volume>249</volume><fpage>1149</fpage><lpage>52</lpage></element-citation></ref><ref id="bib23"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Grosse-Kunstleve</surname><given-names>RW</given-names></name><name><surname>Afonine</surname><given-names>PV</given-names></name><name><surname>Adams</surname><given-names>PD</given-names></name></person-group><year>2004</year><article-title>Cctbx news: geometry restraints and other new features</article-title><source>Newsletter of the IUCr Commission on Crystallographic Computing</source><volume>4</volume><fpage>19</fpage><lpage>36</lpage></element-citation></ref><ref id="bib24"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hazes</surname><given-names>B</given-names></name><name><surname>Sastry</surname><given-names>PA</given-names></name><name><surname>Hayakawa</surname><given-names>K</given-names></name><name><surname>Read</surname><given-names>RJ</given-names></name><name><surname>Irvin</surname><given-names>RT</given-names></name></person-group><year>2000</year><article-title>Crystal structure of <italic>Pseudomonas aeruginosa</italic> PAK pilin suggests a main-chain-dominated mode of receptor binding</article-title><source>J Mol Biol</source><volume>299</volume><fpage>1005</fpage><lpage>17</lpage><pub-id pub-id-type="doi">10.1006/jmbi.2000.3801</pub-id></element-citation></ref><ref id="bib25"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>He</surname><given-names>X</given-names></name><name><surname>Zhou</surname><given-names>J</given-names></name><name><surname>Bartlam</surname><given-names>M</given-names></name><name><surname>Zhang</surname><given-names>R</given-names></name><name><surname>Ma</surname><given-names>J</given-names></name><name><surname>Lou</surname><given-names>Z</given-names></name><etal/></person-group><year>2008</year><article-title>Crystal structure of the polymerase PAC–PB1N complex from an avian influenza H5N1 virus</article-title><source>Nature</source><volume>454</volume><fpage>1123</fpage><lpage>6</lpage><pub-id pub-id-type="doi">10.1038/nature07120</pub-id></element-citation></ref><ref id="bib26"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Heaslet</surname><given-names>H</given-names></name><name><surname>Rosenfeld</surname><given-names>R</given-names></name><name><surname>Giffin</surname><given-names>M</given-names></name><name><surname>Lin</surname><given-names>YC</given-names></name><name><surname>Tam</surname><given-names>K</given-names></name><name><surname>Torbett</surname><given-names>BE</given-names></name><etal/></person-group><year>2007</year><article-title>Conformational flexibility in the flap domains of ligand-free HIV protease</article-title><source>Acta Crystallogr D Biol Crystallogr</source><volume>63</volume><fpage>866</fpage><lpage>75</lpage><pub-id pub-id-type="doi">10.1107/S0907444907029125</pub-id></element-citation></ref><ref id="bib27"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jiang</surname><given-names>JS.</given-names></name><name><surname>Brünger</surname><given-names>AT</given-names></name></person-group><year>1994</year><article-title>Protein hydration observed by X-ray diffraction. Solvation properties of penicillopepsin and neuraminidase crystal structures</article-title><source>J Mol Biol</source><volume>243</volume><fpage>100</fpage><lpage>15</lpage><pub-id pub-id-type="doi">10.1006/jmbi.1994.1633</pub-id></element-citation></ref><ref id="bib28"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Joosten</surname><given-names>RP</given-names></name><name><surname>te Beek</surname><given-names>TA</given-names></name><name><surname>Krieger</surname><given-names>E</given-names></name><name><surname>Hekkelman</surname><given-names>ML</given-names></name><name><surname>Hooft</surname><given-names>RW</given-names></name><name><surname>Schneider</surname><given-names>R</given-names></name><etal/></person-group><year>2010</year><article-title>A series of PDB related databases for everyday needs</article-title><source>Nucleic Acids Res</source><volume>39</volume><fpage>D411</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1093/nar/gkq1105</pub-id></element-citation></ref><ref id="bib29"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kimber</surname><given-names>MS</given-names></name><name><surname>Nachman</surname><given-names>J</given-names></name><name><surname>Cunningham</surname><given-names>AM</given-names></name><name><surname>Gish</surname><given-names>GD</given-names></name><name><surname>Pawson</surname><given-names>T</given-names></name><name><surname>Pai</surname><given-names>EF</given-names></name></person-group><year>2000</year><article-title>Structural basis for specificity switching of the Src SH2 domain</article-title><source>Mol Cell</source><volume>5</volume><fpage>1043</fpage><lpage>9</lpage></element-citation></ref><ref id="bib30"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Klein</surname><given-names>DE</given-names></name><name><surname>Stayrook</surname><given-names>SE</given-names></name><name><surname>Shi</surname><given-names>F</given-names></name><name><surname>Narayan</surname><given-names>K</given-names></name><name><surname>Lemmon</surname><given-names>MA</given-names></name></person-group><year>2008</year><article-title>Structural basis for EGFR ligand sequestration by Argos</article-title><source>Nature</source><volume>453</volume><fpage>1271</fpage><lpage>5</lpage><pub-id pub-id-type="doi">10.1038/nature06978</pub-id></element-citation></ref><ref id="bib31"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Korostelev</surname><given-names>A</given-names></name><name><surname>Laurberg</surname><given-names>M</given-names></name><name><surname>Noller</surname><given-names>HF</given-names></name></person-group><year>2009</year><article-title>Multistart simulated annealing refinement of the crystal structure of the 70S ribosome</article-title><source>Proc Natl Acad Sci USA</source><volume>106</volume><fpage>18195</fpage><lpage>200</lpage><pub-id pub-id-type="doi">10.1073/pnas.0909287106</pub-id></element-citation></ref><ref id="bib32"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lang</surname><given-names>PT</given-names></name><name><surname>Ng</surname><given-names>HL</given-names></name><name><surname>Fraser</surname><given-names>JS</given-names></name><name><surname>Corn</surname><given-names>JE</given-names></name><name><surname>Echols</surname><given-names>N</given-names></name><name><surname>Sales</surname><given-names>M</given-names></name><etal/></person-group><year>2010</year><article-title>Automated electron-density sampling reveals widespread conformational polymorphism in proteins</article-title><source>Protein Sci</source><volume>19</volume><fpage>1420</fpage><lpage>31</lpage></element-citation></ref><ref id="bib33"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname><given-names>AL</given-names></name><name><surname>Kinnear</surname><given-names>SA</given-names></name><name><surname>Wand</surname><given-names>AJ</given-names></name></person-group><year>2000</year><article-title>Redistribution and loss of side chain entropy upon formation of a calmodulin-peptide complex</article-title><source>Nat Struct Biol</source><volume>7</volume><fpage>72</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1038/71280</pub-id></element-citation></ref><ref id="bib34"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Levin</surname><given-names>EJ</given-names></name><name><surname>Kondrashov</surname><given-names>DA</given-names></name><name><surname>Wesenberg</surname><given-names>GE</given-names></name><name><surname>Phillips</surname><given-names>GN</given-names><suffix>Jr</suffix></name></person-group><year>2007</year><article-title>Ensemble refinement of protein crystal structures: validation and application</article-title><source>Structure</source><volume>15</volume><fpage>1040</fpage><lpage>52</lpage><pub-id pub-id-type="doi">10.1016/j.str.2007.06.019</pub-id></element-citation></ref><ref id="bib35"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Linderstrøm-Lang</surname><given-names>K</given-names></name><name><surname>Schellman</surname><given-names>J</given-names></name></person-group><article-title>Protein structure and enzyme activity</article-title><person-group person-group-type="editor"><name><surname>Boyer</surname><given-names>PD</given-names></name><name><surname>Lardy</surname><given-names>H</given-names></name><name><surname>Myrback</surname><given-names>K</given-names></name></person-group><source>The Enzymes</source><volume>Vol 1</volume><publisher-loc>New York</publisher-loc><publisher-name>Academic Press</publisher-name><year>1959</year></element-citation></ref><ref id="bib36"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Marković-Housley</surname><given-names>Z</given-names></name><name><surname>Degano</surname><given-names>M</given-names></name><name><surname>Lamba</surname><given-names>D</given-names></name><name><surname>von Roepenack-Lahaye</surname><given-names>E</given-names></name><name><surname>Clemens</surname><given-names>S</given-names></name><name><surname>Susani</surname><given-names>M</given-names></name><etal/></person-group><year>2003</year><article-title>Crystal structure of a hypoallergenic isoform of the major birch pollen allergen Bet v 1 and its likely biological function as a plant steroid carrier</article-title><source>J Mol Biol</source><volume>325</volume><fpage>123</fpage><lpage>33</lpage></element-citation></ref><ref id="bib37"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Murshudov</surname><given-names>GN</given-names></name><name><surname>Vagin</surname><given-names>AA</given-names></name><name><surname>Dodson</surname><given-names>EJ</given-names></name></person-group><year>1997</year><article-title>Refinement of macromolecular structures by the maximum-likelihood method</article-title><source>Acta Crystallogr D Biol Crystallogr</source><volume>53</volume><fpage>240</fpage><lpage>55</lpage><pub-id pub-id-type="doi">10.1107/S0907444996012255</pub-id></element-citation></ref><ref id="bib38"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nagar</surname><given-names>B</given-names></name><name><surname>Bornmann</surname><given-names>WG</given-names></name><name><surname>Pellicena</surname><given-names>P</given-names></name><name><surname>Schindler</surname><given-names>T</given-names></name><name><surname>Veach</surname><given-names>DR</given-names></name><name><surname>Miller</surname><given-names>WT</given-names></name><etal/></person-group><year>2002</year><article-title>Crystal structures of the kinase domain of c-Abl in complex with the small molecule inhibitors PD173955 and imatinib (STI-571)</article-title><source>Cancer Res</source><volume>62</volume><fpage>4236</fpage><lpage>43</lpage></element-citation></ref><ref id="bib39"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Olsen</surname><given-names>JG</given-names></name><name><surname>Flensburg</surname><given-names>C</given-names></name><name><surname>Olsen</surname><given-names>O</given-names></name><name><surname>Bricogne</surname><given-names>G</given-names></name><name><surname>Henriksen</surname><given-names>A</given-names></name></person-group><year>2004</year><article-title>Solving the structure of the bubble protein using the anomalous sulfur signal from single-crystal in-house Cu Kalpha diffraction data only</article-title><source>Acta Crystallogr D Biol Crystallogr</source><volume>60</volume><fpage>250</fpage><lpage>5</lpage><pub-id pub-id-type="doi">10.1107/S0907444903025927</pub-id></element-citation></ref><ref id="bib40"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Pannu</surname><given-names>NS</given-names></name><name><surname>Read</surname><given-names>RJ</given-names></name></person-group><year>1996</year><article-title>Improved structure refinement through maximum likelihood</article-title><source>Acta Crystallogr Sect A Found Crystallogr</source><volume>52</volume><fpage>659</fpage><lpage>68</lpage><pub-id pub-id-type="doi">10.1107/S0108767396004370</pub-id></element-citation></ref><ref id="bib41"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Reiling</surname><given-names>KK</given-names></name><name><surname>Endres</surname><given-names>NF</given-names></name><name><surname>Dauber</surname><given-names>DS</given-names></name><name><surname>Craik</surname><given-names>CS</given-names></name><name><surname>Stroud</surname><given-names>RM</given-names></name></person-group><year>2002</year><article-title>Anisotropic dynamics of the JE-2147-HIV protease complex: drug resistance and thermodynamic binding mode examined in a 1.09 A structure</article-title><source>Biochemistry</source><volume>41</volume><fpage>4582</fpage><lpage>94</lpage></element-citation></ref><ref id="bib42"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rodríguez</surname><given-names>DD</given-names></name><name><surname>Grosse</surname><given-names>C</given-names></name><name><surname>Himmel</surname><given-names>S</given-names></name><name><surname>Gonzalez</surname><given-names>C</given-names></name><name><surname>de Ilarduya</surname><given-names>IM</given-names></name><name><surname>Becker</surname><given-names>S</given-names></name><etal/></person-group><year>2009</year><article-title>Crystallographic ab initio protein structure solution below atomic resolution</article-title><source>Nat Methods</source><volume>6</volume><fpage>651</fpage><lpage>3</lpage><pub-id pub-id-type="doi">10.1038/nmeth.1365</pub-id></element-citation></ref><ref id="bib43"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schiffer</surname><given-names>CA</given-names></name><name><surname>Gros</surname><given-names>P</given-names></name><name><surname>van Gunsteren</surname><given-names>WF</given-names></name></person-group><year>1995</year><article-title>Time-averaging crystallographic refinement: possibilities and limitations using alpha-cyclodextrin as a test system</article-title><source>Acta Crystallogr D Biol Crystallogr</source><volume>51</volume><fpage>85</fpage><lpage>92</lpage><pub-id pub-id-type="doi">10.1107/S0907444994007158</pub-id></element-citation></ref><ref id="bib44"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schindler</surname><given-names>T</given-names></name><name><surname>Bornmann</surname><given-names>W</given-names></name><name><surname>Pellicena</surname><given-names>P</given-names></name><name><surname>Miller</surname><given-names>WT</given-names></name><name><surname>Clarkson</surname><given-names>B</given-names></name><name><surname>Kuriyan</surname><given-names>J</given-names></name></person-group><year>2000</year><article-title>Structural mechanism for STI-571 inhibition of abelson tyrosine kinase</article-title><source>Science</source><volume>289</volume><fpage>1938</fpage><lpage>42</lpage></element-citation></ref><ref id="bib45"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schomaker</surname><given-names>V</given-names></name><name><surname>Trueblood</surname><given-names>KN</given-names></name></person-group><year>1968</year><article-title>On the rigid-body motion of molecules in crystals</article-title><source>Acta Crystallogr B</source><volume>24</volume><fpage>63</fpage><lpage>76</lpage><pub-id pub-id-type="doi">10.1107/S0567740868001718</pub-id></element-citation></ref><ref id="bib46"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sheriff</surname><given-names>S</given-names></name><name><surname>Hendrickson</surname><given-names>WA</given-names></name></person-group><year>1987</year><article-title>Description of overall anisotropy in diffraction from macromolecular crystals</article-title><source>Acta Crystallogr A</source><volume>43</volume><fpage>118</fpage><lpage>21</lpage><pub-id pub-id-type="doi">10.1107/S010876738709977X</pub-id></element-citation></ref><ref id="bib47"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shimamura</surname><given-names>T</given-names></name><name><surname>Shiroishi</surname><given-names>M</given-names></name><name><surname>Weyand</surname><given-names>S</given-names></name><name><surname>Tsujimoto</surname><given-names>H</given-names></name><name><surname>Winter</surname><given-names>G</given-names></name><name><surname>Katritch</surname><given-names>V</given-names></name><etal/></person-group><year>2011</year><article-title>Structure of the human histamine H1 receptor complex with doxepin</article-title><source>Nature</source><volume>475</volume><fpage>65</fpage><lpage>70</lpage><pub-id pub-id-type="doi">10.1038/nature10236</pub-id></element-citation></ref><ref id="bib48"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Singh</surname><given-names>BK</given-names></name><name><surname>Sattler</surname><given-names>JM</given-names></name><name><surname>Chatterjee</surname><given-names>M</given-names></name><name><surname>Huttu</surname><given-names>J</given-names></name><name><surname>Schuler</surname><given-names>H</given-names></name><name><surname>Kursula</surname><given-names>I</given-names></name></person-group><year>2011</year><article-title>Crystal structures explain functional differences in the two actin depolymerization factors of the malaria parasite</article-title><source>J Biol Chem</source><volume>286</volume><fpage>28256</fpage><lpage>64</lpage><pub-id pub-id-type="doi">10.1074/jbc.M111.211730</pub-id></element-citation></ref><ref id="bib49"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Terwilliger</surname><given-names>TC</given-names></name><name><surname>Grosse-Kunstleve</surname><given-names>RW</given-names></name><name><surname>Afonine</surname><given-names>PV</given-names></name><name><surname>Adams</surname><given-names>PD</given-names></name><name><surname>Moriarty</surname><given-names>NW</given-names></name><name><surname>Zwart</surname><given-names>P</given-names></name><etal/></person-group><year>2007</year><article-title>Interpretation of ensembles created by multiple iterative rebuilding of macromolecular models</article-title><source>Acta Crystallogr D Biol Crystallogr</source><volume>63</volume><fpage>597</fpage><lpage>610</lpage><pub-id pub-id-type="doi">10.1107/S0907444907009791</pub-id></element-citation></ref><ref id="bib50"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Torchia</surname><given-names>DA</given-names></name><name><surname>Ishima</surname><given-names>R</given-names></name></person-group><year>2003</year><article-title>Molecular structure and dynamics of proteins in solution: insights derived from high-resolution NMR approaches</article-title><source>Pure and Applied Chemistry</source><volume>75</volume><fpage>1371</fpage><lpage>81</lpage><pub-id pub-id-type="doi">10.1351/pac200375101371</pub-id></element-citation></ref><ref id="bib51"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Torda</surname><given-names>A</given-names></name><name><surname>Scheek</surname><given-names>R</given-names></name><name><surname>van Gunsteren</surname><given-names>WF</given-names></name></person-group><year>1989</year><article-title>Time-dependent distance restraints in molecular dynamics simulations</article-title><source>Chemical Physics Letters</source><volume>157</volume><fpage>289</fpage><lpage>94</lpage><pub-id pub-id-type="doi">10.1016/0009-2614(89)87249-5</pub-id></element-citation></ref><ref id="bib52"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Vagin</surname><given-names>AA</given-names></name><name><surname>Steiner</surname><given-names>RA</given-names></name><name><surname>Lebedev</surname><given-names>AA</given-names></name><name><surname>Potterton</surname><given-names>L</given-names></name><name><surname>McNicholas</surname><given-names>S</given-names></name><name><surname>Long</surname><given-names>F</given-names></name><etal/></person-group><year>2004</year><article-title>REFMAC5 dictionary: organization of prior chemical knowledge and guidelines for its use</article-title><source>Acta Crystallogr D Biol Crystallogr</source><volume>60</volume><fpage>2184</fpage><lpage>95</lpage><pub-id pub-id-type="doi">10.1107/S0907444904023510</pub-id></element-citation></ref><ref id="bib53"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Vajpai</surname><given-names>N</given-names></name><name><surname>Strauss</surname><given-names>A</given-names></name><name><surname>Fendrich</surname><given-names>G</given-names></name><name><surname>Cowan-Jacob</surname><given-names>SW</given-names></name><name><surname>Manley</surname><given-names>PW</given-names></name><name><surname>Grzesiek</surname><given-names>S</given-names></name><etal/></person-group><year>2008</year><article-title>Solution conformations and dynamics of ABL kinase-inhibitor complexes determined by NMR substantiate the different binding modes of imatinib/nilotinib and dasatinib</article-title><source>J Biol Chem</source><volume>283</volume><fpage>18292</fpage><lpage>302</lpage><pub-id pub-id-type="doi">10.1074/jbc.M801337200</pub-id></element-citation></ref><ref id="bib54"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>van den Bedem</surname><given-names>H</given-names></name><name><surname>Dhanik</surname><given-names>A</given-names></name><name><surname>Latombe</surname><given-names>JC</given-names></name><name><surname>Deacon</surname><given-names>AM</given-names></name></person-group><year>2009</year><article-title>Modeling discrete heterogeneity in X-ray diffraction data by fitting multi-conformers</article-title><source>Acta Crystallogr D Biol Crystallogr</source><volume>65</volume><fpage>1107</fpage><lpage>17</lpage><pub-id pub-id-type="doi">10.1107/S0907444909030613</pub-id></element-citation></ref><ref id="bib55"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Velazquez-Campoy</surname><given-names>A.</given-names></name><name><surname>Kiso</surname><given-names>Y</given-names></name><name><surname>Freire</surname><given-names>E</given-names></name></person-group><year>2001</year><article-title>The binding energetics of first- and second-generation HIV-1 protease inhibitors: implications for drug design</article-title><source>Arch Biochem Biophys</source><volume>390</volume><fpage>169</fpage><lpage>75</lpage><pub-id pub-id-type="doi">10.1006/abbi.2001.2333</pub-id></element-citation></ref><ref id="bib56"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Vitkup</surname><given-names>D</given-names></name><name><surname>Ringe</surname><given-names>D</given-names></name><name><surname>Karplus</surname><given-names>M</given-names></name><name><surname>Petsko</surname><given-names>GA</given-names></name></person-group><year>2002</year><article-title>Why protein R-factors are so large: a self-consistent analysis</article-title><source>Proteins</source><volume>46</volume><fpage>345</fpage><lpage>54</lpage></element-citation></ref><ref id="bib57"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname><given-names>H</given-names></name><name><surname>Yan</surname><given-names>Z</given-names></name><name><surname>Geng</surname><given-names>J</given-names></name><name><surname>Kunz</surname><given-names>S</given-names></name><name><surname>Seebeck</surname><given-names>T</given-names></name><name><surname>Ke</surname><given-names>H</given-names></name></person-group><year>2007</year><article-title>Crystal structure of the Leishmania major phosphodiesterase LmjPDEB1 and insight into the design of the parasite-selective inhibitors</article-title><source>Mol Microbiol</source><volume>66</volume><fpage>1029</fpage><lpage>38</lpage><pub-id pub-id-type="doi">10.1111/j.1365-2958.2007.05976.x</pub-id></element-citation></ref><ref id="bib58"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Winn</surname><given-names>MD</given-names></name><name><surname>Isupov</surname><given-names>MN</given-names></name><name><surname>Murshudov</surname><given-names>GN</given-names></name></person-group><year>2001</year><article-title>Use of TLS parameters to model anisotropic displacements in macromolecular refinement</article-title><source>Acta Crystallogr D Biol Crystallogr</source><volume>57</volume><fpage>122</fpage><lpage>33</lpage></element-citation></ref><ref id="bib59"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname><given-names>B</given-names></name><name><surname>Chien</surname><given-names>EYT</given-names></name><name><surname>Mol</surname><given-names>CD</given-names></name><name><surname>Fenalti</surname><given-names>G</given-names></name><name><surname>Liu</surname><given-names>W</given-names></name><name><surname>Katritch</surname><given-names>V</given-names></name><etal/></person-group><year>2010</year><article-title>Structures of the CXCR4 chemokine GPCR with small-molecule and cyclic peptide antagonists</article-title><source>Science</source><volume>330</volume><fpage>1066</fpage><lpage>71</lpage><pub-id pub-id-type="doi">10.1126/science.1194396</pub-id></element-citation></ref></ref-list></back><sub-article article-type="article-commentary" id="SA1"><front-stub><article-id pub-id-type="doi">10.7554/eLife.00311.028</article-id><title-group><article-title>Decision letter</article-title></title-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Brunger</surname><given-names>Axel T</given-names></name><role>Reviewing editor</role><aff><institution>Howard Hughes Medical Institute, Stanford University</institution>, <country>United States</country></aff></contrib></contrib-group></front-stub><body><boxed-text><p>eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see <ext-link ext-link-type="uri" xlink:href="http://www.elifesciences.org/the-journal/review-process">review process</ext-link>). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.</p></boxed-text><p>Thank you for choosing to send your work entitled “Modelling dynamics in protein crystal structures by ensemble refinement” for consideration at <italic>eLife</italic>. Your article has been evaluated by a Senior Editor and 3 reviewers, one of whom is a member of our Board of Reviewing Editors. The following individuals responsible for the peer review of your submission want to reveal their identity: Phil Evans.</p><p>We are pleased to report that the review of the paper has been favorable, and we invite you to revise the paper by taking into consideration the points made below.</p><p>General assessment: This paper describes a greatly improved version of time-averaging ensemble refinement that was originally developed by Gros et al., 1990. Subsequent works suggested that over-fitting may occur and it was recognized that the choice of B-factor model is very important for the success of time-average multi-conformer refinement. The current work now dramatically improves the situation by using a B-factor model that is derived from the TLS description of the single conformer isotropic B-factors. The initial implementation was requiring the dynamics simulation to model not only the local atomic fluctuations but also the effect of large-scale lattice vibrations and crystal packing defects. Taking out the large-scale effects (by modeling them with a TLS parameterization) has been crucial, and the authors show convincingly that there is no longer a problem with over-fitting. More importantly, they show that the dynamic behavior implied by the simulation correlates with other physical evidence and provides biologically relevant insights into protein function.</p><p>The new improved time-averaging method has then been tested on a set of crystal structures. Significant improvements in Rfree are reported. Specific examples illustrate the power of the method to identify sidechain motion and solvent features. Since the new improved time-averaging refinement method is now more robust than the earlier implementations, it could become a routine tool for the final refinement of macromolecular crystal structures. As such, it will then raise awareness that crystal structures are not static, but rather represent time and spatial averages. In fact, in the examples there are some surprises, of internal regions that show more flexibility than would be expected, and this is likely to have functional consequences. Thus exploring dynamics by this method may indeed produce important biological insights, at least in some cases.</p><p>Highlights:</p><p>* separation of global domain disorder from local fluctuation</p><p>* validation by comparison for a case with highly accurate MAD phases</p><p>* agreement with NMR relaxation studies for two cases</p><p>* discovery of flexible interior residues for some proteins</p><p>Points to consider in the revisions:</p><p>1. The final description of the structure as an ensemble does involve many more (non-independent) parameters than a single description, and particularly needs in essence more parameters to describe poorly ordered regions, which are the least well determined from the data. This has always been the worry about ensemble methods. The approach described here is conservative and controlled, but nevertheless is open to this criticism. It would nice to have an explicit discussion on this point.</p><p>2. In the opinion of the authors, is this a method and structure description that should be generally used? In relation to point 1, what are the dangers if the method is used by people who are less careful than the present authors?</p><p>3. It is stated that restricting “the number of structures modeled … thereby prevents over-fitting of the data”. It should be mentioned here that the 1990 work required a long relaxation time (up to 40ps) to model all the motion including that now modeled by TLS, but that the new implementation uses much smaller relaxation times (usually less than 1ps), and thus many fewer structures contribute to the running average.</p><p>4. A number of test cases are examined for the physical and biological relevance of the dynamics. It is probably not coincidental that the most interesting cases are biased towards the ones with higher resolution data. Although interesting features are seen with resolutions as low as 2.6 Angstrom, it might be worth pointing out that, because of the correlation between resolution and the number of structures that can be allowed in the ensemble average (Figure 1D), there is more potential for observing interesting features at higher resolution.</p><p>5. The discussion implies that statistics can be drawn from averaging results from several blocks each containing about 250 structures. That makes sense, since Figure 2C presents averages from more than 1,000 structures. However, a later section implies that only one acquisition period is selected for analysis. The apparent contradiction should be clarified. On this point, is there objective evidence that choosing a time block with the lowest R(work) really gives the best model? As Ian Tickle has shown, R-factors are subject to statistical error. It should not be difficult to analyze the data from the mannose-binding protein (1YTT) to see whether time blocks with low R(work) actually show the highest correlation to maps computed with experimental phases.</p><p>6. Figures 1C and 1D show a very clear trend, which the authors do not seem to comment on. The number of observations per parameter scales as 1/d(min)^3, and it seems reasonable to expect that the optimal number of structures in the ensemble average would scale in the same way because the number of parameters can be increased in line with the number of observations. Indeed, the data in Figure 1D would not fit badly to a curve described by something like 600/d(min)^3. In order for a proportional number of structures to be consulted during the ensemble averaging process, the relaxation time should also scale proportional to 1/d(min)^3, which is roughly what is seen in Figure 1C. So it looks like a very good guess of an optimal relaxation time and optimal number of structures to average could be made from just knowing the resolution.</p><p>7. Cross-validation (R_free) was used to determine the optimum relaxation time tau_x (which determines the size of the ensemble), as well as p_TLS, T_bath, and w_xray. The optimum values of tau_x are shown in Figure 1c. However, the optimum values of the other adjustable parameters should be shown as well for all cases to get a sense for the variation of these parameters for different systems.</p><p>8. Some representation (e.g., 2D contour plots) of the variation of R_free with adjustable parameters should be provided for one case (e.g., for the case with the largest improvement, 1UOY).</p><p>9. Figure 2 (1UOY): interesting case, but why is the improvement (∼ 5% in Rfree) so good compared to some of the other cases? Can this be explained?</p><p>10. For 1YTT, the improvement for the map correlation coefficient 0.895 (single conformer) to 0.903 (ensemble), and the corresponding improvement in R_free (0.014) seems small but the structures and maps (Figure 3C, D) show significant local improvements. This should be made clearer in the text, i.e., the seemingly small improvements in overall quality indicators may allow for significant local improvements. Perhaps a real space correlation plot could provide a further illustration of the improvement upon multi-conformer refinement.</p><p>11. “Similar to Brunger, the R_free and w_xray correlated with stereochemical quality… ”. This statement is unclear. R_free is not always correlated with w_xray. The weaker w_xray, the tighter geometry one would expect but with increasing R_free. Perhaps the correlation between R_free and model quality could be shown for the ensemble refinements performed?</p><p>12. Figures 7D and 8, the proline isomerase cyclophilin A, CypA. Please provide a plot with the ensemble variability (rmsd), NMR relaxation parameters (Figure 2 in Eisenmesser et al, Science, 2002), and the ADP as a function of residue. Such a plot would be an independent validation of the method by comparison to solution data, and it would also illustrate that the method goes beyond single conformer individual ADP refinement.</p><p>13. Similar for case shown in Figure 9 (2PC0), detailed comparison with NMR data (if they are readily available) and comparison to single conformer ADP refinement.</p><p>14. Equation 5 is the likelihood function for acentric reflections, not centric reflections. Also in equations 5–7, the notation implies that the calculated structure factors are the instantaneous ones, whereas they should be the rolling averages.</p><p>15. In equation 4, the bulk solvent mask excludes the explicit water molecules, i.e., the explicit water molecules are considered part of the model, correct?</p></body></sub-article><sub-article article-type="reply" id="SA2"><front-stub><article-id pub-id-type="doi">10.7554/eLife.00311.029</article-id><title-group><article-title>Author response</article-title></title-group></front-stub><body><p><italic>1. The final description of the structure as an ensemble does involve many more (non-independent) parameters than a single description, and particularly needs in essence more parameters to describe poorly ordered regions, which are the least well determined from the data. This has always been the worry about ensemble methods. The approach described here is conservative and controlled, but nevertheless is open to this criticism. It would nice to have an explicit discussion on this point</italic>.</p><p>We discuss that the number of independent parameters in the ensemble model is ill defined. However, the Rfree improvement and the very high local correlations between independent runs indicate that ensemble refinement yields reliable models.</p><p><italic>2. In the opinion of the authors, is this a method and structure description that should be generally used? In relation to point 1, what are the dangers if the method is used by people who are less careful than the present authors</italic>?</p><p>We show that the method markedly improves Rfree for a range of datasets with upper resolution limits from 1.5 to 2.6 Å resolution. It is rather difficult for us to judge what the dangers might be when used by people who are not careful. We will provide our protocols with the release of the method in the Phenix package. We find two measures very informative: the improvement in Rfree and the local correlation between independent runs. For regions with low local correlation (as for highly disordered regions), detailed structural interpretations are not valid.</p><p><italic>3. It is stated that restricting “the number of structures modeled … thereby prevents over-fitting of the data”. It should be mentioned here that the 1990 work required a long relaxation time (up to 40ps) to model all the motion including that now modeled by TLS, but that the new implementation uses much smaller relaxation times (usually less than 1ps), and thus many fewer structures contribute to the running average</italic>.</p><p>We have included the differences between the original and current work.</p><p><italic>4. A number of test cases are examined for the physical and biological relevance of the dynamics. It is probably not coincidental that the most interesting cases are biased towards the ones with higher resolution data. Although interesting features are seen with resolutions as low as 2.6 Angstrom, it might be worth pointing out that, because of the correlation between resolution and the number of structures that can be allowed in the ensemble average (Figure 1D), there is more potential for observing interesting features at higher resolution</italic>.</p><p>We implicitly addressed this issue by discussing the very high local correlations observed at high resolution versus the lower local correlation observed at lower resolution, and for highly disordered loops.</p><p><italic>5. The discussion implies that statistics can be drawn from averaging results from several blocks each containing about 250 structures. That makes sense, since Figure 2C presents averages from more than 1,000 structures. However, a later section implies that only one acquisition period is selected for analysis. The apparent contradiction should be clarified. On this point, is there objective evidence that choosing a time block with the lowest R(work) really gives the best model? As Ian Tickle has shown, R-factors are subject to statistical error. It should not be difficult to analyze the data from the mannose-binding protein (1YTT) to see whether time blocks with low R(work) actually show the highest correlation to maps computed with experimental phases</italic>.</p><p>The apparent contradiction was removed. For the 1YTT dataset we show a strong correlation between the R values of the selected block and the overall map correlation coefficient (see Figure 11).</p><p><italic>6. Figures 1C and 1D show a very clear trend, which the authors do not seem to comment on. The number of observations per parameter scales as 1/d(min)^3, and it seems reasonable to expect that the optimal number of structures in the ensemble average would scale in the same way because the number of parameters can be increased in line with the number of observations. Indeed, the data in Figure 1D would not fit badly to a curve described by something like 600/d(min)^3. In order for a proportional number of structures to be consulted during the ensemble averaging process, the relaxation time should also scale proportional to 1/d(min)^3, which is roughly what is seen in Figure 1C. So it looks like a very good guess of an optimal relaxation time and optimal number of structures to average could be made from just knowing the resolution</italic>.</p><p>We agree that there is a trend between resolution and the relaxation time (τ<sub>x</sub>). However, at this moment, given the limited number of datasets, we refrain from suggesting fixed limits in the parameterization of the relaxation time and currently still advise testing multiple parameters for optimum results. Future work may lead to a more deterministic parameterization.</p><p><italic>7. Cross-validation (R_free) was used to determine the optimum relaxation time tau_x (which determines the size of the ensemble), as well as p_TLS, T_bath, and w_xray. The optimum values of tau_x are shown in Figure 1c. However, the optimum values of the other adjustable parameters should be shown as well for all cases to get a sense for the variation of these parameters for different systems</italic>.</p><p>As requested this information is now included in Figure 1. We note in the main text that p<sub>TLS</sub> and T<sub>bath</sub> do not correlate with resolution, whereas τ<sub>x</sub> tends to correlate with resolution.</p><p><italic>8. Some representation (e.g., 2D contour plots) of the variation of R_free with adjustable parameters should be provided for one case (e.g., for the case with the largest improvement, 1UOY)</italic>.</p><p>These have now been provided.</p><p><italic>9. Figure 2 (1UOY): interesting case, but why is the improvement (∼ 5% in Rfree) so good compared to some of the other cases? Can this be explained</italic>?</p><p>1UOY exhibits a large degree of anisotropic and anharmonic side chain motion that may explain the magnitude of observed improvements. We have added a comment to the text accordingly.</p><p><italic>10. For 1YTT, the improvement for the map correlation coefficient 0.895 (single conformer) to 0.903 (ensemble), and the corresponding improvement in R_free (0.014) seems small but the structures and maps (Figure 3C, D) show significant local improvements. This should be made clearer in the text, i.e., the seemingly small improvements in overall quality indicators may allow for significant local improvements. Perhaps a real space correlation plot could provide a further illustration of the improvement upon multi-conformer refinement</italic>.</p><p>We have amended the text as suggested. Figure 3 shows the real-space correlation plot of the ensemble model versus the experimentally phased map.</p><p><italic>11. “Similar to Brunger, the R_free and w_xray correlated with stereochemical quality… ”. This statement is unclear. R_free is not always correlated with w_xray. The weaker w_xray, the tighter geometry one would expect but with increasing R_free. Perhaps the correlation between R_free and model quality could be shown for the ensemble refinements performed</italic>?</p><p>A figure has been added pointing out the relationship between best R_free versus w_xray and stereochemistry.</p><p><italic>12. Figures 7D and 8, the proline isomerase cyclophilin A, CypA. Please provide a plot with the ensemble variability (rmsd), NMR relaxation parameters (Figure 2 in Eisenmesser et al, Science, 2002), and the ADP as a function of residue. Such a plot would be an independent validation of the method by comparison to solution data, and it would also illustrate that the method goes beyond single conformer individual ADP refinement</italic>.</p><p><italic>13. Similar for case shown in Figure 9 (2PC0), detailed comparison with NMR data (if they are readily available) and comparison to single conformer ADP refinement</italic>.</p><p>Re 12&amp;13, in the 2002 manuscript by Eisenmesser et al. the NMR relaxation data is not tabulated. Unfortunately neither proteins are listed in the BMRB database (<ext-link ext-link-type="uri" xlink:href="http://www.bmrb.wisc.edu/search/query_grid/kinetic_grid.html">http://www.bmrb.wisc.edu/search/query_grid/kinetic_grid.html</ext-link>).</p><p><italic>14. Equation 5 is the likelihood function for acentric reflections, not centric reflections. Also in equations 5–7, the notation implies that the calculated structure factors are the instantaneous ones, whereas they should be the rolling averages</italic>.</p><p>The text has been corrected; the equation shown exemplifies the general case of acentric reflections. We define <bold>F</bold><sub>model</sub> as a function of rolling averages &lt;<bold>F</bold><sub>calc</sub>&gt; and &lt;<bold>F</bold><sub>mask</sub>&gt;; see equation 2. This choice simplifies the nomenclature when describing map coefficients.</p><p><italic>15. In equation 4, the bulk solvent mask excludes the explicit water molecules, i.e., the explicit water molecules are considered part of the model, correct</italic>?</p><p>This is correct. We now state this explicitly.</p></body></sub-article></article>