Permalink
Cannot retrieve contributors at this time
Fetching contributors…
| <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.1d1 20130915//EN" "JATS-archivearticle1.dtd"><article article-type="research-article" dtd-version="1.1d1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="nlm-ta">elife</journal-id><journal-id journal-id-type="hwp">eLife</journal-id><journal-id journal-id-type="publisher-id">eLife</journal-id><journal-title-group><journal-title>eLife</journal-title></journal-title-group><issn publication-format="electronic">2050-084X</issn><publisher><publisher-name>eLife Sciences Publications, Ltd</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">03568</article-id><article-id pub-id-type="doi">10.7554/eLife.03568</article-id><article-categories><subj-group subj-group-type="display-channel"><subject>Research article</subject></subj-group><subj-group subj-group-type="heading"><subject>Genomics and evolutionary biology</subject></subj-group></article-categories><title-group><article-title>Predicting evolution from the shape of genealogical trees</article-title></title-group><contrib-group><contrib contrib-type="author" corresp="yes" id="author-1701"><name><surname>Neher</surname><given-names>Richard A</given-names></name><contrib-id contrib-id-type="orcid">http://orcid.org/0000-0003-2525-1407</contrib-id><xref ref-type="aff" rid="aff1"/><xref ref-type="corresp" rid="cor1">*</xref><xref ref-type="other" rid="par-1"/><xref ref-type="fn" rid="con1"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-4919"><name><surname>Russell</surname><given-names>Colin A</given-names></name><xref ref-type="aff" rid="aff2"/><xref ref-type="other" rid="par-2"/><xref ref-type="fn" rid="con2"/><xref ref-type="fn" rid="conf2"/></contrib><contrib contrib-type="author" corresp="yes" id="author-15302"><name><surname>Shraiman</surname><given-names>Boris I</given-names></name><xref ref-type="aff" rid="aff3"/><xref ref-type="corresp" rid="cor2">*</xref><xref ref-type="other" rid="par-3"/><xref ref-type="fn" rid="con3"/><xref ref-type="fn" rid="conf2"/></contrib><aff id="aff1"><institution content-type="dept">Evolutionary Dynamics and Biophysics</institution>, <institution>Max Planck Institute for Developmental Biology</institution>, <addr-line><named-content content-type="city">Tübingen</named-content></addr-line>, <country>Germany</country></aff><aff id="aff2"><institution content-type="dept">Department of Veterinary Medicine</institution>, <institution>University of Cambridge</institution>, <addr-line><named-content content-type="city">Cambridge</named-content></addr-line>, <country>United Kingdom</country></aff><aff id="aff3"><institution content-type="dept">Kavli Institute for Theoretical Physics</institution>, <institution>University of California, Santa Barbara</institution>, <addr-line><named-content content-type="city">Santa Barbara</named-content></addr-line>, <country>United States</country></aff></contrib-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>McVean</surname><given-names>Gil</given-names></name><role>Reviewing editor</role><aff><institution>Oxford University</institution>, <country>United Kingdom</country></aff></contrib></contrib-group><author-notes><corresp id="cor1"><label>*</label>For correspondence: <email>richard.neher@tuebingen.mpg.de</email> (RAN);</corresp><corresp id="cor2"><label>*</label>For correspondence: <email>shraiman@kitp.ucsb.edu</email> (BIS)</corresp></author-notes><pub-date date-type="pub" publication-format="electronic"><day>11</day><month>11</month><year>2014</year></pub-date><pub-date pub-type="collection"><year>2014</year></pub-date><volume>3</volume><elocation-id>e03568</elocation-id><history><date date-type="received"><day>03</day><month>06</month><year>2014</year></date><date date-type="accepted"><day>30</day><month>09</month><year>2014</year></date></history><permissions><copyright-statement>© 2014, Neher et al</copyright-statement><copyright-year>2014</copyright-year><copyright-holder>Neher et al</copyright-holder><license xlink:href="http://creativecommons.org/licenses/by/4.0/"><license-p>This article is distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution License</ext-link>, which permits unrestricted use and redistribution provided that the original author and source are credited.</license-p></license></permissions><self-uri content-type="pdf" xlink:href="elife03568.pdf"/><abstract><object-id pub-id-type="doi">10.7554/eLife.03568.001</object-id><p>Given a sample of genome sequences from an asexual population, can one predict its evolutionary future? Here we demonstrate that the branching patterns of reconstructed genealogical trees contains information about the relative fitness of the sampled sequences and that this information can be used to predict successful strains. Our approach is based on the assumption that evolution proceeds by accumulation of small effect mutations, does not require species specific input and can be applied to any asexual population under persistent selection pressure. We demonstrate its performance using historical data on seasonal influenza A/H3N2 virus. We predict the progenitor lineage of the upcoming influenza season with near optimal performance in 30% of cases and make informative predictions in 16 out of 19 years. Beyond providing a tool for prediction, our ability to make informative predictions implies persistent fitness variation among circulating influenza A/H3N2 viruses.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.001">http://dx.doi.org/10.7554/eLife.03568.001</ext-link></p></abstract><abstract abstract-type="executive-summary"><object-id pub-id-type="doi">10.7554/eLife.03568.002</object-id><title>eLife digest</title><p>When viruses multiply, they copy their genetic material to make clones of themselves. However, the genetic material in the clone is often slightly different from the genetic material in the original virus. These mutations can be caused by mistakes made during copying or by radiation or chemicals. Further mutations arise when the clones multiply, which means that, after many generations, there will be quite large differences in the genetic material carried by many members of the population. Most mutations have little or no effect on the ‘fitness’ of an individual - that is, on its ability to survive and multiply - but some mutations do have an influence.</p><p>Some viruses, like seasonal influenza (flu) viruses, can mutate so rapidly that the most common strains change from year to year. This is why new flu vaccines are needed every year. To date most attempts to predict the evolution of seasonal flu viruses have focused on identifying specific features within the genetic sequences that might indicate fitness. However, such approaches require lots of information about the viruses, and this information is often not available.</p><p>To address this problem, Neher, Russell and Shraiman have developed a more general method to predict fitness from virus genetic sequences. First, a ‘family tree’ for a virus population - which shows how each strain of the virus is related to other strains - was constructed by comparing the genetic sequences.</p><p>The next step was based on the observation that as long as differences in fitness arise from the accumulation of multiple mutations, the branching structure of this family tree will bear a visible imprint of the natural selection process as it unfolds. Using this insight and methods borrowed from statistical physics, Neher et al. then analyzed the shape and branching pattern of the tree to work out the fitness of the different strains relative to each other.</p><p>Neher et al. tested the method using historical influenza A virus data. In 16 of the 19 years studied, the family tree approach made meaningful predictions about which viruses were most likely to give rise to future epidemics. The ability to predict influenza virus evolution from tree shape alone suggests that influenza virus evolution may be more predictable than previously expected.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.002">http://dx.doi.org/10.7554/eLife.03568.002</ext-link></p></abstract><kwd-group kwd-group-type="author-keywords"><title>Author keywords</title><kwd>vaccine strain selection</kwd><kwd>adaptive evolution</kwd><kwd>population genetics</kwd></kwd-group><kwd-group kwd-group-type="research-organism"><title>Research organism</title><kwd>viruses</kwd></kwd-group><funding-group><award-group id="par-1"><funding-source><institution-wrap><institution-id institution-id-type="FundRef">http://dx.doi.org/10.13039/501100000781</institution-id><institution>European Research Council</institution></institution-wrap></funding-source><award-id>ERC-Stg-260686</award-id><principal-award-recipient><name><surname>Neher</surname><given-names>Richard A</given-names></name></principal-award-recipient></award-group><award-group id="par-2"><funding-source><institution-wrap><institution-id institution-id-type="FundRef">http://dx.doi.org/10.13039/501100000288</institution-id><institution>Royal Society</institution></institution-wrap></funding-source><award-id>University Research Fellowship</award-id><principal-award-recipient><name><surname>Russell</surname><given-names>Colin A</given-names></name></principal-award-recipient></award-group><award-group id="par-3"><funding-source><institution-wrap><institution-id institution-id-type="FundRef">http://dx.doi.org/10.13039/100000002</institution-id><institution>National Institutes of Health</institution></institution-wrap></funding-source><award-id>R01 GM086793</award-id><principal-award-recipient><name><surname>Shraiman</surname><given-names>Boris I</given-names></name></principal-award-recipient></award-group><funding-statement>The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.</funding-statement></funding-group><custom-meta-group><custom-meta><meta-name>elife-xml-version</meta-name><meta-value>2</meta-value></custom-meta><custom-meta specific-use="meta-only"><meta-name>Author impact statement</meta-name><meta-value>A general method for inferring fitness from a sample of nucleotide sequences can predict the progenitor of next year's seasonal influenza.</meta-value></custom-meta></custom-meta-group></article-meta></front><body><sec id="s1" sec-type="intro"><title>Introduction</title><p>A general method to predict the evolutionary trajectories of asexual populations would be extremely valuable for understanding the population dynamics of pathogens or of malignant cells. For example, the vaccine against seasonal influenza needs to be updated frequently since virus populations evolve to evade increasing immunity among humans (<xref ref-type="bibr" rid="bib14">Hampson, 2002</xref>; <xref ref-type="bibr" rid="bib23">Nelson and Holmes, 2007</xref>). Reliable prediction of the strains most likely to circulate in the upcoming season, and particularly the ability to predict antigenic change, would be transformative to the vaccine strain selection process.</p><p>Predictability from genetic sequence data requires heritable fitness variation among the sampled sequences. Neutral evolution - population dynamics in the absence of selective pressure - is by definition unpredictable: all sequences are equally fit. Yet even when selection determines the success of individual lineages, predictability depends on the effect size of fitness-altering mutations. Two competing scenarios of adaptive evolution are illustrated in <xref ref-type="fig" rid="fig1">Figure 1</xref>. If evolution proceeds via rare mutations with large phenotypic effects, the population is homogeneous in fitness most of the time (<xref ref-type="fig" rid="fig1">Figure 1A</xref>). In this case large effect mutations can convert any genome into the fittest in a single generation. Prediction from sequence alone is only possible if the time of sampling happens to be during a brief sweep of a large effect mutation. In contrast, continuous accumulation of small effect mutations (<xref ref-type="fig" rid="fig1">Figure 1B</xref>) results in a gradual change in fitness of lineages and persistent variation in fitness (<xref ref-type="bibr" rid="bib35">Tsimring et al., 1996</xref>). A genealogical tree then potentially contains predictable patterns: the fitness of most lineages decreases over time (movement to the left in <xref ref-type="fig" rid="fig1">Figure 1</xref>), due to a changing environment or the accumulation of weakly deleterious mutations. Only a few adapt rapidly enough to stay among the most fit in the population (<xref ref-type="bibr" rid="bib29">Rouzine et al., 2003</xref>; <xref ref-type="bibr" rid="bib3">Brunet et al., 2007</xref>; <xref ref-type="bibr" rid="bib7">Desai and Fisher, 2007</xref>; <xref ref-type="bibr" rid="bib13">Hallatschek, 2011</xref>; <xref ref-type="bibr" rid="bib12">Goyal et al., 2012</xref>; <xref ref-type="bibr" rid="bib8">Desai et al., 2013</xref>; <xref ref-type="bibr" rid="bib21">Neher and Hallatschek, 2013</xref>) and thus have a chance to continue into the future.<fig id="fig1" position="float"><object-id pub-id-type="doi">10.7554/eLife.03568.003</object-id><label>Figure 1.</label><caption><title>Genealogies in adapting populations.</title><p>(<bold>A</bold> and <bold>B</bold>) illustrate the genealogy of two successive samples embedded into the (Malthusian) fitness distribution of the population indicated in grey. In absence of adaptive mutations, fitness declines due to a changing environment or accumulation of deleterious mutations. Only one lineage (thick line) persists from first sample to second sample. (<bold>A</bold>) Evolution proceeds via rare large effect mutations (dashed arrows) that occur in a population with little fitness variance. All individuals are roughly equally likely to pick up the large effect mutation, rendering evolution unpredictable from sequence data alone. (<bold>B</bold>) Conversely, if adaptation is due to many small effect mutations, the successful lineage (thick) is always among the most fit individuals. Being able to predict relative fitness therefore enables to pick a progenitor of the future population.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.003">http://dx.doi.org/10.7554/eLife.03568.003</ext-link></p></caption><graphic xlink:href="elife03568f001"/></fig></p><p>In the specific context of human seasonal influenza A/H3N2 viruses, the study of their antigenic evolution has identified specific amino-acid substitutions with large phenotypic effects (<xref ref-type="bibr" rid="bib15">Koel et al., 2013</xref>), that have been responsible for the observed stepwise replacement of antigenic variants over time (<xref ref-type="bibr" rid="bib32">Smith et al., 2004</xref>). Yet, the evolution of seasonal influenza viruses is also marked by the continuous accumulation of mutations that have small or no antigenic effects but nevertheless potentially affect fitness (<xref ref-type="bibr" rid="bib2">Bhatt et al., 2011</xref>; <xref ref-type="bibr" rid="bib34">Strelkowa and Lässig, 2012</xref>), for example compensatory or permissive mutations (<xref ref-type="bibr" rid="bib11">Gong et al., 2013</xref>). Previous attempts at predicting the evolution of seasonal influenza viruses have tried to identify molecular signatures that are predictive of future success (<xref ref-type="bibr" rid="bib4">Bush et al., 1999</xref>) or used clustering approaches based on amino acid sequences (<xref ref-type="bibr" rid="bib26">Plotkin et al., 2002</xref>). Recently, <xref ref-type="bibr" rid="bib18">Łuksza and Lässig (2014)</xref> constructed an explicit fitness model based on sequence data from the hemagglutinin (HA1) surface protein. The utility of these explicit models depend on the availability of extensive historical data or a detailed understanding of the influenza virus sequence-to-fitness map.</p><p>Rather than constructing an explicit fitness model, which is currently impossible for most organisms, we developed a general algorithm to infer fitness from the shape of reconstructed genealogical trees without using any molecular information. Our approach is based on a simple idea: since high (Malthusian) fitness implies many offspring, which in turn implies branching, the shape of the tree can be exploited to infer fitness (<xref ref-type="bibr" rid="bib6">Dayarian and Shraiman, 2014</xref>). Here, we developed a quantitative model of fitness dynamics on genealogical trees, which is based on recent progress in understanding the statistical structure of genealogies in adapting populations (<xref ref-type="bibr" rid="bib21">Neher and Hallatschek, 2013</xref>). Following <xref ref-type="bibr" rid="bib21">Neher and Hallatschek (2013)</xref>, our model assumes: 1) that the population is under persistent directional selection and 2) fitness changes along lineages in small steps through the continuous accumulation of small effect mutations (<xref ref-type="fig" rid="fig1">Figure 1B</xref>). This fitness model resembles the well-known infinitesimal model of quantitative genetics (<xref ref-type="bibr" rid="bib9">Falconer and Mackay, 1996</xref>) in the sense that many small effect mutations give rise to a bell-shaped fitness distribution on which selection acts (<xref ref-type="bibr" rid="bib20">Neher, 2013</xref>). However, the infinitesimal model itself provides no insight into the relationship between the structure of genealogical trees and fitness: this insight stems from the more recent work on the dynamics of adaptation in large asexual populations (<xref ref-type="bibr" rid="bib35">Tsimring et al., 1996</xref>; <xref ref-type="bibr" rid="bib29">Rouzine et al., 2003</xref>; <xref ref-type="bibr" rid="bib7">Desai and Fisher, 2007</xref>; <xref ref-type="bibr" rid="bib8">Desai et al., 2013</xref> ; <xref ref-type="bibr" rid="bib21">Neher and Hallatschek, 2013</xref>) and in populations with occasional reassortment (<xref ref-type="bibr" rid="bib22">Neher and Shraiman, 2011</xref>). After testing the algorithm on simulated data we apply our algorithm to historical data on human seasonal influenza A/H3N2 virus hemagglutinin sequences. Despite multiple confounding factors – discussed below – we find that our algorithm makes informative predictions about influenza virus evolution.</p></sec><sec id="s2" sec-type="results"><title>Results</title><sec id="s2-1"><title>The fitness distribution on a tree</title><p>Intuitively, we expect that an exceptionally fit internal node in a genealogical tree will be at the root of a rapidly branching, and hence expanding, clade (e.g. node 2 in <xref ref-type="fig" rid="fig2">Figure 2A</xref>). Similarly, extant individuals with high fitness are likely to be recent descendants of internal nodes with high fitness (e.g. node 3 in <xref ref-type="fig" rid="fig2">Figure 2A</xref>). By tracing fitness along lineages and integrating across the tree, the algorithm described below makes this intuition precise and quantitative.<fig-group><fig id="fig2" position="float"><object-id pub-id-type="doi">10.7554/eLife.03568.004</object-id><label>Figure 2.</label><caption><title>Inferring fitness from genealogical trees.</title><p>(<bold>A</bold>) The inference algorithm is based on branch propagators associated with each branch of the reconstructed tree (middle). Branch propagators characterize the fitness distribution of child nodes given the fitness of the ancestral node (left). The internal node 2 would have higher marginal fitness estimate (right) than node 1, as node 2 has more children. The inferred distribution of the fitness of the external node 3 has broadened along the branch from node 2. (<bold>B</bold>–<bold>D</bold>) Analysis of simulated data. Panel B shows for a typical example that inferred fitness is well correlated with the true fitness with a rank correlation coefficient <inline-formula><mml:math id="inf2"><mml:mrow><mml:mi>ρ</mml:mi><mml:mo>=</mml:mo><mml:mn>0.56</mml:mn></mml:mrow></mml:math></inline-formula>. This correlation increases with increasing mutation rate as shown in panel C for 100 simulated data sets each (boxes cover the interquartile range, red lines indicate the median). Panel D shows that the sequence with the highest inferred fitness tends to be similar to the population 200 generations in the future. Both axis show the average Hamming distance to the future population between the predicted and the post-hoc optimal sequence on the <italic>y</italic> and <italic>x</italic>-axis, respectively, for 100 simulated data sets. Both distances are relative to the average distance between the present and future population. Parameters: <inline-formula><mml:math id="inf3"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:mn>20000</mml:mn></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="inf4"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>A</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>0.08</mml:mn></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="inf5"><mml:mrow><mml:mtext>Γ</mml:mtext><mml:mo>=</mml:mo><mml:mn>0.2</mml:mn></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="inf6"><mml:mrow><mml:mi>u</mml:mi><mml:mo>=</mml:mo><mml:mn>0.064</mml:mn></mml:mrow></mml:math></inline-formula> (B,D).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.004">http://dx.doi.org/10.7554/eLife.03568.004</ext-link></p></caption><graphic xlink:href="elife03568f002"/></fig><fig id="fig2s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03568.005</object-id><label>Figure 2—figure supplement 1.</label><caption><title>Predictability increases with genetic diversity.</title><p>The prediction performance quantified by the rank correlation coefficient between the inferred and true fitness increases with pairwise diversity. Large <inline-formula><mml:math id="inf183"><mml:mtext>Γ</mml:mtext></mml:math></inline-formula>is superior at small pairwise distances, which corresponds to a regime of few large effect mutations. Smaller <inline-formula><mml:math id="inf184"><mml:mtext>Γ</mml:mtext></mml:math></inline-formula>does better in at large pairwise distance where fitness variation is spread among many loci.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.005">http://dx.doi.org/10.7554/eLife.03568.005</ext-link></p></caption><graphic xlink:href="elife03568fs001"/></fig><fig id="fig2s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03568.006</object-id><label>Figure 2—figure supplement 2.</label><caption><title>Prediction from continuously sampled sequences.</title><p>Same as <xref ref-type="fig" rid="fig2">Figure 2B–D</xref>, but with continuous sampling of 200 simulated sequences over 100 generations, as opposed to one sample from exactly one time point. Panels B&C shows that the rank correlation does not suffer when sampled continuously, at least at moderate or large mutation rates. Genetic distance of the predicted strain to future population behaves similarly. Parameters: <inline-formula><mml:math id="inf185"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:mn>20000</mml:mn></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="inf186"><mml:mrow><mml:mi>ω</mml:mi><mml:mo>=</mml:mo><mml:mn>0.01</mml:mn></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="inf187"><mml:mrow><mml:mtext>Γ</mml:mtext><mml:mo>=</mml:mo><mml:mn>0.2</mml:mn></mml:mrow></mml:math></inline-formula>and <inline-formula><mml:math id="inf188"><mml:mrow><mml:mi>u</mml:mi><mml:mo>=</mml:mo><mml:mn>0.064</mml:mn></mml:mrow></mml:math></inline-formula>.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.006">http://dx.doi.org/10.7554/eLife.03568.006</ext-link></p></caption><graphic xlink:href="elife03568fs002"/></fig></fig-group></p><p>As input, our algorithm requires a genealogical tree, e.g. a tree reconstructed from a sample of genomic sequences. For a given tree <italic>T</italic>, we derived the joint probability distribution <inline-formula><mml:math id="inf220"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi mathvariant="bold">x</mml:mi><mml:mo>|</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> for the fitnesses <inline-formula><mml:math id="inf7"><mml:mrow><mml:mi mathvariant="bold">x</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mo>…</mml:mo></mml:mrow></mml:math></inline-formula> of all internal nodes (corresponding to reconstructed ancestral sequences) and external nodes (corresponding to the sampled genomes). Fitness <italic>x</italic><sub><italic>i</italic></sub> of each node <italic>i</italic> is measured relative to the population mean fitness at the time when the corresponding individual was sampled. <inline-formula><mml:math id="inf9"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi mathvariant="bold">x</mml:mi><mml:mo>|</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is given by a product of <italic>propagators</italic> <inline-formula><mml:math id="inf10"><mml:mrow><mml:mi>g</mml:mi><mml:mo>(</mml:mo><mml:mo>·</mml:mo><mml:mo>|</mml:mo><mml:mo>·</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> for each branch<disp-formula id="equ1"><label>(1)</label><mml:math id="m1"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi mathvariant="bold">x</mml:mi><mml:mo>|</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>Z</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>T</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:munderover><mml:mstyle displaystyle="true"><mml:mo>∏</mml:mo></mml:mstyle><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mtext>int</mml:mtext></mml:mrow></mml:msub></mml:mrow></mml:munderover><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mtext> </mml:mtext><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula><mml:math id="inf11"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is the fitness distribution in the population (see ‘Materials and methods’ for details) and the index <italic>i</italic> runs from 0 (the root) through all <inline-formula><mml:math id="inf12"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mtext>int</mml:mtext></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> internal nodes. The indices <inline-formula><mml:math id="inf14"><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="inf13"><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> denote the two children of node <italic>i</italic>, while <inline-formula><mml:math id="inf15"><mml:mrow><mml:mi>Z</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>T</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> ensures normalization of the distribution. <xref ref-type="disp-formula" rid="equ1">Eq. (1)</xref> has a structure similar to the expression for the likelihood of sampled sequences, given a tree <italic>T</italic>, defined in phylogenetic analysis (<xref ref-type="bibr" rid="bib10">Felsenstein, 2003</xref>). The main difference is that instead of defining the probability of mutation from one character state to another, the branch propagator <inline-formula><mml:math id="inf16"><mml:mrow><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> describes the likelihood of the lineage to connect an ancestor with fitness <inline-formula><mml:math id="inf17"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> at time <inline-formula><mml:math id="inf18"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> to a child with fitness <inline-formula><mml:math id="inf19"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> at a later time <inline-formula><mml:math id="inf20"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> (child in sense of a subclade in the tree, rather than direct offspring). Note that a branch connecting nodes <italic>i</italic> and <italic>j</italic> implies that all sampled descendants of <italic>i</italic> are also descendants of <italic>j</italic>, i.e., the ‘branch does not branch’. This non-branching condition is part of the branch propagator which therefore depends on the fraction <italic>ω</italic> of the total population that is represented in the sample (see ‘Materials and methods’ for details).</p><p><xref ref-type="fig" rid="fig2">Figure 2A</xref> illustrates the propagator as function of child fitness <inline-formula><mml:math id="inf21"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, which describes the fitness distribution of children, conditioned on ancestral fitness <inline-formula><mml:math id="inf22"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. At small <inline-formula><mml:math id="inf23"><mml:mrow><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>−</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, the distribution is peaked around the ancestor. At long times, memory of ancestral fitness is lost and the propagator approaches the population distribution. Backwards in time, <inline-formula><mml:math id="inf24"><mml:mrow><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> describes (using the Bayesian inversion formula [<xref ref-type="bibr" rid="bib10">Felsenstein, 2003</xref>]) the fitness distribution of the ancestor <italic>i</italic> given a sampled child with fitness <inline-formula><mml:math id="inf25"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> at time <inline-formula><mml:math id="inf26"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. Far in the past, the ancestor fitness distribution converges to a narrow peak in the high fitness tail (<xref ref-type="bibr" rid="bib28">Rouzine and Coffin, 2007</xref>; <xref ref-type="bibr" rid="bib21">Neher and Hallatschek, 2013</xref>). See ‘Materials and methods’ for a more detailed discussion.</p><p>The fitness dynamics along a lineage resemble a random walk on which each step corresponds to a mutation with a certain effect on fitness. This walk is biased towards high fitness by selection, which makes fitter lineages more likely to survive and eventually be sampled. If many mutations contribute, the dynamics of fitness along branches can be approximated by selection-biased diffusion (SBD) as described in ‘Materials and methods’, <xref ref-type="disp-formula" rid="equ9">Equation (9)</xref> – <xref ref-type="disp-formula" rid="equ11">Equation (11)</xref>. The fitness diffusion constant of a branch is given by <inline-formula><mml:math id="inf27"><mml:mrow><mml:mi>D</mml:mi><mml:mo>=</mml:mo><mml:mi>u</mml:mi><mml:mo>〈</mml:mo><mml:msup><mml:mi>s</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>〉</mml:mo><mml:mo>/</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:math></inline-formula>, where <italic>u</italic> is the genome wide mutation rate, and <inline-formula><mml:math id="inf28"><mml:mrow><mml:mo>〈</mml:mo><mml:mo>·</mml:mo><mml:mo>〉</mml:mo></mml:mrow></mml:math></inline-formula> denotes the average over the effect sizes of mutations (<xref ref-type="bibr" rid="bib35">Tsimring et al., 1996</xref>). Fitness diffusion and stochasticity due to finite populations determine the fitness variance <inline-formula><mml:math id="inf29"><mml:mrow><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> in the population (<xref ref-type="bibr" rid="bib5">Cohen et al., 2005</xref>).</p><p>Based on the SBD approximation derived in ‘Materials and methods’, we implemented a program that numerically solves for the branch propagator and, by going up and down the tree using a ‘Message Passing’ (similar to dynamic programming) technique (<xref ref-type="bibr" rid="bib19">Mézard and Montanari, 2009</xref>), calculates the marginal fitness distribution for each node as illustrated in <xref ref-type="fig" rid="fig2">Figure 2A</xref>, for details see ‘Materials and methods’.</p></sec><sec id="s2-2"><title>Fitness inference is insensitive to model assumptions</title><p>To explore the extent to which the idealized SBD model assuming infinitesimal mutations is able to infer fitness when evolution happens via discrete mutations, we simulated a simple model of evolution with fixed fitness variance (<inline-formula><mml:math id="inf30"><mml:mrow><mml:mi>σ</mml:mi><mml:mo>=</mml:mo><mml:mn>0.03</mml:mn></mml:mrow></mml:math></inline-formula>) (<xref ref-type="bibr" rid="bib37">Zanini and Neher, 2012</xref>). In order to mimic adaptive evolution in a changing environment we introduced sites in the simulated genome that allow for beneficial mutations at rate <inline-formula><mml:math id="inf31"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>A</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>0.02</mml:mn><mml:mo>,</mml:mo><mml:mo>…</mml:mo><mml:mo>,</mml:mo><mml:mn>0.16</mml:mn></mml:mrow></mml:math></inline-formula> per generation in a genome otherwise dominated by deleterious mutations. Every 200 generations, we took a random sample of sequences from the simulated population. We recorded the fitness of each sampled sequence, which we will compare with our inferences below.</p><p>In order to apply the fitness inference method to a reconstructed tree, we needed to parameterize the model and convert branch length measured as similarity between sequences into time. When measuring time in units of <inline-formula><mml:math id="inf32"><mml:mrow><mml:msup><mml:mi>σ</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, the SBD model has only one free dimensionless parameter <inline-formula><mml:math id="inf33"><mml:mrow><mml:mtext>Γ</mml:mtext><mml:mo>=</mml:mo><mml:mi>D</mml:mi><mml:msup><mml:mi>σ</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:mn>3</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> that describes the relative importance of selection and stochastic processes. <inline-formula><mml:math id="inf34"><mml:mtext>Γ</mml:mtext></mml:math></inline-formula> is inversely proportional to the square root of the logarithm of the population size and hence does not vary greatly (<xref ref-type="bibr" rid="bib35">Tsimring et al., 1996</xref>; <xref ref-type="bibr" rid="bib5">Cohen et al., 2005</xref>). We used <inline-formula><mml:math id="inf35"><mml:mrow><mml:mtext>Γ</mml:mtext><mml:mo>=</mml:mo><mml:mn>0.2</mml:mn></mml:mrow></mml:math></inline-formula> and 0.5 corresponding to moderate and more rapid diffusion relative to selection, respectively. Coalescent theory of adapting population connects pairwise sequence similarity to <inline-formula><mml:math id="inf36"><mml:mtext>Γ</mml:mtext></mml:math></inline-formula>. The choice of <inline-formula><mml:math id="inf37"><mml:mtext>Γ</mml:mtext></mml:math></inline-formula> fixes the conversion from branch length to time via <xref ref-type="disp-formula" rid="equ20">Equation (20)</xref> (<xref ref-type="bibr" rid="bib21">Neher and Hallatschek, 2013</xref>). In addition to <inline-formula><mml:math id="inf38"><mml:mtext>Γ</mml:mtext></mml:math></inline-formula> we need to fix <italic>ω</italic>. Since we used a sample of 200 sequences out of a total of <inline-formula><mml:math id="inf39"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:mn>20000</mml:mn></mml:mrow></mml:math></inline-formula> sequences, <inline-formula><mml:math id="inf40"><mml:mrow><mml:mi>ω</mml:mi><mml:mo>=</mml:mo><mml:mn>0.01</mml:mn></mml:mrow></mml:math></inline-formula> (ultimately, <inline-formula><mml:math id="inf41"><mml:mrow><mml:mi>ω</mml:mi><mml:mo>/</mml:mo><mml:mi>σ</mml:mi></mml:mrow></mml:math></inline-formula> enters the algorithm, see ‘Materials and methods’). Using these parameters, we applied our method to a reconstructed tree and report the mean posterior fitness as ‘inferred fitness’ for each internal and external node.</p><p><xref ref-type="fig" rid="fig2">Figure 2B</xref> shows the inferred vs true fitness for a typical simulation. The rank order of fitness is well predicted (Spearman's correlation coefficients around 0.5). <xref ref-type="fig" rid="fig2">Figure 2C</xref> shows that fitness rankings improve with increasing mutation rates. This is expected, since increased mutation rates correspond to a larger number of mutations that contribute to fitness and make the SBD model a better approximation. This behavior is consistent across different rates of adaptive mutations and depends weakly on our choice of <inline-formula><mml:math id="inf42"><mml:mtext>Γ</mml:mtext></mml:math></inline-formula> (<xref ref-type="fig" rid="fig2s1">Figure 2—figure supplement 1</xref>). Large <inline-formula><mml:math id="inf43"><mml:mtext>Γ</mml:mtext></mml:math></inline-formula> performs better at low mutation rates when fitness diversity is dominated by only a few mutations, corresponding to more rapid fitness diffusion relative to selection and coalescence.</p></sec><sec id="s2-3"><title>High inferred fitness predicts progenitor sequences</title><p>Next, we asked whether sequences that we predict to have high fitness are close in sequence to the progenitor lineage of future populations. <xref ref-type="fig" rid="fig2">Figure 2D</xref> shows the Hamming distance <inline-formula><mml:math id="inf44"><mml:mrow><mml:mtext>Δ</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext>prediction</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> of the sequence of the individual with the highest fitness estimate to the population 200 generations in the future vs the <inline-formula><mml:math id="inf45"><mml:mrow><mml:mtext>Δ</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext>minimal</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> for the post-hoc optimal pick. The measure <inline-formula><mml:math id="inf46"><mml:mrow><mml:mtext>Δ</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext>sequence</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is normalized to the average Hamming distance between the present and future population. In 40 out of 100 simulations, the top-ranked sequence is an almost optimal pick (points close to the diagonal in <xref ref-type="fig" rid="fig2">Figure 2D</xref>). In 8 out of 100 cases, the prediction is better than a random pick (points below the dashed line <xref ref-type="fig" rid="fig2">Figure 2D</xref>).</p><p>The fitness inferences shown in <xref ref-type="fig" rid="fig2">Figure 2B–C</xref> used 200 sequences sampled from the same generation. However, the influenza data to which we apply our algorithm below is continuously sampled throughout the year. In <xref ref-type="fig" rid="fig2s2">Figure 2—figure supplement 2</xref> we reproduce panels B–C using 200 sequences sampled from the simulation over a time interval of 100 generation. This gives highly similar results.</p></sec><sec id="s2-4"><title>Local branching density as a heuristic ranking</title><p>In general, faithful inference of the posterior fitness distribution requires numerical solution for the branch propagators and knowledge of the parameters <inline-formula><mml:math id="inf47"><mml:mtext>Γ</mml:mtext></mml:math></inline-formula> and <inline-formula><mml:math id="inf48"><mml:mrow><mml:mi>ω</mml:mi><mml:mo>/</mml:mo><mml:mi>σ</mml:mi></mml:mrow></mml:math></inline-formula>. We observed, however, that the ranking of nodes by fitness and the prediction of progenitor lineages depends little on these parameters. This insensitivity suggests that the fitness ranking depends primarily on a more universal quantity on which the inference algorithm builds.</p><p>In ‘Materials and methods’, we show that the fitness estimates of internal nodes increase with the total branch length downstream of these nodes–at least for short time periods. The downstream tree length acts as a “polarizer” that pushes the fitness distribution of the node away from the population mean towards high fitness. For given number of descendants, the length of a subtree is maximal if it is star-like. This is intutive, as star-like subtrees indicate rapid branching (or multiple mergers backwards in time) which is expected for high fitness nodes. Conversely, prolonged absence of branching of a lineage indicates relatively low fitness.</p><p>If fitness changes gradually along lineages, high fitness of a node will coincide with both upstream and downstream branching–at least within a certain neighborhood of the tree. The relevant size of the neigborhood will depend on how rapidly fitness decorrelates along lineages. Based on this intuition, we developed a model-independent heuristic ranking algorithm: for each internal and terminal node <italic>i</italic>, we calculate a <italic>local branching index (LBI)</italic> <inline-formula><mml:math id="inf49"><mml:mrow><mml:msub><mml:mi>λ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> defined as total surrounding tree length exponentially discounted with increasing distance from the focal node. The scale <italic>τ</italic> of the exponential discounting corresponds to the size of the relevant tree neighborhood or the time over which fitness is ‘remembered’ across the tree. Within the SBD model, <italic>τ</italic> corresponds to the equilibration time scale of lineage fitness in the high fitness tail, which is of the order <inline-formula><mml:math id="inf50"><mml:mrow><mml:msub><mml:mi>T</mml:mi><mml:mi>c</mml:mi></mml:msub><mml:mo>/</mml:mo><mml:msqrt><mml:mrow><mml:mtext>log</mml:mtext><mml:mi>N</mml:mi></mml:mrow></mml:msqrt></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="inf51"><mml:mrow><mml:msub><mml:mi>T</mml:mi><mml:mi>c</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the coalescence time scale (<xref ref-type="bibr" rid="bib21">Neher and Hallatschek, 2013</xref>).</p><p>The LBI can be efficiently calculated with the same message passing techniques we used to calculate the posterior fitness distribution. Remarkably, rankings obtained by this simple heuristic are almost as accurate as fitness inference using the more complex SBD model. <xref ref-type="fig" rid="fig3">Figure 3</xref> shows Spearman’s correlation coefficient of <inline-formula><mml:math id="inf52"><mml:mrow><mml:msub><mml:mi>λ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> with true fitness as a function of pairwise difference for different memory time scales <italic>τ</italic> and compares it to the ranking via mean inferred fitness. The heuristic <inline-formula><mml:math id="inf53"><mml:mrow><mml:msub><mml:mi>λ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> not only correlates well with true fitness in simulations but sequences with the highest <inline-formula><mml:math id="inf54"><mml:mrow><mml:msub><mml:mi>λ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> also tend to be close to the progenitor of future populations (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1</xref>). Comparing the performance of the LBI to the full fitness inference in <xref ref-type="fig" rid="fig3">Figure 3</xref>, we concluded that a neighborhood size should be <inline-formula><mml:math id="inf55"><mml:mrow><mml:mi>τ</mml:mi><mml:mo>≈</mml:mo><mml:mn>0.0625</mml:mn></mml:mrow></mml:math></inline-formula> of the average pairwise distance in the sample.<fig-group><fig id="fig3" position="float"><object-id pub-id-type="doi">10.7554/eLife.03568.007</object-id><label>Figure 3.</label><caption><title>Local tree length as a fitness ranking.</title><p>Rank correlation between the true fitness and the LBI <inline-formula><mml:math id="inf56"><mml:mrow><mml:msub><mml:mi>λ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is shown as a function of pairwise diversity in the sample. Different curves correspond to different neighborhood sizes <italic>τ</italic>, which is measured in units of the average pairwise distance.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.007">http://dx.doi.org/10.7554/eLife.03568.007</ext-link></p></caption><graphic xlink:href="elife03568f003"/></fig><fig id="fig3s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03568.008</object-id><label>Figure 3—figure supplement 1.</label><caption><title>The LBI predicts progenitor sequences.</title><p>Sequences with the highest LBI in the sample tend to be close to the progenitor of future populations. The measure <inline-formula><mml:math id="inf189"><mml:mtext>Δ</mml:mtext></mml:math></inline-formula>shows the distance of the predicted sequence to the population 200 generations in the future (relative to the average distance between the two populations).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.008">http://dx.doi.org/10.7554/eLife.03568.008</ext-link></p></caption><graphic xlink:href="elife03568fs003"/></fig></fig-group></p></sec><sec id="s2-5"><title>Prediction of seasonal influenza A/H3N2 progenitor lineages</title><p>Having validated our algorithm on simulated data and presented a model independent method to rank sequences, we attempted to predict progenitor sequences of seasonal influenza A/H3N2 viruses. We used samples of influenza A/H3N2 virus hemagglutinin (HA1) sequences from one year (May–February, Asia and North America, at most 100 sequence from each region) to predict the closest relative of the population circulating in the following (northern hemisphere) winter (October–March, Asia and North America) for the years 1995–2013. All HA1 domain sequences used for our analysis came from the public domain and are available from Influenza Research Database (<ext-link ext-link-type="uri" xlink:href="http://www.fludb.org">www.fludb.org</ext-link> (<xref ref-type="bibr" rid="bib33">Squires et al., 2012</xref>)). Next, we built maximum likelihood trees using fasttree (<xref ref-type="bibr" rid="bib27">Price et al., 2009</xref>), collapsed zero-length branches into polytomies, and ranked external and internal nodes using the LBI. We set the memory time scale to <inline-formula><mml:math id="inf57"><mml:mrow><mml:mi>τ</mml:mi><mml:mo>=</mml:mo><mml:mn>0.0625</mml:mn></mml:mrow></mml:math></inline-formula> in units of average pairwise distance as suggested by the simulation data. Details of the data sets used for making predictions and discussion of potential biases are given in ‘Materials and methods’. <xref ref-type="fig" rid="fig4">Figure 4A&B</xref> show example trees of the prediction and test sets for 2007.<fig-group><fig id="fig4" position="float"><object-id pub-id-type="doi">10.7554/eLife.03568.009</object-id><label>Figure 4.</label><caption><title>Predicting the evolution of seasonal influenza A/H3N2 viruses.</title><p>(<bold>A</bold>) A genealogical tree of a sample of HA1 sequences from May 2006 to end of February 2007. Nodes are colored according to our fitness ranking <inline-formula><mml:math id="inf58"><mml:mrow><mml:msub><mml:mi>λ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. The highest ranked node is marked by a black arrow. (<bold>B</bold>) A tree of the same sequences from (<bold>A</bold>) (colored) and sequences from October 2007 to end of March 2008 (in grey). Our algorithm successfully predicts a sequence genetically close and directly ancestral to viruses circulating the following winter. (<bold>C</bold>) For each year from 1995 to 2013 we predicted a progenitor sequence and calculated its nucleotide distance to the A/H3N2 population of the following winter. Predictions based on terminal or internal sequences are very similar. The figure shows the average <inline-formula><mml:math id="inf59"><mml:mrow><mml:mtext>Δ</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext>prediction</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> of 50 runs using subsamples of the data. A random pick from the prediction set corresponds to the solid line at 1. The dashed lines indicate the optimal extant sequence at time of prediction. The distance of the dashed line from the line at 1 indicates the closeness of the optimal extant sequence to future populations.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.009">http://dx.doi.org/10.7554/eLife.03568.009</ext-link></p></caption><graphic xlink:href="elife03568f004"/></fig><fig id="fig4s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03568.010</object-id><label>Figure 4—figure supplement 1.</label><caption><title>Variation of predictions upon variation of the memory time scale of the LBI <inline-formula><mml:math id="inf205"><mml:mrow><mml:msub><mml:mi>λ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>.</title><p>Each year shows two lines–one for internal and external nodes–that show the variation of the prediction as <italic>τ</italic> varies from <inline-formula><mml:math id="inf191"><mml:mrow><mml:msup><mml:mn>2</mml:mn><mml:mrow><mml:mo>−</mml:mo><mml:mn>6</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>to 4 in multiples of 2.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.010">http://dx.doi.org/10.7554/eLife.03568.010</ext-link></p></caption><graphic xlink:href="elife03568fs004"/></fig><fig id="fig4s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03568.011</object-id><label>Figure 4—figure supplement 2.</label><caption><title>Comparison to predictions by <xref ref-type="bibr" rid="bib18">Łuksza and Lässig (2014)</xref>.</title><p>In many years, choosing the sequence with the highest LBI results in a very similar sequence to that predicted by Łuksza and Lässig (2014). In some years the LBI resulted in a pick closer to the future, in other years the sequences predicted by Łuksza and Lässig (2014) was a better choice. Łuksza and Lässig aimed at minimizing amino-acid distance at epitope position, rather than nucleotide distance as we do here. The two measures are strongly correlated, but nucleotide distance has better resolution and is hence used here.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.011">http://dx.doi.org/10.7554/eLife.03568.011</ext-link></p></caption><graphic xlink:href="elife03568fs005"/></fig><fig id="fig4s3" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03568.012</object-id><label>Figure 4—figure supplement 3.</label><caption><title>High LBI predicts clade expansion.</title><p>Each dot corresponds one clade with less than 75% frequency in a sample of sequences from May to February of year <italic>t</italic>. The excess of points in the upper right corner shows that high LBI is predictive of clade expansion. The <italic>x</italic>-axis shows its rank according to the LBI in this year, normalized to the iterval <inline-formula><mml:math id="inf192"><mml:mrow><mml:mo>[</mml:mo><mml:mn>0,1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>. The <italic>y</italic>-axis shows the rank according to clade growth measured as the ratio of frequency of this clade in year <inline-formula><mml:math id="inf193"><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math></inline-formula>and year <italic>t</italic>. Again, rankking is done on a yearly basis and normalized to the interval <inline-formula><mml:math id="inf194"><mml:mrow><mml:mo>[</mml:mo><mml:mn>0,1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>. This plot contains data from years 2003–2013 for which there are sufficiently many sequences to calculate meaningful clade frequencies. The pointsin the lower half of the plot correspond to all clades that do not continue into the next year.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.012">http://dx.doi.org/10.7554/eLife.03568.012</ext-link></p></caption><graphic xlink:href="elife03568fs006"/></fig></fig-group></p><p><xref ref-type="fig" rid="fig4">Figure 4C</xref> shows the nucleotide distance of our prediction to the A/H3N2 virus population of the next season, both for the top-ranked internal and external node of each year. Using the highest ranked external node (<xref ref-type="fig" rid="fig3">Figure 3C</xref>, black squares) is similar to using the highest ranked internal node (<xref ref-type="fig" rid="fig3">Figure 3C</xref>, red diamonds) in all years but 1997. The highest ranked internal node predict years 1997–1999, 2003, 2006–2009, and 2013, reasonably well. Notably, they fail in 1995, 1996, and 2002, while being of intermediate accuracy in the remaining years. The dependence of the prediction accuracy on the neighborhood size <italic>τ</italic> is shown in <xref ref-type="fig" rid="fig4s1">Figure 4—figure supplement 1</xref>. We also predicted successful progenitor strains using the fitness inference based on the SBD model which yields results very similar to the ranking by LBI–sometimes slightly better, sometimes worse depending on parameter choice.</p><p>We compared our predictions to vaccine strain predictions obtained by <xref ref-type="bibr" rid="bib18">Łuksza and Lässig (2014)</xref> who predict progenitors of future epidemics as we do here, albeit using an influenza specific model with four parameters, two of which are trained for each individual prediction on data from several preceding years. On average, using the same time cutoffs for prediction (February to predict October) as we used above, Łuksa and Lässig achieve an accuracy comparable to our parameter-free ranking based (see <xref ref-type="fig" rid="fig4s2">Figure 4—figure supplement 2</xref>). Interestingly, these two rather different approaches yield very similar predictions on a year to year basis. One potential explanation for this concordance is an ad hoc aspect of Łuksa and Lässig's model meant to capture epistatic interactions: the total number of synonymous mutations downstream of each clade is used as an additional predictor. The number of synonymous mutations is strongly correlated with tree length and hence with <inline-formula><mml:math id="inf60"><mml:mrow><mml:msub><mml:mi>λ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>.</p><p>To quantify prediction quality across years, we define the distance measure <inline-formula><mml:math id="inf61"><mml:mrow><mml:mi>d</mml:mi><mml:mo>=</mml:mo><mml:mo>(</mml:mo><mml:mtext>Δ</mml:mtext><mml:mo>(</mml:mo><mml:mtext>prediction</mml:mtext><mml:mo>)</mml:mo><mml:mo>−</mml:mo><mml:mtext>Δ</mml:mtext><mml:mo>(</mml:mo><mml:mtext>minimal</mml:mtext><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>/</mml:mo><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mtext>Δ</mml:mtext><mml:mo>(</mml:mo><mml:mtext>minimal</mml:mtext><mml:mo>)</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> such that an optimal prediction has <inline-formula><mml:math id="inf62"><mml:mrow><mml:mi>d</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula> and a random pick has <inline-formula><mml:math id="inf63"><mml:mrow><mml:mi>d</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math></inline-formula>. The average of <italic>d</italic> over all years is denoted by <inline-formula><mml:math id="inf64"><mml:mrow><mml:mover accent="true"><mml:mi>d</mml:mi><mml:mo>¯</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula>. <xref ref-type="fig" rid="fig5">Figure 5</xref> shows bootstrap distributions of <inline-formula><mml:math id="inf65"><mml:mrow><mml:mover accent="true"><mml:mi>d</mml:mi><mml:mo>¯</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> for our methods and compares it to <xref ref-type="bibr" rid="bib18">Łuksza and Lässig (2014)</xref> as well as two naive prediction methods: (i) a growth rate estimate of individual clades obtained by fitting an exponential curve to the fraction of the total sequences that are part of this clade in three time intervals between May and February, and (ii) the sequence of the most advanced node in a ladderized tree. Predictions with the method described here and by <xref ref-type="bibr" rid="bib18">Łuksza and Lässig (2014)</xref> are comparable within errorbars, while the two naive estimators do substantially worse on average. The dependence of the average predictive power of the LBI on the neighborhood size <italic>τ</italic> is shown in <xref ref-type="fig" rid="fig5s1">Figure 5—figure supplement 1</xref>.<fig-group><fig id="fig5" position="float"><object-id pub-id-type="doi">10.7554/eLife.03568.013</object-id><label>Figure 5.</label><caption><title>Comparison of predictors.</title><p>Transformed genetic distance <inline-formula><mml:math id="inf66"><mml:mrow><mml:mover accent="true"><mml:mi>d</mml:mi><mml:mo>¯</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> averaged over 1000 bootstrap samples (bootstrapping years) to the next influenza season. We compared our method using the sequence of the top ranked internal node, external node, the predictions by <xref ref-type="bibr" rid="bib18">Łuksza and Lässig (2014)</xref>, the ancestral sequence of clades with the largest estimated growth rate, and the sequence of the most ‘advanced’ node in a ladderized tree.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.013">http://dx.doi.org/10.7554/eLife.03568.013</ext-link></p></caption><graphic xlink:href="elife03568f005"/></fig><fig id="fig5s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03568.014</object-id><label>Figure 5—figure supplement 1.</label><caption><title>Dependence of prediction accuracy on τ.</title><p>Predictions for influenza virus A/H3N2 based on the LBI improve with increasing the memory time scale <italic>τ</italic>. Prediction accuracy is assessed as nucleotide distance to the future sample scaled such that the optimal pick as <inline-formula><mml:math id="inf195"><mml:mrow><mml:mi>d</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula>and a random pick has <inline-formula><mml:math id="inf196"><mml:mrow><mml:mi>d</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math></inline-formula>, averaged over 50 repeated predictions per year on different subsamples of the data (at most 100 sequences from Asia and North-America, 70% of the available data in cases fewer than 100 sequences are available). The figure shows the average of <italic>d</italic> over years 1995–2013; the accuracy of predictions by Łuksza and Lässig (2014) is shown as black line; the value of <italic>τ</italic> used in the remainder of the manuscript is indicated by the dashed vertical line.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.014">http://dx.doi.org/10.7554/eLife.03568.014</ext-link></p></caption><graphic xlink:href="elife03568fs007"/></fig></fig-group></p></sec><sec id="s2-6"><title>Inferred fitness increases are associated with epitope mutations</title><p>Changes in fitness along branches can be associated with the types of mutations on those branches. We found that branches corresponding to the top quartile of differentials of <inline-formula><mml:math id="inf67"><mml:mrow><mml:msub><mml:mi>λ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> are enriched for non-synonymous substitutions over synonymous mutations. Restricting non-synonymous mutations to the epitopes A–D (used in (<xref ref-type="bibr" rid="bib18">Łuksza and Lässig, 2014</xref>) and defined in (<xref ref-type="bibr" rid="bib31">Shih et al., 2007</xref>)) increases this enrichment to approximately 2-fold, see <xref ref-type="table" rid="tbl1">Table 1</xref>. Further restriction to the 7 loci identified Koel et al. increases the enrichment slightly, but their number is small and the power to detect additional enrichment is low. These findings are consistent with the notion that influenza evolution is driven by antigenic novelty (<xref ref-type="bibr" rid="bib36">Wiley et al., 1981</xref>; <xref ref-type="bibr" rid="bib14">Hampson, 2002</xref>; <xref ref-type="bibr" rid="bib32">Smith et al., 2004</xref>) and provide independent confirmation of the power of the sequences ranking and fitness inference algorithm.<table-wrap id="tbl1" position="float"><object-id pub-id-type="doi">10.7554/eLife.03568.015</object-id><label>Table 1.</label><caption><p>Non-synonymous mutations at epitopes correlate with increasing fitness</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.015">http://dx.doi.org/10.7554/eLife.03568.015</ext-link></p></caption><table frame="hsides" rules="groups"><thead><tr><th>Quartile</th><th># non-syn</th><th># syn</th><th># epi</th><th># Koel</th></tr></thead><tbody><tr><td>25</td><td>130</td><td>155</td><td>43</td><td>7</td></tr><tr><td>50</td><td>159</td><td>178</td><td>57</td><td>10</td></tr><tr><td>75</td><td>184</td><td>205</td><td>74</td><td>21</td></tr><tr><td>100</td><td>209</td><td>222</td><td>115</td><td>22</td></tr><tr><td>total</td><td>682</td><td>760</td><td>289</td><td>60</td></tr></tbody></table><table frame="hsides" rules="groups"><thead><tr><th>Comparison</th><th>enrichment</th><th>p-value</th></tr></thead><tbody><tr><td>non-syn vs syn</td><td>1.12</td><td>n.s.</td></tr><tr><td>epi vs syn</td><td>1.9</td><td>0.002</td></tr><tr><td>Koel vs syn</td><td>2.2</td><td>0.08</td></tr><tr><td>epi vs non-syn</td><td>1.7</td><td>0.015</td></tr><tr><td>Koel vs non-syn</td><td>2.0</td><td>n.s.</td></tr></tbody></table><table-wrap-foot><fn><p>For each tree constructed for the years 1995–2013, we calculated the increment in λ<sub><italic>i</italic></sub> (<italic>τ</italic>) with <italic>τ</italic> = 0.0625 along each branch and determined the likely mutations on each branch. Branches were then sorted into quartiles according to changes in λ<sub><italic>i</italic></sub> (<italic>τ</italic>). The left table shows the counts of non-synonymous (non-syn), synonymous (syn), non-synonymous mutations at epitope site (epi) and non-synonymous mutations at Koel positions (Koel) for branches in different quartiles. The right table quantifies the enrichment of certain types of mutations on branches in the top quartile relative the bottom quartile. Non-synonymous mutations at epitopes and Koel positions are approximately twofold enriched relative to synonymous mutations. Enrichment (odds ratio) and p-values were obtained using the Fisher exact test as implemented in scipy.stats (<xref ref-type="bibr" rid="bib25">Oliphant, 2007</xref>).</p></fn></table-wrap-foot></table-wrap></p></sec></sec><sec id="s3" sec-type="discussion"><title>Discussion</title><p>Starting with a model of adaptive evolution, we developed a probabilistic description of the fitness dynamics on genealogical trees and presented an algorithm to infer fitness of individual nodes in the tree. We validated this algorithm using trees reconstructed from simulated sequences and showed that the sequence with the highest inferred fitness tends to be a close match to the progenitor of future populations. Analysis of the model revealed that a simple quantity–the local branching index (LBI)–determines the fitness estimates and can be used to rank sequences by fitness with similar accuracy as the full fitness inference algorithm. The only parameter of the LBI is the size of the neighborhood on the tree and a suitable value can be chosen from simulated data.</p><p>Our fitness inference framework is based on the selection-biased diffusion model that assumes evolution proceeds via accumulation of many small effect mutations. As expected, its predictive power increases with increasing level of non-neutral genetic diversity (<xref ref-type="fig" rid="fig2">Figure 2C</xref>). However, predictive power is retained down to rather low pairwise distances, see <xref ref-type="fig" rid="fig2s1">Figure 2—figure supplement 1</xref>, where the model is a poor approximation. This suggests that the relationship between fitness and the structure of genealogical trees is more universal than the specific details of the mutation effect distribution that drive evolutionary dynamics (<xref ref-type="bibr" rid="bib21">Neher and Hallatschek, 2013</xref>). The essence of this relationship between fitness and tree shape is picked up by the LBI. When applied to influenza A/H3N2 viruses sequences, a ranking by LBI predicts progenitor lineages with high accuracy.</p><p>One of the dominant paradigms for influenza A/H3N2 virus evolution has been the exploration of ‘neutral’ networks, punctuated by bursts of rapid adaptation through large effect mutations (<xref ref-type="bibr" rid="bib16">Koelle et al., 2006</xref>; <xref ref-type="bibr" rid="bib24">Nimwegen et al., 1999</xref>). In contrast, our ability to make meaningful predictions from the shape of genealogical trees of influenza virus sequences suggests that fitness variation persists in A/H3N2 populations. Fitness in the context of seasonal influenza viruses includes antigenic evolution as well as compensatory and deleterious mutations–within HA and other segments–that may contribute to fitness variation, shape the genealogies, and be determinants of future success. This conclusion is consistent with other existing evidence for ubiquitous selection in A/H3N2 populations (<xref ref-type="bibr" rid="bib2">Bhatt et al., 2011</xref>; <xref ref-type="bibr" rid="bib34">Strelkowa and Lässig, 2012</xref>). The applicability of our fitness inference scheme and the LBI ranking is further supported by the substantial enrichment in the number of non-synonymous substitutions at epitope loci in the lineages with predicted high relative fitness. These epitopes historically have high <inline-formula><mml:math id="inf71"><mml:mrow><mml:mi>d</mml:mi><mml:mi>n</mml:mi><mml:mo>/</mml:mo><mml:mi>d</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:math></inline-formula> suggesting positive selection. Our model is agnostic to sequence and protein structure but nevertheless associates branches containing these mutations with increasing fitness.</p><p>It is also clear that large effect mutations, such as the ones associated with antigenic cluster transitions (<xref ref-type="bibr" rid="bib15">Koel et al., 2013</xref>) can play an important role in the evolution of human seasonal influenza viruses. Many of the years in which our predictions are suboptimal (e.g., 1995, 2002, and 2004) correspond to antigenic cluster transitions in which antigenic properties changed drastically via specific large effect mutations. We tried to improve predictions by assigning additional positive fitness increments to substitutions at those loci identified by Koel et al. While this did improve results in some years, it also resulted in false positives which erased the overall improvement in predictive power. In some years in which these mutations are important, they tend to occur on many genetic backgrounds. This could explain why these mutations be themselves are not very predictive in our framework.</p><p>The fact that the branching patterns of reconstructed influenza A/H3N2 trees are predictive is surprising. In addition to occasional large effect effect mutations, e.g. those that cause substantial antigenic change, confounders such as the heterogeneity of sampling, complicated migration patterns, and demographic substructure should hamper prediction. The insensitivity to local oversampling is expected from the structure of our algorithm which senses the total length of subtrees (rather then the number of leaves). Local oversampling will add many very short branches that perturb the total tree length only slightly. Subpopulations of different size, seasonality, and migration patterns, however, will perturb the coalescence patterns in parts of the reconstructed tree and should decrease predictability. Successful prediction therefore reinforces the conclusion that circulating influenza A/H3N2 populations harbor fitness variation. On the other hand, predictions might be improved by combining the shape of genealogical trees with antigenic information (<xref ref-type="bibr" rid="bib1">Bedford et al., 2014</xref>), biophysical and structural knowledge (<xref ref-type="bibr" rid="bib15">Koel et al., 2013</xref>), patterns of past evolution (<xref ref-type="bibr" rid="bib18">Łuksza and Lässig, 2014</xref>), and plausible geographic sources (<xref ref-type="bibr" rid="bib30">Russell et al., 2008</xref>; <xref ref-type="bibr" rid="bib17">Lemey et al., 2014</xref>). However, each of these refinements introduces additional parameters into the model that need to be trained if not known a priori.</p><p>A defining feature of our method to predict evolution is that it can operate on a static set of sequences from a single time point and does not require historical data. We use historical data for influenza A/H3N2 only to validate the predictions. In <xref ref-type="fig" rid="fig5">Figure 5</xref>, we compare our results to a method that explicitly uses historical data (available for the influenza A/H3N2) to identify low frequency but expanding clades. By extrapolating their expansion into the future, one can anticipate the dominant strains of next year. Interestingly we found that prediction based on the reconstructed genealogy not only captures similar information, but also performs comparably if not better, even without access to historical data.</p><p>In summary, we have shown that the shape of reconstructed genealogies holds information about the relative fitness of the sampled individuals that can be exploited to predict the genetic composition of future populations, at least when fitness differences depend on multiple mutations. Since our algorithm requires nothing but a reconstructed genealogy as input, it should be applicable in many scenarios ranging from RNA viruses to cancer cell populations.</p></sec><sec id="s4" sec-type="materials|methods"><title>Materials and methods</title><sec id="s4-1"><title>Derivation of the fitness inference algorithm</title><p>Our algorithm is based on a branching process approximation to replicating clones within a finite population. Here, we first show how we use this approximation to calculate the probability that offspring of an individual with a certain fitness are sampled. From there, we derive an equation for the branch propagators, that we solve numerically, and combine the propagators into the expression for the posterior fitness distribution given in <xref ref-type="disp-formula" rid="equ1">Equation (1)</xref>.</p></sec><sec id="s4-2"><title>Offspring number distributions</title><p>The quantitative probabilistic description of clonal propagation is provided by the distribution <inline-formula><mml:math id="inf72"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> of the number of offspring <italic>n</italic> after time <italic>t</italic> given the ancestor had fitness <italic>x</italic>. Using a ‘1st-step’ equation, that is, writing an equation for infinitesimal changes at the initial point <inline-formula><mml:math id="inf73"><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, we find for the backwards master equation for <inline-formula><mml:math id="inf74"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula><disp-formula id="equ2"><label>(2)</label><mml:math id="m2"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>+</mml:mo><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi><mml:mi>v</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo></mml:mtd><mml:mtd><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>+</mml:mo><mml:mi>x</mml:mi><mml:mo>+</mml:mo><mml:mi>u</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mtext> </mml:mtext><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi><mml:mo>〈</mml:mo><mml:mi>u</mml:mi><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>+</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>〉</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mo>+</mml:mo><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:munderover><mml:mstyle displaystyle="true"><mml:mo>∑</mml:mo></mml:mstyle><mml:mrow><mml:mi>n</mml:mi><mml:mo>′</mml:mo><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>−</mml:mo><mml:mi>n</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>where the death rate is set to one and the birth rate is given by <inline-formula><mml:math id="inf75"><mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:math></inline-formula> (see also (<xref ref-type="bibr" rid="bib21">Neher and Hallatschek, 2013</xref>)). The first term corresponds to the probability of nothing happening in the time interval <inline-formula><mml:math id="inf76"><mml:mrow><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi></mml:mrow></mml:math></inline-formula> and the second term in <inline-formula><mml:math id="inf77"><mml:mrow><mml:mo>〈</mml:mo><mml:mo>·</mml:mo><mml:mo>〉</mml:mo></mml:mrow></mml:math></inline-formula> corresponds to mutations averaged over the distribution <inline-formula><mml:math id="inf78"><mml:mrow><mml:mi>μ</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>s</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> of possible fitness effects <italic>s</italic> with the total mutation rate given by <inline-formula><mml:math id="inf79"><mml:mrow><mml:mi>u</mml:mi><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mo>∫</mml:mo></mml:mstyle><mml:mtext> </mml:mtext><mml:mi>d</mml:mi><mml:mi>s</mml:mi><mml:mtext> </mml:mtext><mml:mi>μ</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>s</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. The last term corresponds to replication of the individual. At the earlier time point <inline-formula><mml:math id="inf80"><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi></mml:mrow></mml:math></inline-formula>, fitness <italic>x</italic> was larger by <inline-formula><mml:math id="inf81"><mml:mrow><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi><mml:mi>v</mml:mi></mml:mrow></mml:math></inline-formula> due to the deterioration of the environment with velocity <italic>v</italic>. So far, this equation holds for arbitrary distribution of fitness effects. To make analytical progress, we assume that the distribution of mutational effects is short-tailed (exponential or steeper) and that the total mutation rate <italic>u</italic> is large compared to the typical effect. In this case, <xref ref-type="disp-formula" rid="equ2">Equation (2)</xref> can be rearranged into a differential equation where mutations are captured by the mean mutational effect and the mutational variance (<xref ref-type="bibr" rid="bib35">Tsimring et al., 1996</xref>; <xref ref-type="bibr" rid="bib5">Cohen et al., 2005</xref>; <xref ref-type="bibr" rid="bib21">Neher and Hallatschek, 2013</xref>).<disp-formula id="equ3"><label>(3)</label><mml:math id="m3"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>v</mml:mi><mml:mfrac><mml:mrow><mml:mo>∂</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mo>∂</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mfrac><mml:mo>=</mml:mo></mml:mtd><mml:mtd><mml:mo>−</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>+</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>u</mml:mi><mml:mo>〈</mml:mo><mml:mi>s</mml:mi><mml:mo>〉</mml:mo><mml:mfrac><mml:mrow><mml:mo>∂</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mi>u</mml:mi><mml:mo>〈</mml:mo><mml:msup><mml:mi>s</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>〉</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:mfrac><mml:mfrac><mml:mrow><mml:msup><mml:mo>∂</mml:mo><mml:mn>2</mml:mn></mml:msup><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:msup><mml:mi>x</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mfrac></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mo>+</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:munderover><mml:mstyle displaystyle="true"><mml:mo>∑</mml:mo></mml:mstyle><mml:mrow><mml:mi>n</mml:mi><mml:mo>′</mml:mo><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>−</mml:mo><mml:mi>n</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p><p>The second term on the right hand side corresponds to the directional effect of mutations on fitness, while the third term to the diffusive dynamics of fitness due to mutations. To further analyze the behavior of <inline-formula><mml:math id="inf82"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, it is useful to consider the generating function <inline-formula><mml:math id="inf83"><mml:mrow><mml:msub><mml:mi>ψ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:msub><mml:mo>∑</mml:mo><mml:mi>n</mml:mi></mml:msub><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mi>ω</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mi>n</mml:mi></mml:msup></mml:mrow></mml:mstyle><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, which obeys<disp-formula id="equ4"><label>(4)</label><mml:math id="m4"><mml:mrow><mml:mfrac><mml:mrow><mml:mo>∂</mml:mo><mml:msub><mml:mi>ψ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mo>−</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>+</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>ψ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>u</mml:mi><mml:mo>〈</mml:mo><mml:mi>s</mml:mi><mml:mo>〉</mml:mo><mml:mo>−</mml:mo><mml:mi>v</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mfrac><mml:mrow><mml:mo>∂</mml:mo><mml:msub><mml:mi>ψ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mi>u</mml:mi><mml:mo>〈</mml:mo><mml:msup><mml:mi>s</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>〉</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:mfrac><mml:mfrac><mml:mrow><mml:msup><mml:mo>∂</mml:mo><mml:mn>2</mml:mn></mml:msup><mml:msub><mml:mi>ψ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:msup><mml:mi>x</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msubsup><mml:mi>ψ</mml:mi><mml:mi>ω</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></p><p>Defining <inline-formula><mml:math id="inf84"><mml:mrow><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:msub><mml:mi>ψ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, the fitness diffusion constant <inline-formula><mml:math id="inf85"><mml:mrow><mml:mi>D</mml:mi><mml:mo>=</mml:mo><mml:mi>u</mml:mi><mml:mo>〈</mml:mo><mml:msup><mml:mi>s</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>〉</mml:mo><mml:mo>/</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:math></inline-formula>, and the variance in fitness <inline-formula><mml:math id="inf86"><mml:mrow><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mi>v</mml:mi><mml:mo>−</mml:mo><mml:mi>u</mml:mi><mml:mo>〈</mml:mo><mml:mi>s</mml:mi><mml:mo>〉</mml:mo></mml:mrow></mml:math></inline-formula>, we have<disp-formula id="equ5"><label>(5)</label><mml:math id="m5"><mml:mrow><mml:mfrac><mml:mrow><mml:mo>∂</mml:mo><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mi>x</mml:mi><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>−</mml:mo><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mfrac><mml:mrow><mml:mo>∂</mml:mo><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mi>D</mml:mi><mml:mfrac><mml:mrow><mml:msup><mml:mo>∂</mml:mo><mml:mn>2</mml:mn></mml:msup><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:msup><mml:mi>x</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mfrac><mml:mo>−</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msubsup><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>with initial condition <inline-formula><mml:math id="inf87"><mml:mrow><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mn>0</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>ω</mml:mi></mml:mrow></mml:math></inline-formula>. This equation for the generating function can be solved numerically or analytically in limiting cases. To approximate the fitness distribution on a given tree, we will solve this equation numerically.</p><p>It is also useful to explicitly define the ‘reproductive value’ <inline-formula><mml:math id="inf88"><mml:mrow><mml:mi>R</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> defined as the expected number of offspring of a genotype with fitness <italic>x</italic> after <italic>t</italic> generations, <inline-formula><mml:math id="inf89"><mml:mrow><mml:mi>R</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mstyle displaystyle="true"><mml:mo>∑</mml:mo></mml:mstyle><mml:mi>n</mml:mi></mml:msub><mml:mrow><mml:mi>n</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. From the definition of the generating function it follows that <inline-formula><mml:math id="inf90"><mml:mrow><mml:mi>R</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mo>∂</mml:mo><mml:mi>ω</mml:mi></mml:msub><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mo>|</mml:mo><mml:mrow><mml:mi>ω</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>. Differentiating <xref ref-type="disp-formula" rid="equ5">Equation (5)</xref> w.r.t. <italic>ω</italic> and noting that <inline-formula><mml:math id="inf91"><mml:mrow><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mo>|</mml:mo><mml:mrow><mml:mi>ω</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula> yields a linear equation for <inline-formula><mml:math id="inf92"><mml:mrow><mml:mi>R</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> (essentially <xref ref-type="disp-formula" rid="equ5">Equation (5)</xref> without the term <inline-formula><mml:math id="inf93"><mml:mrow><mml:msup><mml:mi>ϕ</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>) which can be readily integrated. The expected number of offspring of one individual after time <italic>t</italic> given it initially had fitness <italic>x</italic> is<disp-formula id="equ6"><label>(6)</label><mml:math id="m6"><mml:mrow><mml:mi>R</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mi>x</mml:mi><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:msup><mml:mi>t</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mn>2</mml:mn></mml:mfrac><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mi>D</mml:mi><mml:msup><mml:mi>t</mml:mi><mml:mn>3</mml:mn></mml:msup></mml:mrow><mml:mn>3</mml:mn></mml:mfrac></mml:mrow></mml:msup></mml:mrow></mml:math></disp-formula></p><p>This approximation is only valid for times short compared to the coalescence time <inline-formula><mml:math id="inf94"><mml:mrow><mml:msub><mml:mi>T</mml:mi><mml:mi>c</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, but it offers important insight into the dynamics of lineages: Initially, the lineage grows into a clone with rate <italic>x</italic>. The second term in the exponent describes how this growth slows since the remainder of the population is adapting with rate <inline-formula><mml:math id="inf95"><mml:mrow><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>. The last term accounts for the fact that the offspring we consider can themselves change in fitness through mutations, the action of which is captured by the fitness diffusion constant <italic>D</italic>.</p></sec><sec id="s4-3"><title>Lineage sampling probability</title><p>The generating function <inline-formula><mml:math id="inf96"><mml:mrow><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> derived above has the interpretation of the probability that a lineage is represented in a sample of size <italic>M</italic> from a population of size <italic>N</italic> with <inline-formula><mml:math id="inf97"><mml:mrow><mml:mi>ω</mml:mi><mml:mo>=</mml:mo><mml:mi>M</mml:mi><mml:mo>/</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:math></inline-formula>. From its definition, we have<disp-formula id="equ7"><label>(7)</label><mml:math id="m7"><mml:mrow><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:munderover><mml:mstyle displaystyle="true"><mml:mo>∑</mml:mo></mml:mstyle><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mi>∞</mml:mi></mml:munderover><mml:mtext> </mml:mtext><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mi>ω</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mi>n</mml:mi></mml:msup><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p><p>Each term <inline-formula><mml:math id="inf98"><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mi>ω</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mi>n</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> is the probability that none of the <italic>n</italic> offspring are in the sample. By summing over the distribution of <italic>n</italic> and subtracting the sum from 1, one obtains the probability of at least one offspring being sampled. The generating function can be accurately approximated in regimes where <inline-formula><mml:math id="inf99"><mml:mrow><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is small and the non-linear term in <xref ref-type="disp-formula" rid="equ5">Equation (5)</xref> can be neglected, as well as the regime of large enough <italic>x</italic> where <italic>ϕ</italic> ‘saturates’: <inline-formula><mml:math id="inf100"><mml:mrow><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>≈</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:math></inline-formula>, see (<xref ref-type="bibr" rid="bib21">Neher and Hallatschek, 2013</xref>). These two asymptotic solutions can be combined to yield the approximation<disp-formula id="equ8"><label>(8)</label><mml:math id="m8"><mml:mrow><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>≈</mml:mo><mml:mfrac><mml:mrow><mml:mi>ω</mml:mi><mml:mi>x</mml:mi><mml:mi>R</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>x</mml:mi><mml:mo>+</mml:mo><mml:mi>ω</mml:mi><mml:mo>[</mml:mo><mml:mi>R</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula></p><p>Note that this approximation satisfies the initial condition <inline-formula><mml:math id="inf101"><mml:mrow><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mn>0</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>ω</mml:mi></mml:mrow></mml:math></inline-formula>, correctly tends to <italic>x</italic> for <inline-formula><mml:math id="inf102"><mml:mrow><mml:mi>x</mml:mi><mml:mo>></mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula> at long times, and recovers the neutral behavior <inline-formula><mml:math id="inf103"><mml:mrow><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>ω</mml:mi><mml:mo>/</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mi>ω</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> in the <inline-formula><mml:math id="inf104"><mml:mrow><mml:mi>x</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mi>D</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula> limit.</p></sec><sec id="s4-4"><title>Branch propagator</title><p>Having calculated the lineage sampling probability, we are now in a position to derive equations governing the behavior of the branch propagator, that is, the probability of there being an individual with fitness <italic>x</italic> at time <inline-formula><mml:math id="inf105"><mml:mrow><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow></mml:math></inline-formula> (the child), given it descends from an ancestor with fitness <italic>y</italic> at time <italic>t</italic> and all sampled descendants of the ancestor are also descendants of the child. The latter condition amounts to the requirement that in a tree the link between the ancestor and the child does not branch. Using a ‘1st-step’ equation similar to <xref ref-type="disp-formula" rid="equ2">Equation (2)</xref>, we have<disp-formula id="equ9"><label>(9)</label><mml:math id="m9"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>+</mml:mo><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo></mml:mtd><mml:mtd><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>−</mml:mo><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>+</mml:mo><mml:mi>y</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mo>+</mml:mo><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi><mml:mi>D</mml:mi><mml:mfrac><mml:mrow><mml:msup><mml:mo>∂</mml:mo><mml:mn>2</mml:mn></mml:msup><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:msup><mml:mi>y</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mfrac></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mo>+</mml:mo><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi><mml:mn>2</mml:mn><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mi>y</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>[</mml:mo><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p><p>The last term describes a ‘birth’ event in the ancestral lineage with one of the branches surviving up to <inline-formula><mml:math id="inf106"><mml:mrow><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow></mml:math></inline-formula> (at which time its fitness is in the <inline-formula><mml:math id="inf107"><mml:mrow><mml:mo>[</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>+</mml:mo><mml:mi>d</mml:mi><mml:mi>x</mml:mi><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> interval) while the other one is not sampled, which occurs with probability <inline-formula><mml:math id="inf108"><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> at a sampling density <italic>ω</italic>. The <inline-formula><mml:math id="inf109"><mml:mrow><mml:mi>y</mml:mi><mml:mo>→</mml:mo><mml:mi>y</mml:mi><mml:mo>+</mml:mo><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi></mml:mrow></mml:math></inline-formula> shift in the argument of the term on the left-hand-side parametrizes the translation of the mean fitness in time <inline-formula><mml:math id="inf110"><mml:mrow><mml:mtext>Δ</mml:mtext><mml:mi>t</mml:mi></mml:mrow></mml:math></inline-formula>. <xref ref-type="disp-formula" rid="equ9">Equation (9)</xref> reduces to the differential equation<disp-formula id="equ10"><label>(10)</label><mml:math id="m10"><mml:mrow><mml:msub><mml:mo>∂</mml:mo><mml:mi>t</mml:mi></mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>[</mml:mo><mml:mi>y</mml:mi><mml:mo>−</mml:mo><mml:mn>2</mml:mn><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>−</mml:mo><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:msub><mml:mo>∂</mml:mo><mml:mi>y</mml:mi></mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>D</mml:mi><mml:msubsup><mml:mo>∂</mml:mo><mml:mi>y</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>which is complemented with the initial condition <inline-formula><mml:math id="inf111"><mml:mrow><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>δ</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>−</mml:mo><mml:mi>y</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. In deriving this condition, we have assumed that <inline-formula><mml:math id="inf112"><mml:mrow><mml:mi>y</mml:mi><mml:mo>≪</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math></inline-formula>, which is a good assumption when <italic>σ</italic> (the standard deviation in fitness) is small. The fitness differences in a single generation are small in most populations, such that this assumption is not restrictive. Furthermore, violation of this assumption does not change the qualitative behavior of the <inline-formula><mml:math id="inf113"><mml:mrow><mml:mi>g</mml:mi><mml:mo>(</mml:mo><mml:mo>·</mml:mo><mml:mo>|</mml:mo><mml:mo>·</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. When inferring fitness on trees, we will generally solve this equation numerically. Some limits, however, can be addressed analytically as we will see below.</p><p>Numerical solutions of <inline-formula><mml:math id="inf114"><mml:mrow><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> are shown in <xref ref-type="fig" rid="fig6">Figure 6</xref>. For a fixed ancestor at <inline-formula><mml:math id="inf115"><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="inf116"><mml:mrow><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is the density of offspring with fitness <italic>x</italic> at time <inline-formula><mml:math id="inf117"><mml:mrow><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow></mml:math></inline-formula> subject to the following condition: Only one individual from this group of offspring contributes to the sample at present (this is the condition that the lineage connecting <inline-formula><mml:math id="inf118"><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="inf119"><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is unbranched). The propagator <inline-formula><mml:math id="inf120"><mml:mrow><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> broadens in <italic>x</italic> as <inline-formula><mml:math id="inf121"><mml:mrow><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow></mml:math></inline-formula> increases as shown in <xref ref-type="fig" rid="fig6">Figure 6A</xref> for a case of high (red, <inline-formula><mml:math id="inf122"><mml:mrow><mml:mi>y</mml:mi><mml:mo>></mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:math></inline-formula>) and low (blue, <inline-formula><mml:math id="inf123"><mml:mrow><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula>) initial fitness. <xref ref-type="fig" rid="fig6">Figure 6B</xref> shows how the integral <inline-formula><mml:math id="inf124"><mml:mrow><mml:mrow><mml:mstyle displaystyle="true"><mml:mo>∫</mml:mo></mml:mstyle><mml:mi>d</mml:mi><mml:mi>x</mml:mi><mml:mi> </mml:mi><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> increases with <italic>t</italic> for <inline-formula><mml:math id="inf125"><mml:mrow><mml:mi>y</mml:mi><mml:mo>></mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula> but decreases for <inline-formula><mml:math id="inf126"><mml:mrow><mml:mi>y</mml:mi><mml:mo><</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula>. The integral of <inline-formula><mml:math id="inf127"><mml:mrow><mml:mrow><mml:mstyle displaystyle="true"><mml:mo>∫</mml:mo></mml:mstyle><mml:mi>d</mml:mi><mml:mi>x</mml:mi><mml:mi> </mml:mi><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> differs from the reproductive value <inline-formula><mml:math id="inf128"><mml:mrow><mml:mi>R</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, shown as dashed lines in <xref ref-type="fig" rid="fig6">Figure 6B</xref>, only in the additional sampling condition.<fig id="fig6" position="float"><object-id pub-id-type="doi">10.7554/eLife.03568.016</object-id><label>Figure 6.</label><caption><title>Numerical solution for the lineage propagator.</title><p>Panel <bold>A</bold> shows <inline-formula><mml:math id="inf129"><mml:mrow><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> as a function of <italic>x</italic> for different <inline-formula><mml:math id="inf130"><mml:mrow><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow></mml:math></inline-formula> at <inline-formula><mml:math id="inf131"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula> given the ancestor had Malthusian fitness <inline-formula><mml:math id="inf132"><mml:mrow><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula> (blue) or approximately <inline-formula><mml:math id="inf133"><mml:mrow><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn><mml:mi>σ</mml:mi></mml:mrow></mml:math></inline-formula> (red). In both cases, the offspring tend to get less fit and the distribution broadens due to additional mutations. Saturated colors correspond to small <inline-formula><mml:math id="inf134"><mml:mrow><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow></mml:math></inline-formula>, light colors large <inline-formula><mml:math id="inf135"><mml:mrow><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow></mml:math></inline-formula>. Panel <bold>B</bold> shows <inline-formula><mml:math id="inf136"><mml:mrow><mml:mrow><mml:mstyle displaystyle="true"><mml:mo>∫</mml:mo></mml:mstyle><mml:mi>d</mml:mi><mml:mi>x</mml:mi><mml:mi> </mml:mi><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> as a function of <inline-formula><mml:math id="inf137"><mml:mrow><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow></mml:math></inline-formula> for the high (red) and low (blue) fitness ancestor. The dashed lines show the approximation given in <xref ref-type="disp-formula" rid="equ6">Equation (6)</xref>. In the high fitness case, <xref ref-type="disp-formula" rid="equ6">Equation (6)</xref> overestimates <inline-formula><mml:math id="inf138"><mml:mrow><mml:mrow><mml:mstyle displaystyle="true"><mml:mo>∫</mml:mo></mml:mstyle><mml:mi>d</mml:mi><mml:mi>x</mml:mi><mml:mi> </mml:mi><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> since it does not account for the non-sampling contribution. Panel <bold>C</bold> shows <inline-formula><mml:math id="inf139"><mml:mrow><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> as a function of <italic>y</italic>, given the offspring is unfit (blue) or fit (red). Ancestors tend to be fit regardless of offspring fitness and both ancestral distributions converge to a common curve far back in time.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03568.016">http://dx.doi.org/10.7554/eLife.03568.016</ext-link></p></caption><graphic xlink:href="elife03568f006"/></fig></p><p>At fixed <inline-formula><mml:math id="inf140"><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="inf141"><mml:mrow><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is peaked around <italic>x</italic> for small <inline-formula><mml:math id="inf142"><mml:mrow><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow></mml:math></inline-formula> and this peak move to higher fitness as as <inline-formula><mml:math id="inf143"><mml:mrow><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow></mml:math></inline-formula> increases and converges against a steady distribution far in the past. This is seen in <xref ref-type="fig" rid="fig6">Figure 6C</xref>, where the <inline-formula><mml:math id="inf144"><mml:mrow><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is plotted as a function of <italic>y</italic>. Far in the past <inline-formula><mml:math id="inf145"><mml:mrow><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> has a well defined maximum at <inline-formula><mml:math id="inf146"><mml:mrow><mml:mi>y</mml:mi><mml:mo>≈</mml:mo><mml:mn>3</mml:mn><mml:mi>σ</mml:mi></mml:mrow></mml:math></inline-formula>. This steady distribution is shaped by two opposing trends: Fit ancestors (large <italic>y</italic>) leave more offspring and are hence more likely sampled. Too fit ancestors, on the other hand, should leave many individuals at time <inline-formula><mml:math id="inf147"><mml:mrow><mml:mi>t</mml:mi><mml:mo>′</mml:mo></mml:mrow></mml:math></inline-formula> that ultimately contribute to the sample. The width of the steady state distribution is determined the diffusion constant <italic>D</italic>.</p><p>As a special case, we will sometimes be interested in a <italic>terminal</italic> branch propagator, which takes the lineage all the way to the present generation, <inline-formula><mml:math id="inf148"><mml:mrow><mml:mi>t</mml:mi><mml:mo>′</mml:mo><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula>. Marginalizing and multiplying by the sampling probability <inline-formula><mml:math id="inf149"><mml:mrow><mml:mi>ω</mml:mi><mml:mo>=</mml:mo><mml:mi>M</mml:mi><mml:mo>/</mml:mo><mml:mi>N</mml:mi><mml:mo>≪</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math></inline-formula> defines the probability of the <inline-formula><mml:math id="inf150"><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> ancestor to be a direct progenitor of a sampled genome: <inline-formula><mml:math id="inf151"><mml:mrow><mml:mi>G</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>ω</mml:mi><mml:msup><mml:mstyle displaystyle="true"><mml:mo>∫</mml:mo></mml:mstyle><mml:mtext></mml:mtext></mml:msup><mml:mi>d</mml:mi><mml:mi>x</mml:mi><mml:mtext> </mml:mtext><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mn>0</mml:mn><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. Interestingly, for positive <italic>y</italic>, one expects this probability to initially increase with increasing <italic>t</italic> because the reproductive value - i.e. expected number of surviving offspring - for relatively fit individuals increases with time, so that their offspring constitute a larger fraction of the population and are therefore more likely to appear in the sample. At longer times however <inline-formula><mml:math id="inf152"><mml:mrow><mml:mi>G</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is expected to start decreasing, because it is increasingly unlikely that the lineage emanating from a highly fit ancestor far in the past, remains unbranched (i.e., has only a single descendant in the sample).</p><p>For small times and moderate parental fitness <italic>y</italic>, the term enforcing non-branching in <xref ref-type="disp-formula" rid="equ10">Equation (10)</xref> can be neglected. In this case, the terminal branch propagator simplifies to<disp-formula id="equ11"><label>(11)</label><mml:math id="m11"><mml:mrow><mml:mi>G</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>≈</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mi>y</mml:mi><mml:mi>t</mml:mi><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:msup><mml:mi>t</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mn>2</mml:mn></mml:mfrac><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:mi>D</mml:mi><mml:msup><mml:mi>t</mml:mi><mml:mn>3</mml:mn></mml:msup></mml:mrow><mml:mn>3</mml:mn></mml:mfrac></mml:mrow></mml:msup></mml:mrow></mml:math></disp-formula>and is hence identical to the reproductive value <xref ref-type="disp-formula" rid="equ6">Equation (6)</xref>.</p></sec><sec id="s4-5"><title>Tree-based inference</title><p>Armed with branch propagators we can now write down a joint probability of ancestral fitness on any given tree. Let <inline-formula><mml:math id="inf153"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> denote the fitness of node <italic>i</italic> starting with <inline-formula><mml:math id="inf154"><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula> at the root of the tree, <inline-formula><mml:math id="inf155"><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>…</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mtext>int</mml:mtext></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> for internal nodes, and <inline-formula><mml:math id="inf156"><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mtext>int</mml:mtext></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>…</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mtext>int</mml:mtext></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>e</mml:mi><mml:mi>x</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> for external nodes. Furthermore, denote the children of node <italic>i</italic> by <inline-formula><mml:math id="inf157"><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, where <italic>j</italic> runs over the number of children. The joint probability distribution of all nodes in the tree is then given by<disp-formula id="equ12"><label>(12)</label><mml:math id="m12"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi mathvariant="bold">x</mml:mi><mml:mo>|</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>Z</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>T</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:munderover><mml:mstyle displaystyle="true"><mml:mo>∏</mml:mo></mml:mstyle><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mtext>int</mml:mtext></mml:mrow></mml:msub></mml:mrow></mml:munderover><mml:munder><mml:mstyle displaystyle="true"><mml:mo>∏</mml:mo></mml:mstyle><mml:mi>j</mml:mi></mml:munder><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>where <inline-formula><mml:math id="inf158"><mml:mrow><mml:mi>Z</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>T</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is a normalization factor, <inline-formula><mml:math id="inf159"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is the fitness distribution in the population, and the second product runs over all <italic>j</italic> children of node <italic>i</italic>. In contrast to <xref ref-type="disp-formula" rid="equ1">Equation (1)</xref>, <xref ref-type="disp-formula" rid="equ12">Equation (12)</xref> allows for polytomies in the tree. In writing down <xref ref-type="disp-formula" rid="equ12">Equation (12)</xref>, we have made the approximation that the total population size is unconstrained and that different branches of the tree do not interact. In populations dominated by selection, this is a good approximation since coalescent properties depend only weakly on the population size.</p><p>This joint probability lives in a too high dimensional space to be practically useful, however, the tree structure makes it easy to marginalize the distribution. We commence ‘integrating out’ the independent fitness variables of the leaves, followed by integrating over the fitness values of the parents of these leaves until we arrive at the root of the tree. This defines an iterative ‘message passing’ process (<xref ref-type="bibr" rid="bib19">Mézard and Montanari, 2009</xref>) in which the ‘message’ node <italic>i</italic> sends to its parent <inline-formula><mml:math id="inf160"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is calculated via<disp-formula id="equ13"><label>(13)</label><mml:math id="m13"><mml:mrow><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↑</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mo>∫</mml:mo></mml:mstyle><mml:msub><mml:mrow><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mtext> </mml:mtext><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:munder><mml:mstyle displaystyle="true"><mml:mo>∏</mml:mo></mml:mstyle><mml:mi>j</mml:mi></mml:munder><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↑</mml:mo><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>where the product is over all children <italic>j</italic> of node <italic>i</italic> (note that the times <inline-formula><mml:math id="inf161"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="inf162"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> are fixed properties of the tree). For terminal nodes <italic>i</italic> without children, <inline-formula><mml:math id="inf163"><mml:mrow><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↑</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is simply the terminal branch propagator. Similarly, we calculate “messages” passed downstream to child <italic>j</italic> of node <italic>i</italic>:<disp-formula id="equ14"><label>(14)</label><mml:math id="m14"><mml:mrow><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↓</mml:mo><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mo>∫</mml:mo></mml:mstyle><mml:mi>d</mml:mi><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mtext> </mml:mtext><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↓</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:munder><mml:mstyle displaystyle="true"><mml:mo>∏</mml:mo></mml:mstyle><mml:mrow><mml:mi>k</mml:mi><mml:mo>≠</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↑</mml:mo><mml:msub><mml:mi>i</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></p><p>The integrand is the product of the downstream message from the parental node and the upstream messages from all children of node <italic>i</italic> other than child <italic>j</italic>. This product is further multiplied by the branch propagator to child <italic>j</italic> and integrated over the fitness of node <italic>i</italic>.</p><p>Having calculated the up and down messages for each branch, we can simply calculate the marginal distributions of fitness <inline-formula><mml:math id="inf164"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> by multiplying all messages going into a node <italic>i</italic>.<disp-formula id="equ15"><label>(15)</label><mml:math id="m15"><mml:mrow><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:msub><mml:mi>Z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mfrac><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↓</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:munder><mml:mstyle displaystyle="true"><mml:mo>∏</mml:mo></mml:mstyle><mml:mi>j</mml:mi></mml:munder><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↑</mml:mo><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>where <inline-formula><mml:math id="inf165"><mml:mrow><mml:msub><mml:mi>Z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> assures normalization. Our inference uses the mean marginal fitness to rank internal and external nodes.</p><p>For a pre-terminal node, the ‘up-message’ (<xref ref-type="disp-formula" rid="equ13">Equation (13)</xref>) involves multiplying the terminal branch propagators of all its children. If the node is recent, we can use approximation <xref ref-type="disp-formula" rid="equ11">Equation (11)</xref> and obtain<disp-formula id="equ16"><label>(16)</label><mml:math id="m16"><mml:mrow><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↑</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>∼</mml:mo><mml:mstyle displaystyle="true"><mml:mo>∫</mml:mo></mml:mstyle><mml:msub><mml:mrow><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mtext> </mml:mtext><mml:mi>g</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:msub><mml:mi>T</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>where <inline-formula><mml:math id="inf166"><mml:mrow><mml:msub><mml:mi>T</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> is total tree length downstream of node <italic>i</italic>, which polarizes the fitness of node <italic>i</italic> towards the high fitness edge. For a given number of descendants, this total tree length is maximized by a star topology. This corresponds to recent findings that multiple mergers in genealogies are associated with rapid expansion of clones founded by exceptionally fit individuals (<xref ref-type="bibr" rid="bib3">Brunet et al., 2007</xref>; <xref ref-type="bibr" rid="bib8">Desai et al., 2013</xref>; <xref ref-type="bibr" rid="bib21">Neher and Hallatschek, 2013</xref>).</p></sec><sec id="s4-6"><title>Calculating the local branching index (LBI)</title><p>The LBI defined as the integrated exponentially discounted tree length surrounding a node can be calculated in a very similar way to the message passing framework used above to evaluate the fitness distributions. The corresponding ‘up’-messages to the parent of node <italic>i</italic> is simply<disp-formula id="equ17"><label>(17)</label><mml:math id="m17"><mml:mrow><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↑</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>τ</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>/</mml:mo><mml:mi>τ</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>/</mml:mo><mml:mi>τ</mml:mi></mml:mrow></mml:msup><mml:munder><mml:mstyle displaystyle="true"><mml:mo>∑</mml:mo></mml:mstyle><mml:mi>j</mml:mi></mml:munder><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↑</mml:mo><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow></mml:math></disp-formula>where <inline-formula><mml:math id="inf167"><mml:mrow><mml:msub><mml:mi>b</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the branch length of node <italic>i</italic> and the sum runs over the children <inline-formula><mml:math id="inf168"><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> of node <italic>i</italic>. Similarly, the down message from a parent <italic>i</italic> to child <inline-formula><mml:math id="inf169"><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula><disp-formula id="equ18"><label>(18)</label><mml:math id="m18"><mml:mrow><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↓</mml:mo><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>τ</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>/</mml:mo><mml:mi>τ</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mrow><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>/</mml:mo><mml:mi>τ</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↓</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:munder><mml:mstyle displaystyle="true"><mml:mo>∑</mml:mo></mml:mstyle><mml:mrow><mml:mi>k</mml:mi><mml:mo>≠</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↑</mml:mo><mml:msub><mml:mi>i</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula></p><p>After having calculated all up and down messages, the exponentially discounted tree length is given by<disp-formula id="equ19"><label>(19)</label><mml:math id="m19"><mml:mrow><mml:msub><mml:mi>λ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>τ</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↓</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:munder><mml:mstyle displaystyle="true"><mml:mo>∑</mml:mo></mml:mstyle><mml:mi>j</mml:mi></mml:munder><mml:mtext> </mml:mtext><mml:msub><mml:mi>m</mml:mi><mml:mrow><mml:mo>↑</mml:mo><mml:msub><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow></mml:math></disp-formula></p></sec><sec id="s4-7"><title>Implementation of the inference algorithm</title><p>The fitness inference algorithm is implemented in Python using the libraries SciPy and NumPy (<xref ref-type="bibr" rid="bib25">Oliphant, 2007</xref>). Roughly, we have implemented one class, survival_gen_func, that integrates the fitness propagator on a discrete fitness grid. This class is used by the class fitness_inference to calculate the marginal distribution of fitness at each external and internal node of a given tree. The calculation of the marginals is done using a message passing approach (<xref ref-type="bibr" rid="bib19">Mézard and Montanari, 2009</xref>). This fitness inference class is then subclassed to accommodate influenza specific features. All code associated with this manuscript is available at <ext-link ext-link-type="uri" xlink:href="https://github.org/rneher/FitnessInference">https://github.org/rneher/FitnessInference</ext-link>.</p><p>To predict the sequence closest to the future population in a multiple sequence alignment, we build a maximum likelihood tree using fasttree (<xref ref-type="bibr" rid="bib27">Price et al., 2009</xref>) (the fasttree code was modified slightly to resolve short branches better). The reconstructed tree was passed to the fitness inference class. Following fitness inference, internal or external nodes were ranked by their expected fitness and we report the top ranked node as our prediction.</p><p>The branch propagator depends on fitness diffusion constant <italic>D</italic>, the standard deviation in fitness <italic>σ</italic>, and the sampling fraction <italic>ω</italic>. For the numerical implementation, we measure time in unites of <inline-formula><mml:math id="inf170"><mml:mrow><mml:msup><mml:mi>σ</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> and selection strength in units of <italic>σ</italic> and the dimensional fitness diffusion constant is <inline-formula><mml:math id="inf171"><mml:mrow><mml:mtext>Γ</mml:mtext><mml:mo>=</mml:mo><mml:mi>D</mml:mi><mml:msup><mml:mi>σ</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:mn>3</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>. The initial condition for the generating function is <inline-formula><mml:math id="inf172"><mml:mrow><mml:msub><mml:mi>ϕ</mml:mi><mml:mi>ω</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mtext> </mml:mtext><mml:mn>0</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>ω</mml:mi><mml:mo>/</mml:mo><mml:mi>σ</mml:mi></mml:mrow></mml:math></inline-formula> in these units.</p><p>In order to apply our algorithm to a tree reconstructed from sequences, we need to convert branch length into time in units of <inline-formula><mml:math id="inf173"><mml:mrow><mml:msup><mml:mi>σ</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>. Given an alignment, we can calculate the average pairwise nucleotide distance <inline-formula><mml:math id="inf174"><mml:mrow><mml:mi>π</mml:mi><mml:mo>≈</mml:mo><mml:mn>2</mml:mn><mml:mi>μ</mml:mi><mml:mo>〈</mml:mo><mml:msub><mml:mi>T</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>〉</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="inf175"><mml:mrow><mml:mo>〈</mml:mo><mml:msub><mml:mi>T</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>〉</mml:mo></mml:mrow></mml:math></inline-formula> is the average pair coalescent time and <italic>μ</italic> is the per site mutation rate. For an adapting population in the SBD model, we have <inline-formula><mml:math id="inf176"><mml:mrow><mml:mo>〈</mml:mo><mml:msub><mml:mi>T</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>〉</mml:mo><mml:mi>σ</mml:mi><mml:mo>≈</mml:mo><mml:msup><mml:mtext>Γ</mml:mtext><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> (<xref ref-type="bibr" rid="bib21">Neher and Hallatschek, 2013</xref>). Given a choice for <inline-formula><mml:math id="inf177"><mml:mtext>Γ</mml:mtext></mml:math></inline-formula>, the conversion factor <italic>β</italic> from nucleotide distance to <inline-formula><mml:math id="inf178"><mml:mrow><mml:msup><mml:mi>σ</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> units is determined by<disp-formula id="equ20"><label>(20)</label><mml:math id="m20"><mml:mrow><mml:mfrac><mml:mi>π</mml:mi><mml:mrow><mml:mn>2</mml:mn><mml:mi>β</mml:mi></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mtext>Γ</mml:mtext></mml:mfrac><mml:mtext> </mml:mtext><mml:mo>⇒</mml:mo><mml:mtext> </mml:mtext><mml:mi>β</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mtext>Γ</mml:mtext><mml:mi>π</mml:mi></mml:mrow><mml:mn>2</mml:mn></mml:mfrac><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p><p>In addition to estimating fitness from the tree, we also measure the frequency changes of clades over time. For influenza A/H3N2 virus data, we partition sequences into three intervals of equal length between May and February and calculate the fraction of sequences that are below every internal nodes in each of these intervals (using a pseudocount of 5). From these three frequency values, we estimate the expansion rate by fitting a line to the logarithm of the frequencies.</p></sec><sec id="s4-8"><title>Simulations</title><p>We use the population genetics library FFPopSim (<xref ref-type="bibr" rid="bib37">Zanini and Neher, 2012</xref>) to implement an individual based simulation with fixed fitness variance <inline-formula><mml:math id="inf179"><mml:mrow><mml:mi>σ</mml:mi><mml:mo>=</mml:mo><mml:mn>0.03</mml:mn></mml:mrow></mml:math></inline-formula>. Mutations are introduced at random sites in random individuals with rate <italic>μ</italic>. We varied the total genomic mutation rate <inline-formula><mml:math id="inf180"><mml:mrow><mml:mi>u</mml:mi><mml:mo>=</mml:mo><mml:mi>L</mml:mi><mml:mi>μ</mml:mi></mml:mrow></mml:math></inline-formula> between 0.016 and 0.256, where the total number of simulated sites is <inline-formula><mml:math id="inf181"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:mn>2000</mml:mn></mml:mrow></mml:math></inline-formula>. Mutations at all sites are by default deleterious, with effects drawn from an exponential distribution. To emulate a changing environment, we redraw the fitness effect of random positions within the first 500 sites at random with a total rate of <inline-formula><mml:math id="inf182"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>A</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>0.02</mml:mn><mml:mo>,</mml:mo><mml:mo>…</mml:mo><mml:mo>,</mml:mo><mml:mn>0.16</mml:mn></mml:mrow></mml:math></inline-formula> per generation. Beneficial effects are drawn from a gamma distribution with shape parameter 2 and the same scale as the deleterious mutations. Every 200 generations, a random sample of 200 sequences is written to file and later used to predict the sequence closest to the next sample. The simulation code is provided as <monospace>flusim.cpp</monospace> in the above mentioned repository.</p></sec><sec id="s4-9"><title>Influenza data</title><p>All sequences of influenza A/H3N2 viruses from human hosts from 1968 to 2014 that cover the entire HA1 domain were downloaded from IRD and aligned using the alignment feature provided by IRD with default settings (<xref ref-type="bibr" rid="bib33">Squires et al., 2012</xref>). The alignment was inspected by eye and trimmed to the HA1 domain. A few obvious outliers, lab strains, and sequences with indels or more than 4 ambiguous nucleotides were removed manually. For each strain the location information was converted to longitude and latitude at the country level and the strain was classified into rough geographic regions based on longitude and latitude. Only sequences with geographic information at the country level and date information with at least month accuracy were used. To avoid sampling bias, we subsampled the data to at most 100 sequences from either North America and Asia and used repeated subsamples to assess the robustness of the predictions. In years where less than 100 sequences are available from one of the geographic regions, we repeatedly used 70% of the available data. Increasing the sample size has negligible effect on prediction accuracy beyond a sample size of 100.</p></sec></sec></body><back><ack id="ack"><title>Acknowledgements</title><p>We are grateful to Michael Elowitz, Paul Rainey and Eric Siggia for critical reading of the manuscript.</p></ack><sec sec-type="additional-information"><title>Additional information</title><fn-group content-type="competing-interest"><title>Competing interests</title><fn fn-type="conflict" id="conf1"><p>RAN: Reviewing editor, <italic>eLife</italic>.</p></fn><fn fn-type="conflict" id="conf2"><p>The other authors declare that no competing interests exist.</p></fn></fn-group><fn-group content-type="author-contribution"><title>Author contributions</title><fn fn-type="con" id="con1"><p>RAN, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con2"><p>CAR, Conception and design, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con3"><p>BIS, Conception and design, Analysis and interpretation of data, Drafting or revising the article</p></fn></fn-group></sec><ref-list><title>References</title><ref id="bib1"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bedford</surname><given-names>T</given-names></name><name><surname>Suchard</surname><given-names>MA</given-names></name><name><surname>Lemey</surname><given-names>P</given-names></name><name><surname>Dudas</surname><given-names>G</given-names></name><name><surname>Gregory</surname><given-names>V</given-names></name><name><surname>Hay</surname><given-names>AJ</given-names></name><name><surname>McCauley</surname><given-names>JW</given-names></name><name><surname>Russell</surname><given-names>CA</given-names></name><name><surname>Smith</surname><given-names>DJ</given-names></name><name><surname>Rambaut</surname><given-names>A</given-names></name></person-group><year>2014</year><article-title>Integrating influenza antigenic dynamics with molecular evolution</article-title><source>eLife</source><volume>3</volume><fpage>e01914</fpage><pub-id pub-id-type="doi">10.7554/eLife.01914</pub-id></element-citation></ref><ref id="bib2"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bhatt</surname><given-names>S</given-names></name><name><surname>Holmes</surname><given-names>EC</given-names></name><name><surname>Pybus</surname><given-names>OG</given-names></name></person-group><year>2011</year><article-title>The genomic rate of molecular adaptation of the human influenza A virus</article-title><source>Molecular Biology and Evolution</source><volume>28</volume><fpage>2443</fpage><lpage>2451</lpage><pub-id pub-id-type="doi">10.1093/molbev/msr044</pub-id></element-citation></ref><ref id="bib3"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Brunet</surname><given-names>E</given-names></name><name><surname>Derrida</surname><given-names>B</given-names></name><name><surname>Mueller</surname><given-names>AH</given-names></name><name><surname>Munier</surname><given-names>S</given-names></name></person-group><year>2007</year><article-title>Effect of selection on ancestry: an exactly soluble case and its phenomenological generalization</article-title><source>Physical review E Statistical, nonlinear and soft matter physics</source><volume>76</volume><fpage>041104</fpage><pub-id pub-id-type="doi">10.1103/PhysRevE.76.041104</pub-id></element-citation></ref><ref id="bib4"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bush</surname><given-names>RM</given-names></name><name><surname>Bender</surname><given-names>CA</given-names></name><name><surname>Subbarao</surname><given-names>K</given-names></name><name><surname>Cox</surname><given-names>NJ</given-names></name><name><surname>Fitch</surname><given-names>WM</given-names></name></person-group><year>1999</year><article-title>Predicting the evolution of human influenza A</article-title><source>Science</source><volume>286</volume><fpage>1921</fpage><lpage>1925</lpage><pub-id pub-id-type="doi">10.1126/science.286.5446.1921</pub-id></element-citation></ref><ref id="bib5"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cohen</surname><given-names>E</given-names></name><name><surname>Kessler</surname><given-names>DA</given-names></name><name><surname>Levine</surname><given-names>H</given-names></name></person-group><year>2005</year><article-title>Front propagation up a reaction rate gradient</article-title><source>Physical review. E, Statistical, nonlinear, and soft matter physics</source><volume>72</volume><fpage>066126</fpage><pub-id pub-id-type="doi">10.1103/PhysRevE.72.066126</pub-id></element-citation></ref><ref id="bib6"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dayarian</surname><given-names>A</given-names></name><name><surname>Shraiman</surname><given-names>BI</given-names></name></person-group><year>2014</year><article-title>How to Infer Relative Fitness from a Sample of Genomic Sequences</article-title><source>Genetics</source><volume>113</volume><fpage>160986</fpage><pub-id pub-id-type="doi">10.1534/genetics.113.160986</pub-id></element-citation></ref><ref id="bib7"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Desai</surname><given-names>MM</given-names></name><name><surname>Fisher</surname><given-names>DS</given-names></name></person-group><year>2007</year><article-title>Beneficial mutation selection balance and the effect of linkage on positive selection</article-title><source>Genetics</source><volume>176</volume><fpage>1759</fpage><lpage>1798</lpage><pub-id pub-id-type="doi">10.1534/genetics.106.067678</pub-id></element-citation></ref><ref id="bib8"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Desai</surname><given-names>MM</given-names></name><name><surname>Walczak</surname><given-names>AM</given-names></name><name><surname>Fisher</surname><given-names>DS</given-names></name></person-group><year>2013</year><article-title>Genetic diversity and the structure of genealogies in rapidly adapting populations</article-title><source>Genetics</source><volume>193</volume><fpage>565</fpage><lpage>585</lpage><pub-id pub-id-type="doi">10.1534/genetics.112.147157</pub-id></element-citation></ref><ref id="bib9"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Falconer</surname><given-names>DS</given-names></name><name><surname>Mackay</surname><given-names>TFC</given-names></name></person-group><year>1996</year><source>Introduction to quantitative genetics</source><publisher-name>Pearson</publisher-name></element-citation></ref><ref id="bib10"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Felsenstein</surname><given-names>J</given-names></name></person-group><year>2003</year><source>Inferring Phylogenies</source><publisher-name>Sinauer associates</publisher-name><comment>ISBN 0878931775</comment></element-citation></ref><ref id="bib11"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gong</surname><given-names>LI</given-names></name><name><surname>Suchard</surname><given-names>MA</given-names></name><name><surname>Bloom</surname><given-names>JD</given-names></name></person-group><year>2013</year><article-title>Stability-mediated epistasis constrains the evolution of an influenza protein</article-title><source>eLife</source><volume>2</volume><fpage>e00631</fpage><pub-id pub-id-type="doi">10.7554/eLife.00631.</pub-id></element-citation></ref><ref id="bib12"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Goyal</surname><given-names>S</given-names></name><name><surname>Balick</surname><given-names>DJ</given-names></name><name><surname>Jerison</surname><given-names>ER</given-names></name><name><surname>Neher</surname><given-names>RA</given-names></name><name><surname>Shraiman</surname><given-names>BI</given-names></name><name><surname>Desai</surname><given-names>MM</given-names></name></person-group><year>2012</year><article-title>Dynamic mutation-selection balance as an evolutionary attractor</article-title><source>Genetics</source><volume>191</volume><fpage>1309</fpage><lpage>1319</lpage><pub-id pub-id-type="doi">10.1534/genetics.112.141291</pub-id></element-citation></ref><ref id="bib13"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hallatschek</surname><given-names>O</given-names></name></person-group><year>2011</year><article-title>The noisy edge of traveling waves</article-title><source>Proceedings of the National Academy of Sciences of USA</source><volume>108</volume><fpage>1783</fpage><lpage>1787</lpage><pub-id pub-id-type="doi">10.1073/pnas.1013529108</pub-id></element-citation></ref><ref id="bib14"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Hampson</surname><given-names>AW</given-names></name></person-group><year>2002</year><source>Influenza</source><person-group person-group-type="editor"><name><surname>Potter</surname><given-names>CW</given-names></name></person-group><publisher-loc>London</publisher-loc><publisher-name>Elsevier</publisher-name><fpage>49</fpage><lpage>85</lpage></element-citation></ref><ref id="bib15"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Koel</surname><given-names>BF</given-names></name><name><surname>Burke</surname><given-names>DF</given-names></name><name><surname>Bestebroer</surname><given-names>TM</given-names></name><name><surname>van der Vliet</surname><given-names>S</given-names></name><name><surname>Zondag</surname><given-names>GC</given-names></name><name><surname>Vervaet</surname><given-names>G</given-names></name><name><surname>Skepner</surname><given-names>E</given-names></name><name><surname>Lewis</surname><given-names>NS</given-names></name><name><surname>Spronken</surname><given-names>MI</given-names></name><name><surname>Russell</surname><given-names>CA</given-names></name><name><surname>Eropkin</surname><given-names>MY</given-names></name><name><surname>Hurt</surname><given-names>AC</given-names></name><name><surname>Barr</surname><given-names>IG</given-names></name><name><surname>de Jong</surname><given-names>JC</given-names></name><name><surname>Rimmelzwaan</surname><given-names>GF</given-names></name><name><surname>Osterhaus</surname><given-names>AD</given-names></name><name><surname>Fouchier</surname><given-names>RA</given-names></name><name><surname>Smith</surname><given-names>DJ</given-names></name></person-group><year>2013</year><article-title>Substitutions near the receptor binding site determine major antigenic change during influenza virus evolution</article-title><source>Science</source><volume>342</volume><fpage>976</fpage><lpage>979</lpage><pub-id pub-id-type="doi">10.1126/science.1244730</pub-id></element-citation></ref><ref id="bib16"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Koelle</surname><given-names>K</given-names></name><name><surname>Cobey</surname><given-names>S</given-names></name><name><surname>Grenfell</surname><given-names>B</given-names></name><name><surname>Pascual</surname><given-names>M</given-names></name></person-group><year>2006</year><article-title>Epochal evolution shapes the phylodynamics of interpandemic influenza A (H3N2) in humans</article-title><source>Science</source><volume>314</volume><fpage>1898</fpage><lpage>1903</lpage><pub-id pub-id-type="doi">10.1126/science.1132745</pub-id></element-citation></ref><ref id="bib17"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lemey</surname><given-names>P</given-names></name><name><surname>Rambaut</surname><given-names>A</given-names></name><name><surname>Bedford</surname><given-names>T</given-names></name><name><surname>Faria</surname><given-names>N</given-names></name><name><surname>Bielejec</surname><given-names>F</given-names></name><name><surname>Baele</surname><given-names>G</given-names></name><name><surname>Russell</surname><given-names>CA</given-names></name><name><surname>Smith</surname><given-names>DJ</given-names></name><name><surname>Pybus</surname><given-names>OG</given-names></name><name><surname>Brockmann</surname><given-names>D</given-names></name><name><surname>Suchard</surname><given-names>MA</given-names></name></person-group><year>2014</year><article-title>Unifying viral genetics and human transportation data to predict the global transmission dynamics of human influenza H3N2</article-title><source>PLOS Pathogens</source><volume>10</volume><fpage>e1003932</fpage><pub-id pub-id-type="doi">10.1371/journal.ppat.1003932</pub-id></element-citation></ref><ref id="bib18"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Łuksza</surname><given-names>M</given-names></name><name><surname>Lässig</surname><given-names>M</given-names></name></person-group><year>2014</year><article-title>A predictive fitness model for influenza</article-title><source>Nature</source><volume>507</volume><fpage>57</fpage><lpage>61</lpage><pub-id pub-id-type="doi">10.1038/nature13087</pub-id></element-citation></ref><ref id="bib19"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Mézard</surname><given-names>M</given-names></name><name><surname>Montanari</surname><given-names>A</given-names></name></person-group><year>2009</year><source>Information, Physics, and Computation</source><publisher-name>Oxford University Press</publisher-name></element-citation></ref><ref id="bib20"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Neher</surname><given-names>RA</given-names></name></person-group><year>2013</year><article-title>Genetic Draft, Selective Interference, and Population Genetics of Rapid Adaptation</article-title><source>Annual review of Ecology, evolution, and Systematics</source><volume>44</volume><fpage>195</fpage><lpage>215</lpage><pub-id pub-id-type="doi">10.1146/annurev-ecolsys-110512-135920</pub-id></element-citation></ref><ref id="bib21"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Neher</surname><given-names>RA</given-names></name><name><surname>Hallatschek</surname><given-names>O</given-names></name></person-group><year>2013</year><article-title>Genealogies of rapidly adapting populations</article-title><source>Proceedings of the National Academy of Sciences of USA</source><volume>110</volume><fpage>437</fpage><lpage>442</lpage><pub-id pub-id-type="doi">10.1073/pnas.1213113110</pub-id></element-citation></ref><ref id="bib22"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Neher</surname><given-names>RA</given-names></name><name><surname>Shraiman</surname><given-names>BI</given-names></name></person-group><year>2011</year><article-title>Genetic draft and quasi-neutrality in large facultatively sexual populations</article-title><source>Genetics</source><volume>188</volume><fpage>975</fpage><lpage>996</lpage><pub-id pub-id-type="doi">10.1534/genetics.111.128876</pub-id></element-citation></ref><ref id="bib23"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nelson</surname><given-names>MI</given-names></name><name><surname>Holmes</surname><given-names>EC</given-names></name></person-group><year>2007</year><article-title>The evolution of epidemic influenza</article-title><source>Nature Reviews Genetics</source><volume>8</volume><fpage>196</fpage><lpage>205</lpage><pub-id pub-id-type="doi">10.1038/nrg2053</pub-id></element-citation></ref><ref id="bib24"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nimwegen</surname><given-names>EV</given-names></name><name><surname>Crutchfield</surname><given-names>JP</given-names></name><name><surname>Huynen</surname><given-names>M</given-names></name></person-group><year>1999</year><article-title>Neutral evolution of mutational robustness</article-title><source>Proceedings of the National Academy of Sciences</source><volume>96</volume><fpage>9716</fpage><lpage>9720</lpage><pub-id pub-id-type="doi">10.1073/pnas.96.17.9716</pub-id></element-citation></ref><ref id="bib25"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Oliphant</surname><given-names>T</given-names></name></person-group><year>2007</year><article-title>Python for Scientific Computing</article-title><source>Computing in Science & Engineering</source><volume>9</volume><fpage>10</fpage><lpage>20</lpage><pub-id pub-id-type="doi">10.1109/MCSE.2007.58</pub-id></element-citation></ref><ref id="bib26"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Plotkin</surname><given-names>JB</given-names></name><name><surname>Dushoff</surname><given-names>J</given-names></name><name><surname>Levin</surname><given-names>SA</given-names></name></person-group><year>2002</year><article-title>Hemagglutinin sequence clusters and the antigenic evolution of influenza A virus</article-title><source>Proceedings of the National Academy of Sciences of USA</source><volume>99</volume><fpage>6263</fpage><lpage>6268</lpage><pub-id pub-id-type="doi">10.1073/pnas.082110799</pub-id></element-citation></ref><ref id="bib27"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Price</surname><given-names>MN</given-names></name><name><surname>Dehal</surname><given-names>PS</given-names></name><name><surname>Arkin</surname><given-names>AP</given-names></name></person-group><year>2009</year><article-title>FastTree: computing large minimum evolution trees with profiles instead of a distance matrix</article-title><source>Molecular Biology and Evolution</source><volume>26</volume><fpage>1641</fpage><lpage>1650</lpage><pub-id pub-id-type="doi">10.1093/molbev/msp077</pub-id></element-citation></ref><ref id="bib28"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rouzine</surname><given-names>IM</given-names></name><name><surname>Coffin</surname><given-names>JM</given-names></name></person-group><year>2007</year><article-title>Highly fit ancestors of a partly sexual haploid population</article-title><source>Theoretical population biology</source><volume>71</volume><fpage>239</fpage><lpage>250</lpage><pub-id pub-id-type="doi">10.1016/j.tpb.2006.09.002</pub-id></element-citation></ref><ref id="bib29"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rouzine</surname><given-names>IM</given-names></name><name><surname>Wakeley</surname><given-names>J</given-names></name><name><surname>Coffin</surname><given-names>JM</given-names></name></person-group><year>2003</year><article-title>The solitary wave of asexual evolution</article-title><source>Proceedings of the National Academy of Sciences of USA</source><volume>100</volume><fpage>587</fpage><lpage>592</lpage><pub-id pub-id-type="doi">10.1073/pnas.242719299</pub-id></element-citation></ref><ref id="bib30"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Russell</surname><given-names>CA</given-names></name><name><surname>Jones</surname><given-names>TC</given-names></name><name><surname>Barr</surname><given-names>IG</given-names></name><name><surname>Cox</surname><given-names>NJ</given-names></name><name><surname>Garten</surname><given-names>RJ</given-names></name><name><surname>Gregory</surname><given-names>V</given-names></name><name><surname>Gust</surname><given-names>ID</given-names></name><name><surname>Hampson</surname><given-names>AW</given-names></name><name><surname>Hay</surname><given-names>AJ</given-names></name><name><surname>Hurt</surname><given-names>AC</given-names></name><name><surname>de Jong</surname><given-names>JC</given-names></name><name><surname>Kelso</surname><given-names>A</given-names></name><name><surname>Klimov</surname><given-names>AI</given-names></name><name><surname>Kageyama</surname><given-names>T</given-names></name><name><surname>Komadina</surname><given-names>N</given-names></name><name><surname>Lapedes</surname><given-names>AS</given-names></name><name><surname>Lin</surname><given-names>YP</given-names></name><name><surname>Mosterin</surname><given-names>A</given-names></name><name><surname>Obuchi</surname><given-names>M</given-names></name><name><surname>Odagiri</surname><given-names>T</given-names></name><name><surname>Osterhaus</surname><given-names>AD</given-names></name><name><surname>Rimmelzwaan</surname><given-names>GF</given-names></name><name><surname>Shaw</surname><given-names>MW</given-names></name><name><surname>Skepner</surname><given-names>E</given-names></name><name><surname>Stohr</surname><given-names>K</given-names></name><name><surname>Tashiro</surname><given-names>M</given-names></name><name><surname>Fouchier</surname><given-names>RA</given-names></name><name><surname>Smith</surname><given-names>DJ</given-names></name></person-group><year>2008</year><article-title>The global circulation of seasonal influenza A (H3N2) viruses</article-title><source>Science</source><volume>320</volume><fpage>340</fpage><pub-id pub-id-type="doi">10.1126/science.1154137</pub-id></element-citation></ref><ref id="bib31"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shih</surname><given-names>AC</given-names></name><name><surname>Hsiao</surname><given-names>TC</given-names></name><name><surname>Ho</surname><given-names>MS</given-names></name><name><surname>Li</surname><given-names>WH</given-names></name></person-group><year>2007</year><article-title>Simultaneous amino acid substitutions at antigenic sites drive influenza A hemagglutinin evolution</article-title><source>Proceedings of the National Academy of Sciences of USA</source><volume>104</volume><fpage>6283</fpage><lpage>6288</lpage><pub-id pub-id-type="doi">10.1073/pnas.0701396104</pub-id></element-citation></ref><ref id="bib32"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Smith</surname><given-names>DJ</given-names></name><name><surname>Lapedes</surname><given-names>AS</given-names></name><name><surname>de Jong</surname><given-names>JC</given-names></name><name><surname>Bestebroer</surname><given-names>TM</given-names></name><name><surname>Rimmelzwaan</surname><given-names>GF</given-names></name><name><surname>Osterhaus</surname><given-names>AD</given-names></name><name><surname>Fouchier</surname><given-names>RA</given-names></name></person-group><year>2004</year><article-title>Mapping the antigenic and genetic evolution of influenza virus</article-title><source>Science</source><volume>305</volume><fpage>371</fpage><lpage>376</lpage><pub-id pub-id-type="doi">10.1126/science.1097211</pub-id></element-citation></ref><ref id="bib33"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Squires</surname><given-names>RB</given-names></name><name><surname>Noronha</surname><given-names>J</given-names></name><name><surname>Hunt</surname><given-names>V</given-names></name><name><surname>Garca-Sastre</surname><given-names>A</given-names></name><name><surname>Macken</surname><given-names>C</given-names></name><name><surname>Baumgarth</surname><given-names>N</given-names></name><name><surname>Suarez</surname><given-names>D</given-names></name><name><surname>Pickett</surname><given-names>BE</given-names></name><name><surname>Zhang</surname><given-names>Y</given-names></name><name><surname>Larsen</surname><given-names>CN</given-names></name><name><surname>Ramsey</surname><given-names>A</given-names></name><name><surname>Zhou</surname><given-names>L</given-names></name><name><surname>Zaremba</surname><given-names>S</given-names></name><name><surname>Kumar</surname><given-names>S</given-names></name><name><surname>Deitrich</surname><given-names>J</given-names></name><name><surname>Klem</surname><given-names>E</given-names></name><name><surname>Scheuermann</surname><given-names>RH</given-names></name></person-group><year>2012</year><article-title>Influenza research database: an integrated bioinformatics resource for influenza research and surveillance</article-title><source>Influenza and Other Respiratory Viruses</source><volume>6</volume><fpage>404</fpage><pub-id pub-id-type="doi">10.1111/j.1750-2659.2011.00331.x</pub-id></element-citation></ref><ref id="bib34"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Strelkowa</surname><given-names>N</given-names></name><name><surname>Lässig</surname><given-names>M</given-names></name></person-group><year>2012</year><article-title>Clonal interference in the evolution of influenza</article-title><source>Genetics</source><volume>192</volume><fpage>671</fpage><lpage>682</lpage><pub-id pub-id-type="doi">10.1534/genetics.112.143396</pub-id></element-citation></ref><ref id="bib35"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tsimring</surname><given-names>L</given-names></name><name><surname>Levine</surname><given-names>H</given-names></name><name><surname>Kessler</surname><given-names>D</given-names></name></person-group><year>1996</year><article-title>RNA virus evolution via a fitness-space model</article-title><source>Physical Review Letters</source><volume>76</volume><fpage>4440</fpage><lpage>4443</lpage><pub-id pub-id-type="doi">10.1103/PhysRevLett.76.4440</pub-id></element-citation></ref><ref id="bib36"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wiley</surname><given-names>DC</given-names></name><name><surname>Wilson</surname><given-names>IA</given-names></name><name><surname>Skehel</surname><given-names>JJ</given-names></name></person-group><year>1981</year><article-title>Structural identification of the antibody-binding sites of Hong Kong influenza haemagglutinin and their involvement in antigenic variation</article-title><source>Nature</source><volume>289</volume><fpage>373</fpage><lpage>378</lpage><pub-id pub-id-type="doi">10.1038/289373a0</pub-id></element-citation></ref><ref id="bib37"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zanini</surname><given-names>F</given-names></name><name><surname>Neher</surname><given-names>RA</given-names></name></person-group><year>2012</year><article-title>FFPopSim: an efficient forward simulation package for the evolution of large populations</article-title><source>Bioinformatics</source><volume>28</volume><fpage>3332</fpage><lpage>3333</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/bts633</pub-id></element-citation></ref></ref-list></back><sub-article article-type="article-commentary" id="SA1"><front-stub><article-id pub-id-type="doi">10.7554/eLife.03568.017</article-id><title-group><article-title>Decision letter</article-title></title-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>McVean</surname><given-names>Gil</given-names></name><role>Reviewing editor</role><aff><institution>Oxford University</institution>, <country>United Kingdom</country></aff></contrib></contrib-group></front-stub><body><boxed-text><p>eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see <ext-link ext-link-type="uri" xlink:href="http://elifesciences.org/review-process">review process</ext-link>). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.</p></boxed-text><p>Thank you for sending your work entitled “Predicting evolution from the shape of genealogical trees” for consideration at <italic>eLife</italic>. Your article has been favorably evaluated by Chris Ponting (Senior editor) and 3 reviewers, one of whom is a member of our Board of Reviewing Editors.</p><p>The Reviewing editor and the other reviewers discussed their comments before we reached this decision, and the Reviewing editor has assembled the following comments to help you prepare a revised submission.</p><p>All reviewers agreed that inferring fitness from a phylogeny is an interesting and promising approach. The principal innovations were considered to be: (a) The likelihood function, which is an approximation to a birth-death process with variable (and evolving) birth rate; (b) the evaluation of the method through simulation; (c) the application to the influenza data sets, showing the information about fitness inherent in tree shape; (d) the comparison of the method to the recent work from Luksza and Lassig, which focused on key sites in key immunological proteins; and, (e) the elaboration of the method to include addition information about sites of known importance and longitudinal information within a year that can help spot growing clades.</p><p>Nevertheless, there was no clear consensus among the reviewers with regard to its suitability for publication and they would like to consider a revised version of the manuscript, which should not require much extra work but should adequately address the three comments and criticisms below.</p><p>1) The conclusion of many minor-effect mutations will need to be supported by better evidence or to be formulated differently. Whilst you claim that the results support the notion that a substantial fraction of adaptive evolution is through a quantitative (infinitessimal-style) response, it appears feasible that the primary driver of genealogical history/selection is indeed through HA/NM, but acts in a structured population. If so, and there is a semi-neutral accumulation of mutations along the branches, the genealogy becomes a proxy for the relative fitness of cryptic sub-populations within the species. The fact that the Luksza and Lassig results are so similar suggests that the two approaches take advantage of very similar information. The reviewers were also sceptical about the inclusion of Koel mutations into the prediction scheme. The sites of these mutations have been identified in a very recent publication through their importance for antigenic substitutions (<xref ref-type="bibr" rid="bib15">Koel et al., 2013</xref>). Hence, for most of the prediction period, they introduce posterior information into the prediction; this caveat should at least be noted. We ask you to consider including results for the model with temporal information (clade growth rates), but without the Koel term; this will quantify the contribution of that term to the full prediction. If the net contribution of the Koel term remains limited, the message of the paper might become stronger by making exactly that point. This is of relevance for influenza research because the current analysis focuses to a large extent on the identification of few large-effect antigenic changes. Finally, please comment further on the inference of small effects: it may be circular since it is an input assumption of your method.</p><p>2) Inadequate discussion of model assumptions and impact of data structure (temporal spread, spatial structure, sampling inhomogeneities). (a) The model contains a parameter lambda that scales the branch lengths of the coalescent tree. Is this a canonical scale parameter within the traveling wave theory or a heuristic extension? If so, it should be marked as such. (b) Similarly, is the form of the propagator for branches with Koel mutations, <xref ref-type="disp-formula" rid="equ2">eq. (2)</xref>, supported by the theory with heterogeneous effects or a heuristic? (c) The additional weighing of clade frequency changes (growth rates) rho_i via <xref ref-type="disp-formula" rid="equ3">eq. (3)</xref> appears to us foreign to the coalescent approach, which should predict this change rather than using it as a separate input. In other words, we would expect the frequency change term to be either redundant or to measure deviations from the coalescent model. The relationship between these two model components should be explained and quantified by a scatter plot showing coalescent-predicted (from the two-parameter model) vs. measured clade growth rates rho_i. (d) We noted that the best model for influenza, which includes the Koel reward and the clade growth term, uses four parameters; this should be made explicit in the main text. (e) Can one relate the fitness diffusion constant D to standard population-genetic parameters, such as the rate and effect distribution of beneficial mutations? Note that the mean coalescent time T2 and the speed of the fitness wave (via the average lifetimes of polymorphisms destined for fixation?) can at least roughly be estimated from the data. Are these estimates consistent with the model input parameters?</p><p>Furthermore: (f) What are the effects of sampling inhomogeneities on the tree, as they exist in the influenza case? If a given clade is oversampled, this may produce a spurious fitness signal in the authors' method. To what extent is the method robust to such fluctuations, and where are the limits of applicability? (g) What are the effects of the temporal spread of input data within a given year? Note that the influenza data are strains from a full year; they deviate from the model assumption of input at a given point in time. How strongly? i.e., how does this time interval compare to the other characteristic time scales of the problem? The effects of these inhomogeneities on model predictions can best be assessed by simulations of the kind already performed in this work and described in the results Section. (h) It may be worth stressing that the prediction scheme works for a limited time into the future. This time can probably be estimated from the model parameters (fitness diffusion constant...). It would be interesting to quote this prediction time for influenza.</p><p>3) Whilst the motivation of the paper is intuitive and the paper easy to understand for non-experts, this simplicity masked many assumptions of the model which was considered problematic. For example, the Methods section should explain more fully the inference algorithm (Neher and Hallatschek), especially its critical assumptions. This is important because the predictions on simulated data can be rather bad or even misleading (see <xref ref-type="fig" rid="fig2">figure 2B</xref> low mutation rates), and it will be important to assess its applicability to other data. In addition, in deriving <xref ref-type="disp-formula" rid="equ10">equation 10</xref>, it seems that there is an assumption of small fitness effects (x approx 0), such that (1-x) phi^2 = phi^2. On the other hand, you state that for large enough x, phi = x. Then you plot fitnesses in the range (-4 sigma, 4 sigma) (<xref ref-type="fig" rid="fig4">Figure 4</xref>). Does this put a very stringent restriction on sigma^2? It is possible that there is a mathematically solid justification for this, but you should make it easier for the reader to understand. Finally, in the methods (and as discussed above under point 2), you also briefly comment on how your assumption of “non-sampling” affects fitness estimates. What does this mean for the sample sizes in the case of influenza? Is it necessary to use a small sampling fraction to obtain a reliable prediction? Also, would sampling bias have a large effect on the predictions. If a certain clade is over-represented in the sample, the algorithm would infer higher fitness for that clade, correct?</p></body></sub-article><sub-article article-type="reply" id="SA2"><front-stub><article-id pub-id-type="doi">10.7554/eLife.03568.018</article-id><title-group><article-title>Author response</article-title></title-group></front-stub><body><p>The main points raised during the review were (i) the suggestion that small effect mutations are important for influenza A/H3N2 adaptation, and (ii) model assumptions, parameter dependence, and lack of intuition for the inference algorithm.</p><p>To address the first point, we examined the dependence of predictive power on the number of mutations contributing to fitness variation and included this as a supplement to <xref ref-type="fig" rid="fig2">Figure 2</xref>. While predictive power increases when more mutations with smaller effects contribute, predictability is retained down to relatively small number of mutations contributing to fitness differentials. Prediction capacity is retained because Selection-biased Diffusion, which describes fitness dynamics along lineages in the “infinitesimal” limit (of many small effect mutations), holds approximately even in the case when only a few mutations contribute.</p><p>Predictability alone therefore not an unequivocal argument for many small effect mutations. Nevertheless, predictability requires that the population has persistent fitness variation distributed over a number of loci, in effect behaving closer to the infinitesimal case, than to the case of periodic sweeps. We have adapted the manuscript to reflect this insight.</p><p>To address the concerns regarding model assumptions, parameter fitting, and the lack of intuitive insight into the inference algorithm, we now discuss in greater detail the analytic result that downstream tree length is the most important determinant of fitness of young nodes. We extended this argument for all nodes on the tree and introduced a local branching index (LBI) defined as the length of the tree surrounding a node in its neighborhood (implemented as exponential weighting). Local tree length increases with branching, which in turn is indicative of high fitness. This intuitive connection captures the essence of the fitness inference algorithm. Moreover, the LBI predicts the progenitor lineages almost as well as the full probabilistic fitness inference when applied to simulation data. We used simulation data to fix the ‘only’ free parameter of the LBI – the size of the neighbourhood – and applied it to influenza without any fitting parameters whatsoever. The predictions obtain this way are almost as good as those obtained previously with the full fitness inference. Since that latter required choosing at least 2 parameters, we feel that the LBI is superior in practice even though it misses one year where the fitness inference did well (1996: there is little data in this early year anyway).</p><p>To accommodate these improvements, we have included a section on the LBI and redid the influenza analysis using the LBI to rank isolates. All conclusions remain the same, but the nature of fitness inference and progenitor prediction have become much more transparent through the identification of the local tree length as the essence of fitness prediction. We have streamlined the manuscript and removed the discussion of the gamma parameter (no longer needed when using the simpler predictor) and no longer show the results including the Koel mutations (which didn't add much to the predictions anyway).</p><p>We included one additional year (2013) of influenza predictions, as data to evaluate this prediction have become available since our initial submission (both the full fitness inference and the LBI predict this year well). In addition, we have further streamlined and commented the code that has been deposited on github. We feel that these revisions have greatly improved our manuscript. We provide a point by point response to all referee comments below.</p><p><italic>1) The conclusion of many minor-effect mutations will need to be supported by better evidence or to be formulated differently. Whilst you claim that the results support the notion that a substantial fraction of adaptive evolution is through a quantitative (infinitessimal-style) response, it appears feasible that the primary driver of genealogical history/selection is indeed through HA/NM, but acts in a structured population. If so, and there is a semi-neutral accumulation of mutations along the branches, the genealogy becomes a proxy for the relative fitness of cryptic sub-populations within the species</italic>.</p><p>In the previous version of the manuscript, we stated that predictability of influenza using a model assuming many small effect mutations suggests that influenza A/H3N2 evolution is in part dominated by such dynamics. We now quantify, using simulations, how predictability varies with the number of segregating mutations in the sample (<xref ref-type="fig" rid="fig2s1">Figure 2–figure supplement 1</xref>). Good predictions require several segregating mutations, but some predictability is retained down to low numbers. Hence, we cannot provide a reliable lower bound on the number of mutations contributing to fitness in any given year and we have reworded the relevant parts of the text to stress that our ability to make meaningful predictions stems from persistent variations in virus fitness.</p><p>If we understand the alternative scenario correctly, the structured population would merely reduce competition between different lineages on time scales shorter than the one year (H3N2 viruses do not persist locally between epidemics). Without persistent heritable differences between strains, we do not see how it would be possible for our method to make meaningful predictions. Furthermore, the 2-fold enrichment of nonsynonymous mutations at 50 epitope positions on branches with high inferred Delta fitness suggests that our algorithm picks up a genetic signature.</p><p><italic>The fact that the Luksza and Lassig results are so similar suggests that the two approaches take advantage of very similar information</italic>.</p><p>Luksza and Laessig include an ad-hoc predictor meant to account for nonlinear or epistatic effects that counts the number of synonymous mutations below an internal node. This component of their model accounts for a substantial fraction of the predictive power. Incidentally, the number of synonymous mutations is a proxy for the total branch length below a node, which is intimately connected to the inferred fitness of internal nodes. The addition of the local branching index makes this essential feature of fitness prediction explicit.</p><p><italic>The reviewers were also sceptical about the inclusion of Koel mutations into the prediction scheme. The sites of these mutations have been identified in a very recent publication through their importance for antigenic substitutions (</italic><xref ref-type="bibr" rid="bib15"><italic>Koel et al., 2013</italic></xref><italic>). Hence, for most of the prediction period, they introduce posterior information into the prediction; this caveat should at least be noted. We ask you to consider including results for the model with temporal information (clade growth rates), but without the Koel term; this will quantify the contribution of that term to the full prediction. If the net contribution of the Koel term remains limited, the message of the paper might become stronger by making exactly that point. This is of relevance for influenza research because the current analysis focuses to a large extent on the identification of few large-effect antigenic changes</italic>.</p><p>We agree that inclusion of the Koel mutation introduces after-the-fact information into our prediction. The predictive power of the Koel mutations is limited and improvement restricted to a few years. In some years, predictions become worse because Koel mutations appear multiple times on the tree and are mostly false leads. We did not claim Koel mutations to be predictive, but merely sought to show how flu specific knowledge can be combined with our general inference scheme. We now clearly state that within our framework the predictive power gained from looking at the occurrence of Koel mutations is not worth the extra parameter needed to include them (and have removed the predictions utilizing them). We still discuss the Koel mutations as potential large effect mutations beyond the scope of our model.</p><p><italic>Finally, please comment further on the inference of small effects: it may be circular since it is an input assumption of your method</italic>.</p><p>As pointed out above, we now emphasize that persistent fitness variation is a prerequisite for prediction but that we cannot put a lower bound on the number of mutations contributing to fitness. We added a supplementary figure showing how predictive capacity depends on the genetic diversity in the sample (<xref ref-type="fig" rid="fig2s1">Figure 2–figure supplement 1</xref>).</p><p><italic>2) Inadequate discussion of model assumptions and impact of data structure (temporal spread, spatial structure, sampling inhomogeneities). (a) The model contains a parameter lambda that scales the branch lengths of the coalescent tree. Is this a canonical scale parameter within the traveling wave theory or a heuristic extension? If so, it should be marked as such</italic>.</p><p>The parameter gamma has become obsolete since we now use the simpler algorithm (Local Branching Index) to rank influenza sequences. Within the selection-biased diffusion model, the conversion of branch length to time is fixed by theory and gamma=1. LBI has a single phenomenological parameter (setting the local scale) which we choose by comparing LBI to the full inference algorithm on simulated data.</p><p><italic>(b) Similarly, is the form of the propagator for branches with Koel mutations,</italic> <xref ref-type="disp-formula" rid="equ2"><italic>eq. (2)</italic></xref><italic>, supported by the theory with heterogeneous effects or a heuristic</italic>?</p><p>The parameter for branches with Koel mutation was ad-hoc. It could be made more principled, but this would involve an additional integration over the possible times at which this mutation could have arisen. Given that the value of the Koel mutations for prediction seems limited, we have removed this altogether.</p><p><italic>(c) The additional weighing of clade frequency changes (growth rates) rho_i via</italic> <xref ref-type="disp-formula" rid="equ3"><italic>eq. (3)</italic></xref> <italic>appears to us foreign to the coalescent approach, which should predict this change rather than using it as a separate input. In other words, we would expect the frequency change term to be either redundant or to measure deviations from the coalescent model. The relationship between these two model components should be explained and quantified by a scatter plot showing coalescent-predicted (from the two-parameter model) vs. measured clade growth rates rho_i</italic>.</p><p>The fitness inference or LBI can indeed be used to predict clade expansion into the next season. We included a figure supplement to <xref ref-type="fig" rid="fig4">Figure 4</xref> (formerly <xref ref-type="fig" rid="fig3">Figure 3</xref>) that shows how highly ranked clades tend to expand.</p><p>We have also included a comparison of our predictions with predictions based solely on growth rate and another naive predictor based on ladderization of the tree. Neither of these predictors come close to the predictive power of our approach, likely because both of them are highly sensitive to sampling biases while our approach is much less so.</p><p><italic>(d) We noted that the best model for influenza, which includes the Koel reward and the clade growth term, uses four parameters; this should be made explicit in the main text</italic>.</p><p>We no longer use this model as the improvement of the predictions did not warrant the two extra parameters. We explicitly discuss the trade-off between improved predictions and additional parameters. Our simplified local Branching Index ranking has only single parameter (the neighbourhood size), which we choose based on simulated data.</p><p><italic>(e) Can one relate the fitness diffusion constant D to standard population-genetic parameters, such as the rate and effect distribution of beneficial mutations? Note that the mean coalescent time T2 and the speed of the fitness wave (via the average lifetimes of polymorphisms destined for fixation?) can at least roughly be estimated from the data. Are these estimates consistent with the model input parameters</italic>?</p><p>Yes, the fitness diffusion constant has a straightforward interpretation in terms of mutations rates and fitness effects. D is half the product of the mutation rate and the second moment of the effect size of mutations. Since time is measured in units of 1/\sigma, both the mutation rate the effect size are also measured in units of sigma. We now explicitly state this. In units of sigma, \Gamma = D\sigma^{-3} is proportional to the inverse sqrt(log N), which varies only slowly.</p><p><italic>Furthermore: (f) What are the effects of sampling inhomogeneities on the tree, as they exist in the influenza case? If a given clade is oversampled, this may produce a spurious fitness signal in the authors' method. To what extent is the method robust to such fluctuations, and where are the limits of applicability</italic>?</p><p>Sampling can affect the fitness inference in exactly the direction suggested. But as we now discuss, our algorithm and the LBI are fairly insensitive to sampling biases. Since the algorithm senses the total length of subtrees, local over sampling and the addition of many similar sequences (with short branches) has little impact on our inferences. We tried to avoid sampling biases by using at most 100 sequences from Asia or North America – Asia being the primary source region for seasonal H3N2 viruses and North America being a sink region. More sophisticated corrections for sampling biases (i.e. by factoring in surveillance efforts in different countries) could be envisioned, but require additional data currently not available to us.</p><p><italic>(g) What are the effects of the temporal spread of input data within a given year? Note that the influenza data are strains from a full year; they deviate from the model assumption of input at a given point in time. How strongly? i.e., how does this time interval compare to the other characteristic time scales of the problem? The effects of these inhomogeneities on model predictions can best be assessed by simulations of the kind already performed in this work and described in the results Section</italic>.</p><p>We had already tested the effect of continuous sampling in simulation data. The results are now included as <xref ref-type="fig" rid="fig2s2">Figure 2–figure supplement 2</xref> and are very similar to data sampled from only one time point. Our influenza sequences come from a 10 month interval from May to February.</p><p><italic>(h) It may be worth stressing that the prediction scheme works for a limited time into the future. This time can probably be estimated from the model parameters (fitness diffusion constant ...). It would be interesting to quote this prediction time for influenza</italic>.</p><p>We now clearly state that the scope of our method is predicting a progenitor of the future, rather than predicting future evolution. The progenitor lineage is independent of the time horizon on which the prediction is evaluated. On very short time scales (a few months), predicting clade growth rather than progenitors might be more appropriate.</p><p><italic>3) Whilst the motivation of the paper is intuitive and the paper easy to understand for non-experts, this simplicity masked many assumptions of the model which was considered problematic. For example, the Methods section should explain more fully the inference algorithm (Neher and Hallatschek), especially its critical assumptions. This is important because the predictions on simulated data can be rather bad or even misleading (see</italic> <xref ref-type="fig" rid="fig2"><italic>figure 2B</italic></xref> <italic>low mutation rates), and it will be important to assess its applicability to other data</italic>.</p><p>We have extended the Methods section and included relevant details and assumptions. The lower quality of predictions at low mutation rates stems from the fact that fitness diversity in the population depends on few mutations. The resulting granularity of the fitness distribution (and the large effect size of mutations) make prediction difficult. We have extended the discussion of the toy data results, included a figure supplement that explicitly shows the quality of fitness inference as a function of the genetic diversity in the sample. Furthermore, we point out the crucial assumptions when deriving the fitness inference algorithm.</p><p><italic>In addition, in deriving</italic> <xref ref-type="disp-formula" rid="equ10"><italic>equation 10</italic></xref><italic>, it seems that there is an assumption of small fitness effects (x approx 0), such that (1-x) phi^2 = phi^2. On the other hand, you state that for large enough x, phi = x. Then you plot fitnesses in the range (-4 sigma, 4 sigma) (</italic><xref ref-type="fig" rid="fig4"><italic>Figure 4</italic></xref><italic>). Does this put a very stringent restriction on sigma^2? It is possible that there is a mathematically solid justification for this, but you should make it easier for the reader to understand</italic>.</p><p>We do indeed make an assumption of small x, that fitness differences in one generation are assumed to be <<1. Influenza lineage turnover happens on the scale of 1 to 2 years, suggesting that fitness differential in one generation (a few days) is on the order of a few percent. Even if this assumption is not justified, the qualitative behavior of all equations remains unchanged. The high fitness tail of phi changed from phi∼x to phi∼x/(1+x). We now point the constraints on the strength of selection, i.e., sigma, more carefully.</p><p><italic>Finally, in the methods (and as discussed above under point 2.), you also briefly comment on how your assumption of “non-sampling” affects fitness estimates. What does this mean for the sample sizes in the case of influenza? Is it necessary to use a small sampling fraction to obtain a reliable prediction? Also, would sampling bias have a large effect on the predictions. If a certain clade is over-represented in the sample, the algorithm would infer higher fitness for that clade</italic>, <italic>correct?</italic></p><p>The results depend only very weakly (square root of a logarithm) on the sampling size entering the non-sampling factor. Increasing omega (the assumed sampling fraction) by a large factor pushes all fitness estimates down, but has little effect on the relative ranking. When chosen correctly, the fitness estimates of terminal nodes should scatter around 0 consistent with the population distribution. For the simulated data, we know how to set omega, for the influenza data we now use the LBI to rank sequence which does not depend on the sampling fraction. As discussed above, our algorithm is fairly insensitive to sampling bias and we try to avoid sampling bias by down sampling that data to similar numbers of sequences from different geographic regions.</p></body></sub-article></article> |