Permalink
Cannot retrieve contributors at this time
Fetching contributors…
| <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.1d1 20130915//EN" "JATS-archivearticle1.dtd"><article article-type="research-article" dtd-version="1.1d1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="nlm-ta">elife</journal-id><journal-id journal-id-type="hwp">eLife</journal-id><journal-id journal-id-type="publisher-id">eLife</journal-id><journal-title-group><journal-title>eLife</journal-title></journal-title-group><issn publication-format="electronic">2050-084X</issn><publisher><publisher-name>eLife Sciences Publications, Ltd</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">00961</article-id><article-id pub-id-type="doi">10.7554/eLife.00961</article-id><article-categories><subj-group subj-group-type="display-channel"><subject>Research article</subject></subj-group><subj-group subj-group-type="heading"><subject>Plant biology</subject></subj-group></article-categories><title-group><article-title>Phenotypic landscape inference reveals multiple evolutionary paths to C<sub>4</sub> photosynthesis</article-title></title-group><contrib-group><contrib contrib-type="author" equal-contrib="yes" id="author-5662"><name><surname>Williams</surname><given-names>Ben P</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="fn" rid="equal-contrib">†</xref><xref ref-type="other" rid="par-1"/><xref ref-type="fn" rid="con1"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" equal-contrib="yes" id="author-5663"><name><surname>Johnston</surname><given-names>Iain G</given-names></name><xref ref-type="aff" rid="aff2"/><xref ref-type="fn" rid="equal-contrib">†</xref><xref ref-type="other" rid="par-1"/><xref ref-type="fn" rid="con2"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-5664"><name><surname>Covshoff</surname><given-names>Sarah</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="other" rid="par-2"/><xref ref-type="fn" rid="con3"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" corresp="yes" id="author-5389"><name><surname>Hibberd</surname><given-names>Julian M</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="corresp" rid="cor1">*</xref><xref ref-type="other" rid="par-1"/><xref ref-type="fn" rid="con4"/><xref ref-type="fn" rid="conf1"/></contrib><aff id="aff1"><institution content-type="dept">Department of Plant Sciences</institution>, <institution>University of Cambridge</institution>, <addr-line><named-content content-type="city">Cambridge</named-content></addr-line>, <country>United Kingdom</country></aff><aff id="aff2"><institution content-type="dept">Department of Mathematics</institution>, <institution>Imperial College London</institution>, <addr-line><named-content content-type="city">London</named-content></addr-line>, <country>United Kingdom</country></aff></contrib-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Bergmann</surname><given-names>Dominique</given-names></name><role>Reviewing editor</role><aff><institution>Stanford University</institution>, <country>United States</country></aff></contrib></contrib-group><author-notes><corresp id="cor1"><label>*</label>For correspondence: <email>Julian.Hibberd@plantsci.cam.ac.uk</email></corresp><fn fn-type="con" id="equal-contrib"><label>†</label><p>These authors contributed equally to this work</p></fn></author-notes><pub-date date-type="pub" publication-format="electronic"><day>28</day><month>09</month><year>2013</year></pub-date><pub-date pub-type="collection"><year>2013</year></pub-date><volume>2</volume><elocation-id>e00961</elocation-id><history><date date-type="received"><day>20</day><month>05</month><year>2013</year></date><date date-type="accepted"><day>05</day><month>08</month><year>2013</year></date></history><permissions><copyright-statement>© 2013, Williams et al</copyright-statement><copyright-year>2013</copyright-year><copyright-holder>Williams et al</copyright-holder><license xlink:href="http://creativecommons.org/licenses/by/3.0/"><license-p>This article is distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/3.0/">Creative Commons Attribution License</ext-link>, which permits unrestricted use and redistribution provided that the original author and source are credited.</license-p></license></permissions><self-uri content-type="pdf" xlink:href="elife00961.pdf"/><related-article ext-link-type="doi" id="ra1" related-article-type="commentary" xlink:href="10.7554/eLife.01403"/><abstract><object-id pub-id-type="doi">10.7554/eLife.00961.001</object-id><p>C<sub>4</sub> photosynthesis has independently evolved from the ancestral C<sub>3</sub> pathway in at least 60 plant lineages, but, as with other complex traits, how it evolved is unclear. Here we show that the polyphyletic appearance of C<sub>4</sub> photosynthesis is associated with diverse and flexible evolutionary paths that group into four major trajectories. We conducted a meta-analysis of 18 lineages containing species that use C<sub>3</sub>, C<sub>4</sub>, or intermediate C<sub>3</sub>–C<sub>4</sub> forms of photosynthesis to parameterise a 16-dimensional phenotypic landscape. We then developed and experimentally verified a novel Bayesian approach based on a hidden Markov model that predicts how the C<sub>4</sub> phenotype evolved. The alternative evolutionary histories underlying the appearance of C<sub>4</sub> photosynthesis were determined by ancestral lineage and initial phenotypic alterations unrelated to photosynthesis. We conclude that the order of C<sub>4</sub> trait acquisition is flexible and driven by non-photosynthetic drivers. This flexibility will have facilitated the convergent evolution of this complex trait.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.001">http://dx.doi.org/10.7554/eLife.00961.001</ext-link></p></abstract><abstract abstract-type="executive-summary"><object-id pub-id-type="doi">10.7554/eLife.00961.002</object-id><title>eLife digest</title><p>Plants rely on carbon for their growth and survival: in a process called photosynthesis, they use energy from sunlight to convert carbon dioxide and water into carbohydrates and oxygen gas. The chemical reactions that make up photosynthesis are powered by a chain of enzymes, and plants must ensure that these enzymes—which are in the leaves of the plant—are supplied with enough carbon dioxide and water. Carbon dioxide from the atmosphere enters plants through pores in their leaves, but water must be carried up the plant from the roots.</p><p>The type of photosynthesis used by about 90% of flowering plant species—including tomatoes and rice—is called C<sub>3</sub> photosynthesis. The first step in this process begins with an enzyme called RuBisCO, which reacts with carbon dioxide and a substance called RuBP to form molecules that contain three carbon atoms (hence the name C<sub>3</sub> photosynthesis).</p><p>In a hot climate, however, a plant can lose a lot of water through the pores in its leaves: closing these pores allows the plant to retain water, but this also reduces the supply of carbon dioxide. Under these circumstances this causes problems because RuBisCO uses oxygen to break down RuBP, instead of creating sugars, when carbon dioxide is not readily available. To prevent this process, which wastes a lot of energy and resources, some plants—including maize, sugar cane and many other agricultural staples—have evolved an alternative process called C<sub>4</sub> photosynthesis. Although it is more complex than C<sub>3</sub> photosynthesis, and required many changes to be made to the structure of leaves, C<sub>4</sub> photosynthesis has evolved on more than 60 different occasions.</p><p>In C<sub>4</sub> plants, the mesophyll—the region that is associated with the capture of carbon dioxide by RuBisCO in C<sub>3</sub> plants—contains high levels of an alternative enzyme called PEPC that converts carbon dioxide molecules into an acid that contains four carbon atoms. To avoid carbon dioxide being captured by both enzymes, C<sub>4</sub> plants evolved to relocate RuBisCO from the mesophyll to a second set of cells in an airtight structure known as the bundle sheath. The four-carbon acids produced by PEPC diffuse to the cells in the bundle sheath, where they are broken down into carbon dioxide molecules, and photosynthesis then proceeds as normal. This process allows photosynthesis to continue when the level of carbon dioxide in the leave is low because the plant has closed its pores to retain water.</p><p>Since C<sub>4</sub> plants grow faster than C<sub>3</sub> plants, and also require less water, plant biologists would like to introduce certain C<sub>4</sub> traits into C<sub>3</sub> crop plants. To help with this process, Williams, Johnston et al. have used computational methods to explore how C<sub>4</sub> photosynthesis evolved from ancestral C<sub>3</sub> plants. This involved investigating the prevalence of 16 traits that are common to C<sub>4</sub> plants in a total of 73 species that undergo C<sub>3</sub> or C<sub>4</sub> photosynthesis (including 37 species that possess characteristics of both C<sub>3</sub> and C<sub>4</sub>).</p><p>Williams, Johnston et al. then went on to produce a new mathematical model that represents evolutionary processes as pathways across a multi-dimensional “landscape”. The model shows that traits can be acquired in various orders, and that C<sub>4</sub> photosynthesis evolved through a number of independent pathways. Some traits that evolved early in the transitions to C<sub>4</sub> photosynthesis influenced how evolution proceeded, providing “foundations” upon which further changes evolved.</p><p>Interestingly, the structure of the leaf itself appeared to change before any of the photosynthetic enzymes changed. This led Williams, Johnston et al. to conclude that climate change—in particular, the declines in carbon dioxide levels that occurred in prehistoric times—was probably not responsible for the original evolution of C<sub>4</sub> photosynthesis. Nevertheless, these results could help with efforts to adapt important C<sub>3</sub> crop plants to on-going changes in our climate.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.002">http://dx.doi.org/10.7554/eLife.00961.002</ext-link></p></abstract><kwd-group kwd-group-type="author-keywords"><title>Author keywords</title><kwd>convergent evolution</kwd><kwd>C<sub>4</sub> photosynthesis</kwd><kwd>Bayesian model</kwd></kwd-group><kwd-group kwd-group-type="research-organism"><title>Research organism</title><kwd>Other</kwd></kwd-group><funding-group><award-group id="par-1"><funding-source><institution-wrap><institution>Biotechnology and Biological Sciences Research Council</institution></institution-wrap></funding-source><principal-award-recipient><name><surname>Williams</surname><given-names>Ben P</given-names></name><name><surname>Johnston</surname><given-names>Iain G</given-names></name><name><surname>Hibberd</surname><given-names>Julian M</given-names></name></principal-award-recipient></award-group><award-group id="par-2"><funding-source><institution-wrap><institution>International Rice Research Institute</institution></institution-wrap></funding-source><principal-award-recipient><name><surname>Covshoff</surname><given-names>Sarah</given-names></name></principal-award-recipient></award-group><funding-statement>The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.</funding-statement></funding-group><custom-meta-group><custom-meta><meta-name>elife-xml-version</meta-name><meta-value>2</meta-value></custom-meta><custom-meta specific-use="meta-only"><meta-name>Author impact statement</meta-name><meta-value>Computational modelling indicates that C<sub>4</sub> photosynthesis in distantly related plant species arose through a number of independent evolutionary paths.</meta-value></custom-meta></custom-meta-group></article-meta></front><body><sec id="s1" sec-type="intro"><title>Introduction</title><p>The convergent evolution of complex traits is surprisingly common, with examples including camera-like eyes of cephalopods, vertebrates, and cnidaria (<xref ref-type="bibr" rid="bib42">Kozmik et al., 2008</xref>), mimicry in invertebrates and vertebrates (<xref ref-type="bibr" rid="bib69">Santos et al., 2003</xref>; <xref ref-type="bibr" rid="bib85">Wilson et al., 2012</xref>) and the different photosynthetic machineries of plants (<xref ref-type="bibr" rid="bib66">Sage et al., 2011a</xref>). While the polyphyletic origin of simple traits (<xref ref-type="bibr" rid="bib30">Hill et al., 2006</xref>; <xref ref-type="bibr" rid="bib73">Steiner et al., 2009</xref>) is underpinned by flexibility in the underlying molecular mechanisms, the extent to which this applies to complex traits is less clear. C<sub>4</sub> photosynthesis is both highly complex, involving alterations to leaf anatomy, cellular ultrastructure, and photosynthetic metabolism, and also convergent, being found in at least 60 independent lineages of angiosperms (<xref ref-type="bibr" rid="bib66">Sage et al., 2011a</xref>). As the emergence of the entire C<sub>4</sub> phenotype cannot be comprehensively explored experimentally, C<sub>4</sub> photosynthesis is an ideal system for the mathematical modelling of complex trait evolution as transitions on an underlying phenotype landscape. Furthermore, understanding the evolutionary events that have generated C<sub>4</sub> photosynthesis on many independent occasions has the potential to inform approaches being undertaken to engineer C<sub>4</sub> photosynthesis into C<sub>3</sub> crop species (<xref ref-type="bibr" rid="bib29">Hibberd et al., 2008</xref>).</p><p>The C<sub>4</sub> pathway is estimated to have first evolved between 32 and 25 million years ago (<xref ref-type="bibr" rid="bib12">Christin et al., 2011b</xref>) in response to multiple ecological drivers, including decreasing atmospheric CO<sub>2</sub> concentration (<xref ref-type="bibr" rid="bib76">Vicentini et al., 2008</xref>). C<sub>4</sub> species have since radiated to represent the most productive crops and native vegetation on the planet because modifications to their leaves increase the efficiency of photosynthesis in the sub-tropics and tropics (<xref ref-type="bibr" rid="bib19">Edwards et al., 2010</xref>). In C<sub>4</sub> plants, photosynthetic efficiency is improved compared with C<sub>3</sub> species because significant alterations to leaf anatomy, cell biology and biochemistry lead to higher concentrations of CO<sub>2</sub> around the primary carboxylase RuBisCO <xref ref-type="bibr" rid="bib71">Slack and Hatch, 1967</xref>; <xref ref-type="bibr" rid="bib47">Langdale, 2011</xref>). The morphology of C<sub>4</sub> leaves is typically modified into so-called Kranz anatomy that consists of repeating units of vein, bundle sheath (BS) and mesophyll (M) cells (<xref ref-type="bibr" rid="bib26">Hattersley, 1984</xref>; <xref ref-type="bibr" rid="bib47">Langdale, 2011</xref>) (<xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1</xref>). Photosynthetic metabolism becomes modified and compartmentalised between the M and BS, with M cells lacking RuBisCO but instead containing high activities of the alternate carboxylase PEPC to generate C<sub>4</sub> acids. The diffusion of these acids followed by their decarboxylation in BS cells around RuBisCO increases CO<sub>2</sub> supply and therefore photosynthetic efficiency (<xref ref-type="bibr" rid="bib89">Zhu et al., 2008</xref>). C<sub>4</sub> acids are decarboxylated by at least one of three enzymes within BS cells: NADP- or NAD-dependent malic enzymes (NADP-ME or NAD-ME respectively), or phospho<italic>enol</italic>pyruvate carboxykinase (PCK) (<xref ref-type="bibr" rid="bib25">Hatch et al., 1975</xref>). Specific lineages of C<sub>4</sub> species have typically been classified into one of three sub-types, based on the activity of these decarboxylases, as well as anatomical and cellular traits that consistently correlate with each other (<xref ref-type="bibr" rid="bib20">Furbank, 2011</xref>).</p><p>The genetic mechanisms underlying the evolution of cell-specific gene expression associated with the separation of photosynthetic metabolism between M and BS cells involve both alterations to <italic>cis</italic>-elements and <italic>trans</italic>-acting factors (<xref ref-type="bibr" rid="bib3">Akyildiz et al., 2007</xref>; <xref ref-type="bibr" rid="bib6">Brown et al., 2011</xref>; <xref ref-type="bibr" rid="bib36">Kajala et al., 2012</xref>; <xref ref-type="bibr" rid="bib84">Williams et al., 2012</xref>). Phylogenetically independent lineages of C<sub>4</sub> plants have co-opted homologous mechanisms to generate cell specificity (<xref ref-type="bibr" rid="bib6">Brown et al., 2011</xref>) as well as the altered allosteric regulation of C<sub>4</sub> enzymes (<xref ref-type="bibr" rid="bib14">Christin et al., 2007</xref>) indicating that parallel evolution underpins at least part of the convergent C<sub>4</sub> syndrome. However, while a substantial amount of work has addressed the molecular alterations that generate the biochemical differences between C<sub>3</sub> and C<sub>4</sub> plants (<xref ref-type="bibr" rid="bib84">Williams et al., 2012</xref>) much less is known about the order and flexibility with which phenotypic traits important for C<sub>4</sub> photosynthesis are acquired (<xref ref-type="bibr" rid="bib67">Sage et al., 2012</xref>). Clues to this question exist in the form of C<sub>3</sub>–C<sub>4</sub> intermediates, species exhibiting characteristics of both C<sub>3</sub> or C<sub>4</sub> photosynthesis, such as the activity or localisation of C<sub>4</sub> cycle enzymes (<xref ref-type="bibr" rid="bib27">Hattersley and Stone, 1986</xref>), the possession of one or more anatomical or cellular adaptations associated with C<sub>4</sub> photosynthesis (<xref ref-type="bibr" rid="bib53">Moore et al., 1987</xref>), or combinations of both (e.g., <xref ref-type="bibr" rid="bib38">Kennedy et al., 1980</xref>; <xref ref-type="bibr" rid="bib41">Kotayeva et al., 2010</xref>). To address these unknown aspects of C<sub>4</sub> evolutionary history, we combined the concept of considering evolutionary paths as stochastic processes on complex adaptive landscapes (<xref ref-type="bibr" rid="bib88">Wright, 1932</xref>; <xref ref-type="bibr" rid="bib21">Gavrilets, 1997</xref>) with the analysis of extant C<sub>3</sub>–C<sub>4</sub> intermediate species to develop a predictive model of how the full C<sub>4</sub> phenotype evolved.</p></sec><sec id="s2" sec-type="results"><title>Results</title><sec id="s2-1"><title>A meta-analysis of photosynthetic phenotypes</title><p>To parameterise the phenotypic landscape underlying photosynthetic phenotypes, data was consolidated from 43 studies encompassing 18 C<sub>3</sub>, 18 C<sub>4</sub>, and 37 C<sub>3</sub>–C<sub>4</sub> intermediate species from 22 genera (<xref ref-type="table" rid="tbl1">Table 1</xref>). These C<sub>3</sub>–C<sub>4</sub> species are from 18 independent lineages likely representing 18 distinct evolutionary origins of C<sub>3</sub>–C<sub>4</sub> intermediacy (<xref ref-type="bibr" rid="bib66">Sage et al., 2011a</xref>) (<xref ref-type="fig" rid="fig1s2">Figure 1—figure supplement 2</xref>). These studies were used to quantify 16 biochemical, anatomical, and cellular characteristics associated with C<sub>4</sub> photosynthesis (<xref ref-type="supplementary-material" rid="SD1-data">Figure 1—source data 1</xref>). Principal components analysis (PCA) was performed to confirm the phenotypic intermediacy of the C<sub>3</sub>–C<sub>4</sub> species (<xref ref-type="fig" rid="fig1">Figure 1A</xref>). This result, the sister-group relationships of C<sub>3</sub>–C<sub>4</sub> species with congeneric C<sub>4</sub> clades (<xref ref-type="bibr" rid="bib51">McKown et al., 2005</xref>; <xref ref-type="bibr" rid="bib77">Vogan et al., 2007</xref>; <xref ref-type="bibr" rid="bib12">Christin et al., 2011a</xref>; <xref ref-type="bibr" rid="bib66">Sage et al., 2011a</xref>; <xref ref-type="bibr" rid="bib39">Khoshravesh et al., 2012</xref>) and the prevalence of extant C<sub>3</sub>–C<sub>4</sub> species in genera with the most recent origins of C<sub>4</sub> photosynthesis (<xref ref-type="bibr" rid="bib16">Christin et al., 2011b</xref>) all support the notion that C<sub>3</sub>–C<sub>4</sub> species represent phenotypic states through which transitions to C<sub>4</sub> photosynthesis could occur. The combined traits of C<sub>3</sub>–C<sub>4</sub> intermediate species therefore represent samples from across the space of phenotypes connecting C<sub>3</sub> to C<sub>4</sub> photosynthesis (<xref ref-type="fig" rid="fig1">Figure 1B</xref>). Within our meta-analysis data, C<sub>3</sub>–C<sub>4</sub> phenotypes were available for 33 eudicot and 4 monocot species. 16 and 17 of these species have extant congeneric relatives performing NADP-ME or NAD-ME sub-type C<sub>4</sub> photosynthesis respectively. No C<sub>3</sub>–C<sub>4</sub> relatives of PCK sub-type C<sub>4</sub> species are known (<xref ref-type="bibr" rid="bib66">Sage et al., 2011a</xref>). Our meta-analysis therefore encompassed a variety of taxonomic lineages, as well as representing close relatives of known phenotypic variants performing C<sub>4</sub> photosynthesis.<table-wrap id="tbl1" position="float"><object-id pub-id-type="doi">10.7554/eLife.00961.003</object-id><label>Table 1.</label><caption><p>Summary of C<sub>3</sub>–C<sub>4</sub> lineages assessed</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.003">http://dx.doi.org/10.7554/eLife.00961.003</ext-link></p></caption><table frame="hsides" rules="groups"><thead><tr><th>Family</th><th>Species</th><th>References<xref ref-type="table-fn" rid="tblfn1">*</xref></th></tr></thead><tbody><tr><td rowspan="3">Amaranthaceae</td><td><italic>Alternanthera ficoides</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib58">Rajendrudu et al. (1986)</xref></td></tr><tr><td><italic>Alternanthera tenella</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib17">Devi and Raghavendra (1993)</xref></td></tr><tr><td><italic>Alternanthera pungens</italic> (C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib18">Devi et al. (1995)</xref></td></tr><tr><td rowspan="20">Asteraceae</td><td><italic>Flaveria cronquistii</italic> (C<sub>3</sub>)</td><td/></tr><tr><td><italic>Flavera pringlei</italic> (C<sub>3</sub>)</td><td/></tr><tr><td><italic>Flaveria robusta</italic> (C<sub>3</sub>)</td><td/></tr><tr><td><italic>Flaveria angustifolia</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td/></tr><tr><td><italic>Flaveria anomala</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib44">Ku et al. (1983)</xref></td></tr><tr><td><italic>Flaveria chloraefolia</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib32">Holaday et al. (1984)</xref></td></tr><tr><td><italic>Flaveria floridana</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib2">Adams et al. (1986)</xref></td></tr><tr><td><italic>Flaveria linearis</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib7">Brown and Hattersley (1989)</xref></td></tr><tr><td><italic>Flaveria oppositifolia</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib45">Ku et al. (1991)</xref></td></tr><tr><td><italic>Flaveria ramosissima</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib64">Rosche et al. (1994)</xref></td></tr><tr><td><italic>Flaveria sonorensis</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib10">Casati et al. (1999)</xref></td></tr><tr><td><italic>Flaveria brownie</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib51">McKown et al. (2005)</xref></td></tr><tr><td><italic>Flaveria vaginata</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib50">McKown and Dengler (2007)</xref></td></tr><tr><td><italic>Flaveria pubescens</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib23">Gowik et al. (2011)</xref></td></tr><tr><td><italic>Flaveria australasica</italic> (C<sub>4</sub>)</td><td/></tr><tr><td><italic>Flaveria bidentis</italic> (C<sub>4</sub>)</td><td/></tr><tr><td><italic>Flaveria kochiana</italic> (C<sub>4</sub>)</td><td/></tr><tr><td><italic>Flaveria trinervia</italic> (C<sub>4</sub>)</td><td/></tr><tr><td><italic>Parthenium incanum</italic> (C<sub>3</sub>)</td><td><xref ref-type="bibr" rid="bib53">Moore et al. (1987)</xref></td></tr><tr><td><italic>Parthenium hysterophorus</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib17">Devi and Raghavendra (1993)</xref></td></tr><tr><td rowspan="5">Boraginaceae</td><td><italic>Heliotropium europaeum</italic> (C<sub>3</sub>)</td><td/></tr><tr><td><italic>Heliotropium calcicola</italic> (C<sub>3</sub>)</td><td><xref ref-type="bibr" rid="bib77">Vogan et al. (2007)</xref></td></tr><tr><td><italic>Heliotropium convolvulaceum</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib54">Muhaidat et al. (2011)</xref></td></tr><tr><td><italic>Heliotropium greggii</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td/></tr><tr><td><italic>Heliotropium polyphyllum</italic> (C<sub>4</sub>)</td><td/></tr><tr><td rowspan="7">Brassicaceae</td><td><italic>Moricandia foetida</italic> (C<sub>3</sub>)</td><td><xref ref-type="bibr" rid="bib33">Holaday et al. (1981)</xref></td></tr><tr><td><italic>Moricandia arvensis</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib61">Rawsthorne et al. (1988)</xref></td></tr><tr><td><italic>Moricandia spinosa</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib5">Beebe and Evert (1990)</xref></td></tr><tr><td><italic>Moricandia nitens</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib62">Rawsthorne et al. (1998)</xref></td></tr><tr><td><italic>Raphanus sativus</italic> (C<sub>3</sub>)</td><td><xref ref-type="bibr" rid="bib74">Ueno et al. (2003)</xref></td></tr><tr><td><italic>Diplotaxis muralis</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib75">Ueno et al. (2006)</xref></td></tr><tr><td><italic>Diplotaxis tenuifolia</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td/></tr><tr><td rowspan="3">Chenopodiaceae</td><td><italic>Salsola oreophila</italic> (C<sub>3</sub>)</td><td><xref ref-type="bibr" rid="bib56">P’yankov et al. (1997)</xref></td></tr><tr><td><italic>Salsola arbusculiformis</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib78">Voznesenskaya et al. (2001)</xref></td></tr><tr><td><italic>Salsola arbuscula</italic> (C<sub>4</sub>)</td><td/></tr><tr><td rowspan="3">Cleomaceae</td><td><italic>Cleome spinosa</italic> (C<sub>3</sub>)</td><td><xref ref-type="bibr" rid="bib79">Voznesenskaya et al. (2007)</xref></td></tr><tr><td><italic>Cleome paradoxa</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib41">Koteyeva et al. (2010)</xref></td></tr><tr><td><italic>Cleome gynandra</italic> (C<sub>4</sub>)</td><td/></tr><tr><td rowspan="3">Cyperaceae</td><td><italic>Eleocharis acuta</italic> (C<sub>3</sub>)</td><td><xref ref-type="bibr" rid="bib8">Bruhl and Perry (1995)</xref></td></tr><tr><td><italic>Eleocharis acicularis</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib37">Keeley (1999)</xref></td></tr><tr><td><italic>Eleocharis tetragona</italic> (C<sub>4</sub>)</td><td/></tr><tr><td rowspan="4">Euphorbiaceae</td><td><italic>Euphorbia angusta</italic> (C<sub>3</sub>)</td><td/></tr><tr><td><italic>Euphorbia acuta</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib68">Sage et al. (2011b)</xref></td></tr><tr><td><italic>Euphorbia lata</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td/></tr><tr><td><italic>Euphorbia mesembryanthemifolia</italic> (C<sub>4</sub>)</td><td/></tr><tr><td rowspan="5">Molluginaceae</td><td><italic>Mollugo tenella</italic> (C<sub>3</sub>)</td><td/></tr><tr><td><italic>Mollugo verticillata</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib70">Sayre et al. (1979)</xref></td></tr><tr><td><italic>Mollugo naudicalis</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib38">Kennedy et al. (1980)</xref></td></tr><tr><td><italic>Mollugo pentaphylla</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib12">Christin et al. (2011a)</xref></td></tr><tr><td><italic>Mollugo cerviana</italic> (C<sub>4</sub>)</td><td/></tr><tr><td rowspan="15">Poaceae</td><td><italic>Avena sativa</italic> (C<sub>3</sub>)</td><td><xref ref-type="bibr" rid="bib71">Slack and Hatch (1967)</xref></td></tr><tr><td><italic>Neurachne tenuifolia</italic> (C<sub>3</sub>)</td><td><xref ref-type="bibr" rid="bib27">Hattersley and Stone (1986)</xref></td></tr><tr><td><italic>Neurachne minor</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib7">Brown and Hattersley (1989)</xref></td></tr><tr><td><italic>Neurachne munroi</italic> (C<sub>4</sub>)</td><td/></tr><tr><td><italic>Panicum bisculatum</italic> (C<sub>3</sub>)</td><td><xref ref-type="bibr" rid="bib22">Goldstein et al. (1976)</xref></td></tr><tr><td><italic>Panicum hians</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib43">Ku et al. (1976)</xref></td></tr><tr><td><italic>Panicum milioides</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib46">Ku and Edwards (1978)</xref></td></tr><tr><td rowspan="4"><italic>Panicum miliaceum</italic> (C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib60">Rathnam and Chollet (1978)</xref></td></tr><tr><td><xref ref-type="bibr" rid="bib59">Rathnam and Chollet (1979)</xref></td></tr><tr><td><xref ref-type="bibr" rid="bib31">Holaday and Black (1981)</xref></td></tr><tr><td><xref ref-type="bibr" rid="bib26">Hattersley (1984)</xref></td></tr><tr><td><italic>Saccharum officinarum</italic> (C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib71">Slack and Hatch (1967)</xref></td></tr><tr><td><italic>Sorghum bicolor</italic> (C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib71">Slack and Hatch (1967)</xref></td></tr><tr><td><italic>Triticum aestivum</italic> (C<sub>3</sub>)</td><td><xref ref-type="bibr" rid="bib71">Slack and Hatch (1967)</xref></td></tr><tr><td><italic>Zea mays</italic> (C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib71">Slack and Hatch (1967)</xref></td></tr><tr><td rowspan="3">Portulaceae</td><td><italic>Sesuvium portulacastrum</italic> (C<sub>3</sub>)</td><td/></tr><tr><td><italic>Portulaca cryptopetala</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td><xref ref-type="bibr" rid="bib80">Voznesenskaya et al. (2010)</xref></td></tr><tr><td><italic>Portulaca oleracea</italic> (C<sub>4</sub>)</td><td/></tr><tr><td rowspan="5">Scrophularaceae</td><td><italic>Anticharis kaokoensis</italic> (C<sub>3</sub>)</td><td><xref ref-type="bibr" rid="bib39">Khoshravesh et al. (2012)</xref></td></tr><tr><td><italic>Anticharis ebracteata</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td/></tr><tr><td><italic>Anticharis imbricate</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td/></tr><tr><td><italic>Anticharis namibensis</italic> (C<sub>3</sub>–C<sub>4</sub>)</td><td/></tr><tr><td><italic>Anticharis glandulosa</italic> (C<sub>4</sub>)</td><td/></tr></tbody></table><table-wrap-foot><fn><p>The family, species, photosynthetic type and original study are listed. In total, 16 characteristics relating to C<sub>4</sub> photosynthesis were extracted from 43 studies encompassing 18 C<sub>3</sub>, 18 C<sub>4</sub>, and 37 C<sub>3</sub>–C<sub>4</sub> intermediate species.</p></fn><fn id="tblfn1"><label>*</label><p>References apply to all species within each genus.</p></fn></table-wrap-foot></table-wrap><fig-group><fig id="fig1" position="float"><object-id pub-id-type="doi">10.7554/eLife.00961.004</object-id><label>Figure 1.</label><caption><title>Evolutionary paths to C<sub>4</sub> phenotype space modelled from a meta-analysis of C<sub>3</sub>–C<sub>4</sub> phenotypes.</title><p>Principal component analysis (PCA) on data for the activity of five C<sub>4</sub> cycle enzymes confirms the intermediacy of C<sub>3</sub>–C<sub>4</sub> species between C<sub>3</sub> and C<sub>4</sub> phenotype spaces (<bold>A</bold>). Each C<sub>4</sub> trait was considered absent in C<sub>3</sub> species and present in C<sub>4</sub> species, with previously studied C<sub>3</sub>–C<sub>4</sub> intermediate species representing samples from across the phenotype space (<bold>B</bold>). With a dataset of 16 phenotypic traits, a 16-dimensional space was defined. (<bold>C</bold>) A 2D representation of 50 pathways across this space. The phenotypes of multiple C<sub>3</sub>–C<sub>4</sub> species were used to identify pathways compatible with individual species (e.g., <italic>Alternanthera ficoides</italic> [red nodes] and <italic>Parthenium hysterophorus</italic> [blue nodes]), and pathways compatible with the phenotypes of multiple species (purple nodes).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.004">http://dx.doi.org/10.7554/eLife.00961.004</ext-link></p><p><supplementary-material id="SD1-data"><object-id pub-id-type="doi">10.7554/eLife.00961.005</object-id><label>Figure 1—source data 1.</label><caption><title>Binary scoring of C<sub>4</sub> traits present in C<sub>3</sub>–C<sub>4</sub> species.</title><p>The EM algorithm was used to assign binary scores for the presence or absence of 16 C<sub>4</sub> traits in 37 C<sub>3</sub>–C<sub>4</sub> intermediate species. 1 denotes the presence of a trait, 0 denotes absence. Blank cells denote traits that have not been defined.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.005">http://dx.doi.org/10.7554/eLife.00961.005</ext-link></p></caption><media mime-subtype="xlsx" mimetype="application" xlink:href="elife00961s001.xlsx"/></supplementary-material></p></caption><graphic xlink:href="elife00961f001"/></fig><fig id="fig1s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00961.006</object-id><label>Figure 1—figure supplement 1.</label><caption><title>A graphical representation of key phenotypic changes distinguishing C<sub>3</sub> and C<sub>4</sub> leaves.</title><p>Plants using C<sub>4</sub> photosynthesis possess a number of anatomical, cellular, and biochemical adaptations that distinguish them from C<sub>3</sub> ancestors. These include decreased vein spacing (<bold>A</bold>) and enlarged bundle sheath (BS) cells, which lie adjacent to veins (<bold>B</bold>). Together, these adaptations decrease the ratio of mesophyll (M) to BS cell volume. C<sub>4</sub> metabolism is generated by the increased abundance and M or BS-specific expression of multiple enzymes (shown in purple), which are expressed in both M and BS cells of C<sub>3</sub> leaves. Abbreviations: ME–Malic enzymes, RuBisCO—Ribulose1-5,Bisphosphate Carboxylase Oxygenase, PEPC–phospho<italic>enol</italic>pyruvate carboxylase, PPDK–pyruvate,orthophosphate dikinase.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.006">http://dx.doi.org/10.7554/eLife.00961.006</ext-link></p></caption><graphic xlink:href="elife00961fs001"/></fig><fig id="fig1s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00961.007</object-id><label>Figure 1—figure supplement 2.</label><caption><title>Phylogenetic distribution of C<sub>4</sub> and C<sub>3</sub>–C<sub>4</sub> lineages across the angiosperm phylogeny.</title><p>A phylogeny of angiosperm orders is shown, based on the classification by the Angiosperm Phylogeny Group. The phylogenetic distribution of known two-celled C<sub>4</sub> photosynthetic lineages are annotated, together with the distribution of C<sub>3</sub>-C<sub>4</sub> lineages that we used in this study. The numbers of independent C<sub>3</sub>-C<sub>4</sub>, or C<sub>4</sub> lineages present in each order are shown in parentheses.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.007">http://dx.doi.org/10.7554/eLife.00961.007</ext-link></p></caption><graphic xlink:href="elife00961fs002"/></fig><fig id="fig1s3" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00961.008</object-id><label>Figure 1—figure supplement 3.</label><caption><title>Clustering quantitative traits by EM algorithm and hierarchical clustering.</title><p>Quantitative variables were assigned binary scores using two-data clustering techniques. Each panel depicts the assignation of presence (red squares) and absence (blue triangles) scores by the EM algorithm. Adjacent to the right are cladograms depicting the partitioning of the same values into clusters by hierarchical clustering. Red cladogram branches denote values partitioned into a different group to that assigned by EM. The variables depicted in each panel are PEPC activity (<bold>A</bold>), PPDK activity (<bold>B</bold>), C<sub>4</sub> acid decarboxylase activity (<bold>C</bold>), RuBisCO activity (<bold>D</bold>), MDH activity (<bold>E</bold>), vein spacing (<bold>F</bold>), number of BS chloroplasts (<bold>G</bold>), BS chloroplast size (<bold>H</bold>).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.008">http://dx.doi.org/10.7554/eLife.00961.008</ext-link></p></caption><graphic xlink:href="elife00961fs003"/></fig><fig id="fig1s4" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00961.009</object-id><label>Figure 1—figure supplement 4.</label><caption><title>Illustration of the principle by which evolutionary pathways emit intermediate signals.</title><p>In this illustration, the phenotype consists of three traits, yielding a simple (hyper)cubic transition network. Simulated trajectories on this network evolve according to the weights of network edges (<bold>A</bold>). Probabilities were calculated from the signals emitted by simulated trajectories at intermediate nodes (<bold>B</bold>). Ensembles of trajectories were simulated to obtain probabilities from these signals for every possible evolutionary transition (<bold>C</bold>).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.009">http://dx.doi.org/10.7554/eLife.00961.009</ext-link></p></caption><graphic xlink:href="elife00961fs004"/></fig></fig-group></p><p>We defined each C<sub>4</sub> trait as either being absent (0) or present (1). For quantitative traits the expectation-maximization (EM) algorithm and hierarchical clustering were used to impartially assign binary scores (<xref ref-type="fig" rid="fig1s3">Figure 1—figure supplement 3</xref>). This generated a 16-bit string for each of the species (<xref ref-type="supplementary-material" rid="SD1-data">Figure 1—source data 1</xref>), with a presence or absence score for each of the traits included in our meta-analysis. This defined a 16-dimensional phenotype space with 2<sup>16</sup> (65,536) nodes corresponding to all possible combinations of presence (1) and absence (0) scores for each characteristic.</p></sec><sec id="s2-2"><title>A novel Bayesian approach for predicting evolutionary trajectories</title><p>Many existing methods of inference for evolutionary trajectories rely on phylogenetic information or assumptions about the fitness landscape underlying evolutionary dynamics (<xref ref-type="bibr" rid="bib82">Weinreich et al., 2005</xref>; <xref ref-type="bibr" rid="bib49">Lobkovsky et al., 2011</xref>; <xref ref-type="bibr" rid="bib52">Mooers and Heard, 2013</xref>). In convergent evolution, these properties are not always known, as convergent lineages may be genetically distant and associated with poor phylogenetic reconstructions. In addition, the selective pressures experienced by each may be different and dynamic. We therefore consider the convergent evolution of C<sub>4</sub> fundamentally as the acquisition of the key phenotypic traits identified through our meta-analysis (<xref ref-type="fig" rid="fig1">Figure 1B</xref>). The process of acquisition of these traits can be pictured as a path on the 16-dimensional hypercube (<xref ref-type="fig" rid="fig1">Figure 1C</xref>), from the node labelled with all 0’s (the C<sub>3</sub> phenotype, with no C<sub>4</sub> characteristics) to the node labelled with all 1’s (the C<sub>4</sub> phenotype, with all C<sub>4</sub> characteristics).</p><p>The phenotypic landscape underlying the evolution of C<sub>4</sub> photosynthesis was then modelled as a transition network, with weighted edges describing the probability of transitions occurring between two phenotypic states (two nodes on the hypercube, <xref ref-type="fig" rid="fig1s4">Figure 1—figure supplement 4</xref>). Observed intermediate points were then used to constrain the structure of these phenotypic landscapes. To do this, we developed inferential machinery based on the framework of Hidden Markov Models (HMMs) (<xref ref-type="bibr" rid="bib57">Rabiner, 1989</xref>) (<xref ref-type="fig" rid="fig1s4">Figure 1—figure supplement 4</xref>) and simulated an ensemble of Markov chains on trial transition networks. Each of these chains represents a possible evolutionary pathway from C<sub>3</sub> to C<sub>4</sub>, and passes through several intermediate phenotypic states. The likelihood of observing intermediate states with characteristics compatible with the biologically observed data on C<sub>3</sub>–C<sub>4</sub> intermediates was recorded for the set of paths supported on each trial network. A Bayesian MCMC procedure was used to sample from the set of networks most compatible with the meta-analysis dataset, and thus most likely to represent the underlying dynamics of C<sub>4</sub> evolution. The order in which phenotypic characteristics were acquired was recorded for paths on each network compatible with the C<sub>3</sub>–C<sub>4</sub> species data, and posterior probability distributions (given uninformative priors) for the time-ordered acquisition of each C<sub>4</sub> trait were generated. For further information and mathematical details, see ‘Methods’.</p><p>To model the evolutionary paths generating C<sub>4</sub> without requiring additional dimensionality, we imposed that only one C<sub>4</sub> trait may be acquired at a time, and loss of acquired C<sub>4</sub> traits was forbidden. To test if we were nevertheless able to detect traits acquired simultaneously in evolution, we tested our approach on artificial positive control datasets containing intermediate nodes representing a stepwise evolutionary sequence of events (<xref ref-type="fig" rid="fig2">Figure 2A</xref>) and an evolutionary pathway in which four traits are acquired simultaneously at a time (<xref ref-type="fig" rid="fig2">Figure 2B</xref>). Our approach clearly assigned equal acquisition probabilities to traits whose timing was linked in the underlying dataset, even when 50% of the data was occluded (<xref ref-type="fig" rid="fig2">Figure 2B</xref>). These data are consistent with this approach detecting the simultaneous acquisition of traits in evolution, even though single-trait acquisitions are simulated.<fig-group><fig id="fig2" position="float"><object-id pub-id-type="doi">10.7554/eLife.00961.010</object-id><label>Figure 2.</label><caption><title>Verifying a novel Bayesian approach for predicting evolutionary trajectories.</title><p>(<bold>A</bold> and <bold>B</bold>) Datasets were obtained from an artificially constructed diagonal dynamic matrix (<bold>A</bold>), and a diagonal matrix with linked timing of locus acquisitions (<bold>B</bold>). The single, diagonal evolutionary trajectory was clearly replicated in both examples, over a time-scale of 16 individual steps, or four coarse-grained quartiles. We subjected these artificial datasets to our inferential machinery with fully characterised artificial species, and with 50% of data occluded in order to replicate the proportion of missing data from our C<sub>3</sub>–C<sub>4</sub> dataset. (<bold>C</bold>) When applied to our meta-analysis of C<sub>3</sub>–C<sub>4</sub> data, predictions were generated for every trait missing from the biological dataset. We tested this predictive machinery by generating 29 artificial datasets, each missing one data point, and comparing the presence/absence of the trait as predicted by our approach with the experimental data from the original study. (<bold>D</bold> and <bold>E</bold>) Quantitative real-time PCR (qPCR) was used to verify the predicted phenotypes of four C<sub>3</sub>–C<sub>4</sub> species. The abundance <italic>RbcS</italic> (<bold>D</bold>) and <italic>MDH</italic> (<bold>E</bold>) transcripts were determined from six <italic>Flaveria</italic> species. White bars represent phenotypes already determined by other studies, grey bars those that were predicted by the model and asterisks denote intermediate species phenotypes correctly predicted by our approach (Error bars indicate SEM, N = 3).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.010">http://dx.doi.org/10.7554/eLife.00961.010</ext-link></p></caption><graphic xlink:href="elife00961f002"/></fig><fig id="fig2s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00961.011</object-id><label>Figure 2—figure supplement 1.</label><caption><title>Computational prediction of C<sub>3</sub>–C<sub>4</sub> intermediate phenotypes.</title><p>A probability for the presence of unobserved phenotypic characters was generated for every characteristic not yet studied in each of the C<sub>3</sub>–C<sub>4</sub> species included in this study. Red (upward triangles) predict a posterior mean probability of >0.75 for the presence of a C<sub>4</sub> trait; blue (downward triangles) predict a posterior mean probability of <0.25. Darker triangles represent probabilities whose standard deviations (SD) are lower than 0.25. Yellow blocks correspond to known data: no symbol is present for traits for which presence and absence have an equal probability (0.25–0.75).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.011">http://dx.doi.org/10.7554/eLife.00961.011</ext-link></p></caption><graphic xlink:href="elife00961fs005"/></fig></fig-group></p></sec><sec id="s2-3"><title>Verifying prediction accuracy</title><p>The presence and absence of unknown phenotypes were predicted by recording all phenotypes encountered along a set of simulated evolutionary trajectories that were compatible with the data from a given species (<xref ref-type="fig" rid="fig1s4">Figure 1—figure supplement 4</xref>), and calculating the posterior distribution of the proportion of these phenotypes with the value 1 for the unknown trait. If the mean of this distribution was <25% or >75%, and that value fell outside one standard deviation of the mean, the missing trait was assigned a strong prediction of absence or presence. To comprehensively test the accuracy of our predictive machinery, we generated 29 occluded datasets, consisting of the original full dataset with one randomly chosen data point removed. The predicted phenotype of each missing trait was then compared with the known phenotype published in the original study. For 29 occluded traits 18 were strongly predicted to be present or absent, and the remaining 11 predictions were neutral. Of the 18 strongly predicted traits (i.e., <25% or >75% probability), 15 were correct, with only one false positive and two false negative predictions (<xref ref-type="fig" rid="fig2">Figure 2C</xref>). The approach therefore assigns neutral predictions much more frequently than false positive or false negative predictions, suggesting that its outputs are highly conservative, and thus unlikely to produce artefacts. Predictions were generated for phenotypes that have not yet been described in C<sub>3</sub>–C<sub>4</sub> species (<xref ref-type="fig" rid="fig2s1">Figure 2—figure supplement 1</xref>). Quantitative real-time PCR experimentally verified a subset of these, relating to abundance of C<sub>4</sub> enzymes not previously measured (<xref ref-type="fig" rid="fig2">Figure 2D–E</xref>). We also found that the model was able to successfully infer evolutionary dynamics in artificially constructed datasets (<xref ref-type="fig" rid="fig2">Figure 2A–B</xref>). Taken together, these prediction and verification studies illustrate that our approach robustly identifies key features of C<sub>4</sub> evolution.</p></sec><sec id="s2-4"><title>A high-resolution model for the evolutionary events generating C<sub>4</sub></title><p>The posterior probability distributions for the acquisition time of each phenotypic trait were combined to produce an objective, computationally generated blueprint for the order of evolutionary events generating C<sub>4</sub> photosynthesis (<xref ref-type="fig" rid="fig3">Figure 3</xref>). These results were consistent with previous work on subsets of C<sub>4</sub> lineages that proposed the BS-specificity of GDC occurs prior to the evolution of C<sub>4</sub> metabolism (<xref ref-type="bibr" rid="bib34">Hylton et al., 1988</xref>; <xref ref-type="bibr" rid="bib62">Rawsthorne et al., 1988</xref>; <xref ref-type="bibr" rid="bib18">Devi et al., 1995</xref>; <xref ref-type="bibr" rid="bib67">Sage et al., 2012</xref>), and loss of RuBisCO from M cells occurs late (<xref ref-type="bibr" rid="bib11">Cheng et al., 1988</xref>; <xref ref-type="bibr" rid="bib39">Khoshravesh et al., 2012</xref>), but also provided higher resolution insight into the order of events generating C<sub>4</sub> metabolism. Alterations to leaf anatomy as well as cell-specificity and increased abundance of multiple C<sub>4</sub> cycle enzymes were predicted to evolve prior to any alteration to the primary C<sub>3</sub> and C<sub>4</sub> photosynthetic enzymes RuBisCO and phospho<italic>enol</italic>pyruvate carboxylase (PEPC) (<xref ref-type="fig" rid="fig3">Figure 3</xref>).<fig-group><fig id="fig3" position="float"><object-id pub-id-type="doi">10.7554/eLife.00961.012</object-id><label>Figure 3.</label><caption><title>The mean ordering of phenotypic changes generating C<sub>4</sub> photosynthesis.</title><p>EM-clustered data from C<sub>3</sub>–C<sub>4</sub> intermediate species were used to generate posterior probability distributions for the timing of the acquisition of C<sub>4</sub> traits in sixteen evolutionary steps (<bold>A</bold>) or four quartiles (<bold>B</bold>). Circle diameter denotes the mean posterior probability of a trait being acquired at each step in C<sub>4</sub> evolution (the Bayes estimator for the acquisition probability). Halos denote the standard deviation of the posterior. The 16 traits are ordered from left to right by their probability of being acquired early to late in C<sub>4</sub> evolution. Abbreviations: bundle sheath (BS), glycine decarboxylase (GDC), chloroplasts (CPs), decarboxylase (Decarb.), pyruvate, orthophosphate dikinase (PPDK), malate dehydrogenase (MDH), phosphoenolpyruvate carboxylase (PEPC).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.012">http://dx.doi.org/10.7554/eLife.00961.012</ext-link></p></caption><graphic xlink:href="elife00961f003"/></fig><fig id="fig3s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00961.013</object-id><label>Figure 3—figure supplement 1.</label><caption><title>Results obtained using data clustered by hierarchical clustering.</title><p>Traits were also assigned presence/absence scores by hierarchical clustering. Analysis of data partitioned by hierarchical clustering predicted a similar sequence of evolutionary events to that shown in <xref ref-type="fig" rid="fig3">Figure 3</xref> (<bold>A</bold>). Direct comparison of posterior probabilities reveals a high degree of similarity between results from the data clustered by hierarchical clustering versus the EM algorithm (<bold>B</bold>). These results suggest our conclusions are not affected by the different methods of assigning binary scores to traits.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.013">http://dx.doi.org/10.7554/eLife.00961.013</ext-link></p></caption><graphic xlink:href="elife00961fs006"/></fig><fig id="fig3s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00961.014</object-id><label>Figure 3—figure supplement 2.</label><caption><title>Adding or removing traits does not affect the predicted order of evolutionary events.</title><p>Two independent pairs of traits were randomly selected and deleted from the analysis. In both cases, removing two traits did not affect the predicted timing of the remaining 14 traits in the analysis (<bold>A</bold> and <bold>B</bold>). Furthermore, including two additional traits associated with C<sub>4</sub> photosynthesis also did not alter the predicted timing of other traits (<bold>C</bold>). Together, these data suggest our results are robust to both the removal and addition of traits from the phenotype space. Abbreviations: bundle sheath (BS), glycine decarboxylase (GDC), chloroplasts (CPs), C<sub>4</sub> acid decarboxylase (Decarb.), mitochondria (MitoC) pyruvate,orthophosphate dikinase (PPDK), malate dehydrogenase (MDH), phosphoenolpyruvate carboxylase (PEPC).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.014">http://dx.doi.org/10.7554/eLife.00961.014</ext-link></p></caption><graphic xlink:href="elife00961fs007"/></fig><fig id="fig3s3" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00961.015</object-id><label>Figure 3—figure supplement 3.</label><caption><title>Probabilities of C<sub>4</sub> traits being acquired simultaneously.</title><p>The extent to which C<sub>4</sub> traits are linked in evolution was assessed by modelling C<sub>4</sub> evolution from a start phenotype with one trait already acquired. Linked traits would have a high probability of being acquired in the next event. Artificially acquired traits are listed on the x-axis and the probability of each additional C<sub>4</sub> trait being subsequently acquired (y-axis) is denoted in each pixel of the heat map. There is overall very low probability for multiple traits being linked in their acquisition in the evolution of C<sub>4</sub>.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.015">http://dx.doi.org/10.7554/eLife.00961.015</ext-link></p></caption><graphic xlink:href="elife00961fs008"/></fig></fig-group></p><p>There was also strong evidence for enlargement of BS cells as an early innovation in most C<sub>4</sub> lineages (<xref ref-type="fig" rid="fig3">Figure 3</xref>), consistent with the suggestion that this was an ancestral state within C<sub>3</sub> ancestors of C<sub>4</sub> grass lineages and that this contributed to the high number of C<sub>4</sub> origins within this family (<xref ref-type="bibr" rid="bib15">Christin et al., 2013</xref>; <xref ref-type="bibr" rid="bib24">Griffiths et al., 2013</xref>). The compartmentation of PEPC into M cells and its increased abundance compared with C<sub>3</sub> leaves was predicted to occur at similar times, but for all other C<sub>4</sub> enzymes the evolution of increased abundance and cellular compartmentation were clearly separated by the acquisition of other traits (<xref ref-type="fig" rid="fig3">Figure 3</xref>). This result is consistent with molecular analysis of genes encoding C<sub>4</sub> enzymes that indicates cell-specificity and increased expression are mediated by different <italic>cis</italic>-elements (<xref ref-type="bibr" rid="bib3">Akyildiz et al., 2007</xref>; <xref ref-type="bibr" rid="bib36">Kajala et al., 2012</xref>; <xref ref-type="bibr" rid="bib86">Wiludda et al., 2012</xref>).</p><p>Two approaches were taken to verify that these conclusions are robust and accurately reflect biological data. First, the analysis was repeated using scores for presence or absence of traits that were assigned by hierarchical clustering, as opposed to using the EM algorithm (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1A</xref>). Although hierarchical clustering generated differences in the scoring of a small number of traits, the predicted evolutionary trajectories were not affected, producing highly similar results (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1B</xref>). Second, we introduced structural changes to the phenotype space, by both adding and subtracting traits from the analysis (<xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2</xref>). Removing two independent pairs of traits from the analysis did not affect the predicted timing of the remaining 14 traits (<xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2A–B</xref>). However, increased standard deviations were observed in some cases (e.g., for the probabilities of acquiring enlarged BS cells, or decreased vein spacing) likely a consequence of using fewer data. To test if the addition of data might also affect the results, we performed an analysis with two additional traits included (<xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2C</xref>). We selected two traits that have been widely observed in C<sub>3</sub>–C<sub>4</sub> species, the centripetal positioning of mitochondria and the centrifugal or centripetal position of chloroplasts within BS cells (<xref ref-type="bibr" rid="bib67">Sage et al., 2012</xref>). Despite the widespread occurrence of these traits, their functional importance remains unclear (<xref ref-type="bibr" rid="bib67">Sage et al., 2012</xref>). Consistent with observations made from several genera, we predict that these cellular alterations are acquired early in the evolution of C<sub>4</sub> photosynthesis (<xref ref-type="bibr" rid="bib34">Hylton et al., 1988</xref>; <xref ref-type="bibr" rid="bib50">McKown and Dengler, 2007</xref>; <xref ref-type="bibr" rid="bib54">Muhaidat et al., 2011</xref>; <xref ref-type="bibr" rid="bib68">Sage et al., 2011b</xref>). Importantly, including these additional early traits in the analysis did not alter the predicted order of the original 16 traits. Together, these analyses did not alter our main conclusions, suggesting that they are robust.</p></sec><sec id="s2-5"><title>The order of C<sub>4</sub> trait evolution is flexible</title><p>In addition to the likely order of evolutionary events generating C<sub>4</sub> photosynthesis, the number of molecular alterations required is also unknown. We therefore aimed to test if multiple traits were predicted to evolve with linked timing, and therefore likely mediated by a single underlying mechanism. To achieve this, we performed a contingency analysis by considering trajectories across phenotype space beginning with a given initial acquisition step. In this analysis, the starting genome had one of the 16 traits acquired and the rest absent, and the contingency of the subsequent trajectory upon the initial step was recorded. This approach was designed to test if acquiring one C<sub>4</sub> trait increased the probability of subsequently acquiring other traits, thus detecting if the evolution of multiple traits is linked by underlying mechanisms. Inflexible linkage between multiple traits was detected in artificial positive control datasets (<xref ref-type="fig" rid="fig2">Figure 2B</xref>) but not in the C<sub>3</sub>–C<sub>4</sub> dataset (<xref ref-type="fig" rid="fig3s3">Figure 3—figure supplement 3</xref>). This result suggests that the order of C<sub>4</sub> trait acquisition is flexible. Multiple origins of C<sub>4</sub> may therefore have been facilitated by this flexibility in the evolutionary pathways connecting C<sub>3</sub> and C<sub>4</sub> phenotypes.</p></sec><sec id="s2-6"><title>C<sub>4</sub> evolved via multiple distinct evolutionary trajectories</title><p>Our Bayesian analysis strongly indicates that there are multiple evolutionary pathways by which C<sub>4</sub> traits are acquired by all lineages of C<sub>4</sub> plants. First, no single sequence of acquisitions was capable of producing intermediate phenotypes compatible with all observations (‘Methods’). Second, several traits such as compartmentation of GDC into BS and the increased number of chloroplasts in the BS clearly displayed bimodal probability distributions for their acquisition (<xref ref-type="fig" rid="fig3">Figure 3</xref>). This bimodality is indicative of multiple distinct pathways to C<sub>4</sub> photosynthesis that acquire traits at earlier or later times. To investigate factors underlying this bimodality, we inferred evolutionary pathways generating the C<sub>4</sub> leaf using data from monocot and eudicot lineages, or from lineages using NAD malic enzyme (NAD–ME) or NADP malic enzyme (NADP-ME) as their primary C<sub>4</sub> acid decarboxylase. PCA on the entire set of inferred transition networks for monocot and dicot subsets revealed distinct separation (<xref ref-type="fig" rid="fig4">Figure 4A</xref>), suggesting that the topology of the evolutionary landscape surrounding C<sub>4</sub> is largely different for these two anciently diverged taxa. Performing this PCA including networks that were inferred from the full data set (with both lineages) confirmed that this separation is a robust result and involves posterior variation on a comparable scale to that of the full set of possible networks (<xref ref-type="fig" rid="fig4s1">Figure 4—figure supplement 1</xref>). Analysis of the posterior probabilities of the mean pathways representing either monocots or dicots revealed that this separation is the result of differences in the timing of events generating both anatomical and biochemical traits (<xref ref-type="fig" rid="fig4">Figure 4C</xref>). We propose that the ancient divergence of the monocot and eudicot clades constrained the evolution of C<sub>4</sub> photosynthesis to broadly different evolutionary pathways in each.<fig-group><fig id="fig4" position="float"><object-id pub-id-type="doi">10.7554/eLife.00961.016</object-id><label>Figure 4.</label><caption><title>Differences in the evolutionary events generating different C<sub>4</sub> sub-types and distantly related taxa.</title><p>Principal component analysis (PCA) on the entire landscape of transition probabilities using only monocot and eudicot data (<bold>A</bold>) and data from NADP-ME and NAD-ME sub-type lineages (<bold>B</bold>) shows broad differences between the evolutionary pathways generating C<sub>4</sub> in each taxon. Monocots and eudicots differ in the predicted timing of events generating C<sub>4</sub> anatomy and biochemistry (<bold>C</bold>), whereas NADP-ME and NAD-ME lineages differ primarily in the evolution of decreased vein spacing and greater numbers of chloroplasts in BS cells (<bold>D</bold>).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.016">http://dx.doi.org/10.7554/eLife.00961.016</ext-link></p></caption><graphic xlink:href="elife00961f004"/></fig><fig id="fig4s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00961.017</object-id><label>Figure 4—figure supplement 1.</label><caption><title>Variation between lineages compared to variance of overall dataset.</title><p>PCA was performed on sampled transition networks from the sets compatible with the overall dataset and each of the two subsets corresponding to different lineages: overall/monocot/eudicot (<bold>A</bold>) overall/NAD-ME/NADP-ME (<bold>B</bold>). In (<bold>A</bold>) the variation between monocot and eudicot lineages is observed to be preserved when the overall transition networks are included, and on a similar quantitative scale to the variation in the overall set, embedded mainly on the first principal axis. In (<bold>B</bold>) the variation is of a similar scale but less distinct, correlating more with the second principal axis.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00961.017">http://dx.doi.org/10.7554/eLife.00961.017</ext-link></p></caption><graphic xlink:href="elife00961fs009"/></fig></fig-group></p><p>There was more overlap between the landscapes generating NAD–ME and NADP-ME species (<xref ref-type="fig" rid="fig4">Figure 4B</xref>), likely reflecting the convergent origins of NAD–ME and NADP-ME sub-types (<xref ref-type="bibr" rid="bib20">Furbank, 2011</xref>; <xref ref-type="bibr" rid="bib66">Sage et al., 2011a</xref>)<italic>.</italic> Despite the traditional definition of these lineages on the basis of biochemical differences, we detected differences in the timing of their anatomical evolution (<xref ref-type="fig" rid="fig4">Figure 4D</xref>). For example, in NAD–ME lineages, increased vein density was predicted to be acquired early in C<sub>4</sub> evolution, while in NADP-ME species this trait showed a broadly different trajectory (<xref ref-type="fig" rid="fig4">Figure 4D</xref>). The proliferation of chloroplasts in the BS was also acquired with different timings between the two sub-types. The alternative evolutionary pathways generating the NADP-ME and NAD–ME subtypes were therefore defined by differences in the timing of anatomical and cellular traits that are predicted to precede the majority of biochemical alterations (<xref ref-type="fig" rid="fig3">Figure 3</xref>, <xref ref-type="fig" rid="fig4">Figure 4D</xref>). We therefore conclude that these distinct sub-types evolved as a consequence of alternative evolutionary histories in response to non-photosynthetic pressures. Furthermore, we propose that early evolutionary events determined the downstream phenotypes of C<sub>4</sub> sub-types by restricting lineages to independent pathways across phenotype space.</p></sec></sec><sec id="s3" sec-type="discussion"><title>Discussion</title><sec id="s3-1"><title>A novel Bayesian technique for inferring stochastic trajectories</title><p>The adaptive landscape metaphor has provided a powerful conceptual framework within which evolutionary transitions can be modelled (<xref ref-type="bibr" rid="bib21">Gavrilets, 1997</xref>; <xref ref-type="bibr" rid="bib83">Whibley et al., 2006</xref>; <xref ref-type="bibr" rid="bib49">Lobkovsky et al., 2011</xref>). However, the majority of complex biological traits provide numerous challenges in utilising such an approach, including missing phenotypic data, incomplete phylogenetic information and in the case of convergent evolution, variable ancestral states. Here we report the development of a novel, predictive Bayesian approach that is able to infer likely evolutionary trajectories connecting phenotypes from sparsely sampled, highly stochastic data. With this model, we provided insights into the evolution of one of the most complex traits to have arisen in multiple lineages: C<sub>4</sub> photosynthesis. However, as our approach is not dependent on detailed phylogenetic inference, we propose that it could be used to model the evolution of other complex traits, such as those in the fossil record, which are also currently limited by the fragmented nature of data available (<xref ref-type="bibr" rid="bib40">Kidwell and Holland, 2002</xref>). Our approach is also not limited by the time-scale over which predicted trajectories occur. As a result, it may be useful in inferring pathways underlying stochastic processes occurring over much shorter timescales, such as disease or tumour progression, or the differentiation of cell types.</p></sec><sec id="s3-2"><title>C<sub>4</sub> evolution was initiated by non-photosynthetic drivers</title><p>A central hypothesis for the ecological drivers of C<sub>4</sub> evolution is that declining CO<sub>2</sub> concentration in the Oligocene decreased the rate of carboxylation by RuBisCO, creating a strong pressure to evolve alternative photosynthetic strategies (<xref ref-type="bibr" rid="bib13">Christin et al., 2008</xref>; <xref ref-type="bibr" rid="bib76">Vicentini et al., 2008</xref>). According to this hypothesis, alterations to the localisation and abundance of the primary carboxylases PEPC and RuBisCO would be expected to occur early in the evolutionary trajectories generating C<sub>4</sub>. Conversely, our data suggest that alterations to anatomy and cell biology were predicted to precede the majority of biochemical alterations, and that other enzymes of the C<sub>4</sub> pathway are recruited prior to PEPC and RuBisCO (<xref ref-type="fig" rid="fig3">Figure 3</xref>). These enzymes, such as PPDK and C<sub>4</sub> acid decarboxylases, function in processes not related to photosynthesis within leaves of C<sub>3</sub> plants (<xref ref-type="bibr" rid="bib4">Aubry et al., 2011</xref>), so the early changes to abundance and localisation of these enzymes within C<sub>4</sub> lineages may have been driven by non-photosynthetic pressures. A recent in silico study also predicts that changes to photorespiratory metabolism and GDC in BS cells evolved prior to the C<sub>4</sub> pathway (<xref ref-type="bibr" rid="bib28">Heckman et al., 2013</xref>). Our model predicts that BS-specificity of GDC was acquired early in C<sub>4</sub> evolution for the majority of lineages. However, we also note that the predicted timing of GDC BS-specificity is bimodal in our analysis (<xref ref-type="fig" rid="fig3">Figure 3</xref>), and not predicted to be acquired early in monocot lineages (<xref ref-type="fig" rid="fig4">Figure 4C</xref>). These results suggest that this is not a feature of C<sub>4</sub> evolution to have occurred repeatedly in all lineages.</p><p>Recent evidence from physiological and ecological studies has identified a number of additional environmental pressures that may have driven the evolution and radiation of C<sub>4</sub> lineages, including high evaporative demands (<xref ref-type="bibr" rid="bib55">Osborne and Sack, 2012</xref>) and increased fire frequency (<xref ref-type="bibr" rid="bib19">Edwards et al., 2010</xref>). Increased BS volume and vein density have been proposed as likely adaptations to improve leaf hydraulics under drought (<xref ref-type="bibr" rid="bib55">Osborne and Sack, 2012</xref>; <xref ref-type="bibr" rid="bib24">Griffiths et al. 2013</xref>), but nothing is known about how early recruitment of GDC, PPDK, and C<sub>4</sub> acid decarboxylases (<xref ref-type="fig" rid="fig3">Figure 3</xref>) may relate to these pressures. A better understanding of the mechanisms underlying the recruitment of these enzymes (<xref ref-type="bibr" rid="bib6">Brown et al., 2011</xref>; <xref ref-type="bibr" rid="bib36">Kajala et al., 2012</xref>; <xref ref-type="bibr" rid="bib86">Wiludda et al., 2012</xref>) may help identify the key molecular events facilitating C<sub>4</sub> evolution.</p><p>Our data also suggest that modifications to leaf development drove the evolution of diverse C<sub>4</sub> sub-types. For example, we find that differences in the timing of events altering leaf vascular development and BS chloroplast division occur prior to the appearance of the alternative evolutionary pathways generating the NADP-ME and NAD-ME biochemical sub-types (<xref ref-type="fig" rid="fig4">Figure 4D</xref>). These traits are predicted to evolve prior to any alterations to the C<sub>4</sub> acid decarboxylase enzymes that traditionally define these sub-types (<xref ref-type="bibr" rid="bib20">Furbank, 2011</xref>). As an homologous mechanism has been shown to regulate the cell-specificity of gene expression in both NADP-ME and NAD-ME gene families in independent lineages (<xref ref-type="bibr" rid="bib6">Brown et al., 2011</xref>), it is unlikely that mechanisms underlying the recruitment of these enzymes drove the evolution of distinct sub-types. We therefore conclude that these different sub-types evolved as a consequence of alternative evolutionary histories in leaf development, rather than biochemical or photosynthetic pressures. This may explain why differences in the carboxylation efficiency or photosynthetic performance of different C<sub>4</sub> sub-types have never been detected (<xref ref-type="bibr" rid="bib20">Furbank, 2011</xref>), making the adaptive significance of different decarboxylation mechanisms difficult to explain. Instead, we propose that early evolutionary events determined the downstream phenotypes of C<sub>4</sub> sub-types by restricting lineages to independent pathways across phenotype space. The numerous differences in leaf development and cell biology between C<sub>4</sub> sub-types (<xref ref-type="bibr" rid="bib20">Furbank, 2011</xref>) may provide clues as to which developmental changes underlie subsequent differences in metabolic evolution.</p></sec><sec id="s3-3"><title>Convergent evolution was facilitated by flexibility in evolutionary trajectories</title><p>C<sub>4</sub> photosynthesis provides an excellent example of how independent lineages with a wide range of ancestral phenotypes can converge upon similar complex traits. Several studies on more simple traits have demonstrated that convergence upon a phenotype can be specified by diverse genotypes, and thus non-homologous molecular mechanisms in independent lineages (<xref ref-type="bibr" rid="bib87">Wittkopp et al., 2003</xref>; <xref ref-type="bibr" rid="bib30">Hill et al., 2006</xref>; <xref ref-type="bibr" rid="bib73">Steiner et al., 2009</xref>). Taken together, our data also indicate that flexibility in the viable series of evolutionary events has also facilitated the convergence of this highly complex trait. First, we show that at least four distinct evolutionary trajectories underlie the evolution of C<sub>4</sub> lineages (<xref ref-type="fig" rid="fig4">Figure 4</xref>). Second, we find no evidence for inflexible linkage between the predicted timing of distinct C<sub>4</sub> traits (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1</xref>). This diversity in viable pathways also helps explain why C<sub>4</sub> has been accessible to such a wide variety of species and not limited to a smaller subset of the angiosperm phylogeny. A recent model for the evolution of the biochemistry associated with the C<sub>4</sub> leaf also found that C<sub>4</sub> photosynthesis was accessible from any surrounding point of a fitness landscape (<xref ref-type="bibr" rid="bib28">Heckman et al., 2013</xref>). Our study of C<sub>4</sub> anatomy, biochemistry, and cell biology also suggests the C<sub>4</sub> phenotype is accessible from multiple trajectories. Encouragingly, the trajectories predicted by <xref ref-type="bibr" rid="bib28">Heckman et al. (2013)</xref> were found to pass through phenotypes of C<sub>3</sub>–C<sub>4</sub> species, despite the fact that these species were not used to parameterise the evolutionary landscape. As different mechanisms generate increased abundance and cell-specificity for the majority C<sub>4</sub> enzymes in independent C<sub>4</sub> lineages (reviewed in <xref ref-type="bibr" rid="bib47">Langdale, 2011</xref>; <xref ref-type="bibr" rid="bib84">Williams et al., 2012</xref>), it is likely that mechanistic diversity underlies the multiple evolutionary pathways generating C<sub>4</sub> photosynthesis and may be a key factor in facilitating the convergent evolution of complex traits. This may benefit efforts to recapitulate the acquisition of C<sub>4</sub> photosynthesis through the genetic engineering of C<sub>3</sub> species (<xref ref-type="bibr" rid="bib29">Hibberd et al., 2008</xref>), expanding the molecular toolbox available to establish C<sub>4</sub> traits in distinct phenotypic backgrounds.</p></sec></sec><sec id="s4" sec-type="methods"><title>Methods</title><sec id="s4-1"><title>Biological data from C<sub>4</sub> intermediates</title><p>Data from eighteen C<sub>3</sub>, seventeen C<sub>4</sub>, and thirty-seven C<sub>3</sub>–C<sub>4</sub> species were consolidated from 43 studies that have examined the phenotypic characteristics of C<sub>3</sub>–C<sub>4</sub> species since their discovery in 1974 (<xref ref-type="table" rid="tbl1">Table 1</xref>). Values for sixteen of the most widely studied C<sub>3</sub> characteristics were recorded for each intermediate species, as well as congeneric C<sub>3</sub> and C<sub>4</sub> relatives where available. The majority of data on enzyme abundance and the number and size of bundle sheath (BS) chloroplasts were obtained from studies employing the same methodology and were thus cross-comparable (e.g., <xref ref-type="bibr" rid="bib22">Goldstein et al., 1976</xref>; <xref ref-type="bibr" rid="bib43">Ku et al., 1976</xref>; <xref ref-type="bibr" rid="bib46">Ku and Edwards, 1978</xref>; <xref ref-type="bibr" rid="bib70">Sayre et al., 1979</xref>; <xref ref-type="bibr" rid="bib33">Holaday et al., 1981</xref>; <xref ref-type="bibr" rid="bib31">Holaday and Black, 1981</xref>; <xref ref-type="bibr" rid="bib44">Ku et al., 1983</xref>; <xref ref-type="bibr" rid="bib2">Adams et al., 1986</xref>; <xref ref-type="bibr" rid="bib58">Rajendrudu et al., 1986</xref>; <xref ref-type="bibr" rid="bib45">Ku et al., 1991</xref>; <xref ref-type="bibr" rid="bib17">Devi and Raghavendra, 1993</xref>; <xref ref-type="bibr" rid="bib8">Bruhl and Perry, 1995</xref>; <xref ref-type="bibr" rid="bib10">Casati et al., 1999</xref>; <xref ref-type="bibr" rid="bib37">Keeley, 1999</xref>). These cross-comparable quantitative data were partitioned into presence absence scores using two clustering techniques, the expectation-maximisation (EM) algorithm and hierarchical clustering (<xref ref-type="fig" rid="fig3s1">Figure 1—figure supplement 3</xref>). EM clustering was performed using a one-dimensional mixture model with two assigned components (e.g., presence and absence clusters), allowing for variable variance between the two components of the model, and variable population size between the two components. Hierarchical clustering was performed using a complete-linkage agglomerative approach, partitioning clusters by maximum distance according to a Euclidean distance metric. This approach identifies clusters with common variance, thus contrasting with the clusters of variable variance identifiable by EM.</p><p>For quantitative data not comparable with other studies (e.g., <xref ref-type="bibr" rid="bib46">Ku and Edwards, 1978</xref>; <xref ref-type="bibr" rid="bib60">Rathnam and Chollet, 1978</xref>; <xref ref-type="bibr" rid="bib59">Rathnam and Chollet, 1979</xref>; <xref ref-type="bibr" rid="bib32">Holaday et al., 1984</xref>; <xref ref-type="bibr" rid="bib7">Brown and Hattersley, 1989</xref>; <xref ref-type="bibr" rid="bib5">Beebe and Evert, 1990</xref>; <xref ref-type="bibr" rid="bib56">P’yankov et al., 1997</xref>; <xref ref-type="bibr" rid="bib23">Gowik et al., 2011</xref>), values obtained for intermediate species were scored as 1 or 0 if they were closer to the values for the respective C<sub>4</sub> or C<sub>3</sub> controls used in the original study. For qualitative abundance data from immunoblots (e.g., <xref ref-type="bibr" rid="bib64">Rosche et al., 1994</xref>; <xref ref-type="bibr" rid="bib79">Voznesenskaya et al., 2007</xref>; <xref ref-type="bibr" rid="bib80">Voznesenskaya, 2010</xref>), relative band intensity was measured using ImageJ software (<xref ref-type="bibr" rid="bib1">Abramoff et al., 2004</xref>) and abundance was scored as 1 or 0 if the band intensity value was closer to the C<sub>4</sub> or C<sub>3</sub> control respectively. For qualitative cell-specificity data from immunolocalisations (e.g. <xref ref-type="bibr" rid="bib78">Voznesenskaya et al., 2001</xref>; <xref ref-type="bibr" rid="bib74">Ueno et al., 2003</xref>, <xref ref-type="bibr" rid="bib75">2006</xref>; <xref ref-type="bibr" rid="bib54">Muhaidat et al., 2011</xref>), a presence score was only assigned if the enzyme appeared completely absent from either mesophyll (M) or BS cells. We represent the phenotypic properties of each intermediate species as a string of <italic>L</italic> = 16 numbers (<xref ref-type="supplementary-material" rid="SD1-data">Figure 1—source data 1</xref>). We will refer to these strings as <italic>phenotype strings</italic> of L loci, with each locus describing data on the corresponding phenotypic trait. In a given locus, 0 denotes the absence of a C<sub>4</sub> trait, 1 denotes the presence of a C<sub>4</sub> trait, and 2 denotes missing data.</p></sec><sec id="s4-2"><title>Principal component analysis (PCA)</title><p>PCA was performed on five variables for C<sub>4</sub> cycle enzyme activity, with missing values estimated using the EM algorithm for PCA as described by <xref ref-type="bibr" rid="bib65">Roweis (1998)</xref>.</p></sec><sec id="s4-3"><title>Model transition networks</title><p>The fundamental element underlying our analysis is a transition network <italic>P</italic>, consisting of a directed graph with 2<sup><italic>L</italic></sup> = 65,536 nodes, and the weight of the edge <italic>P</italic><sub><italic>ij</italic></sub> denoting the probability of a transition occurring from node <italic>i</italic> to node <italic>j</italic>. Each node corresponds to a possible phenotype: we labeled nodes with labels <italic>l</italic><sub><italic>i</italic></sub> so that <italic>l</italic><sub><italic>i</italic></sub> was the binary representation of the phenotype at node <italic>i</italic>, and <italic>P</italic><sub><italic>ij</italic></sub> takes on the specific meaning of the probability of a transition from phenotype <italic>l</italic><sub><italic>i</italic></sub> to phenotype <italic>l</italic><sub><italic>j</italic></sub>. We made several restrictions on the structure of <italic>P</italic>. We allowed only transitions that change a given phenotype at one locus, so <italic>P</italic><sub><italic>ij</italic></sub> = 0 if <italic>H</italic> (<italic>l</italic><sub><italic>i</italic></sub>,<italic>l</italic><sub><italic>j</italic></sub>) ≠ 1, where <italic>H</italic> (<italic>b</italic><sub><italic>1</italic></sub>,<italic>b</italic><sub><italic>2</italic></sub>) is the Hamming distance between bitstrings <italic>b</italic><sub><italic>1</italic></sub> and <italic>b</italic><sub><italic>2</italic></sub>. Transitions that changed loci with value 1 to value 0 (steps back towards the C<sub>3</sub> state) were forbidden, so <italic>P</italic><sub><italic>ij</italic></sub> = 0 if <italic>H</italic> (<italic>l</italic><sub><italic>i</italic></sub>,<italic>l</italic><sub>0</sub>) > <italic>H</italic> (<italic>l</italic><sub><italic>j</italic></sub>,<italic>l</italic><sub>0</sub>), where <italic>l</italic><sub><italic>0</italic></sub> is the phenotypic string containing only zeroes. We assume that the possibility of events involving backwards steps, and multiple simultaneous trait acquisitions constitute second-order effects which will not strongly influence the inferred evolutionary dynamics.</p></sec><sec id="s4-4"><title>Evolutionary trajectories</title><p>Given the transition network <italic>P</italic>, we modelled the evolutionary trajectories that may give rise to C<sub>4</sub> photosynthesis through the picture of a discrete analogue to a Brownian bridge, that is as a stochastic process on <italic>P</italic> with constrained start and end positions (<xref ref-type="bibr" rid="bib63">Revuz and Yor, 1999</xref>). We enforced the start state of the process to be <inline-formula><mml:math id="inf1"><mml:mrow><mml:msub><mml:mi>l</mml:mi><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mn>3</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo>≡</mml:mo><mml:msub><mml:mi>l</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>…</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula> (the phenotype string of all zeroes) and the end state, through the imposed structure of <italic>P</italic>, to be <inline-formula><mml:math id="inf2"><mml:mrow><mml:msub><mml:mi>l</mml:mi><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mn>4</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo>≡</mml:mo><mml:msub><mml:mi>l</mml:mi><mml:mrow><mml:msup><mml:mn>2</mml:mn><mml:mi>L</mml:mi></mml:msup><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>…</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math></inline-formula> (the string of all ones). The dynamics of the process between these points consisted of <italic>L</italic> steps, with a phenotypic trait being acquired at each step, and a step from node <italic>i</italic> to node <italic>j</italic> occurring with probability <italic>P</italic><sub><italic>ij</italic></sub>.</p></sec><sec id="s4-5"><title>Sampling intermediates</title><p>As many evolutionary trajectories may lead to the acquisition of the required phenotypic traits, we considered an ensemble of evolutionary trajectories for each transition network. Each member of this ensemble is started at <inline-formula><mml:math id="inf3"><mml:mrow><mml:msub><mml:mi>l</mml:mi><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mn>3</mml:mn></mml:msub></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> and allowed to step across the network according to probabilities <italic>P</italic>.</p><p>To compare the dynamics of a given transition network to the properties of observed biological intermediates, we pictured this ensemble of trajectories as a modification of a hidden Markov model (HMM [<xref ref-type="bibr" rid="bib57">Rabiner, 1989</xref>]). At each timestep in each individual trajectory, the process may with some probability emit a signal to the observer, with that signal being simply <italic>l</italic><sub><italic>i</italic></sub>, the label of the node at which the process currently resides. Over an ensemble of trajectories, a set of randomly emitted signals is thus built up (<xref ref-type="fig" rid="fig1s4">Figure 1—figure supplement 4</xref>).</p><p>We define a compatibility function between two strings as<disp-formula id="equ1"><label>(1)</label><mml:math id="m1"><mml:mrow><mml:mi>C</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mover accent="true"><mml:mrow><mml:munder><mml:mstyle displaystyle="true"><mml:mo>∏</mml:mo></mml:mstyle><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:munder></mml:mrow><mml:mi>L</mml:mi></mml:mover></mml:mrow><mml:mi>c</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula><disp-formula id="equ2"><label>(2)</label><mml:math id="m2"><mml:mrow><mml:mi>c</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mn>1</mml:mn></mml:mtd><mml:mtd><mml:mrow><mml:mtext>if</mml:mtext><mml:mo> </mml:mo><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo> </mml:mo><mml:mtext>or</mml:mtext><mml:mo> </mml:mo><mml:msub><mml:mi>s</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>2</mml:mn><mml:mo> </mml:mo><mml:mtext>or</mml:mtext><mml:mo> </mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>2</mml:mn><mml:mo>;</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>0</mml:mn></mml:mtd><mml:mtd><mml:mrow><mml:mtext>otherwise</mml:mtext><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula></p><p><italic>C</italic>(<italic>s</italic>,<italic>t</italic>) thus returns 1 if a signal comprising string <italic>s</italic> could be responsible for observation <italic>t</italic> once some of the loci within <italic>s</italic> have been obscured: signal <italic>s</italic> is compatible with observation <italic>t</italic>.</p></sec><sec id="s4-6"><title>Likelihood of observing biological data</title><p>We wish to compute the likelihood of observing biological data <italic>B</italic> given a transition network <italic>P</italic>. Under our model, this likelihood is calculated by considering the compatibility of randomly emitted signals from processes supported by <italic>P</italic> with the observed data <italic>B</italic>. We write<disp-formula id="equ3"><label>(3)</label><mml:math id="m3"><mml:mrow><mml:mi mathvariant="script">L</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>P</mml:mi><mml:mo>|</mml:mo><mml:mi>B</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:munder><mml:mstyle displaystyle="true"><mml:mo>∏</mml:mo></mml:mstyle><mml:mi>i</mml:mi></mml:munder><mml:munder><mml:mstyle displaystyle="true"><mml:mo>∑</mml:mo></mml:mstyle><mml:mrow><mml:mtext>chains</mml:mtext><mml:mo> </mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:munder><mml:munder><mml:mstyle displaystyle="true"><mml:mo>∑</mml:mo></mml:mstyle><mml:mrow><mml:mtext>signals</mml:mtext><mml:mo> </mml:mo><mml:mi>s</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mi mathvariant="normal">ℙ</mml:mi><mml:mrow><mml:mtext>chain</mml:mtext></mml:mrow></mml:msub><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>|</mml:mo><mml:mi>P</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:msub><mml:mi mathvariant="normal">ℙ</mml:mi><mml:mrow><mml:mtext>emission</mml:mtext></mml:mrow></mml:msub><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>s</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mi>C</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>B</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula></p><p>Here, <inline-formula><mml:math id="inf4"><mml:mrow><mml:msub><mml:mi mathvariant="normal">ℙ</mml:mi><mml:mrow><mml:mtext>chain</mml:mtext></mml:mrow></mml:msub><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>|</mml:mo><mml:mi>P</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> is the probability of specific trajectory <italic>x</italic> arising on network <italic>P</italic>, <inline-formula><mml:math id="inf5"><mml:mrow><mml:msub><mml:mi mathvariant="normal">ℙ</mml:mi><mml:mrow><mml:mtext>emission</mml:mtext></mml:mrow></mml:msub><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>s</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> is the probability that trajectory <italic>x</italic> emits signal <italic>s</italic>, and <italic>C</italic>(<italic>s</italic>,<italic>B</italic><sub><italic>i</italic></sub>) gives the compatibility of signal <italic>s</italic> with intermediate state <italic>B</italic><sub><italic>i</italic></sub>. The term within the product operator thus describes the probability that evolutionary dynamics on network <italic>P</italic> give rise to a signal that is compatible with species <italic>B</italic><sub><italic>i</italic></sub>, with the overall likelihood being the product of this probability over all observed species.</p></sec><sec id="s4-7"><title>Simulation</title><p>The uniform and random nature of signal emission means that <inline-formula><mml:math id="inf6"><mml:mrow><mml:msub><mml:mi mathvariant="normal">ℙ</mml:mi><mml:mrow><mml:mtext>emission</mml:mtext></mml:mrow></mml:msub><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>s</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> is a constant if signal <italic>s</italic> can be emitted from trajectory <italic>x</italic>, and zero otherwise. Our simulation approach only produces signals which can be emitted from the trajectory under consideration, so <inline-formula><mml:math id="inf7"><mml:mrow><mml:msub><mml:mi mathvariant="normal">ℙ</mml:mi><mml:mrow><mml:mtext>emission</mml:mtext></mml:mrow></mml:msub><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>s</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> will always take the same constant value (which depends on the probability of signal emissions). As we will be considering ratios of network likelihoods and will not be concerned with absolute likelihoods we will ignore this term henceforth. For each network <italic>P</italic> we simulate an ensemble of <italic>N</italic><sub>chain</sub> trajectories and, for each node encountered throughout this ensemble, we record compatibilities with each of the biologically observed intermediates. We sum these compatibilities over the ensemble, obtaining <inline-formula><mml:math id="inf8"><mml:mrow><mml:msub><mml:mo>∑</mml:mo><mml:mrow><mml:mtext>chains</mml:mtext><mml:mo> </mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi mathvariant="normal">ℙ</mml:mi><mml:mrow><mml:mtext>chain</mml:mtext></mml:mrow></mml:msub><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>|</mml:mo><mml:mi>P</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mi>C</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>B</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula>. A network that does not encounter any node compatible with a particular intermediate will thus be assigned zero likelihood; networks that encounter compatible nodes many times will be assigned high likelihoods.</p><p>For each transition network, we simulated <italic>N</italic><sub>chain</sub> = 2 × 10<sup>4</sup> individual trajectories running from C<sub>3</sub> to C<sub>4</sub>. This value was chosen after preliminary investigations to analyse the ability of trajectory ensembles to broadly sample available paths on networks.</p></sec><sec id="s4-8"><title>Bayesian MCMC over compatible networks</title><p>Given uninformative prior knowledge about the evolutionary dynamics leading to C<sub>4</sub> photosynthesis (specifically, our prior involves each possible transition from a given node being assigned equal probability), we aimed to build a posterior distribution over a suitable description of the evolutionary dynamics. We represented the dynamics supported on a network <italic>P</italic> through a matrix <italic>π</italic>, where <italic>π</italic><sub><italic>i,n</italic></sub> describes the probability that acquisition of trait <italic>i</italic> occurs at the <italic>n</italic>th step in an evolutionary trajectory. The values of matrix <italic>π</italic><sub><italic>i,n</italic></sub> were built up from sampling over the ensemble of trajectories simulated on <italic>P</italic>.</p><p>We used Bayesian MCMC to sample networks based on their associated likelihood values (<xref ref-type="bibr" rid="bib81">Wasserman, 2004</xref>). At each iteration, we perturbed the transition probability of the current network <italic>P</italic> a small amount (see below) to yield a new trial network <italic>P</italic>’. We calculated <inline-formula><mml:math id="inf9"><mml:mrow><mml:mi mathvariant="script">L</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>P</mml:mi><mml:mtext>'</mml:mtext><mml:mo>|</mml:mo><mml:mi>B</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> and accepted <italic>P</italic> as the new network if <inline-formula><mml:math id="inf10"><mml:mrow><mml:mfrac><mml:mrow><mml:mi mathvariant="script">L</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>P</mml:mi><mml:mtext>'</mml:mtext><mml:mo>|</mml:mo><mml:mi>B</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>P</mml:mi><mml:mo>|</mml:mo><mml:mi>B</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mfrac><mml:mo>></mml:mo><mml:mi>r</mml:mi></mml:mrow></mml:math></inline-formula>, where <italic>r</italic> was taken from <inline-formula><mml:math id="inf11"><mml:mrow><mml:mi mathvariant="script">U</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula>. For practical reasons we implemented this scheme using log-likelihoods.</p><p>The perturbations we applied to transition probabilities are Normally distributed in logarithmic space: for each edge <italic>w</italic><sub><italic>ij</italic></sub> we used <inline-formula><mml:math id="inf12"><mml:mrow><mml:mi>w</mml:mi><mml:msub><mml:mi>'</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mtext>exp</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext>ln</mml:mtext><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mi mathvariant="script">N</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. To show that this scheme obeys detailed balance, consider two states <inline-formula><mml:math id="inf13"><mml:mi>A</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="inf14"><mml:mrow><mml:mi>A</mml:mi><mml:mo>'</mml:mo></mml:mrow></mml:math></inline-formula>, for simplicity described by a one-dimensional scalar quantity. Consider the proposed move from <italic>A</italic> when Δ is the result of the random draw. This proposal is <inline-formula><mml:math id="inf15"><mml:mrow><mml:mi>A</mml:mi><mml:mo>→</mml:mo><mml:mi>A</mml:mi><mml:mo>'</mml:mo></mml:mrow></mml:math></inline-formula> if <inline-formula><mml:math id="inf16"><mml:mrow><mml:mi>A</mml:mi><mml:mo>'</mml:mo><mml:mo>=</mml:mo><mml:mtext>exp</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtext>ln</mml:mtext><mml:mi>A</mml:mi><mml:mo>+</mml:mo><mml:mtext>Δ</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>A</mml:mi><mml:msup><mml:mi>e</mml:mi><mml:mtext>Δ</mml:mtext></mml:msup></mml:mrow></mml:math></inline-formula>, implying that <inline-formula><mml:math id="inf17"><mml:mrow><mml:mi>A</mml:mi><mml:mo>=</mml:mo><mml:mi>A</mml:mi><mml:mtext>'</mml:mtext><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mo>−</mml:mo><mml:mtext>Δ</mml:mtext></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>. The probability of proposing move <inline-formula><mml:math id="inf18"><mml:mrow><mml:mi>A</mml:mi><mml:mo>→</mml:mo><mml:mi>A</mml:mi><mml:mo>'</mml:mo></mml:mrow></mml:math></inline-formula> is thus <inline-formula><mml:math id="inf19"><mml:mrow><mml:mi mathvariant="script">N</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>=</mml:mo><mml:mtext>Δ</mml:mtext><mml:mo>|</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, and the probability of proposing <inline-formula><mml:math id="inf20"><mml:mrow><mml:mi>A</mml:mi><mml:mo>→</mml:mo><mml:mi>A</mml:mi><mml:mo>'</mml:mo></mml:mrow></mml:math></inline-formula> is <inline-formula><mml:math id="inf21"><mml:mrow><mml:mi mathvariant="script">N</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>=</mml:mo><mml:mo>−</mml:mo><mml:mtext>Δ</mml:mtext><mml:mo>|</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. By the symmetry of the Normal distribution, these two probabilities are equal.</p><p>We started each MCMC run with a randomly initialised transition matrix. We allowed 2 × 10<sup>4</sup> burn-in steps then sampled over a further 2 × 10<sup>5</sup> steps. The value σ = 0.1 was chosen for the perturbation kernel. These values were chosen through an initial investigation to analyse the convergence of MCMC runs under different parameterisations. We performed 40 MCMC runs for each experiment and confirmed that the resulting posterior distributions had converged and yielded consistent results.</p></sec><sec id="s4-9"><title>Summary dynamics matrices</title><p>We report the posterior distributions <inline-formula><mml:math id="inf22"><mml:mrow><mml:mi mathvariant="normal">ℙ</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>π</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> inferred from sampling compatible networks as above. In the coarse-grained time representation, we use <inline-formula><mml:math id="inf23"><mml:mrow><mml:msubsup><mml:mi>π</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mi>'</mml:mi></mml:mrow><mml:mrow><mml:mi>C</mml:mi><mml:mi>G</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi mathvariant="normal">ℙ</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:msup><mml:mstyle displaystyle="true"><mml:mo>∑</mml:mo></mml:mstyle><mml:mtext></mml:mtext></mml:msup></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mn>4</mml:mn><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mi>'</mml:mi><mml:mo>−</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mrow><mml:mn>4</mml:mn><mml:mi>n</mml:mi><mml:mi>'</mml:mi></mml:mrow></mml:msubsup><mml:msub><mml:mi>π</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula>, summing over sets of ordinals of size 4.</p><p>We used the transition network <italic>P</italic>, rather than a more coarse-grained representation of the evolutionary dynamics (e.g., the summary matrices <italic>π</italic>), as the fundamental element within our simulations so as not to discard possible details that would be lost in a coarse-grained approach – for example, the presence of multiple distinct pathways, which may be averaged over in a summary matrix.</p></sec><sec id="s4-10"><title>Proofs of principle</title><p>To verify our approach, we constructed artificial data sets, consisting of sets of strings in which phenotypic traits were acquired in a single ordering. Specifically, <italic>π</italic><sub><italic>i</italic>,<italic>n</italic></sub> = <italic>δ</italic><sub><italic>i</italic>,<italic>n</italic></sub>, so the first step always resulted in acquiring the first trait, and so on. To test the approach in a pleiotropic setting, where multiple traits were acquired simultaneously, we also constructed data sets where traits were acquired at only four timesteps, each corresponding to the simultaneous acquisition of four traits. We subjected these datasets to our inferential machinery with all data intact, and with 50% of data points occluded, to determine the sensitivity and robustness of our approach (<xref ref-type="fig" rid="fig2">Figure 2A–B</xref>). The approach accurately determines the ordering of events in both the bare and occluded cases and assigns very similar posterior probability distributions to the ordering of those traits acquired simultaneously.</p></sec><sec id="s4-11"><title>Comparing the evolution of multiple C<sub>4</sub> sub-types</title><p>To compare the pathways generating C<sub>4</sub> in monocots and eudicots, and in NADP-ME and NAD-ME sub-type lineages, we performed inference on two data sets: <italic>B</italic><sub><italic>1</italic></sub> and <italic>B</italic><sub>2</sub>, each comprising phenotype measurements from one of the groups of interest. We reported the posteriors on the resulting summary dynamics <inline-formula><mml:math id="inf24"><mml:mrow><mml:mi mathvariant="normal">ℙ</mml:mi><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>π</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> as before, and for the principal components analysis (PCA) we sampled 10<sup>3</sup> summary dynamic matrices <italic>π</italic><sub><italic>i,n</italic></sub> from the inferred posterior distribution during the Bayesian MCMC procedure, and performed PCA on these sampled matrices.</p></sec><sec id="s4-12"><title>Predictions</title><p>When a simulated chain encountered a phenotypic node compatible with a given biological intermediate, the values of traits corresponding to missing data in the biological data were recorded. These recorded values, sampled over the sampled set of networks, allowed us to place probabilities on the values of biologically unobserved traits inferred from the encounters of compatible dynamics with the corresponding phenotypic possibilities. For example, if 70% of paths on network <italic>P</italic> pass through point 101 and 30% pass through point 001, we infer a 70% probability that the missing trait in biological intermediate 201 takes the value 1. Predictions were presented if the inferred probability of a ‘1’ value was >75% (predicting a ‘1’) or <25% (predicting a ‘0’). If one of these inequalities held and the limiting value fell outside one standard deviation of the inferred probability (i.e., for mean <italic>μ</italic> and standard deviation <italic>σ</italic>, <italic>μ</italic> > 0.75 and <italic>μ</italic> − <italic>σ</italic> > 0.75 [predicting a ‘1’] or <italic>μ</italic> > 0.25 and <italic>μ</italic> + <italic>σ</italic> < 0.25 [predicting a ‘0’]), the prediction was presented as ‘strict’.</p></sec><sec id="s4-13"><title>Acquisition ordering and evidence against a single pathway</title><p>We used a dynamic programming approach to explore whether a deterministic sequence of events, with a trait <italic>T</italic><sub><italic>n</italic></sub> always being acquired at timestep <italic>n</italic> (<inline-formula><mml:math id="inf25"><mml:mrow><mml:msub><mml:mi>π</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>δ</mml:mi><mml:mrow><mml:msub><mml:mi>T</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>), was compatible with the biological data. Performing an exhaustive search over sequences of single transitions that were compatible with the observed data, we identified several such sequences that accounted for all but one trait acquisition, but no single sequence exists that accounts for all the data.</p></sec><sec id="s4-14"><title>Contingent trait acquisition</title><p>To explore the possibility of multiple traits being acquired simultaneously, we tracked acquisition probabilities for later traits given that a certain trait was acquired first. This tracking was performed over all sampled compatible networks, building up ‘contingent’ acquisition tables <italic>γ</italic> with the <italic>i</italic>, <italic>j</italic> th element given by <inline-formula><mml:math id="inf26"><mml:mrow><mml:mtext>ℙ</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>π</mml:mi><mml:mrow><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>π</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mo>≠</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:math></inline-formula>. If a pair of traits <italic>i</italic> and <italic>j</italic> were acquired simultaneously, we would expect <italic>γ</italic><sub><italic>ij</italic></sub> and <italic>γ</italic><sub><italic>ji</italic></sub> to both be higher than expected in the non-contingent case (as <italic>j</italic> should always appear to be immediately acquired after <italic>i</italic> and vice versa).</p></sec><sec id="s4-15"><title>Quantitative real-time PCR (qPCR)</title><p>RNA was extracted from mature leaves of six <italic>Flaveria</italic> species as part of the One Thousand Plants Consortium (<ext-link ext-link-type="uri" xlink:href="http://www.onekp.com">www.onekp.com</ext-link>), using the hot acid phenol protocol as described by <xref ref-type="bibr" rid="bib35">Johnson et al. (2012)</xref> (protocol no. 12). cDNA was synthesised from 0.5 µg RNA using Superscript II (Life Technologies, Glasgow, U.K.) following manufacturer’s instructions. An oligo dT primer (Roche, Basel, Switzerland) was used to selectively transcribe polyadenylated transcripts. To each RNA sample, 1 fmol GUS transcript was added for use as an exogenous control or ‘RNA spike’, against which measured transcript abundance was normalised as described by <xref ref-type="bibr" rid="bib72">Smith et al. (2003)</xref>.</p><p>qPCR was performed as described by <xref ref-type="bibr" rid="bib9">Bustin (2000)</xref> using the DNA-binding marker SYBR Green (Sigma Aldrich, St. Louis, MO) according to manufacturer’s instructions. Primers were designed using cDNA sequences for <italic>Flaveria</italic> species available at Genbank (<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/genbank">http://www.ncbi.nlm.nih.gov/genbank</ext-link>) and synthesised by Life Technologies. Amplification was performed using a Rotor-Gene Q instrument (Qiagen, Hilden, Germany), using the following cycling parameters: 94°C for 2 min, followed by 40 cycles at 94°C for 20 s, 60°C for 30 s, 72°C for 30 s, followed by a 5 min incubation at 72°C. Relative transcript abundance was calculated as described by <xref ref-type="bibr" rid="bib48">Livak and Schmittgen (2001)</xref>.</p></sec></sec></body><back><ack id="ack"><title>Acknowledgements</title><p>We thank S Kelly, JA Langdale, H Griffiths, and N Jones for advice.</p></ack><sec sec-type="additional-information"><title>Additional information</title><fn-group content-type="competing-interest"><title>Competing interests</title><fn fn-type="conflict" id="conf1"><p>The authors declare that no competing interests exist.</p></fn></fn-group><fn-group content-type="author-contribution"><title>Author contributions</title><fn fn-type="con" id="con1"><p>BPW, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con2"><p>IGJ, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con3"><p>SC, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con4"><p>JMH, Conception and design, Analysis and interpretation of data, Drafting or revising the article</p></fn></fn-group></sec><ref-list><title>References</title><ref id="bib1"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Abramoff</surname><given-names>MD</given-names></name><name><surname>Magalhaes</surname><given-names>PJ</given-names></name><name><surname>Ram</surname><given-names>SJ</given-names></name></person-group><year>2004</year><article-title>Image processing with ImageJ</article-title><source>Biophotonics Int</source><volume>11</volume><fpage>36</fpage><lpage>42</lpage></element-citation></ref><ref id="bib2"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Adams</surname><given-names>CA</given-names></name><name><surname>Leung</surname><given-names>F</given-names></name><name><surname>Sun</surname><given-names>SSM</given-names></name></person-group><year>1986</year><article-title>Molecular properties of phosphoenolpyruvate carboxylase from C<sub>3</sub>, C<sub>3</sub>-C<sub>4</sub> intermediate, and C<sub>4</sub> <italic>Flaveria</italic> species</article-title><source>Planta</source><volume>167</volume><fpage>218</fpage><lpage>25</lpage><pub-id pub-id-type="doi">10.1007/BF00391418</pub-id></element-citation></ref><ref id="bib3"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Akyildiz</surname><given-names>M</given-names></name><name><surname>Gowik</surname><given-names>U</given-names></name><name><surname>Engelmann</surname><given-names>S</given-names></name><name><surname>Koczor</surname><given-names>M</given-names></name><name><surname>Streubel</surname><given-names>M</given-names></name><name><surname>Westhoff</surname><given-names>P</given-names></name></person-group><year>2007</year><article-title>Evolution and function of a <italic>cis</italic>-regulatory module for mesophyll-specific gene expression in the C<sub>4</sub> dicot <italic>Flaveria trinervia</italic></article-title><source>Plant Cell</source><volume>19</volume><fpage>3391</fpage><lpage>402</lpage><pub-id pub-id-type="doi">10.1105/tpc.107.053322</pub-id></element-citation></ref><ref id="bib4"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Aubry</surname><given-names>S</given-names></name><name><surname>Brown</surname><given-names>NJ</given-names></name><name><surname>Hibberd</surname><given-names>JM</given-names></name></person-group><year>2011</year><article-title>The role of proteins in C<sub>3</sub> plants prior to their recruitment into the C<sub>4</sub> pathway</article-title><source>J Exp Bot</source><volume>62</volume><fpage>3049</fpage><lpage>59</lpage><pub-id pub-id-type="doi">10.1093/jxb/err012</pub-id></element-citation></ref><ref id="bib5"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Beebe</surname><given-names>DU</given-names></name><name><surname>Evert</surname><given-names>RF</given-names></name></person-group><year>1990</year><article-title>The morphology and anatomy of the leaf of <italic>Moricandia arvensis</italic> (L.) DC. (Brassicaceae)</article-title><source>Bot Gaz</source><volume>151</volume><fpage>184</fpage><lpage>203</lpage><pub-id pub-id-type="doi">10.1086/337818</pub-id></element-citation></ref><ref id="bib6"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Brown</surname><given-names>NJ</given-names></name><name><surname>Newell</surname><given-names>CA</given-names></name><name><surname>Stanley</surname><given-names>S</given-names></name><name><surname>Chen</surname><given-names>JE</given-names></name><name><surname>Perrin</surname><given-names>AJ</given-names></name><name><surname>Kajala</surname><given-names>K</given-names></name><etal/></person-group><year>2011</year><article-title>Independent and parallel recruitment of preexisting mechanisms underlying C<sub>4</sub> photosynthesis</article-title><source>Science</source><volume>331</volume><fpage>1436</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1126/science.1201248</pub-id></element-citation></ref><ref id="bib7"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Brown</surname><given-names>RH</given-names></name><name><surname>Hattersley</surname><given-names>PW</given-names></name></person-group><year>1989</year><article-title>Leaf anatomy of C<sub>3</sub>-C<sub>4</sub> species as related to evolution of C<sub>4</sub> photosynthesis</article-title><source>Plant Physiol</source><volume>91</volume><fpage>1543</fpage><lpage>50</lpage><pub-id pub-id-type="doi">10.1104/pp.91.4.1543</pub-id></element-citation></ref><ref id="bib8"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bruhl</surname><given-names>JJ</given-names></name><name><surname>Perry</surname><given-names>S</given-names></name></person-group><year>1995</year><article-title>Photosynthetic pathway-related ultrastructure of C<sub>3</sub>, C<sub>4</sub> and C<sub>3</sub>-like C<sub>3</sub>-C<sub>4</sub> intermediate sedges (Cyperaceae), with special reference to <italic>Eleocharis</italic></article-title><source>Aust J Plant Physiol</source><volume>22</volume><fpage>521</fpage><pub-id pub-id-type="doi">10.1071/PP9950521</pub-id></element-citation></ref><ref id="bib9"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bustin</surname><given-names>SA</given-names></name></person-group><year>2000</year><article-title>Absolute quantification of mRNA using real-time reverse transcription polymerase chain reaction assays</article-title><source>J Mol Endocrinol</source><volume>25</volume><fpage>169</fpage><lpage>93</lpage><pub-id pub-id-type="doi">10.1677/jme.0.0250169</pub-id></element-citation></ref><ref id="bib10"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Casati</surname><given-names>P</given-names></name><name><surname>Fresco</surname><given-names>AG</given-names></name><name><surname>Andreo</surname><given-names>CS</given-names></name><name><surname>Drincovich</surname><given-names>MF</given-names></name></person-group><year>1999</year><article-title>An intermediate form of NADP-malic enzyme from the C<sub>3</sub>-C<sub>4</sub> intermediate species <italic>Flaveria floridana</italic></article-title><source>Plant Science</source><volume>147</volume><fpage>101</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1016/S0168-9452(99)00101-6</pub-id></element-citation></ref><ref id="bib11"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cheng</surname><given-names>SH</given-names></name><name><surname>Moore</surname><given-names>BD</given-names></name><name><surname>Edwards</surname><given-names>GE</given-names></name><name><surname>Ku</surname><given-names>MS</given-names></name></person-group><year>1988</year><article-title>Photosynthesis in <italic>Flaveria brownii</italic>, a C<sub>4</sub>-like species: leaf anatomy, characteristics of CO<sub>2</sub> exchange, compartmentation of photosynthetic enzymes, and metabolism of CO<sub>2</sub></article-title><source>Plant Physiol</source><volume>87</volume><fpage>867</fpage><lpage>73</lpage><pub-id pub-id-type="doi">10.1104/pp.87.4.867</pub-id></element-citation></ref><ref id="bib15"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Christin</surname><given-names>P-A</given-names></name><name><surname>Osborne</surname><given-names>CP</given-names></name><name><surname>Chatelet</surname><given-names>DS</given-names></name><name><surname>Columbus</surname><given-names>JT</given-names></name><name><surname>Besnard</surname><given-names>G</given-names></name><name><surname>Hodkinson</surname><given-names>TR</given-names></name><etal/></person-group><year>2013</year><article-title>Anatomical enablers and the evolution of C<sub>4</sub> photosynthesis in grasses</article-title><source>Proc Natl Acad Sci USA</source><volume>110</volume><fpage>1381</fpage><lpage>6</lpage><pub-id pub-id-type="doi">10.1073/pnas.1216777110</pub-id></element-citation></ref><ref id="bib16"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Christin</surname><given-names>P-A</given-names></name><name><surname>Osborne</surname><given-names>CP</given-names></name><name><surname>Sage</surname><given-names>RF</given-names></name><name><surname>Arakaki</surname><given-names>M</given-names></name><name><surname>Edwards</surname><given-names>EJ</given-names></name></person-group><year>2011b</year><article-title>C<sub>4</sub> eudicots are not younger than C<sub>4</sub> monocots</article-title><source>J Exp Bot</source><volume>62</volume><fpage>3171</fpage><lpage>81</lpage><pub-id pub-id-type="doi">10.1093/jxb/err041</pub-id></element-citation></ref><ref id="bib12"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Christin</surname><given-names>P-A</given-names></name><name><surname>Sage</surname><given-names>TL</given-names></name><name><surname>Edwards</surname><given-names>EJ</given-names></name><name><surname>Ogburn</surname><given-names>RM</given-names></name><name><surname>Khoshravesh</surname><given-names>R</given-names></name><name><surname>Sage</surname><given-names>RF</given-names></name></person-group><year>2011a</year><article-title>Complex evolutionary transitions and the significance of C<sub>3</sub>-C<sub>4</sub> intermediate forms of photosynthesis in Molluginaceae</article-title><source>Evolution</source><volume>65</volume><fpage>643</fpage><lpage>60</lpage><pub-id pub-id-type="doi">10.1111/j.1558-5646.2010.01168.x</pub-id></element-citation></ref><ref id="bib13"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Christin</surname><given-names>P</given-names></name><name><surname>Besnard</surname><given-names>G</given-names></name><name><surname>Samaritani</surname><given-names>E</given-names></name><name><surname>Duvall</surname><given-names>M</given-names></name><name><surname>Hodkinson</surname><given-names>T</given-names></name><name><surname>Savolainen</surname><given-names>V</given-names></name><etal/></person-group><year>2008</year><article-title>Oligocene CO<sub>2</sub> decline promoted C<sub>4</sub> photosynthesis in grasses</article-title><source>Curr Biol</source><volume>18</volume><fpage>37</fpage><lpage>43</lpage><pub-id pub-id-type="doi">10.1016/j.cub.2007.11.058</pub-id></element-citation></ref><ref id="bib14"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Christin</surname><given-names>P</given-names></name><name><surname>Salamin</surname><given-names>N</given-names></name><name><surname>Savolainen</surname><given-names>V</given-names></name><name><surname>Duvall</surname><given-names>M</given-names></name><name><surname>Besnard</surname><given-names>G</given-names></name></person-group><year>2007</year><article-title>C<sub>4</sub> photosynthesis evolved in grasses via parallel adaptive genetic changes</article-title><source>Curr Biol</source><volume>17</volume><fpage>1241</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1016/j.cub.2007.06.036</pub-id></element-citation></ref><ref id="bib17"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Devi</surname><given-names>MT</given-names></name><name><surname>Raghavendra</surname><given-names>AS</given-names></name></person-group><year>1993</year><article-title>Partial reduction in activities of photorespiratory enzymes in C<sub>3</sub>-C<sub>4</sub> intermediates of <italic>Alternanthera</italic> and <italic>Parthenium</italic></article-title><source>J Exp Bot</source><volume>44</volume><fpage>779</fpage><lpage>84</lpage><pub-id pub-id-type="doi">10.1093/jxb/44.4.779</pub-id></element-citation></ref><ref id="bib18"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Devi</surname><given-names>MT</given-names></name><name><surname>Rajagopalan</surname><given-names>AV</given-names></name><name><surname>Raghavendra</surname><given-names>AS</given-names></name></person-group><year>1995</year><article-title>Predominant localization of mitochondria enriched with glycine-decarboxylating enzymes in bundle sheath cells of <italic>Alternanthera tenella</italic>, a C<sub>3</sub>-C<sub>4</sub> intermediate species</article-title><source>Plant Cell Environ</source><volume>18</volume><fpage>589</fpage><lpage>94</lpage><pub-id pub-id-type="doi">10.1111/j.1365-3040.1995.tb00559.x</pub-id></element-citation></ref><ref id="bib19"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Edwards</surname><given-names>EJ</given-names></name><name><surname>Osborne</surname><given-names>CP</given-names></name><name><surname>Stromberg</surname><given-names>CAE</given-names></name><name><surname>Smith</surname><given-names>SA</given-names></name><collab>C<sub>4</sub> grasses consortium</collab></person-group><year>2010</year><article-title>The origins of C<sub>4</sub> grasslands: integrating evolutionary and ecosystem science</article-title><source>Science</source><volume>328</volume><fpage>587</fpage><lpage>91</lpage><pub-id pub-id-type="doi">10.1126/science.1177216</pub-id></element-citation></ref><ref id="bib20"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Furbank</surname><given-names>RT</given-names></name></person-group><year>2011</year><article-title>Evolution of the C<sub>4</sub> photosynthetic mechanism: are there really three C<sub>4</sub> acid decarboxylation types?</article-title><source>J Exp Bot</source><volume>62</volume><fpage>3103</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1093/jxb/err080</pub-id></element-citation></ref><ref id="bib21"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gavrilets</surname><given-names>S</given-names></name></person-group><year>1997</year><article-title>Evolution and speciation on holey adaptive landscapes</article-title><source>Trends Ecol Evol</source><volume>12</volume><fpage>307</fpage><lpage>12</lpage><pub-id pub-id-type="doi">10.1016/S0169-5347(97)01098-7</pub-id></element-citation></ref><ref id="bib22"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Goldstein</surname><given-names>L</given-names></name><name><surname>Ray</surname><given-names>T</given-names></name><name><surname>Kestler</surname><given-names>D</given-names></name><name><surname>Mayne</surname><given-names>B</given-names></name><name><surname>Brown</surname><given-names>R</given-names></name><name><surname>Black</surname><given-names>C</given-names></name></person-group><year>1976</year><article-title>Biochemical characterization of panicum species which are intermediate between C<sub>3</sub> and C<sub>4</sub> photosynthesis plants</article-title><source>Plant Sci</source><volume>6</volume><fpage>85</fpage><lpage>90</lpage><pub-id pub-id-type="doi">10.1016/0304-4211(76)90140-1</pub-id></element-citation></ref><ref id="bib23"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gowik</surname><given-names>U</given-names></name><name><surname>Brautigam</surname><given-names>A</given-names></name><name><surname>Weber</surname><given-names>KL</given-names></name><name><surname>Weber</surname><given-names>APM</given-names></name><name><surname>Westhoff</surname><given-names>P</given-names></name></person-group><year>2011</year><article-title>Evolution of C<sub>4</sub> photosynthesis in the genus <italic>Flaveria</italic>: how many and which genes does it take to make C<sub>4</sub>?</article-title><source>Plant Cell</source><volume>23</volume><fpage>2087</fpage><lpage>105</lpage><pub-id pub-id-type="doi">10.1105/tpc.111.086264</pub-id></element-citation></ref><ref id="bib24"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Griffiths</surname><given-names>H</given-names></name><name><surname>Weller</surname><given-names>G</given-names></name><name><surname>Toy</surname><given-names>LFM</given-names></name><name><surname>Dennis</surname><given-names>RJ</given-names></name></person-group><year>2013</year><article-title>You’re so vein: bundle sheath physiology, phylogeny and evolution in C<sub>3</sub> and C<sub>4</sub> plants</article-title><source>Plant Cell Environ</source><volume>36</volume><fpage>249</fpage><lpage>61</lpage><pub-id pub-id-type="doi">10.1111/j.1365-3040.2012.02585.x</pub-id></element-citation></ref><ref id="bib25"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hatch</surname><given-names>MD</given-names></name><name><surname>Kagawa</surname><given-names>T</given-names></name><name><surname>Craig</surname><given-names>S</given-names></name></person-group><year>1975</year><article-title>Subdivision of C<sub>4</sub> pathway species based on differing C<sub>4</sub> acid decarboxylating systems and ultrastructural features</article-title><source>Aust J Plant Physiol</source><volume>2</volume><fpage>111</fpage><lpage>28</lpage><pub-id pub-id-type="doi">10.1071/PP9750111</pub-id></element-citation></ref><ref id="bib26"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hattersley</surname><given-names>PW</given-names></name></person-group><year>1984</year><article-title>Characterization of C<sub>4</sub> type leaf anatomy in grasses (Poaceae). Mesophyll: bundle sheath area ratios</article-title><source>Ann Bot</source><volume>53</volume><fpage>163</fpage><lpage>80</lpage></element-citation></ref><ref id="bib27"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hattersley</surname><given-names>PW</given-names></name><name><surname>Stone</surname><given-names>NE</given-names></name></person-group><year>1986</year><article-title>Photosynthetic enzyme activities in the C<sub>3</sub>-C<sub>4</sub> intermediate <italic>Neurachne minor</italic></article-title><source>Aust J Plant Physiol</source><volume>13</volume><fpage>399</fpage><pub-id pub-id-type="doi">10.1071/PP9860399</pub-id></element-citation></ref><ref id="bib28"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Heckman</surname><given-names>D</given-names></name><name><surname>Schulze</surname><given-names>S</given-names></name><name><surname>Denton</surname><given-names>A</given-names></name><name><surname>Gowik</surname><given-names>U</given-names></name><name><surname>Westhoff</surname><given-names>P</given-names></name><name><surname>Weber</surname><given-names>APM</given-names></name><etal/></person-group><year>2013</year><article-title>Predicting C<sub>4</sub> photosynthesis evolution: modular, individually adaptive steps on a Mount Fuji fitness landscape</article-title><source>Cell</source><volume>153</volume><fpage>1579</fpage><lpage>88</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2013.04.058</pub-id></element-citation></ref><ref id="bib29"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hibberd</surname><given-names>JM</given-names></name><name><surname>Sheehy</surname><given-names>JE</given-names></name><name><surname>Langdale</surname><given-names>JA</given-names></name></person-group><year>2008</year><article-title>Using C<sub>4</sub> photosynthesis to increase the yield of rice-rationale and feasibility</article-title><source>Curr Opin Plant Biol</source><volume>11</volume><fpage>228</fpage><lpage>31</lpage><pub-id pub-id-type="doi">10.1016/j.pbi.2007.11.002</pub-id></element-citation></ref><ref id="bib30"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hill</surname><given-names>RC</given-names></name><name><surname>Egydio de Carvalho</surname><given-names>C</given-names></name><name><surname>Salogiannis</surname><given-names>J</given-names></name><name><surname>Schlager</surname><given-names>B</given-names></name><name><surname>Pilgrim</surname><given-names>D</given-names></name><name><surname>Haag</surname><given-names>ES</given-names></name></person-group><year>2006</year><article-title>Genetic flexibility in the convergent evolution of hermaphroditism in <italic>Caenorhabditis</italic> nematodes</article-title><source>Dev Cell</source><volume>10</volume><fpage>531</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1016/j.devcel.2006.02.002</pub-id></element-citation></ref><ref id="bib31"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Holaday</surname><given-names>AS</given-names></name><name><surname>Black</surname><given-names>CC</given-names></name></person-group><year>1981</year><article-title>Comparative characterization of phosphoenolpyruvate carboxylase in C<sub>3</sub>, C<sub>4</sub>, and C<sub>3</sub>-C<sub>4</sub> intermediate <italic>Panicum</italic> species</article-title><source>Plant Physiol</source><volume>67</volume><fpage>330</fpage><lpage>4</lpage><pub-id pub-id-type="doi">10.1104/pp.67.2.330</pub-id></element-citation></ref><ref id="bib32"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Holaday</surname><given-names>AS</given-names></name><name><surname>Lee</surname><given-names>KW</given-names></name><name><surname>Chollet</surname><given-names>R</given-names></name></person-group><year>1984</year><article-title>C<sub>3</sub>-C<sub>4</sub> Intermediate species in the genus <italic>Flaveria</italic>: leaf anatomy, ultrastructure, and the effect of O<sub>2</sub> on the CO<sub>2</sub> compensation concentration</article-title><source>Planta</source><volume>160</volume><fpage>25</fpage><lpage>32</lpage><pub-id pub-id-type="doi">10.1007/BF00392462</pub-id></element-citation></ref><ref id="bib33"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Holaday</surname><given-names>AS</given-names></name><name><surname>Shieh</surname><given-names>Y-J</given-names></name><name><surname>Lee</surname><given-names>KW</given-names></name><name><surname>Chollet</surname><given-names>R</given-names></name></person-group><year>1981</year><article-title>Anatomical, ultrastructural and enzymic studies of leaves of Moricandia arvensis, a C<sub>3</sub>-C<sub>4</sub> intermediate species</article-title><source>Biochim Biophys Acta</source><volume>637</volume><fpage>334</fpage><lpage>41</lpage><pub-id pub-id-type="doi">10.1016/0005-2728(81)90172-9</pub-id></element-citation></ref><ref id="bib34"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hylton</surname><given-names>CM</given-names></name><name><surname>Rawsthorne</surname><given-names>S</given-names></name><name><surname>Smith</surname><given-names>AM</given-names></name><name><surname>Jones</surname><given-names>DA</given-names></name><name><surname>Woolhouse</surname><given-names>HW</given-names></name></person-group><year>1988</year><article-title>Glycine decarboxylase is confined to the bundle-sheath cells of leaves of C<sub>3</sub>-C<sub>4</sub> intermediate species</article-title><source>Planta</source><volume>175</volume><fpage>452</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1007/BF00393064</pub-id></element-citation></ref><ref id="bib35"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Johnson</surname><given-names>MTJ</given-names></name><name><surname>Carpenter</surname><given-names>EJ</given-names></name><name><surname>Tian</surname><given-names>Z</given-names></name><name><surname>Bruskiewich</surname><given-names>R</given-names></name><name><surname>Burris</surname><given-names>JN</given-names></name><name><surname>Carrigan</surname><given-names>CT</given-names></name><etal/></person-group><year>2012</year><article-title>Evaluating methods for isolating total RNA and predicting the success of sequencing phylogenetically diverse plant transcriptomes</article-title><source>PLOS ONE</source><volume>7</volume><fpage>e50226</fpage><pub-id pub-id-type="doi">10.1371/journal.pone.0050226</pub-id></element-citation></ref><ref id="bib36"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kajala</surname><given-names>K</given-names></name><name><surname>Brown</surname><given-names>NJ</given-names></name><name><surname>Williams</surname><given-names>BP</given-names></name><name><surname>Borrill</surname><given-names>P</given-names></name><name><surname>Taylor</surname><given-names>LE</given-names></name><name><surname>Hibberd</surname><given-names>JM</given-names></name></person-group><year>2012</year><article-title>Multiple <italic>Arabidopsis</italic> genes primed for recruitment into C<sub>4</sub> photosynthesis</article-title><source>Plant J</source><volume>69</volume><fpage>47</fpage><lpage>56</lpage><pub-id pub-id-type="doi">10.1111/j.1365-313X.2011.04769.x</pub-id></element-citation></ref><ref id="bib37"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Keeley</surname><given-names>JE</given-names></name></person-group><year>1999</year><article-title>Photosynthetic pathway diversity in a seasonal pool community</article-title><source>Funct Ecol</source><volume>13</volume><fpage>106</fpage><lpage>18</lpage><pub-id pub-id-type="doi">10.1046/j.1365-2435.1999.00294.x</pub-id></element-citation></ref><ref id="bib38"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kennedy</surname><given-names>RA</given-names></name><name><surname>Eastburn</surname><given-names>JL</given-names></name><name><surname>Jensen</surname><given-names>KG</given-names></name></person-group><year>1980</year><article-title>C<sub>3</sub>-C<sub>4</sub> Photosynthesis in the genus <italic>Mollugo</italic>: structure, physiology and evolution of intermediate characteristics</article-title><source>Am J Bot</source><volume>67</volume><fpage>1207</fpage><lpage>17</lpage><pub-id pub-id-type="doi">10.2307/2442363</pub-id></element-citation></ref><ref id="bib39"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Khoshravesh</surname><given-names>R</given-names></name><name><surname>Hossein</surname><given-names>A</given-names></name><name><surname>Sage</surname><given-names>TL</given-names></name><name><surname>Nordenstam</surname><given-names>B</given-names></name><name><surname>Sage</surname><given-names>RF</given-names></name></person-group><year>2012</year><article-title>Phylogeny and photosynthetic pathway distribution in <italic>Anticharis</italic> Endl. (Scrophulariaceae)</article-title><source>J Exp Bot</source><volume>63</volume><fpage>5645</fpage><lpage>58</lpage><pub-id pub-id-type="doi">10.1093/jxb/ers218</pub-id></element-citation></ref><ref id="bib40"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kidwell</surname><given-names>SM</given-names></name><name><surname>Holland</surname><given-names>SM</given-names></name></person-group><year>2002</year><article-title>The quality of the fossil record: implications for evolutionary analyses</article-title><source>Ann Rev Ecol Syst</source><volume>33</volume><fpage>561</fpage><lpage>88</lpage><pub-id pub-id-type="doi">10.1146/annurev.ecolsys.33.030602.152151</pub-id></element-citation></ref><ref id="bib41"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Koteyeva</surname><given-names>NK</given-names></name><name><surname>Voznesenskaya</surname><given-names>EV</given-names></name><name><surname>Roalson</surname><given-names>EH</given-names></name><name><surname>Edwards</surname><given-names>GE</given-names></name></person-group><year>2010</year><article-title>Diversity in forms of C<sub>4</sub> in the genus <italic>Cleome</italic> (Cleomaceae)</article-title><source>Ann Bot</source><volume>107</volume><fpage>269</fpage><lpage>83</lpage><pub-id pub-id-type="doi">10.1093/aob/mcq239</pub-id></element-citation></ref><ref id="bib42"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kozmik</surname><given-names>Z</given-names></name><name><surname>Ruzickova</surname><given-names>J</given-names></name><name><surname>Jonasova</surname><given-names>K</given-names></name><name><surname>Matsumoto</surname><given-names>Y</given-names></name><name><surname>Vopalensky</surname><given-names>P</given-names></name><name><surname>Kozmikova</surname><given-names>I</given-names></name><etal/></person-group><year>2008</year><article-title>Assembly of the cnidarian camera-type eye from vertebrate-like components</article-title><source>Proc Natl Acad Sci USA</source><volume>105</volume><fpage>8989</fpage><lpage>93</lpage><pub-id pub-id-type="doi">10.1073/pnas.0800388105</pub-id></element-citation></ref><ref id="bib46"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ku</surname><given-names>MSB</given-names></name><name><surname>Edwards</surname><given-names>GE</given-names></name></person-group><year>1978</year><article-title>Photosynthetic efficiency of <italic>Panicum hians</italic> and <italic>Panicum milioides</italic> in relation to C<sub>3</sub> and C<sub>4</sub> plants</article-title><source>PlantCell Physiol</source><volume>19</volume><fpage>665</fpage><lpage>75</lpage></element-citation></ref><ref id="bib43"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ku</surname><given-names>MSB</given-names></name><name><surname>Edwards</surname><given-names>GE</given-names></name><name><surname>Kanai</surname><given-names>R</given-names></name></person-group><year>1976</year><article-title>Distribution of enzymes related to C<sub>3</sub> and C<sub>4</sub> pathway of photosynthesis between mesophyll and bundle sheath cells of <italic>Panicum hians</italic> and <italic>Panicum milioides</italic></article-title><source>Plant Cell Physiol</source><volume>17</volume><fpage>615</fpage><lpage>20</lpage></element-citation></ref><ref id="bib44"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ku</surname><given-names>MSB</given-names></name><name><surname>Monson</surname><given-names>RK</given-names></name><name><surname>Littlejohn</surname><given-names>RO</given-names></name><name><surname>Nakamoto</surname><given-names>H</given-names></name><name><surname>Fisher</surname><given-names>DB</given-names></name><name><surname>Edwards</surname><given-names>GE</given-names></name></person-group><year>1983</year><article-title>Photosynthetic characteristics of C<sub>3</sub>-C<sub>4</sub> intermediate <italic>Flaveria</italic> species<sup>1</sup>: I. leaf anatomy, photosynthetic responses to O<sub>2</sub> and CO<sub>2</sub>, and activities of key enzymes in the C<sub>3</sub> and C<sub>4</sub> Pathways</article-title><source>Plant Physiol</source><volume>71</volume><fpage>944</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1104/pp.71.4.944</pub-id></element-citation></ref><ref id="bib45"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ku</surname><given-names>MSB</given-names></name><name><surname>Wu</surname><given-names>J</given-names></name><name><surname>Dai</surname><given-names>Z</given-names></name><name><surname>Scott</surname><given-names>RA</given-names></name><name><surname>Chu</surname><given-names>C</given-names></name><name><surname>Edwards</surname><given-names>GE</given-names></name></person-group><year>1991</year><article-title>Photosynthetic and photorespiratory characteristics of <italic>Flaveria</italic> species</article-title><source>Plant Physiol</source><volume>96</volume><fpage>518</fpage><lpage>28</lpage><pub-id pub-id-type="doi">10.1104/pp.96.2.518</pub-id></element-citation></ref><ref id="bib47"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Langdale</surname><given-names>JA</given-names></name></person-group><year>2011</year><article-title>C<sub>4</sub> cycles: past, present, and future research on C<sub>4</sub> photosynthesis</article-title><source>Plant Cell</source><volume>23</volume><fpage>3879</fpage><lpage>92</lpage><pub-id pub-id-type="doi">10.1105/tpc.111.092098</pub-id></element-citation></ref><ref id="bib48"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Livak</surname><given-names>KJ</given-names></name><name><surname>Schmittgen</surname><given-names>TD</given-names></name></person-group><year>2001</year><article-title>Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta DeltaC(T)) method</article-title><source>Methods</source><volume>25</volume><fpage>402</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1006/meth.2001.1262</pub-id></element-citation></ref><ref id="bib49"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lobkovsky</surname><given-names>AE</given-names></name><name><surname>Wolf</surname><given-names>YI</given-names></name><name><surname>Koonin</surname><given-names>EV</given-names></name></person-group><year>2011</year><article-title>Predictability of evolutionary trajectories in fitness landscapes</article-title><source>PLOS Comput Biol</source><volume>7</volume><fpage>e1002302</fpage><pub-id pub-id-type="doi">10.1371/journal.pcbi.1002302</pub-id></element-citation></ref><ref id="bib50"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>McKown</surname><given-names>AD</given-names></name><name><surname>Dengler</surname><given-names>NG</given-names></name></person-group><year>2007</year><article-title>Key innovations in the evolution of Kranz anatomy and C<sub>4</sub> vein pattern in <italic>Flaveria</italic> (Asteraceae)</article-title><source>Am J Bot</source><volume>94</volume><fpage>382</fpage><lpage>99</lpage><pub-id pub-id-type="doi">10.3732/ajb.94.3.382</pub-id></element-citation></ref><ref id="bib51"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>McKown</surname><given-names>AD</given-names></name><name><surname>Moncalvo</surname><given-names>J-M</given-names></name><name><surname>Dengler</surname><given-names>NG</given-names></name></person-group><year>2005</year><article-title>Phylogeny of <italic>Flaveria</italic> (Asteraceae) and inference of C<sub>4</sub> photosynthesis evolution</article-title><source>Am J Bot</source><volume>92</volume><fpage>1911</fpage><lpage>28</lpage><pub-id pub-id-type="doi">10.3732/ajb.92.11.1911</pub-id></element-citation></ref><ref id="bib52"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Mooers</surname><given-names>AO</given-names></name><name><surname>Heard</surname><given-names>SB</given-names></name></person-group><year>2013</year><article-title>Inferring evolutionary process from phylogenetic tree shape</article-title><source>Q Rev Biol</source><volume>72</volume><fpage>31</fpage><lpage>54</lpage><pub-id pub-id-type="doi">10.1086/419657</pub-id></element-citation></ref><ref id="bib53"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Moore</surname><given-names>BD</given-names></name><name><surname>Franceschi</surname><given-names>VR</given-names></name><name><surname>Cheng</surname><given-names>S-H</given-names></name><name><surname>Wu</surname><given-names>J</given-names></name><name><surname>Ku</surname><given-names>MSB</given-names></name></person-group><year>1987</year><article-title>Photosynthetic characteristics of the C<sub>3</sub>-C<sub>4</sub> intermediate <italic>Parthenium hysterophorus</italic></article-title><source>Plant Physiol</source><volume>85</volume><fpage>978</fpage><lpage>83</lpage><pub-id pub-id-type="doi">10.1104/pp.85.4.978</pub-id></element-citation></ref><ref id="bib54"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Muhaidat</surname><given-names>R</given-names></name><name><surname>Sage</surname><given-names>TL</given-names></name><name><surname>Frohlich</surname><given-names>MW</given-names></name><name><surname>Dengler</surname><given-names>NG</given-names></name><name><surname>Sage</surname><given-names>RF</given-names></name></person-group><year>2011</year><article-title>Characterization of C<sub>3</sub>-C<sub>4</sub> intermediate species in the genus <italic>Heliotropium</italic> L. (Boraginaceae): anatomy, ultrastructure and enzyme activity</article-title><source>Plant Cell Environ</source><volume>34</volume><fpage>1723</fpage><lpage>36</lpage><pub-id pub-id-type="doi">10.1111/j.1365-3040.2011.02367.x</pub-id></element-citation></ref><ref id="bib55"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Osborne</surname><given-names>CP</given-names></name><name><surname>Sack</surname><given-names>L</given-names></name></person-group><year>2012</year><article-title>Evolution of C<sub>4</sub> plants: a new hypothesis for an interaction of CO<sub>2</sub> and water relations mediated by plant hydraulics</article-title><source>Philos Trans R Soc Lond B Biol Sci</source><volume>367</volume><fpage>583</fpage><lpage>600</lpage><pub-id pub-id-type="doi">10.1098/rstb.2011.0261</pub-id></element-citation></ref><ref id="bib56"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>P’yankov</surname><given-names>VI</given-names></name><name><surname>Voznesenskaya</surname><given-names>EV</given-names></name><name><surname>Kondratschuk</surname><given-names>AV</given-names></name><name><surname>Black</surname><given-names>CC</given-names></name></person-group><year>1997</year><article-title>A comparative anatomical and biochemical analysis in <italic>Salsola</italic> (Chenopodiaceae) species with and without a Kranz type leaf anatomy: a possible reversion of C<sub>4</sub> to C<sub>3</sub> photosynthesis</article-title><source>Am J Bot</source><volume>84</volume><fpage>597</fpage></element-citation></ref><ref id="bib57"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rabiner</surname><given-names>L</given-names></name></person-group><year>1989</year><article-title>A tutorial on hidden Markov models and selected applications in speech recognition</article-title><source>Proc IEEE</source><volume>77</volume><fpage>257</fpage><lpage>86</lpage><pub-id pub-id-type="doi">10.1109/5.18626</pub-id></element-citation></ref><ref id="bib58"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rajendrudu</surname><given-names>G</given-names></name><name><surname>Prasad</surname><given-names>JSR</given-names></name><name><surname>Das</surname><given-names>VSR</given-names></name></person-group><year>1986</year><article-title>C<sub>3</sub>-C<sub>4</sub> intermediate species in <italic>Alternanthera</italic> (Amaranthaceae)<sup>1</sup>: leaf anatomy, CO<sub>2</sub> compensation point, net CO<sub>2</sub> exchange and activities of photosynthetic enzymes</article-title><source>Plant Physiol</source><volume>80</volume><fpage>409</fpage><lpage>14</lpage><pub-id pub-id-type="doi">10.1104/pp.80.2.409</pub-id></element-citation></ref><ref id="bib60"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rathnam</surname><given-names>CKM</given-names></name><name><surname>Chollet</surname><given-names>R</given-names></name></person-group><year>1978</year><article-title>CO<sub>2</sub> donation by malate and aspartate reduces photorespiration in <italic>Panicum milioides</italic>, A C<sub>3</sub>-C<sub>4</sub> intermediate species</article-title><source>Biochem Biophys Res Comm</source><volume>85</volume><fpage>801</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1016/0006-291X(78)91233-0</pub-id></element-citation></ref><ref id="bib59"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rathnam</surname><given-names>CKM</given-names></name><name><surname>Chollet</surname><given-names>R</given-names></name></person-group><year>1979</year><article-title>Photosynthetic carbon metabolism in <italic>Panicum milioides</italic>, a C<sub>3</sub>-C<sub>4</sub> intermediate species: evidence for a limited C<sub>4</sub> dicarboxylic acid pathway of photosynthesis</article-title><source>Biochim Biophys Acta</source><volume>548</volume><fpage>500</fpage><lpage>19</lpage><pub-id pub-id-type="doi">10.1016/0005-2728(79)90061-6</pub-id></element-citation></ref><ref id="bib61"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rawsthorne</surname><given-names>S</given-names></name><name><surname>Hylton</surname><given-names>CM</given-names></name><name><surname>Smith</surname><given-names>AM</given-names></name><name><surname>Woolhouse</surname><given-names>HW</given-names></name></person-group><year>1988</year><article-title>Photorespiratory metabolism and immunogold localization of photorespiratory enzymes in leaves of C<sub>3</sub> and C<sub>3</sub>-C<sub>4</sub> intermediate species of <italic>Moricandia</italic></article-title><source>Planta</source><volume>173</volume><fpage>298</fpage><lpage>308</lpage><pub-id pub-id-type="doi">10.1007/BF00401016</pub-id></element-citation></ref><ref id="bib62"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rawsthorne</surname><given-names>S</given-names></name><name><surname>Morgan</surname><given-names>CL</given-names></name><name><surname>O’Neill</surname><given-names>CM</given-names></name><name><surname>Hylton</surname><given-names>CM</given-names></name><name><surname>Jones</surname><given-names>DA</given-names></name><name><surname>Frean</surname><given-names>ML</given-names></name></person-group><year>1998</year><article-title>Cellular expression pattern of the glycine decarboxylase P protein in leaves of an intergeneric hybrid between the C<sub>3</sub>-C<sub>4</sub> intermediate species <italic>Moricandia nitens</italic> and the C<sub>3</sub> species <italic>Brassica napus</italic></article-title><source>Theoret App Genet</source><volume>96</volume><fpage>922</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1007/s001220050821</pub-id></element-citation></ref><ref id="bib63"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Revuz</surname><given-names>D</given-names></name><name><surname>Yor</surname><given-names>M</given-names></name></person-group><year>1999</year><source>Continuous Martingales and Brownian Motion</source><publisher-name>Springer</publisher-name></element-citation></ref><ref id="bib64"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rosche</surname><given-names>E</given-names></name><name><surname>Streubel</surname><given-names>M</given-names></name><name><surname>Westhoff</surname><given-names>P</given-names></name></person-group><year>1994</year><article-title>Primary structure of the photosynthetic pyruvate orthophosphate dikinase of the C<sub>3</sub> plant <italic>Flaveria pringlei</italic> and expression analysis of pyruvate orthophosphate dikinase sequences in C<sub>3</sub>, C<sub>3</sub>-C<sub>4</sub> and C<sub>4</sub> <italic>Flaveria</italic> species</article-title><source>Plant Mol Biol</source><volume>26</volume><fpage>763</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1007/BF00013761</pub-id></element-citation></ref><ref id="bib65"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Roweis</surname><given-names>S</given-names></name></person-group><year>1998</year><article-title>Em algorithms for PCA and SPCA</article-title><source>Advances in Neural Information Processing Systems</source><publisher-loc>Cambridge, MA</publisher-loc><publisher-name>MIT Press</publisher-name><fpage>626</fpage><lpage>32</lpage></element-citation></ref><ref id="bib66"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sage</surname><given-names>RF</given-names></name><name><surname>Christin</surname><given-names>P-A</given-names></name><name><surname>Edwards</surname><given-names>EJ</given-names></name></person-group><year>2011a</year><article-title>The C<sub>4</sub> plant lineages of planet Earth</article-title><source>J Exp Bot</source><volume>62</volume><fpage>3155</fpage><lpage>69</lpage><pub-id pub-id-type="doi">10.1093/jxb/err048</pub-id></element-citation></ref><ref id="bib67"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sage</surname><given-names>RF</given-names></name><name><surname>Sage</surname><given-names>TL</given-names></name><name><surname>Kocacinar</surname><given-names>F</given-names></name></person-group><year>2012</year><article-title>Photorespiration and the evolution of C<sub>4</sub> photosynthesis</article-title><source>Annu Rev Plant Biol</source><volume>63</volume><fpage>19</fpage><lpage>47</lpage><pub-id pub-id-type="doi">10.1146/annurev-arplant-042811-105511</pub-id></element-citation></ref><ref id="bib68"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sage</surname><given-names>TL</given-names></name><name><surname>Sage</surname><given-names>RF</given-names></name><name><surname>Vogan</surname><given-names>PJ</given-names></name><name><surname>Rahman</surname><given-names>B</given-names></name><name><surname>Johnson</surname><given-names>DC</given-names></name><name><surname>Oakley</surname><given-names>JC</given-names></name><etal/></person-group><year>2011b</year><article-title>The occurrence of C<sub>2</sub> photosynthesis in <italic>Euphorbia</italic> subgenus <italic>Chamaesyce</italic> (Euphorbiaceae)</article-title><source>J Exp Bot</source><volume>62</volume><fpage>3183</fpage><lpage>95</lpage><pub-id pub-id-type="doi">10.1093/jxb/err059</pub-id></element-citation></ref><ref id="bib69"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Santos</surname><given-names>JC</given-names></name><name><surname>Coloma</surname><given-names>LA</given-names></name><name><surname>Cannatella</surname><given-names>DC</given-names></name></person-group><year>2003</year><article-title>Multiple, recurring origins of aposematism and diet specialization in poison frogs</article-title><source>Proc Natl Acad Sci USA</source><volume>100</volume><fpage>12792</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1073/pnas.2133521100</pub-id></element-citation></ref><ref id="bib70"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sayre</surname><given-names>RT</given-names></name><name><surname>Kennedy</surname><given-names>RA</given-names></name><name><surname>Pringnitz</surname><given-names>DJ</given-names></name></person-group><year>1979</year><article-title>Photosynthetic enzyme activities and localization in <italic>Mollugo verticillata</italic> populations differing in the levels of C<sub>3</sub> and C<sub>4</sub> cycle operation</article-title><source>Plant Physiol</source><volume>64</volume><fpage>293</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1104/pp.64.2.293</pub-id></element-citation></ref><ref id="bib71"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Slack</surname><given-names>CR</given-names></name><name><surname>Hatch</surname><given-names>MD</given-names></name></person-group><year>1967</year><article-title>Comparative studies on the activity of carboxylases and other enzymes in relation to the new pathway of photosynthetic carbon dioxide fixation in tropical grasses</article-title><source>Biochem J</source><volume>103</volume><fpage>660</fpage><lpage>5</lpage><pub-id pub-id-type="doi">10.1042/bj1030660</pub-id></element-citation></ref><ref id="bib72"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Smith</surname><given-names>RD</given-names></name><name><surname>Brown</surname><given-names>B</given-names></name><name><surname>Ikonomi</surname><given-names>P</given-names></name><name><surname>Schechter</surname><given-names>AN</given-names></name></person-group><year>2003</year><article-title>Exogenous reference RNA for normalization of real-time quantitative PCR</article-title><source>BioTechniques</source><volume>34</volume><fpage>88</fpage><lpage>91</lpage><pub-id pub-id-type="doi">11200322</pub-id></element-citation></ref><ref id="bib73"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Steiner</surname><given-names>CC</given-names></name><name><surname>Römpler</surname><given-names>H</given-names></name><name><surname>Boettger</surname><given-names>LM</given-names></name><name><surname>Schöneberg</surname><given-names>T</given-names></name><name><surname>Hoekstra</surname><given-names>HE</given-names></name></person-group><year>2009</year><article-title>The genetic basis of phenotypic convergence in beach mice: similar pigment patterns but different genes</article-title><source>Mol Biol Evol</source><volume>26</volume><fpage>35</fpage><lpage>45</lpage><pub-id pub-id-type="doi">10.1093/molbev/msn218</pub-id></element-citation></ref><ref id="bib74"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ueno</surname><given-names>O</given-names></name><name><surname>Bang</surname><given-names>SW</given-names></name><name><surname>Wada</surname><given-names>Y</given-names></name><name><surname>Kondo</surname><given-names>A</given-names></name><name><surname>Ishihara</surname><given-names>K</given-names></name><name><surname>Kaneko</surname><given-names>Y</given-names></name><etal/></person-group><year>2003</year><article-title>Structural and biochemical dissection of photorespiration in hybrids differing in genome constitution between <italic>Diplotaxis tenuifolia</italic> (C<sub>3</sub>-C<sub>4</sub>) and Radish (C<sub>3</sub>)</article-title><source>Plant Physiol</source><volume>132</volume><fpage>1550</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1104/pp.103.021329</pub-id></element-citation></ref><ref id="bib75"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ueno</surname><given-names>O</given-names></name><name><surname>Wada</surname><given-names>Y</given-names></name><name><surname>Wakai</surname><given-names>M</given-names></name><name><surname>Bang</surname><given-names>SW</given-names></name></person-group><year>2006</year><article-title>Evidence from photosynthetic characteristics for the hybrid origin of <italic>Diplotaxis muralis</italic> from a C<sub>3</sub>-C<sub>4</sub> intermediate and a C<sub>3</sub> species</article-title><source>Plant Biol</source><volume>8</volume><fpage>253</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1055/s-2005-873050</pub-id></element-citation></ref><ref id="bib76"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Vicentini</surname><given-names>A</given-names></name><name><surname>Barber</surname><given-names>JC</given-names></name><name><surname>Aliscioni</surname><given-names>SS</given-names></name><name><surname>Giussani</surname><given-names>LM</given-names></name><name><surname>Kellogg</surname><given-names>EA</given-names></name></person-group><year>2008</year><article-title>The age of the grasses and clusters of origins of C<sub>4</sub> photosynthesis</article-title><source>Glob Change Biol</source><volume>14</volume><fpage>2963</fpage><lpage>77</lpage><pub-id pub-id-type="doi">10.1111/j.1365-2486.2008.01688.x</pub-id></element-citation></ref><ref id="bib77"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Vogan</surname><given-names>PJ</given-names></name><name><surname>Frohlich</surname><given-names>MW</given-names></name><name><surname>Sage</surname><given-names>RF</given-names></name></person-group><year>2007</year><article-title>The functional significance of C<sub>3</sub>-C<sub>4</sub> intermediate traits in <italic>Heliotropium</italic> L. (Boraginaceae): gas exchange perspectives</article-title><source>Plant Cell Environ</source><volume>30</volume><fpage>1337</fpage><lpage>45</lpage><pub-id pub-id-type="doi">10.1111/j.1365-3040.2007.01706.x</pub-id></element-citation></ref><ref id="bib78"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Voznesenskaya</surname><given-names>EV</given-names></name><name><surname>Artyusheva</surname><given-names>EG</given-names></name><name><surname>Franceschi</surname><given-names>VR</given-names></name><name><surname>Pyankov</surname><given-names>VI</given-names></name><name><surname>Kiirats</surname><given-names>O</given-names></name><name><surname>Ku</surname><given-names>M</given-names></name><etal/></person-group><year>2001</year><article-title><italic>Salsola arbusculiformis</italic>, a C<sub>3</sub>-C<sub>4</sub> Intermediate in Salsoleae (Chenopodiaceae)</article-title><source>Ann Bot</source><volume>88</volume><fpage>337</fpage><lpage>48</lpage><pub-id pub-id-type="doi">10.1006/anbo.2001.1457</pub-id></element-citation></ref><ref id="bib79"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Voznesenskaya</surname><given-names>EV</given-names></name><name><surname>Koteyeva</surname><given-names>NK</given-names></name><name><surname>Chuong</surname><given-names>SDX</given-names></name><name><surname>Ivanova</surname><given-names>AN</given-names></name><name><surname>Barroca</surname><given-names>J</given-names></name><name><surname>Craven</surname><given-names>LA</given-names></name><etal/></person-group><year>2007</year><article-title>Physiological, anatomical and biochemical characterisation of photosynthetic types in genus <italic>Cleome</italic> (Cleomaceae)</article-title><source>Funct Plant Biol</source><volume>34</volume><fpage>247</fpage><pub-id pub-id-type="doi">10.1071/FP06287</pub-id></element-citation></ref><ref id="bib80"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Voznesenskaya</surname><given-names>EV</given-names></name><name><surname>Koteyeva</surname><given-names>NK</given-names></name><name><surname>Edwards</surname><given-names>GE</given-names></name><name><surname>Ocampo</surname><given-names>G</given-names></name></person-group><year>2010</year><article-title>Revealing diversity in structural and biochemical forms of C<sub>4</sub> photosynthesis and a C<sub>3</sub>-C<sub>4</sub> intermediate in genus <italic>Portulaca</italic> L. (Portulacaceae)</article-title><source>J Exp Bot</source><volume>61</volume><fpage>3647</fpage><lpage>62</lpage><pub-id pub-id-type="doi">10.1093/jxb/erq178</pub-id></element-citation></ref><ref id="bib81"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Wasserman</surname><given-names>L</given-names></name></person-group><year>2004</year><source>All of statistics: a concise course in statistical inference</source><publisher-name>Springer</publisher-name></element-citation></ref><ref id="bib82"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Weinreich</surname><given-names>DM</given-names></name><name><surname>Watson</surname><given-names>RA</given-names></name><name><surname>Chao</surname><given-names>L</given-names></name></person-group><year>2005</year><article-title>Sign epistasis and genetic constraint on evolutionary trajectories</article-title><source>Evolution</source><volume>59</volume><fpage>1165</fpage><lpage>74</lpage><pub-id pub-id-type="doi">10.1111/j.0014-3820.2005.tb01768.x</pub-id></element-citation></ref><ref id="bib83"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Whibley</surname><given-names>AC</given-names></name><name><surname>Langlade</surname><given-names>NB</given-names></name><name><surname>Andalo</surname><given-names>C</given-names></name><name><surname>Hanna</surname><given-names>AI</given-names></name><name><surname>Bangham</surname><given-names>A</given-names></name><name><surname>Thébaud</surname><given-names>C</given-names></name><etal/></person-group><year>2006</year><article-title>Evolutionary paths underlying flower color variation in <italic>Antirrhinum</italic></article-title><source>Science</source><volume>313</volume><fpage>963</fpage><lpage>6</lpage><pub-id pub-id-type="doi">10.1126/science.1129161</pub-id></element-citation></ref><ref id="bib84"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Williams</surname><given-names>BP</given-names></name><name><surname>Aubry</surname><given-names>S</given-names></name><name><surname>Hibberd</surname><given-names>JM</given-names></name></person-group><year>2012</year><article-title>Molecular evolution of genes recruited into C<sub>4</sub> photosynthesis</article-title><source>Trends Plant Sci</source><volume>17</volume><fpage>213</fpage><lpage>20</lpage><pub-id pub-id-type="doi">10.1016/j.tplants.2012.01.008</pub-id></element-citation></ref><ref id="bib85"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wilson</surname><given-names>JS</given-names></name><name><surname>Williams</surname><given-names>KA</given-names></name><name><surname>Forister</surname><given-names>ML</given-names></name><name><surname>von Dohlen</surname><given-names>CD</given-names></name><name><surname>Pitts</surname><given-names>JP</given-names></name></person-group><year>2012</year><article-title>Repeated evolution in overlapping mimicry rings among North American velvet ants</article-title><source>Nat Commun</source><volume>3</volume><fpage>1272</fpage><pub-id pub-id-type="doi">10.1038/ncomms2275</pub-id></element-citation></ref><ref id="bib86"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wiludda</surname><given-names>C</given-names></name><name><surname>Schulze</surname><given-names>S</given-names></name><name><surname>Gowik</surname><given-names>U</given-names></name><name><surname>Engelmann</surname><given-names>S</given-names></name><name><surname>Koczor</surname><given-names>M</given-names></name><name><surname>Streubel</surname><given-names>M</given-names></name><etal/></person-group><year>2012</year><article-title>Regulation of the photorespiratory GLDPA gene in C<sub>4</sub> <italic>Flaveria</italic>: an intricate interplay of transcriptional and post-transcriptional processes</article-title><source>Plant Cell</source><volume>24</volume><fpage>137</fpage><lpage>51</lpage><pub-id pub-id-type="doi">10.1105/tpc.111.093872</pub-id></element-citation></ref><ref id="bib87"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wittkopp</surname><given-names>PJ</given-names></name><name><surname>Williams</surname><given-names>BL</given-names></name><name><surname>Selegue</surname><given-names>JE</given-names></name><name><surname>Carroll</surname><given-names>SB</given-names></name></person-group><year>2003</year><article-title>Drosophila pigmentation evolution: divergent genotypes underlying convergent phenotypes</article-title><source>Proc Natl Acad Sci USA</source><volume>100</volume><fpage>1808</fpage><lpage>13</lpage><pub-id pub-id-type="doi">10.1073/pnas.0336368100</pub-id></element-citation></ref><ref id="bib88"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wright</surname><given-names>S</given-names></name></person-group><year>1932</year><article-title>The roles of mutation, inbreeding, crossbreeding and selection in evolution</article-title><source>Proc Sixth Int Congr Genet</source><volume>1</volume><fpage>356</fpage><lpage>66</lpage></element-citation></ref><ref id="bib89"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zhu</surname><given-names>X-G</given-names></name><name><surname>Long</surname><given-names>SP</given-names></name><name><surname>Ort</surname><given-names>DR</given-names></name></person-group><year>2008</year><article-title>What is the maximum efficiency with which photosynthesis can convert solar energy into biomass?</article-title><source>Curr Opin Biotechnol</source><volume>19</volume><fpage>153</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1016/j.copbio.2008.02.004</pub-id></element-citation></ref></ref-list></back><sub-article article-type="article-commentary" id="SA1"><front-stub><article-id pub-id-type="doi">10.7554/eLife.00961.018</article-id><title-group><article-title>Decision letter</article-title></title-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Bergmann</surname><given-names>Dominique</given-names></name><role>Reviewing editor</role><aff><institution>Stanford University</institution>, <country>United States</country></aff></contrib></contrib-group></front-stub><body><boxed-text><p>eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see <ext-link ext-link-type="uri" xlink:href="http://elife.elifesciences.org/review-process">review process</ext-link>). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.</p></boxed-text><p>Thank you for sending your work entitled “Phenotypic landscape inference reveals multiple evolutionary paths to C<sub>4</sub> photosynthesis” for consideration at eLife. Your article has been favorably evaluated by a Senior editor, a Reviewing editor, and 2 reviewers, one of whom, Patrick Warren, has agreed to reveal his identity.</p><p>The Reviewing editor and the two reviewers discussed their comments before we reached this decision, and the Reviewing editor has assembled the following comments to help you prepare a revised submission.</p><p>The manuscript by Williams et al. describes a systematic analysis of traits associated with intermediate C<sub>3</sub>-C<sub>4</sub> forms, and makes inferences on the likely evolutionary pathways from C<sub>3</sub> to C<sub>4</sub> photosynthesis. It is largely an in silico study, inferring the evolutionary dynamics within 16 traits that distinguish C<sub>3</sub> and C<sub>4</sub> plants. The 18 lineages fall into 4 classes within which the order of appearance of the traits is convergent. Furthermore, the model predicts that there is a strong preference for the order of appearance of traits, a feature the authors have validated by measuring some previously undetermined trait values (a proxy being the abundance of transcripts by qPCR).</p><p>Although both reviewers noted that the work was predicated on the critical assumption that present day intermediate forms are representative of the evolutionary pathways, they found the work important from several angles. It suggests that the initial phenotypic directions were unrelated to photosynthetic drivers, and only later co-opted for a common end point. This is a valuable insight, and it seems to exemplify what could be a generic mechanism to explain convergent evolution of complex traits. The work shows just how far one can get with the systematic analysis of fragmented phenotype data, with a cross-disciplinary approach, even so far as to make verifiable predictions for missing trait data.</p><p>While agreeing that overall the paper is ambitious, has an evolutionary message, is technically original and shows the predictive value of their approach, several substantive concerns were raised that that should be addressed in the revision:</p><p>1) A strength of this work is that the conclusions emerge from an inference framework that is much more convincing than when traditional (qualitative and argumentative) approaches are used. Specifically, the methodology unveiled here is both quantitative and “objective”. However, the authors need to be more explicit about the level of “objectiveness” of their work. Indeed, they present a framework where choices had to be made and the reader cannot know whether attempts with other choices were less conclusive. To address this issue, the authors have to demonstrate that their conclusions are robust to choices made within their modeling framework. Two specific tests should be performed:</p><p>1A) First, use a structural change to your framework: instead of the 16 traits you have selected, remove say 2 of these traits, and if possible include 2 others.</p><p>1B) Second, you treat quantitative traits using EM and represent these by a binary value (presence vs absence). Sometimes the assignment will not be clear-cut; could you then use the other assignment (which is nearly as justified)? An even simpler approach would have been to simply put a threshold (common for all) bypassing any EM (you do this for some of your traits). Did you first try that but not succeed? It is okay to be honest about the potential weaknesses of the conclusions given that the data is limited.</p><p>2) To be clear to a broad audience, please be explicit about how you define a plant ‘lineage’ as used in your analysis. Your text describes that you have analysed data from 18 lineages. Reference is made to ‘taxonomic lineage’ but in Table 1 there are 12 families and 22 genera, neither of which tallies with the number 18. Also, please double check the species numbers: Table 1 appears to list 73 species. In the text you refer to 18 C<sub>3</sub>, 17 C<sub>4</sub>, and 37 C<sub>3</sub>-C<sub>4</sub> intermediates, which totals 72 species. So, is there an extra species listed in Table 1? Please check to make sure you (or the reviewers) didn't miscount. A graphical representation of the phylogeny for the species used in the analysis would be useful.</p></body></sub-article><sub-article article-type="reply" id="SA2"><front-stub><article-id pub-id-type="doi">10.7554/eLife.00961.019</article-id><title-group><article-title>Author response</article-title></title-group></front-stub><body><p><italic>1) A strength of this work is that the conclusions emerge from an inference framework that is much more convincing than when traditional (qualitative and argumentative) approaches are used. Specifically, the methodology unveiled here is both quantitative and “objective”. However, the authors need to be more explicit about the level of “objectiveness” of their work. Indeed, they present a framework where choices had to be made and the reader cannot know whether attempts with other choices were less conclusive. To address this issue, the authors have to demonstrate that their conclusions are robust to choices made within their modeling framework. Two specific tests should be performed</italic>:</p><p><italic>1A) First, use a structural change to your framework: instead of the 16 traits you have selected, remove say 2 of these traits, and if possible include 2 others</italic>.</p><p>We performed three additional analyses. In the first two, we removed two randomly selected independent pairs of traits (<xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2A and B</xref>). Neither the predicted timing of the remaining 14 traits nor our main conclusions about C<sub>4</sub> evolution were affected. Thirdly, we repeated the analysis including data for two additional traits associated with C<sub>4</sub> evolution (<xref ref-type="fig" rid="fig3s3">Figure 3—figure supplement 3C</xref>). Removing these traits from the analysis did not alter the predicted order of trait acquisition compared with the initial analysis, or the conclusions that we draw from these predictions.</p><p><italic>1B) Second, you treat quantitative traits using EM and represent these by a binary value (presence vs absence). Sometimes the assignment will not be clear-cut; could you then use the other assignment (which is nearly as justified)</italic>?</p><p>To test this, we repeated the analysis with presence and absence scores assigned by hierarchical clustering as opposed to EM. Hierarchical clustering generated alternative binary values to EM for all the traits where assignment was not clear-cut. Despite this, extremely similar predictions were generated and the conclusions we draw about C<sub>4</sub> evolution were the same. We include the results of this analysis in a new supplement (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1</xref>). We present both the data from hierarchical-clustered data on its own, as well as a comparison with posterior probabilities obtained using data clustered by EM. These results suggest that the data points whose assignment is not clear-cut do not strongly affect our conclusions.</p><p><italic>An even simpler approach would have been to simply put a threshold (common for all) bypassing any EM (you do this for some of your traits). Did you first try that but not succeed? It is okay to be honest about the potential weaknesses of the conclusions given that the data is limited</italic>.</p><p>We note that both the EM algorithm and hierarchical clustering assign thresholds based on the distribution of data available. We did not try assigning thresholds of our own definition to these quantitative traits at any stage, but rather preferred to use clustering by statistical methods such as EM so as to minimize bias in assigning presence/absence scores. We only assigned our own thresholds to data for which clustering by EM was not possible (i.e., when traits were measured qualitatively, or too few data points were available).</p><p>We have integrated these new supplements associated with points 1A&B above into the main article.</p><p><italic>2) To be clear to a broad audience, please be explicit about how you define a plant ‘lineage’ as used in your analysis. Your text describes that you have analysed data from 18 lineages. Reference is made to ‘taxonomic lineage’ but in Table 1 there are 12 families and 22 genera, neither of which tallies with the number 18. Also, please double check the species numbers: Table 1 appears to list 73 species. In the text you refer to 18 C</italic><sub><italic>3</italic></sub><italic>, 17 C</italic><sub><italic>4</italic></sub> <italic>and 37 C</italic><sub><italic>3</italic></sub><italic>-C</italic><sub><italic>4</italic></sub> <italic>intermediates, which totals 72 species. So, is there an extra species listed in Table 1? Please check to make sure you (or the reviewers) didn't miscount. A graphical representation of the phylogeny for the species used in the analysis would be useful</italic>.</p><p>We define the number of C<sub>3</sub>-C<sub>4</sub> lineages by evolutionary independent origins of C<sub>3</sub>-C<sub>4</sub> intermediacy. Although our analysis included 15 genera possessing C<sub>3</sub>-C<sub>4</sub> species, within two of these genera there are multiple independent lineages of intermediates. For example, in Flaveria and Mollugo there are three and two distinct clades of C<sub>3</sub>-C<sub>4</sub> species respectively. This totals 18 independent C<sub>3</sub>-C<sub>4</sub> lineages. We have annotated species in Table 1 as C<sub>3</sub>, C<sub>4</sub>, or C<sub>3</sub>-C<sub>4</sub> to provide further clarity. We included the following at the beginning of the Results section to better clarify the definition of intermediates:</p><p>“To parameterise the phenotypic landscape underlying photosynthetic phenotypes, data was consolidated from 43 studies encompassing 18 C<sub>3</sub>, 18 C<sub>4</sub>, and 37 C<sub>3</sub>-C<sub>4</sub> intermediate species from 22 genera (Table 1). These C<sub>3</sub>-C<sub>4</sub> species are from 18 independent lineages likely representing 18 distinct evolutionary origins of C<sub>3</sub>-C<sub>4</sub> intermediacy (<xref ref-type="bibr" rid="bib66">Sage et al. 2011a</xref>) (<xref ref-type="fig" rid="fig1s2">Figure 1—figure supplement 2</xref>).”</p><p>Regarding the number of species, the 73 species listed in Table 1 is complete. The numbers presented in the text were a count of these. A recount confirms that the analysis included 18 C<sub>3</sub>, 18 C<sub>4</sub>, and 37 C<sub>3</sub>-C<sub>4</sub> species. We are grateful for this anomaly being spotted.</p><p>We have included an angiosperm phylogeny with the distribution of independent C<sub>3</sub>-C<sub>4</sub> and C<sub>4</sub> lineages annotated onto it (<xref ref-type="fig" rid="fig1s2">Figure 1—figure supplement 2</xref>).</p></body></sub-article></article> |