Permalink
Cannot retrieve contributors at this time
Fetching contributors…
| <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.1d1 20130915//EN" "JATS-archivearticle1.dtd"><article article-type="research-article" dtd-version="1.1d1" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="nlm-ta">elife</journal-id><journal-id journal-id-type="hwp">eLife</journal-id><journal-id journal-id-type="publisher-id">eLife</journal-id><journal-title-group><journal-title>eLife</journal-title></journal-title-group><issn publication-format="electronic">2050-084X</issn><publisher><publisher-name>eLife Sciences Publications, Ltd</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">03553</article-id><article-id pub-id-type="doi">10.7554/eLife.03553</article-id><article-categories><subj-group subj-group-type="display-channel"><subject>Research article</subject></subj-group><subj-group subj-group-type="heading"><subject>Genes and chromosomes</subject></subj-group><subj-group subj-group-type="heading"><subject>Immunology</subject></subj-group></article-categories><title-group><article-title>Active RNAP pre-initiation sites are highly mutated by cytidine deaminases in yeast, with AID targeting small RNA genes</article-title></title-group><contrib-group><contrib contrib-type="author" corresp="yes" id="author-3693"><name><surname>Taylor</surname><given-names>Benjamin JM</given-names></name><contrib-id contrib-id-type="orcid">http://orcid.org/0000-0001-6101-3786</contrib-id><xref ref-type="aff" rid="aff1"/><xref ref-type="corresp" rid="cor1">*</xref><xref ref-type="other" rid="par-1"/><xref ref-type="other" rid="par-2"/><xref ref-type="fn" rid="con1"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-3744"><name><surname>Wu</surname><given-names>Yee Ling</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="other" rid="par-1"/><xref ref-type="fn" rid="con2"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" corresp="yes" id="author-3745"><name><surname>Rada</surname><given-names>Cristina</given-names></name><contrib-id contrib-id-type="orcid">http://orcid.org/0000-0003-4898-5550</contrib-id><xref ref-type="aff" rid="aff1"/><xref ref-type="corresp" rid="cor2">*</xref><xref ref-type="other" rid="par-1"/><xref ref-type="fn" rid="con3"/><xref ref-type="fn" rid="conf1"/></contrib><aff id="aff1"><institution content-type="dept">Protein and Nucleic Acid Chemistry Division</institution>, <institution>Medical Research Council Laboratory of Molecular Biology</institution>, <addr-line><named-content content-type="city">Cambridge</named-content></addr-line>, <country>United Kingdom</country></aff></contrib-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Proudfoot</surname><given-names>Nick J</given-names></name><role>Reviewing editor</role><aff><institution>University of Oxford</institution>, <country>United Kingdom</country></aff></contrib></contrib-group><author-notes><corresp id="cor1"><label>*</label>For correspondence: <email>btaylor@mrc-lmb.cam.ac.uk</email> (BJMT);</corresp><corresp id="cor2"><label>*</label>For correspondence: <email>car@mrc-lmb.cam.ac.uk</email> (CR)</corresp></author-notes><pub-date date-type="pub" publication-format="electronic"><day>19</day><month>09</month><year>2014</year></pub-date><pub-date pub-type="collection"><year>2014</year></pub-date><volume>3</volume><elocation-id>e03553</elocation-id><history><date date-type="received"><day>02</day><month>06</month><year>2014</year></date><date date-type="accepted"><day>17</day><month>09</month><year>2014</year></date></history><permissions><copyright-statement>© 2014, Taylor et al</copyright-statement><copyright-year>2014</copyright-year><copyright-holder>Taylor et al</copyright-holder><license xlink:href="http://creativecommons.org/licenses/by/4.0/"><license-p>This article is distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution License</ext-link>, which permits unrestricted use and redistribution provided that the original author and source are credited.</license-p></license></permissions><self-uri content-type="pdf" xlink:href="elife03553.pdf"/><abstract><object-id pub-id-type="doi">10.7554/eLife.03553.001</object-id><p>Cytidine deaminases are single stranded DNA mutators diversifying antibodies and restricting viral infection. Improper access to the genome leads to translocations and mutations in B cells and contributes to the mutation landscape in cancer, such as kataegis. It remains unclear how deaminases access double stranded genomes and whether off-target mutations favor certain loci, although transcription and opportunistic access during DNA repair are thought to play a role. In yeast, AID and the catalytic domain of APOBEC3G preferentially mutate transcriptionally active genes within narrow regions, 110 base pairs in width, fixed at RNA polymerase initiation sites. Unlike APOBEC3G, AID shows enhanced mutational preference for small RNA genes (tRNAs, snoRNAs and snRNAs) suggesting a putative role for RNA in its recruitment. We uncover the high affinity of the deaminases for the single stranded DNA exposed by initiating RNA polymerases (a DNA configuration reproduced at stalled polymerases) without a requirement for specific cofactors.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.001">http://dx.doi.org/10.7554/eLife.03553.001</ext-link></p></abstract><abstract abstract-type="executive-summary"><object-id pub-id-type="doi">10.7554/eLife.03553.002</object-id><title>eLife digest</title><p>In cells, genetic information is stored within molecules of DNA, which contain sequences of four ‘bases’ arranged in different orders. Replacing one of these bases with a different base results in a mutation, which can have a positive or negative influence on the cell.</p><p>Mammals use a group of enzymes called cytidine deaminases to help defend themselves against harmful invaders. These enzymes work by introducing mutations into the DNA of viruses, microbes and even the mammal itself. For example, an enzyme called APOBEC3G can mutate the DNA of viruses to prevent them spreading around the body. Another enzyme, called AID, can mutate the genes that make antibodies—proteins that attack the invading microbes—in order to make new varieties of antibodies. Unfortunately, the enzymes sometimes target other genes, which can lead to cancer and other diseases.</p><p>Cytidine deaminases can only access and mutate single strands of DNA, so most of the DNA in a cell is protected because it is in a two-stranded double helix. However, there are times when the two strands are separated, such as when a section of DNA is being repaired, or when it is being transcribed to produce a molecule of RNA, which is subsequently used to make a protein. It is not clear when cytidine deaminases are able to target single stranded DNA, and whether they need help from any other components.</p><p>Now, Taylor et al. have studied how these enzymes access single stranded DNA when artificially introduced into yeast. These experiments showed that AID and APOBEC3G can access single stranded DNA without the help of any extra components. The enzymes target genes that are being transcribed to make RNA, with the DNA at the start of the transcription site being the most prone to mutation.</p><p>In mammal cells, most genes are normally protected from the mutations introduced by cytidine deaminases, but this protection does not appear to work in many cancer cells. The next challenge will be to develop a better understanding of how this protection works, and to work out why it sometimes goes wrong.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.002">http://dx.doi.org/10.7554/eLife.03553.002</ext-link></p></abstract><kwd-group kwd-group-type="author-keywords"><title>Author keywords</title><kwd>cancer</kwd><kwd>AID/APOBECs</kwd><kwd>cytidine deamination</kwd><kwd>RNA polymerase</kwd><kwd>mutation</kwd><kwd>transcription initiation</kwd></kwd-group><kwd-group kwd-group-type="research-organism"><title>Research organism</title><kwd>human</kwd><kwd>S. cerevisiae</kwd></kwd-group><funding-group><award-group id="par-1"><funding-source><institution-wrap><institution-id institution-id-type="FundRef">http://dx.doi.org/10.13039/501100000265</institution-id><institution>Medical Research Council</institution></institution-wrap></funding-source><award-id>MC_U105178806</award-id><principal-award-recipient><name><surname>Taylor</surname><given-names>Benjamin JM</given-names></name><name><surname>Wu</surname><given-names>Yee Ling</given-names></name><name><surname>Rada</surname><given-names>Cristina</given-names></name></principal-award-recipient></award-group><award-group id="par-2"><funding-source><institution-wrap><institution-id institution-id-type="FundRef">http://dx.doi.org/10.13039/501100000265</institution-id><institution>Medical Research Council</institution></institution-wrap></funding-source><award-id>Centenary Award</award-id><principal-award-recipient><name><surname>Taylor</surname><given-names>Benjamin JM</given-names></name></principal-award-recipient></award-group><funding-statement>The funder had no role in study design, data collection and interpretation, or the decision to submit the work for publication.</funding-statement></funding-group><custom-meta-group><custom-meta><meta-name>elife-xml-version</meta-name><meta-value>2</meta-value></custom-meta><custom-meta specific-use="meta-only"><meta-name>Author impact statement</meta-name><meta-value>Transcribed promoters are highly susceptible to mutation by cytidine deaminases, implicating stable exposure of single stranded DNA structures, rather than cofactors, in localising mutation during tumourigenesis and antibody maturation.</meta-value></custom-meta></custom-meta-group></article-meta></front><body><sec id="s1" sec-type="intro"><title>Introduction</title><p>Cytidine deaminases are a family of polynucleotide mutators that modify cytosines into uracil in viral nucleic acids as part of the innate immune defences (<xref ref-type="bibr" rid="bib21">Harris and Liddament, 2004</xref>). Their success in restricting infection is reflected in the fact that the family has undergone a rapid expansion in primates and humans (<xref ref-type="bibr" rid="bib25">Jarmuz et al., 2002</xref>). The ancestral founder of the family, activation induced deaminase (AID), functions in the adaptive immune system to mutate antibody genes in B cells as a fast mechanism to promote diversity of the antibody response to match the rapid evolution of pathogens during infection. The evolutionary advantages of these strategies are counterbalanced by the risk of exposing the host genome to active mutagenesis, a frequent cause of oncogenic transformation in leukaemia and lymphomas of B cell origin.</p><p>All members of the AID/APOBEC family are selective in the sequence context of the deaminated cytosine, with the two preceding nucleotides identifying the signature of individual deaminases (<xref ref-type="bibr" rid="bib4">Beale et al., 2004</xref>). This mutation context signature has identified the human APOBEC3A and 3B proteins as the source of many of the somatic mutations accumulated by cancer genomes (<xref ref-type="bibr" rid="bib41">Nik-Zainal et al., 2012</xref>; <xref ref-type="bibr" rid="bib7">Burns et al., 2013</xref>; <xref ref-type="bibr" rid="bib48">Roberts et al., 2013</xref>, <xref ref-type="bibr" rid="bib53">Taylor et al., 2013</xref>). The combined mutational landscape observed in mammalian genomes is complicated by the contribution from multiple cellular processes in addition to enzymatic deamination, such as metabolic oxidation, methyl-CpG deamination and aging, thus elucidating the precise contribution of the APOBECs is far from straightforward (<xref ref-type="bibr" rid="bib2">Alexandrov et al., 2013</xref>; <xref ref-type="bibr" rid="bib35">Lawrence et al., 2013</xref>). However the peculiar clustering of same strand mutations at TpC dinucleotides observed in kataegic mutations in breast cancers constitutes a hallmark of the APOBEC3A and 3B deaminases that can be experimentally induced. Repair of double stranded DNA breaks can expose long patches of single stranded DNA with multiple deaminations leading to the mutation clusters observed in association with genomic rearrangements in breast cancer genomes (<xref ref-type="bibr" rid="bib41">Nik-Zainal et al., 2012</xref>; <xref ref-type="bibr" rid="bib49">Roberts et al., 2012</xref>; <xref ref-type="bibr" rid="bib53">Taylor et al., 2013</xref>).</p><p>Physiologically, the activity of such mutators is targeted to specific substrates and restricted from the rest of the genome to limit genomic instability. In the case of AID, expressed upon activation in only a fraction of B cells, by limiting access to the nuclear compartment and preferential recruitment to the immunoglobulin genes; in the case of APOBEC3G, expressed preferentially in lymphoid cells, by its localisation in the cytosol and binding to the viral genome and capsid. The mechanism that preferentially directs AID to the immunoglobulin genes is not fully understood, but active transcription has been repeatedly invoked as a requirement (Reviewed in <xref ref-type="bibr" rid="bib51">Storb, 2014</xref>) and many of the proteins found to be associated with AID are also involved in transcription and mRNA processing (<xref ref-type="bibr" rid="bib44">Pavri et al., 2010</xref>; <xref ref-type="bibr" rid="bib3">Basu et al., 2011</xref>; <xref ref-type="bibr" rid="bib42">Okazaki et al., 2011</xref>; <xref ref-type="bibr" rid="bib59">Willmann et al., 2012</xref>). Access of AID to off-target loci are documented not only by the anecdotal occurrence of mutations in oncogenes and chromosomal break points bearing the signature of the deaminase (Bcl6, MYC [<xref ref-type="bibr" rid="bib43">Pasqualucci et al., 2001</xref>]) but also by AID dependent chromosome-break-capture and direct ChIP, where widespread off-target presence of AID is experimentally detected outside the immunoglobulin locus in mouse B cells (<xref ref-type="bibr" rid="bib11">Chiarle et al., 2011</xref>; <xref ref-type="bibr" rid="bib30">Klein et al., 2011</xref>; <xref ref-type="bibr" rid="bib62">Yamane et al., 2011</xref>).</p><p>In addition to the sporadic off-target mutations induced by AID in B cells, APOBEC3A and 3B are thought to be responsible for many of the non-clustered/non-kataegic mutations at TpC dinucleotides observed not only in breast cancers but in other tumour types where the kataegic signature is not obviously present (<xref ref-type="bibr" rid="bib32">Kuong and Loeb, 2013</xref>). As with sporadic AID mutations, the circumstances that promote or grant access of the APOBECs to single stranded DNA substrates of the host are not known. We have shown that overexpresison of deaminases in yeast faithfully recapitulates the mutation signatures observed in mammalian genomes. Here we have attempted to identify genomic features that promote or are permissive for enzymatic deamination by footprinting mutator activity on multiple genomes. Our results indeed reveal a preferential targeting of the deaminases to defined regions of the genome that is not dependent on cofactors but is rather based on accessibility, with structural features of the DNA at the promoter of actively transcribed genes being the key determinant. We also uncover a potential mechanistic explanation for the targeting and off-target preferences of the antibody diversification mutator AID.</p></sec><sec id="s2" sec-type="results"><title>Results</title><sec id="s2-1"><title>AID and APOBEC3G extensively mutate the yeast genome</title><p>Overexpression of cytidine deaminases in yeast leads to the accumulation of genome wide mutations, which can be monitored by the number of cells resistant to the arginine analogue L-Canavinine through inactivation of the arginine permease CAN1 gene (<xref ref-type="fig" rid="fig1">Figure 1A</xref>). We have previously shown that such overexpression leads to an uracil-DNA glycosylase (UNG) dependent enrichment of kataegic mutations through deamination of cytosines on single stranded DNA intermediates during the repair of double strand breaks (<xref ref-type="bibr" rid="bib53">Taylor et al., 2013</xref>). To assess the distribution of isolated mutations, we obtained a dataset largely devoid of kataegic mutations by expressing the deaminases in <italic>ung</italic>Δ cells. Overexpression of AID* (an AID hyperactive mutant [<xref ref-type="bibr" rid="bib57">Wang et al., 2009</xref>; <xref ref-type="bibr" rid="bib53">Taylor et al., 2013</xref>]) in haploid cells results in highly elevated frequency of Canavinine resistant colonies (164 × 10<sup>−6</sup>), but relatively few mutations, averaging 61 single nucleotide variations (SNVs) per genome (<xref ref-type="fig" rid="fig1">Figure 1A,B</xref>). Diploid cells can overcome this limit as they avoid the reduction in fitness costs caused by accumulated mutation (<xref ref-type="bibr" rid="bib58">Waters and Parry, 1973</xref>; <xref ref-type="bibr" rid="bib33">Lada et al., 2013</xref>). Our experimental setting confirms this effect; whereas the mutation frequency is reduced almost 40-fold due to the requirement to inactivate both CAN1 alleles, the genome wide SNV increase over 10-fold, averaging 796 SNVs per genome for AID* and 592 SNVs for transformants expressing sA3G* (a hyperactive mutant of the catalytic domain of human APOBEC3G [<xref ref-type="bibr" rid="bib57">Wang et al., 2009</xref>]; <xref ref-type="fig" rid="fig1">Figure 1A,B</xref>). For comparison, a database of mutations at C•G pairs was generated using the alkylating agent ethyl methane sulfonate (EMS). Alkylation of guanosines promotes base pairing with thymine, thereby causing G > A transitions during replication. Overnight exposure of diploid cells to 0.2% EMS resulted in increased mutation frequency and SNV load per genome similar to that elicited by the deaminases (<xref ref-type="fig" rid="fig1">Figure 1</xref>).<fig id="fig1" position="float"><object-id pub-id-type="doi">10.7554/eLife.03553.003</object-id><label>Figure 1.</label><caption><title>Genome wide distribution and signature of unclustered deaminase induced mutations in <italic>ung1Δ</italic> diploid yeast.</title><p>(<bold>A</bold>) Mutation frequency (expressed as the number of canavinine resistant colonies per 10<sup>6</sup> ) at the CAN1 locus in <italic>ung1Δ</italic> haploid yeast (data in part from <xref ref-type="bibr" rid="bib53">Taylor et al., 2013</xref>) and <italic>ung1Δ/ung1Δ</italic> diploid yeast transformants expressing AID/APOBEC proteins or upon treatment with 0.2% EMS. Red bars indicate the median mutation frequency (n = 12–126 colonies). (<bold>B</bold>) Genome wide SNV number in <italic>ung1Δ</italic> haploid and <italic>ung1Δ/Δ</italic> diploid yeast transformants expressing AID/APOBEC proteins or with EMS treatment. Red bars indicate the median mutation per genome (n = 25–50 independent clones). (<bold>C</bold>) Sequence context of mutations at G•C pairs in diploid yeast genomes (indicated as mutations at cytosines) exposed to AID*, sA3G* or EMS mutagenesis. The numbers indicate total mutations per dataset, with the height of colour bars proportional to the frequency of each base found in the vicinity of a mutation. (<bold>D</bold>) Distribution of mutations per diploid yeast chromosome expressed as the number of mutations per chromosome in each independent genome against the chromosome length. The bars represent the projected linear trend for mutations at C (in black) or G (in red).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.003">http://dx.doi.org/10.7554/eLife.03553.003</ext-link></p></caption><graphic xlink:href="elife03553f001"/></fig></p><p>When interrogating the mutations (99.8% of which occur at C:G pairs; A:T mutations were excluded from further analysis; all detected mutations are given in <xref ref-type="supplementary-material" rid="SD1-data">Supplementary file 1</xref>), the expected flanking sequence context of WR<italic>C</italic> was found for AID* and YC<italic>C</italic> for sA3G* (<xref ref-type="fig" rid="fig1">Figure 1C</xref>). In stark contrast, no consensus motif was observed in the EMS data, highlighting the random nature of this mutagenesis. In all three datasets SNVs appeared distributed throughout the genome, with all chromosomes displaying similar overall mutation that is strongly correlated with chromosome length, ruling out major biases in the targeting of mutations (Spearman's correlation coefficient for AID*: ρ > 0.65; for sA3G*: ρ > 0.55; for EMS: ρ > 0.68; <xref ref-type="fig" rid="fig1">Figure 1D</xref>).</p></sec><sec id="s2-2"><title>Deaminase induced mutations are highly enriched in a small fraction of the genome</title><p>Whilst mutations are equally distributed amongst chromosomes, they are not uniformly arranged along the chromosome. By combining the SNVs from independent transformants, regions can be observed in AID* and sA3G* genomes which show pronounced mutational peaks (<xref ref-type="fig" rid="fig2">Figure 2A</xref>). Only one such region of high mutation density is seen in the EMS treated clones, that of the CAN1 gene. The presence of multiple loci with high mutation density is therefore a deaminase specific process.<fig-group><fig id="fig2" position="float"><object-id pub-id-type="doi">10.7554/eLife.03553.004</object-id><label>Figure 2.</label><caption><title>Mutation enriched loci (MELs) identified by focussed deaminase-induced mutation.</title><p>(<bold>A</bold>) Radial histograms depict the density (Z-score) of pooled mutations for each dataset in 2 kb overlapping genomic segments along each chromosome. The CAN1 locus is highlighted in red. The peak highlighted in cyan is further enlarged in panels (<bold>B</bold>), (<bold>C</bold>) and (<bold>D</bold>). (<bold>B</bold>) Mutation densities along ChrII in AID* (red), sA3G* (black) and EMS (blue) treated genomes, expressed as the Z-score of mutation density per dataset (y-axis) along chromosome II (x-axis; 200 bp bin size). The region shadowed in cyan is magnified in (<bold>C</bold>). (<bold>C</bold>) Regions of high mutation density identify narrow mutation enriched regions (MELs), shown as green boxes for AID* and purple boxes for sA3G* in the bottom panel. Horizontal lines represent a single genome with each non-clonal mutation at C or G indicated by a dot (black or red respectively). Regions in Chr II and Chr X containing mutation enriched loci shown at the same scale, with the genomic coordinates indicated. (<bold>D</bold>) Mutations in the pronounced MEL on ChrII (highlighted cyan in panels (<bold>A</bold>), (<bold>B</bold>) and (<bold>C</bold>) shown in green for AID* and purple for sA3G*. Coordinates are indicated. (<bold>E</bold>) Overlap of detected MELs in AID*, sA3G* and EMS datasets. (<bold>F</bold>) Distribution of MELs width with the median indicated for AID* and sA3G* mutated genomes. (<bold>G</bold>) Fraction of the total deaminase mutations in MELs (black boxes) relative to genomic coverage of MELs. (<bold>H</bold>) Distribution of distances between AID and A3G mutable motifs within MELs vs genome wide mutable motif distances.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.004">http://dx.doi.org/10.7554/eLife.03553.004</ext-link></p></caption><graphic xlink:href="elife03553f002"/></fig><fig id="fig2s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03553.005</object-id><label>Figure 2—figure supplement 1.</label><caption><title>Overlap between Haploid and Diploid MELs.</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.005">http://dx.doi.org/10.7554/eLife.03553.005</ext-link></p></caption><graphic xlink:href="elife03553fs001"/></fig><fig id="fig2s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03553.006</object-id><label>Figure 2—figure supplement 2.</label><caption><title>Strand bias in deaminase induced mutations calculated as fraction of mutations at C (+strand) or G (- strand) within each MEL.</title><p>(<bold>A</bold>) Strand distribution of mutations within AID* and sA3G* MEL regions. MELs comprising a single base were excluded. (<bold>B</bold>) Strand distribution of mutations within MEL regions in relation to the direction of transcription of the associated gene. (<bold>C</bold>) Strand distribution of WRC and YCC deaminase motifs within MEL regions and their flanking 50 base pairs.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.006">http://dx.doi.org/10.7554/eLife.03553.006</ext-link></p></caption><graphic xlink:href="elife03553fs002"/></fig></fig-group></p><p>A more detailed look at regions with high density of mutations reveals narrow peaks of accumulated mutation that are in many cases common to both deaminases (<xref ref-type="fig" rid="fig2">Figure 2B</xref>), with the most prominent peaks resulting from the proximity of several regions of densely targeted loci. These peaks represent high mutation densities within a bin size of 150 base pairs but surprisingly reflect the accumulation of mutations focussed to very narrow intervals within targeted loci (<xref ref-type="fig" rid="fig2">Figure 2C,D</xref>).</p><p>To further delineate mutation favoured loci, we defined regions of high mutation density by identifying overlapping 150 base pair fragments containing higher than expected mutation loads (minimum of six mutations per fragment, originating from three independent transformants). We identify 1227 and 568 such mutation-enriched loci (MELs) in the AID* and sA3G* treated genomes, in contrast to just 1 obtained for EMS treatment (overlapping the body of the CAN1 gene and hence due to canavinine selection). On average 35 such MELs would be expected for simulated datasets of equivalent mutation loads (<xref ref-type="fig" rid="fig2">Figure 2E</xref> and <xref ref-type="supplementary-material" rid="SD2-data">Supplementary file 2</xref>). MELs span remarkably narrow regions, with a window width averaging 110 bp for AID* and 71bp for sA3G* (<xref ref-type="fig" rid="fig2">Figure 2F</xref>), and with almost 41% of all AID* and 22% of all sA3G* induced mutations localised to these regions (<xref ref-type="table" rid="tbl1">Table 1</xref> and <xref ref-type="supplementary-material" rid="SD2-data">Supplementary file 2</xref>). In total, 25,618 of the combined 72,196 AID* and sA3G* mutations are occurring in MELs which account for just 1.5% of the genome (<xref ref-type="fig" rid="fig2">Figure 2G</xref>).<table-wrap id="tbl1" position="float"><object-id pub-id-type="doi">10.7554/eLife.03553.007</object-id><label>Table 1.</label><caption><p>Deaminase induced Mutation Enriched Loci (MEL) in yeast genomes</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.007">http://dx.doi.org/10.7554/eLife.03553.007</ext-link></p></caption><table frame="hsides" rules="groups"><thead><tr><th/><th colspan="3">Observed</th><th colspan="3">Simulated</th></tr><tr><th/><th>AID*</th><th>sA3G*</th><th>EMS</th><th>AID*</th><th>sA3G*</th><th>EMS</th></tr></thead><tbody><tr><td>MELs</td><td>1227</td><td>568</td><td>1</td><td>50</td><td>21</td><td>3</td></tr><tr><td>% MEL mutation</td><td>40.7</td><td>21.6</td><td>0.24</td><td>0.75</td><td>0.39</td><td>0.14</td></tr></tbody></table></table-wrap></p><p>Both AID and APOBEC3G target cytosines for deamination within a specific sequence context, leading to the mutation hotspots associated with antibody diversification and the recurrent mutations at CCC trinucleotides observed in HIV-1 genomes during the evolution of viral clades and which accumulate in viral genomes from infected individual (<xref ref-type="bibr" rid="bib28">Kijak et al., 2008</xref>). We therefore analysed the distribution of AID and APOBEC3G preferred sequence context in the yeast genome and find that the densities of AID and APOBEC3G motifs (WRC and YCC respectively) show no enrichment within the highly targeted regions compared to the remaining genome (<xref ref-type="fig" rid="fig2">Figure 2H</xref>). Therefore, the accumulation of mutations in MELs is not a consequence of localised clustering of mutable motifs.</p><p>Reinforcing the notion that MELs are highly favoured targets for mutations, we find these areas are frequency mutated on both alleles: 48% of AID* genomes and 56% of sA3G* genomes have mutations within MELs occurring on both chromosome alleles, compared to just 2–3% predicted for random fragments of equivalent size and mutation loads. MELs also contain most of the homozygous mutations detected (82% of AID* and 78% of sA3G*). Targeting of both alleles in MELs suggests they represent highly mutable regions within the genome, with the deaminases returning repeatedly to the same sites (albeit on a second chromosome) to mutate.</p><p>Re-analysis of deaminase mutations we previously reported in haploid yeast (<xref ref-type="bibr" rid="bib53">Taylor et al., 2013</xref>) identified 39 MELs which overlap with hypermutated MELs in diploid yeast, thus the focusing of mutations to MELs is seemingly unaffected by ploidy, suggesting the skewing of mutations due to selective pressures, such as fitness, is negligible (<xref ref-type="fig" rid="fig2s1">Figure 2—figure supplement 1</xref>). Equally, we observe no significant strand bias in the hypermutated hotspots associated with AID* MELs suggesting that both strands are targeted in a similar fashion. A broader distribution of sA3G* strand bias more likely reflects the partial skewing in the presence of YCC motifs at MELs (<xref ref-type="fig" rid="fig2s2">Figure 2—figure supplement 2</xref>).</p><p>In conclusion, deaminases preferentially target narrow focussed regions throughout the genome independent of the sequence density of deaminase targets.</p></sec><sec id="s2-3"><title>MELs exclusively overlap gene promoters</title><p>There is a well-recognised relationship between AID induced mutations and transcription both in B cells at immunoglobulin genes, and for off-target loci, with mutations preferentially accumulating towards the promoter proximal region of the transcription unit (<xref ref-type="bibr" rid="bib43">Pasqualucci et al., 2001</xref>; <xref ref-type="bibr" rid="bib45">Rada and Milstein, 2001</xref>). The transcription link is interpreted as a mechanism that facilitates access of AID due to the generation of single stranded DNA intermediates (<xref ref-type="bibr" rid="bib9">Chaudhuri et al., 2003</xref>). We therefore wondered whether AID induced MELs would be found associated with transcription. Contrary to expectation, enrichment analysis reveals that both AID* and sA3G* MELs are depleted within the body of RNA polymerase II (RNAP II) transcribed mRNA genes and rather that the deaminase induced mutations are preferentially associated with promoter regions, with over 76% of deaminase targeted hotspots found at promoters, compared to just 24% for simulated fragments (<xref ref-type="fig" rid="fig3">Figure 3A</xref>).<fig-group><fig id="fig3" position="float"><object-id pub-id-type="doi">10.7554/eLife.03553.008</object-id><label>Figure 3.</label><caption><title>Deaminase mutation footprints are focussed to the pre-initiation complex region of active promoters.</title><p>(<bold>A</bold>) Proportion of promoters, gene bodies, intergenic regions and replication origins (ARS) harbouring a MEL (green) or not (grey) for AID* and sA3G* datasets vs the expected distribution (sim.AID*sA3G*) determined by Monte Carlo simulation of equivalent sized fragments for each MEL dataset distributed randomly across the genome. (<bold>B</bold>) Density of mutations in relation to their distance to the nearest transcription start site (TSS) of mRNA (RNAP II) transcripts compared to the density relative to transcription termination sites (TTS). Data includes all mutations in addition to MELs. (<bold>C</bold>) Deaminase mutations relative to the TATA or TATA-like element for each RNAP II promoters (<xref ref-type="bibr" rid="bib47">Rhee and Pugh, 2012</xref>) compared to the mutation distance distribution aligned to the transcription start site (TSS). (<bold>D</bold>) Proportion of AID* or sA3G* mutable motifs within RNAP II promoter regions, centred on the TATA-elements (<xref ref-type="bibr" rid="bib47">Rhee and Pugh, 2012</xref>). Total number of mutations for each dataset is shown at each position (black line). (<bold>E</bold>) Relative transcription rates (see methods) at RNAP II promoters targeted by MELs compared to relative transcription rates for all RNAP II genes in gal induced conditions (<xref ref-type="bibr" rid="bib15">García-Martínez et al., 2004</xref>). (<bold>F</bold>) Relative enrichment of RNAP II and RNAP II CTD phosphorylation (S2P, S5P and S7P) in promoters containing AID* (red) and sA3G* (black) MELs and all RNAP II promoters (grey) ranked according to transcriptional activity (<xref ref-type="bibr" rid="bib15">García-Martínez et al., 2004</xref>).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.008">http://dx.doi.org/10.7554/eLife.03553.008</ext-link></p></caption><graphic xlink:href="elife03553f003"/></fig><fig id="fig3s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03553.009</object-id><label>Figure 3—figure supplement 1.</label><caption><title>Paucity of deaminase mutations at replication origins is not a consequence of absence of mutable motifs.</title><p>Proportion of AID* or sA3G* mutable motifs around replication origins (ARS), depicted as in <xref ref-type="fig" rid="fig3">Figure 3D</xref>. Total number of mutations for each dataset is shown for at position (black line, scale as in <xref ref-type="fig" rid="fig3">Figure 3D</xref>).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.009">http://dx.doi.org/10.7554/eLife.03553.009</ext-link></p></caption><graphic xlink:href="elife03553fs003"/></fig><fig id="fig3s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03553.010</object-id><label>Figure 3—figure supplement 2.</label><caption><title>Density of mutations in relation to their distance to the nearest TATA box or TATA-like element.</title><p>Mutations are grouped according to the TAF1 enrichment status (data from <xref ref-type="bibr" rid="bib47">Rhee and Pugh, 2012</xref>) with the line colour depicting the mutator (AID*, red; sA3G*, black; EMS, blue). Data includes all mutations in addition to MELs.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.010">http://dx.doi.org/10.7554/eLife.03553.010</ext-link></p></caption><graphic xlink:href="elife03553fs004"/></fig><fig id="fig3s3" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03553.011</object-id><label>Figure 3—figure supplement 3.</label><caption><title>Distribution of the deaminases on chromatin is unrelated to mutation preferences.</title><p>Enrichment of (<bold>A</bold>) deaminase, (<bold>B</bold>) serine 5 phosphorylated RNAPII and (<bold>C</bold>) Histone H3 at MEL promoters, unmutated promoters and intergenic regions. Enrichment is shown relative to input chromatin (<bold>B</bold> and <bold>C</bold>) or further normalised to control cell lines (<bold>A</bold>). Data from 2–3 independent experiments.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.011">http://dx.doi.org/10.7554/eLife.03553.011</ext-link></p></caption><graphic xlink:href="elife03553fs005"/></fig><fig id="fig3s4" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03553.012</object-id><label>Figure 3—figure supplement 4.</label><caption><title>Transcription factor binding sites compared to MEL preferences.</title><p>(<bold>A</bold>) Frequency of each yeast transcription factor at individual promoters as described in (<xref ref-type="bibr" rid="bib56">Venters et al., 2011</xref>) (blue dots) compared with the frequency that the transcription factor appears in the promoter of genes containing AID* (red dots) and sA3G* (black dots) MELs. Factors are ordered according to number of binding sites in all promoters. Basal transcription factors are the most commonly associated with deaminase targeted promoters (labelled). (<bold>B</bold>) Transcription rates of genes grouped according to Spt16 promoter occupancy and presence of MELs. (<bold>C</bold>) List of transcription factors found to vary in occupancy at MEL targeted promoters vs their overall frequency at all yeast promoters (Venters dataset). Transcription factors showing ±10% variation which are present in at least 25% of MELs are listed.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.012">http://dx.doi.org/10.7554/eLife.03553.012</ext-link></p></caption><graphic xlink:href="elife03553fs006"/></fig></fig-group></p><p>Initiation of replication also transiently generates single stranded DNA at defined locations. However, there is no enrichment of mutated hotspots associated with replication origins (ARS) (<xref ref-type="fig" rid="fig3">Figure 3A</xref>). Although this could reflect the relative depletion of mutable motifs within ARS core consensus sequence, we find similar densities of mutable cytosines within the broader sequence context encompassing 200–300 base pairs nucleosome depleted regions associated with functional origins (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1</xref>), suggesting that single strand availability provided by melting the DNA by the ORC complex might not be sufficient to efficiently target the deaminases.</p><p>Mutation enrichment at promoters is not restricted to hotspots identified within MELs, which exclude 73% of the total mutations due to the threshold applied in defining enriched loci. Aligning all mutations to mRNA transcriptional starts (TSS) and termination sites (TTS) (<xref ref-type="bibr" rid="bib61">Xu et al., 2009</xref>), revealed a strong association of deaminase induced mutations with the TSS, with over 57% of AID* and 46% of sA3G* mutations occurring within the promoter region (defined as 500 base pairs upstream and 50 base pairs downstream of the TSS), compared to only 21% of EMS mutations (the expected frequency for randomly distributed mutations). Mutation accumulation is skewed upstream of the TSS (peak at -21 bp and -38 bp for AID* and sA3G* respectively; <xref ref-type="fig" rid="fig3">Figure 3B</xref>), corresponding to the nucleosome free region where the pre-initiation RNAP complex forms before scanning for the TSS (<xref ref-type="bibr" rid="bib47">Rhee and Pugh, 2012</xref>). Indeed, aligning SNVs to the TATA box/TATA-like element or TSS revealed that not only are the majority of promoter associated mutations occurring between these two features (<xref ref-type="fig" rid="fig3">Figure 3C</xref>), there is also a paucity of mutations at the TATA-element suggesting this region is protected by TBP/TFIID binding (this paucity is, at least for AID*, not due to an absence of mutable sequence motifs; <xref ref-type="fig" rid="fig3">Figure 3D</xref>). Intriguingly, the peak of AID* and sA3G* induced SNVs centred 30 base pairs from the TATA-element, the region where TBP guides TFIIB to load RNAPII for the formation of the pre-initiation complex (PIC) (<xref ref-type="bibr" rid="bib47">Rhee and Pugh, 2012</xref>).</p><p>The deaminase mutated hotspots thus identify the position where promoter melting occurs before the scanning polymerase encounters the TSS, suggesting a mechanistic basis for the hypothesis that the deaminases access the promoter coincidentally with the assembly of the pre-initiation complex. Consistent with the notion that initiating polymerases create transient access for the deaminases rather than specifically loading the proteins, we detect robust association of RNAP II with the promoter region of deaminase targeted promoters in yeast but negligible enhancement in the association of either AID or sA3G with mutated promoters compared to unmutated or intergenic regions (<xref ref-type="fig" rid="fig3s3">Figure 3—figure supplement 3</xref>). Additionally, while there is a correlation between the mutated strand and the direction of transcription (<xref ref-type="fig" rid="fig2s2">Figure 2—figure supplement 2</xref>), MELs are predominantly composed of mutations occurring in both strands suggesting the PIC makes both strands available during initiation.</p><p>Supporting the idea that the deaminases preferentially mutate promoters due to their ability to recognize the melted DNA associated with the transcription pre-initiation complex, we observe that MELs occur in genes with above average transcriptional activity (<xref ref-type="bibr" rid="bib15">García-Martínez et al., 2004</xref>) but targeting appears unrelated to any particular transcriptional program (<xref ref-type="fig" rid="fig3s4">Figure 3—figure supplement 4</xref>). Rather than simply transcription factor binding at the promoter, active initiation by RNAP II is important for MEL development (Wilcox test p < 0.005 for all groups; <xref ref-type="fig" rid="fig3">Figure 3E</xref>). The transition of RNAP II from the pre-initiation complex to the elongation complex is associated with a shift in phosphorylation of the C-terminal domain (CTD) serine 5/7 to serine 2 (<xref ref-type="bibr" rid="bib29">Kim et al., 2010</xref>). In agreement with the transcription rate analysis, deaminase MELs are associated with both high levels of RNAP II occupancy and CTD-S5P, that parallels the association with the highest transcribed genes (<xref ref-type="fig" rid="fig3">Figure 3F</xref>). Indeed, the recurrent association of both AID* and sA3G* MELs with regions enriched for the basal transcription machinery and in particular Spt16 -a chromatin chaperon associated with highly transcribed genes (<xref ref-type="bibr" rid="bib13">Formosa, 2013</xref>) (<xref ref-type="fig" rid="fig3s4">Figure 3—figure supplement 4</xref>), reinforces the idea that active transcription and potential pausing (at promoters highly dependent on the FACT/Spt16 complex) determines the deaminases targeting.</p><p>In summary, cytidine deaminases mutate at specific loci through the yeast genome, predominantly within active gene promoter regions.</p></sec><sec id="s2-4"><title>AID targets promoter regions of small RNAP III genes</title><p>In B cells, AID is found in association with components of the transcription machinery such as SPT5 and SPT6, and RNAP II itself (<xref ref-type="bibr" rid="bib40">Nambu et al., 2003</xref>; <xref ref-type="bibr" rid="bib44">Pavri et al., 2010</xref>; <xref ref-type="bibr" rid="bib42">Okazaki et al., 2011</xref>), therefore we wondered whether the enrichment of mutations associated with promoters might be a feature restricted to RNAP II dependent genes. Analysis of mutations in highly transcribed non-RNAP II dependent transcripts, such as RNAP III dependent tRNA genes, astonishingly reveals an even more pronounced enrichment of targeted hotspots with 78% of the genomic regions corresponding to tRNAs harbouring repeated mutations. While we find that both sA3G* and AID* MELs overlap with tRNAs, AID* MELs are disproportionately overrepresented, with 228 of 275 tRNA genes being highly targeted (<xref ref-type="fig" rid="fig4">Figure 4A</xref>). Furthermore, aligning of mutations within 250 base pairs of the TSS of tRNA genes shows that all occur within the tRNA gene body, which is also the site of RNAP III initiation (<xref ref-type="fig" rid="fig4">Figure 4B</xref>). As in the case of mRNA promoters, the mutations in tRNAs are highly focussed to narrow hotspots that span the site where loading of the polymerase is thought to occur (<xref ref-type="fig" rid="fig4">Figure 4C</xref>).<fig-group><fig id="fig4" position="float"><object-id pub-id-type="doi">10.7554/eLife.03553.013</object-id><label>Figure 4.</label><caption><title>AID* and sA3G* target both RNAP II and RNAP III promoters.</title><p>(<bold>A</bold>) Number of tRNA genes harbouring (green) an AID* or sA3G* MEL compared with expected number from Monte Carlo simulations. (<bold>B</bold>) Density of mutations in relation to the transcription start site (TSS) of tRNA genes. Mutations within the 500 base pair interval centred at the TSS are included. (<bold>C</bold>) Mutation frequency in promoters of mRNA genes (within a window 500 bp upstream and 50 bp downstream of the TSS) compared to the frequency of mutations in the promoters of tRNA (550 bp window centred on the middle of the tRNA gene), snoRNAs and snRNA genes (550 bp window as for mRNA genes). mRNA genes are binned according to transcription rate as in <xref ref-type="fig" rid="fig3">Figure 3</xref>. Both RNAP II and III driven snoRNAs are included. (<bold>D</bold>) Example of MELs in ChrIV and ChrXV corresponding to tRNA tI(UAU)D and tG(CCC)O, depicted as in <xref ref-type="fig" rid="fig3">Figure 3</xref>.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.013">http://dx.doi.org/10.7554/eLife.03553.013</ext-link></p></caption><graphic xlink:href="elife03553f004"/></fig><fig id="fig4s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03553.014</object-id><label>Figure 4—figure supplement 1.</label><caption><title>Median number of mutable motifs in promoter regions.</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.014">http://dx.doi.org/10.7554/eLife.03553.014</ext-link></p></caption><graphic xlink:href="elife03553fs007"/></fig><fig id="fig4s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03553.015</object-id><label>Figure 4—figure supplement 2.</label><caption><title>Mutationally enriched loci are not a consequence of increased density of mutable motifs.</title><p>The number of deaminase motifs for each MEL vs the number of mutations within each MEL for AID* and sA3G* datasets.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.015">http://dx.doi.org/10.7554/eLife.03553.015</ext-link></p></caption><graphic xlink:href="elife03553fs008"/></fig><fig id="fig4s3" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03553.016</object-id><label>Figure 4—figure supplement 3.</label><caption><title>Mutations in the rDNA locus are restricted to the replication fork block (RFB) site.</title><p>(<bold>A</bold>) Sequence context of low allelic frequency mutations detected in the rDNA locus, as depicted in <xref ref-type="fig" rid="fig1">Figure 1C</xref>. (<bold>B</bold>) Schematic of the rDNA repeat region. Panels show mutations detected in yeast transformants at the rDNA locus. Each line represents one clone with dots representing mutations (mutation at C, black; at G, red; at A, green). Clones with no detected mutations are not depicted.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.016">http://dx.doi.org/10.7554/eLife.03553.016</ext-link></p></caption><graphic xlink:href="elife03553fs009"/></fig><fig id="fig4s4" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03553.017</object-id><label>Figure 4—figure supplement 4.</label><caption><title>Deaminase induced mutation distribution in relation to R-loop forming potential.</title><p>Tables showing the correlation between R-loops formation predicted by the QmRLFS-finder (<xref ref-type="bibr" rid="bib60">Wongsurawat et al., 2012</xref>) or SkewR package (<xref ref-type="bibr" rid="bib17">Ginno et al., 2012</xref>) and the presence of MELs.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.017">http://dx.doi.org/10.7554/eLife.03553.017</ext-link></p></caption><graphic xlink:href="elife03553fs010"/></fig></fig-group></p><p>The mutation frequency (normalised number of mutations per 550 base pairs) in AID* genomes within tRNA genes is higher than at mRNA gene promoters (p value < 2e-16, Wilcox non-parametric test; <xref ref-type="fig" rid="fig4">Figure 4C</xref>) and much higher than that observed even in the subset of highly transcribed mRNA promoters. While the differences in mutation frequency between mRNA promoters and tRNAs is still statistically significant for A3G* (p value < 8e-10), this effect is less pronounced. Enhanced mutation is also observed in the promoters of snoRNA and snRNA genes, again particularly in the case of AID* genomes, whereas no statistically significant differences are observed between any of the promoter subsets for mutations driven by EMS. The enhanced mutation attributable to AID* is not likely a feature of RNAP III, since snoRNAs are even more targeted for mutation though all but snR52 are transcribed by RNAP II (<xref ref-type="bibr" rid="bib39">Moqtaderi and Struhl, 2004</xref>).</p><p>Targeting of tRNA, snRNA and snoRNA genes by the deaminases could be enhanced by the availability of hypermutable motifs, as there is on average one more YCC motif in the tRNA genes (1.5 more in the MEL region itself) targeted by sA3G* than in those tRNA genes not targeted by sA3G*. We see no such difference with AID* target motifs which are present within tRNA, snRNA and snoRNA gene promoters at similar frequency as in other promoters (average 52 to 63 motifs per 550 base pairs promoter window, <xref ref-type="fig" rid="fig4s1">Figure 4—figure supplement 1</xref>). Overall, there is only weak correlation between the number of motifs within the 550 base pair promoter window and the number of mutations (Spearman's ρ = 0.02 and ρ = 0.2 for AID* and sA3G* respectively; <xref ref-type="fig" rid="fig4s2">Figure 4—figure supplement 2</xref>), confirming that motif availability is not the main determinant for targeting.</p><p>Mutations at rRNA genes were poorly mapped due to the repetitive nature of the region on Chr XII (150–200 copies of the 9.1 kb unit containing the 35S pre-RNA and the 5S RNA). By including repeatedly mapped reads across the rDNA locus, we could detect several hundred mutations at low allele frequency all within the expected deaminase mutation context, giving confidence in their detection and location (<xref ref-type="fig" rid="fig4s3">Figure 4—figure supplement 3A</xref>). Mutations were restricted to the well defined ribosomal replication fork barrier (rRFB) located between the 5S and 35S transcriptional units. No enhanced mutation was detected at the promoter regions (which are transcribed in opposite directions by RNAP III and RNAP I respectively). However mutations clustered at the rRFB site for both deaminases (<xref ref-type="fig" rid="fig4s3">Figure 4—figure supplement 3B</xref>), at a site where induced homologous recombination maintains the size of the ribosomal gene array. Although DNA double-strand breaks (DSB) have been detected at the site, it is likely that in vivo persistent breaks are rare in undamaged yeast (<xref ref-type="bibr" rid="bib14">Fritsch et al., 2010</xref>). Accordingly we did not detect kataegic like clusters in the region, but rather localised mutated hotspots. Thus it is possible that other mechanisms such as cryptic transcription (<xref ref-type="bibr" rid="bib22">Houseley et al., 2007</xref>) might expose the site to the action of the deaminases, rather than repair of double strand breaks. While AID overexpression in yeast deficient for components of the RNA processing machinery (THO) have enhanced genomic instability, particularly in highly transcribed GC-rich regions prone to R-loop formation (<xref ref-type="bibr" rid="bib18">Gómez-González and Aguilera, 2007</xref>), in wild type yeast this effect is only mild. Nonetheless we observe positive association of MELs with predicted R-loop potential genes although the paucity of these features across the genomes (between 59–78 sites) precludes any predictive dissociation between high density of mutation, R-loop potential and transcription rates (<xref ref-type="fig" rid="fig4s4">Figure 4—figure supplement 4</xref>).</p></sec><sec id="s2-5"><title>AID but not sA3G binds small RNAs</title><p>An alternative explanation for the enhanced targeting of small RNA promoters by AID* is that the RNAs themselves preferentially bind AID, thereby creating co-transcriptional enrichment of AID in the vicinity of their genes. Purified AID binds RNA, with its in vitro deamination activity enhanced by treatment with RNAse A (<xref ref-type="bibr" rid="bib6">Bransteitter et al., 2003</xref>), whereas the non-catalytic domain of APOBEC3G is responsible for its ability to bind RNA and form high molecular weight ribonucleic–protein complexes (<xref ref-type="bibr" rid="bib24">Huthoff et al., 2009</xref>; <xref ref-type="bibr" rid="bib5">Bélanger et al., 2013</xref>). It is not known whether binding in both cases is specific for any particular RNA species, but based on our current observations we decided to test the ability of human AID and human APOBEC3G to bind in vitro transcribed tRNA as well as polyU RNA. Whereas both Flag-tagged overexpressed human AID and full length human APOBEC3G can be recovered from cell extracts by binding to biotin labelled RNAs, the catalytic domain of APOBEC3G (sA3G) is not (<xref ref-type="fig" rid="fig5">Figure 5A</xref>). Furthermore, full length APOBEC3G is efficiently recovered from extracts by the extended linear polyU RNA, a reflection of its ability to oligomerise in an RNA dependent fashion, whereas AID recovery is not enhanced by its binding to linear polyU RNA. Binding of AID to tRNA species was also found for endogenous yeast tRNAs, suggesting that the modifications found in vivo (pseudouridylation and 2′-O-ribose methylation) do not affect the interaction. The single domain APOBEC3A protein shows no RNA binding ability except a limited amount to doubled stranded RNA, despite sharing the preferential targeting to promoters as the rest of the deaminases (<xref ref-type="fig" rid="fig5s1">Figure 5—figure supplement 1A</xref>). Taken together, this data suggest a degree of specificity in the RNA binding preferences of the deaminases, with AID preference linked to structured rather than linear RNA (<xref ref-type="fig" rid="fig5">Figure 5B</xref>). Interestingly the catalytic activity of AID is not required for the binding or the specificity, since similar binding was observed for the inactive mutant AID-E58A (<xref ref-type="fig" rid="fig5">Figure 5A</xref>).<fig-group><fig id="fig5" position="float"><object-id pub-id-type="doi">10.7554/eLife.03553.018</object-id><label>Figure 5.</label><caption><title>RNA binding by human AID and APOBEC3G.</title><p>(<bold>A</bold>) Left panel shows the in vitro transcribed pre-tI(UAU)D tRNA used for affinity purification. Right panel shows immunoblots for transiently overexpressed AID/APOBEC3G proteins following RNA-immunoprecipitation with pre-tRNA. (<bold>B</bold>) Affinity purification with tl(UAU)D probe, total yeast tRNA, homopolymeric single stranded (polyU) and double stranded (polyA:U) RNA. Left panel shows input proteins, right panel shows immunoblots for transiently overexpressed AID/APOBEC3 proteins following RNA-immunoprecipitation. Results representative of at least 3 independent experiments. (<bold>C</bold>) Deaminase induced mutations in the promoter region of the YBR194W locus. Top panels: accumulated mutations in the AID*, sA3G* and EMS whole genome datasets. Bottom panels: mutations detected in Sanger sequenced yeast clones unmodified or harbouring a chimeric YBR194W-snR6 locus. Each line represents one clone with dots representing mutations (at C, black; at G, red). Clones with no mutations are indicated.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.018">http://dx.doi.org/10.7554/eLife.03553.018</ext-link></p></caption><graphic xlink:href="elife03553f005"/></fig><fig id="fig5s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03553.019</object-id><label>Figure 5—figure supplement 1.</label><caption><title>Promoter mutations are driven by APOBEC3A and 3B and are a feature of cancer genomes enriched for TC mutations.</title><p>(<bold>A</bold>) Mutation density relative to the TSS for APOBEC3A and APOBEC3B induced mutations from <italic>ung</italic>Δ haploid cells (data from <xref ref-type="bibr" rid="bib53">Taylor et al., 2013</xref>). The density at tRNA promoters is shown separately in red. (<bold>B</bold>) Mutations in breast cancer genome PD4120a and lung adenocarcenoma LUAD-S01345. Pie charts show the contribution of mutations at TC over mutations at the remaining dinucleotides and histograms show mutation density relative to all human TSS (Ensemble annotation).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.019">http://dx.doi.org/10.7554/eLife.03553.019</ext-link></p></caption><graphic xlink:href="elife03553fs011"/></fig><fig id="fig5s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.03553.020</object-id><label>Figure 5—figure supplement 2.</label><caption><title>Functional comparison of the YBR194W locus in modified yeast clones.</title><p>(<bold>A</bold>) Immunoprecipitation of chromatin associated RNAP II or (<bold>B</bold>) Histone H3 from unmodified or YBR194W-snR6 chimeric yeast. Black bars show enrichment relative to input in the unmodified strain with the modified strain in red. An unrelated locus, YJL105W is shown as control. Data from three independent experiments. (<bold>C</bold>) mRNA levels of YBR194W shown relative to ACF1. Levels at the TAF10 gene are shown as a control. Data from three independent experiments.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.020">http://dx.doi.org/10.7554/eLife.03553.020</ext-link></p></caption><graphic xlink:href="elife03553fs012"/></fig></fig-group></p><p>In order to test the RNA binding properties of AID in modulating its targeting preferences we introduced a chimeric snR6 RNA into the RNAP II driven YBR194W gene, which was identified in our dataset as a transcribed but poorly targeted promoter by both deaminases (<xref ref-type="fig" rid="fig5">Figure 5C</xref> top panels). Initiation and transcription of the modified locus remained overall unaffected (<xref ref-type="fig" rid="fig5s2">Figure 5—figure supplement 2</xref>), while comparison of the YBR194W promoter region by Sanger sequencing revealed enhanced mutation focused to the immediate vicinity of the TSS by AID* but not sA3G*. No such focussing of mutations was observed in the unmodified yeast overexpresing AID* (<xref ref-type="fig" rid="fig5">Figure 5C</xref>).</p><p>We conclude that the differential preference of AID* for tRNA, snRNAs and snoRNAs in yeast might reflect the ability of AID to preferentially associate with abundant small RNA species, in contrast to the catalytic domain of APOBEC3G (sA3G*) that possesses no RNA binding activity.</p><p>Targeting mutations to initiating promoters is not likely a function of the size of the deaminase, as could be inferred from the results described for both AID and the single domain fragment of APOBEC3G used in our study. Similar promoter associated recurrent mutations can be elicited not only by APOBEC3A (also a single domain deaminase) but also by the double domain APOBEC3B (<xref ref-type="fig" rid="fig5s1">Figure 5—figure supplement 1A</xref>). It is therefore not entirely unexpected to observe enrichment of mutations at TpC (versus other dinucleotides) in association with promoter regions in a breast cancer genome that has the highest incidence of APOBEC3 kataegic mutations (<xref ref-type="fig" rid="fig5s1">Figure 5—figure supplement 1B</xref>), suggesting that the deaminases could access dsDNA at initiating or paused RNAPs also in mammalian cells.</p></sec></sec><sec id="s3" sec-type="discussion"><title>Discussion</title><p>The involvement of AID and APOBEC3A and 3B in cancer suggests that enzymatic deamination of genomic targets is an infrequent but recurrent consequence of the presence of the deaminases in vertebrates. Despite subcellular compartmentalisation, specific targeting and restricted expression limiting AID off-target activity, some genomic regions other than the natural target, the immunoglobulin loci, are predisposed to mutation. BCL6, PIM1 and MYC are recurrent off-targets of AID mutation in B cell malignancies (<xref ref-type="bibr" rid="bib43">Pasqualucci et al., 2001</xref>); in the case of BCL6 it is estimated that AID induced mutations are also prevalent in non transformed B cells at just 10<sup>3</sup>-fold lower frequency than at immunoglobulin genes (<xref ref-type="bibr" rid="bib37">Liu et al., 2008</xref>) and even in the absence of the mutator phenotypes attributable to malignant transformation, normal B cells frequently show AID induced translocations at the MYC locus (<xref ref-type="bibr" rid="bib50">Roschke et al., 1997</xref>; <xref ref-type="bibr" rid="bib8">Casellas et al., 2009</xref>). In cancer genomes, the association of APOBEC mutations with genomic rearrangements suggests that replication stress, persistent DNA lesions and incomplete repair expose single stranded DNA that becomes a substrate for deaminases leading to clustered mutations. It is unclear how APOBEC3A and 3B gain access to single stranded DNA leading to the singlet isolated mutations highly prevalent in mutated cancer genomes that bear the APOBEC signature (<xref ref-type="bibr" rid="bib53">Taylor et al., 2013</xref>). It is therefore important to understand the genomic context that facilitates off-target activity of the deaminases in the absence of explicit DNA damage.</p><p>Expression of AID and other APOBEC proteins in yeast faithfully recapitulates the signature of mutations observed in mammalian cells in a smaller genome with no background mutations due to unrelated processes, such as DNA repair (<xref ref-type="bibr" rid="bib33">Lada et al., 2013</xref>; <xref ref-type="bibr" rid="bib53">Taylor et al., 2013</xref>). In this study we demonstrate the non-random nature of the mutations induced by the deaminases, which is remarkably focussed to just 1.5% of the yeast genome but nonetheless overlaps more than half of the active promoters. AID is known to interact with components of the transcription machinery in mammalian cells (reviewed in <xref ref-type="bibr" rid="bib27">Kenter, 2012</xref>). However, the overlap between highly mutated promoters by both AID and APOBEC3G suggests that rather than conservation of protein–protein interactions of the deaminases with the transcription complex, properties of the promoter itself can determine targeting.</p><p>Enhanced targeting of RNAP III transcribed genes argues against active recruitment of the deaminases by conserved initiation factors, whereas the structural conservation of the DNA template conformation at the core pre-initiation complex of all polymerases (<xref ref-type="bibr" rid="bib55">Vannini and Cramer, 2012</xref>) supports the idea that the conformation of the DNA template is the common element in the recruitment of the deaminases. Indeed, the site of polymerase loading (within the body of the tRNA genes) rather than the TSS is the preferred target of deamination in the case of the RNAP III transcribed tRNAs in contrast to the 5′ region of the RNAP III transcribed <italic>SNR52</italic> snoRNA, where the loading of the RNAP is fixed at the 5′ promoter region. Furthermore, the high density of mutations focussed to the small region between the TATA binding protein site (TBP) and the transcription start site (TSS), more precisely identify the pre-initiation complex (PIC) as the target for the deaminases.</p><p>Budding yeast RNAP II promoters show characteristic and highly regulated nucleosome exclusion. This is partly due to sequence composition, with regions enriched for poly dA•dT nucleotides that confer rigidity to the DNA and are therefore thermodynamically less favourable to wrap around nucleosomes (<xref ref-type="bibr" rid="bib63">Yuan et al., 2005</xref>), and partly due to the regulated and precise positioning of the +1 nucleosome relative to the TSS that includes specific histone variants (H2A.Z and H3.3) that promote chromatin accessibility (reviewed in <xref ref-type="bibr" rid="bib26">Jiang and Pugh, 2009</xref>). Therefore it is highly significant that other nucleosome free regions, such as ARS are not targeted by the deaminases, despite undergoing DNA melting during the initiation of replication. This reinforces our interpretation that intrinsic properties of active promoters, in particular the configuration associated with loading of the polymerase at the pre-initiation complex (open pre-initiation complex) (<xref ref-type="bibr" rid="bib19">Grünberg et al., 2012</xref>), are sufficient to generate persistent single stranded DNA accessible for deamination. Our data supports the presence of such open PICs in most yeast active promoters.</p><p>Neither the preferential targeting of promoters nor the narrow focus of the MELs is due to the preferential clustering of mutable motifs. Interestingly, the nature of the mutation hotspots within MELs (both at C and G), reveals that both strands of the melted DNA structure associated with active promoters are accessible. Furthermore, protection from mutation is evident at the TBP binding site while the peak of mutations ∼30 base pairs downstream identifies the site of RNAP loading and DNA melting mapped by permanganate footprinting (<xref ref-type="bibr" rid="bib16">Giardina and Lis, 1993</xref>) and high resolution ChIP (<xref ref-type="bibr" rid="bib47">Rhee and Pugh, 2012</xref>). Our deaminase footprinting data further confirms the persistent open configuration and single stranded nature of this region potentially identifying open pre-initiation promoters.</p><p>Differences in the assembly of the PIC in TATA and TATA-like promoters, do not seem to affect mutation susceptibility, although predictably, TATA box promoters show a more defined distance between the TBP protected footprint and the accessible melted DNA (<xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2</xref>) indicating that it is the structure of the single stranded DNA rather than the assembly (SAGA or THIID dependent) of the transcription initiation complex itself that determines targeting (<xref ref-type="bibr" rid="bib47">Rhee and Pugh, 2012</xref>).</p><p>Up to 75% of human promoters in different cell types are occupied by a pre-initiating form of RNAP II (<xref ref-type="bibr" rid="bib20">Guenther et al., 2007</xref>), whereas pausing and stalling are much more common in metazoan transcription compared with <italic>Saccharomyces cerevisiae</italic>. Mammalian promoters are frequently regulated by proximal pausing, with most promoters pausing within 200 base pairs of the TSS (<xref ref-type="bibr" rid="bib1">Adelman and Lis, 2012</xref>). In the presence of a deaminase, initiating and or paused sites would become accessible for mutation, thus it is intriguing to observe promoter proximal enrichment of mutations at TpC dinucleotides in PD4120a, a breast cancer genome with dramatic accumulation of kataegis that betrays its mutagenesis by APOBEC3B (<xref ref-type="bibr" rid="bib41">Nik-Zainal et al., 2012</xref>). Our data favours the idea that accessibility of single stranded DNA at RNAP II stalled sites suffice to recruit APOBECs or indeed AID. This model offers explanation for the association of AID with mammalian SPT5, which functions in modulating the pausing of RNAP II during elongation as transcription stalls, and is consistent with the recurrent targeting by AID of the promoter proximal region of MYC (<xref ref-type="bibr" rid="bib12">Duquette et al., 2005</xref>) a well characterised promoter-proximal pausing regulated gene (<xref ref-type="bibr" rid="bib31">Krumm et al., 1992</xref>; <xref ref-type="bibr" rid="bib52">Strobl and Eick, 1992</xref>).</p><p>The correlation between high transcription rates and enhanced deaminase targeting reinforces the hypothesis that repeated loading of the pre-initiation complex leads to the persistence of a small region of melted DNA that is very efficiently targeted by the deaminases. Indeed the enhanced targeting of tRNA, snoRNAs and snRNA genes could reflect the high transcription rates of these essential RNAs given that RNAP I and III transcripts constitute almost 80% of the total nuclear gene expression in dividing cells (<xref ref-type="bibr" rid="bib54">Vannini, 2013</xref>). The unexpected finding that tRNAs are disproportionally targeted for mutation by AID compared with APOBEC3G, as are the promoters of other highly structured RNAs (snRNA or snoRNA), and the indication that this difference is not due to motif enrichment at those promoters, brings into focus the potential involvement of the RNA binding properties of the deaminases in promoting targeting. While APOBEC3G has been shown to bind not only HIV RNA, but cellular RNAs, including abundant 7S RNA (<xref ref-type="bibr" rid="bib24">Huthoff et al., 2009</xref>), this ability is dependent on the N-terminal domain. Mutation targeting of the RNAP initiation complex is not linked to the ability of the deaminases to bind RNA per se, as the catalytic C-terminal domain of APOBEC3G in this study is inert regarding RNA binding. Notably, our results show that AID binds structured RNAs in vitro (such as tRNAs), and preferentially targets tRNAs and other small RNA promoters for mutation in yeast, prompting the speculation that binding to abundant RNAs sequesters AID to subnuclear localities such as nucleolar areas, where small RNAs genes also localise during transcription. Indeed nucleolar localisation of overexpressed AID has been reported in mammalian cells, although its significance under physiological levels remains to be tested (<xref ref-type="bibr" rid="bib23">Hu et al., 2013</xref>). Alternatively preferential recognition of particular RNA structures such as folded tRNAs could determine the recruitment of AID to genomic regions.</p><p>In conclusion, our study uncovers the remarkable preference of mammalian cytidine deaminases to mutate active promoters when expressed in yeast, a preference blind to the type of RNA polymerase (both RNAP II and III genes are targets) and not ascribable to sequence context or targeting by specific cofactors. The precise and narrow location of the recurrent mutations pinpoints the site where the RNAP pre-initiation complex is loaded highlighting the conservation of the TBP (TATA binding protein) site and the formation of the pre-initiation complex, whereas exclusion of mutations from the TBP site confirms the poised nature of active yeast promoters.</p><p>These results suggest that initiating polymerases create a small but persistent accessible patch of single stranded DNA in vivo<italic>,</italic> which has high affinity for deaminases and where both strands are accessible for mutation. They also strongly support the notion that AID might directly bind to single stranded DNA at the pre-initiating or stalling RNAP sites without a requirement for specific cofactors and that its targeting is modulated by its ability to interact with structured RNA species.</p></sec><sec id="s4" sec-type="materials|methods"><title>Materials and methods</title><sec id="s4-1"><title>Yeast transformants</title><p>Yeast strain BY4743 <italic>ung</italic>Δ<italic>/ungΔ</italic> was generated by crossing BY4741 <italic>ungΔ</italic> (MATa; <italic>his3Δ1</italic>; <italic>leu2Δ0</italic>; <italic>met15Δ0</italic>; <italic>ura3Δ0</italic>) obtained from Euroscarf deletion collection (Frankfurt, Germany) with the BY4742 <italic>ungΔ</italic> strain. BY4742 <italic>ungΔ</italic> was generated by removal of the <italic>UNG1</italic> open reading frame by homologous recombination in the parental BY4742 strain, using a PCR generated <italic>URA3</italic> cassette flanked by a 57-bp 5′ homology and 51-bp 3′ homology arms that include adaptamers for post integration removal of the <italic>URA3</italic> selection cassette (<xref ref-type="bibr" rid="bib46">Reid et al., 2002</xref>). The YBR194W-snR6 chimeric strain was generated by inserting a URA3 cassette at the 5′ end of the YBR194W gene in BY4741 <italic>ungΔ</italic> cells. Homology arms and the snR6 gene were amplified from genomic DNA using the primers (1) 5′-CCTGCCACTTTCAAAAGGCG-3′ and 5′-CGAAGGGTTACTTCGCGAACTCCTGTCCCTATTACATATTCAACC-3′, (2) 5′-GGTTGAATATGTAATAGGGACAGGAGTTCGCGAAGTAACCCTTCG-3′ and 5′-GCCAGGCATGCTAATGGCAAAACGAAATAAATCTCTTTGTAAAAC-3′, (3) 5′-GTTTTACAAAGAGATTTATTTCGTTTTGCCATTAGCATGCCTGGC-3′ and 5′-TGGTGGTCATATGCTCGGTG-3′. A PCR fusion of all three fragments with the first and last primer was used to retarget the URA3 containing locus. 5-Fluoroorotic acid counter-selection was used to isolate targeted colonies that were then mated with BY4742 <italic>ungΔ</italic> to generate the final BY4743 <italic>ungΔ/ungΔ</italic> YBR194W-snR6/YBR194W strain. Correct integration of all targeting constructs was confirmed by PCR.</p><p>Yeast transformation and selection, genomic DNA extraction and mutation frequency calculation were performed as described previously (<xref ref-type="bibr" rid="bib53">Taylor et al., 2013</xref>). Control and AID* expression vectors were as described previously (<xref ref-type="bibr" rid="bib53">Taylor et al., 2013</xref>). The sA3G* vector was generated by PCR amplification of the C-terminal domain of A3G* fused with a 5′ SV40 nuclear localisation sequence and FLAG tag using primers 5′-GCAAGCTTGCCACCATGCCTAAAAAGAAGCGTAAAGTCGAGATTCTCAGACACTCG-3′ and 5′-CCAGAATCAGGAAAACGGAGCAGACTACAAGGACGATGACGACAAGTAGCTCGAGGC-3′ and ligating the resultant Hind III-Xho I fragment it into pRS426-GAL1pr-tADHpolyA vector described previously (<xref ref-type="bibr" rid="bib53">Taylor et al., 2013</xref>).</p><p>Ethyl methanesulfonate (EMS) mutagenesis was performed by culturing BY4743 <italic>ungΔ/ungΔ</italic> yeast overnight in YEPD with 0.2% EMS, after which cells were washed in 5% sodium thiosulfate and plated for viability and canavanine resistance as above.</p></sec><sec id="s4-2"><title>Sample preparation and DNA sequencing</title><p>DNA libraries were generated using the multiplexing Nextera DNA Sample Prep Kit (Illumina, Little Chesterford, UK) according to manufactures instructions. The libraries were sequenced by BGI (BGI, Beijing, China). The de-multiplexed sequence reads were aligned to the reference yeast genome (SacCer_Apr2011/sacCer3) using BWA-MEM (<xref ref-type="bibr" rid="bib36">Li and Durbin, 2009</xref>). Optical duplicates were removed using Picard (<ext-link ext-link-type="uri" xlink:href="http://picard.sourceforge.net/">http://picard.sourceforge.net</ext-link>) and only uniquely mapped paired reads were retained. On average 43-fold sequence coverage was achieved for each yeast genome. Unprocessed sequence reads for this study have been deposited at the EMBL-EBI European Nucleotide Archive, study accession number PRJEB7456 (<ext-link ext-link-type="uri" xlink:href="http://www.ebi.ac.uk/ena/data/view/PRJEB7456">http://www.ebi.ac.uk/ena/data/view/PRJEB7456</ext-link>).</p></sec><sec id="s4-3"><title>Data analysis</title><sec id="s4-3-1"><title>Mutation calling</title><p>An in-house pipeline for mutation calling was used where GATK base quality score recalibration and indel realignment (<xref ref-type="bibr" rid="bib38">McKenna et al., 2010</xref>) was performed prior to somatic mutation calling by Somatic Sniper (<xref ref-type="bibr" rid="bib34">Larson et al., 2012</xref>) using the parental BY4743 genome as reference. High confidence single nucleotide variations (SNVs) were filtered using the following criteria: (1) SomaticSniper score >50, (2) allele frequency ≥0.3, (3) reference or samples read count ≥4, (4) average position as fraction on reads ≥0.1, (5) average distance to 3′ end ≥0.1, (6) average base quality ≥30, (7) average read length >50 bp.</p></sec><sec id="s4-3-2"><title>Mutation enriched loci (MEL) identification</title><p>Within each data set, mutations were pooled with the number of mutations within 150 base pair windows. Based on the assumption of a random distribution of mutation amongst the fragments, a binomial distribution was determined using the following parameters: size equal to the average number of mutations per clone and probability equal to the average number of mutations per clone over the total number of mutable motifs. Mutable motifs were the total number of WRC, YCC, or C bases for AID*, sA3G* and EMS respectively. The 99<sup>th</sup> percentile was used as a threshold to identify significantly mutated windows and adjacent windows merged. To refine the span of each individual mutation enriched loci (MEL), unmutated residues and residues falling in the following categories were removed and the window size adjusted: bases that had a count below the 25<sup>th</sup> percentile of all the counts in the window; bases which had a mutation count below four standard deviations from the average for the window and all bases with only a single detected mutation (where the median mutation count was above one). A final threshold was applied so that only regions with more than 5 mutations derived from at least four independent transformants were assigned as high confidence MELs. All MELs were manually assessed using a genome browser and are shown in <xref ref-type="supplementary-material" rid="SD3-data">Supplementary file 3</xref>.</p><p>The averaged fraction of overlapping regions for simulated MEL dataset were determined by 1000 cycles of bootstrap analysis using randomised equivalent number of fragments of identical sizes for each dataset distributed across the genome.</p></sec><sec id="s4-3-3"><title>Normalised mutation density</title><p>The normalised mutation density was calculated by dividing the mutation count for each residue by the total number of mutation for the dataset.</p></sec><sec id="s4-3-4"><title>RNAP enrichment</title><p>ChIP enrichment was determined by taking the sum of the ChIP enrichment scores (<xref ref-type="bibr" rid="bib29">Kim et al., 2010</xref>) for each promoter fragment (defined as 550 bp upstream and 50 bp downstream from the TSS [<xref ref-type="bibr" rid="bib47">Rhee and Pugh, 2012</xref>]). Promoters were then grouped according to the transcription rate (<xref ref-type="bibr" rid="bib15">García-Martínez et al., 2004</xref>) or whether they contained a MEL.</p></sec><sec id="s4-3-5"><title>Average mutation frequency for mRNA, tRNA, snoRNA and snRNA promoters</title><p>Promoter fragments for mRNA genes and transcription rate binning were performed as above. tRNA gene promoter fragments were defined as a 550 bp fragment centred on the middle of the tRNA gene. snoRNA and snRNA promoters were defined as 550 bp upstream and 50 bp downstream from the TSS defined in the <italic>Saccharomyces</italic> Genome Database (<xref ref-type="bibr" rid="bib10">Cherry et al., 2012</xref>). Intronic snoRNA genes were assigned the mRNA promoter and polycistronic snoRNA genes were assigned only one promoter. Mutation frequency was calculated by first randomly down-sampling the databases to half the size of the EMS dataset, to allow equivalent numbers of mutations to be compared. The number of mutations occurring on each promoter was then calculated. The process was bootstrapped 1000 times to give a directly comparable average number of mutations for each promoter.</p></sec><sec id="s4-3-6"><title>rDNA mapping</title><p>To detect mutations at the repetitive rDNA locus a less stringent algorithm was used. De-multiplexed sequence reads were aligned as before and unmapped reads removed. Reads mapping to the rDNA region (chrXII:434839-508289) were extracted and used for mutation calling by SomaticSniper. Mutations with a SomaticSniper score of above 50, a read depth of 10 in both the reference and the sample and no evidence of the mutated base in the reference genome were assigned.</p><p>All analyses were performed using Bioconductor. Scripts are included as <xref ref-type="supplementary-material" rid="SD4-data">Supplementary file 4</xref>.</p></sec></sec><sec id="s4-4"><title>Immunopreciptation</title><sec id="s4-4-1"><title>RNA binding</title><p>The tI(UAU)D RNA probes were generated by in vitro transcription (MegaShortScript T7 Kit, Life Technologies, Paisley, UK) with or without biotin-UTP (Life Technologies), according to manufactures instructions. Free nucleotides were removed using Oligo Clean & Concentrator columns (Zymo, Irvine, CA, USA). The tI(UAU)D template was generated by annealing the following oligos 5′-AATTTAATACGACTCACTATAGGGCTCGTGTAGCTCAGTGGTTAGAGCTTCGTGCTTATAACG-3′ and 5′-TGCTCGAGGTGGGGTTTGAACCCACGACGGTCGCGTTATAAGCACGAAGCTCTAACC-3′. The pre-tI(UAU)D template was generated by PCR amplification from yeast genomic DNA using the following primers 5′- AATTTAATACGACTCACTATAGGGCTCGTGTAGCTCAGTGGTTAGAGC-3′ and 5′-TGCTCGAGGTGGGGTTTGAACCCACGACGG-3′. Biotinylation of total yeast RNA (Life Technologies), polyuridylic acid, polyadenylic acid-polyuridylic acid (Sigma–Aldrich, Gillingham, UK) and the tI(UAU)D probe were performed using the RNA 3′ End Biotinylation Kit (Pierce) according to manufacturers instructions.</p><p>Biotinylated RNA probes (3.6 μg) were refolded by heating to 80°C for 5 min in folding buffer (25 mM Tris pH 7.6, 100 mM KCl, 1 mM EDTA), MgCl<sub>2</sub> was then added to a final concentration of 20 mM and the RNA allowed to slowly cool to 10°C before being bound to magnetic beads (Pierce, Loughborough, UK) for 1 hr at 4°C. Unbound probe was removed by washing with RNA buffer (25 mM Tris pH 7.6, 50 mM KCl, 5 mM NaCl, 1.5 mM MgCl<sub>2</sub>, 35 mM Glycine, 10% glycerol) supplemented with 0.5% Triton X-100. The integrity of the RNA was monitored by denaturing gel electrophoresis and staining with toluidine blue.</p><p>Clarified whole cell extracts (in RNA buffer supplemented with 0.3% Triton X-100 and complete protease inhibitors [Roche, Burgess Hill, UK]) from HEK 293 cells expressing Flag-AID, catalytically inactive AID (E58A mutation), APOBEC3G-Flag and the SV40-NLS tagged catalytic C-terminal domain of APOBEC3G (sA3G)-Flag, were incubated for 1 hr at 4°C in the presence of bead bound biotinylated RNA probes. Unbound proteins were removed by washing the beads four times in RNA buffer supplemented with 0.5% Triton X-100 at 4°C and the bound protein monitored by western using anti Flag antibodies (M2-HRP, Sigma–Aldrich).</p></sec><sec id="s4-4-2"><title>Chromatin imunoprecipitation</title><p>Overnight 60 ml yeast cultures fixed in 1% formaldehyde for 20 min and quenched in 0.125 M glycine (final) were washed twice in cold PBS, resuspended in RIPAlo (150 mM NaCl, 10 mM Tris–HCl pH 7.5, 1 mM EDTA, 1% Triton X-100, 0.1% SDS, 0.1% Sodium Deoxycholate, 1× Complete protease inhibitors) prior to lysis using a MPI TissueLyser (10 cycles of 30 s on, 5 min off, 4000 rpm), sonication using a Bioruptor (14 cycles of 30 s on, 30 s off, high intensity) and centrifugation (10 min 15,000×g). Equal amounts of clarified chromatin were incubated overnight at 4°C with 3 μg anti-HA 16B12 (Covance, Maidenhead, UK), 2 μg anti-H3 ab1791 (Abcam, Cambridge, UK), 2 μg anti-RNAPII S5P ab5131 (Abcam). Purification followed on Protein-G dynabeads for 2 hr with extensive washes (twice in RIPAlo, twice in RIPAhi [RIPAlo but for 500 mM NaCl], once in RIPA-LiCl [RIPAlo but 250 mM LiCl replacing NaCl] and twice in TE) and overnight elution in 25 mM Tris–HCl, 1 mM EDTA, pH to 9.8, 50 μg/ml proteinase K at 65°C. Input DNA was extracted using Gentra Puregene (Qiagen, Manchester, UK) with qPCR performed using QuantiFast SYBR kit (Qiagen) all as per manufactures instructions. Primers used are; YBR019C; 5′-ATCCAGCACCACCTGTAACC-3′ and 5′-AAACTTCTTTGCGTCCATCC-3′, YBR020W; 5′-ACCTGAGTTCAATTCTAGCGC-3′ and 5′-TCCGGTTTAGCATCATAAGCG-3′, YNL067W; 5′-AACCAAACTCTAGCCTCCAA-3′ and 5′-TGCTGACAGTAACACCTTCTGG-3′, YBL003C; 5′-TGTGCACTCTACCAACTGGG-3′ and 5′-ATGTCCGGTGGTAAAGGTGG-3′, YPL250C; 5′-AGAGAGTTGCTCCAGACCCT-3′ and 5′-GCATAAAGAAGCGGCTCTGC-3′, YEL009C; 5′-GGGGGAGAGTAACCTGTGTT-3′ and 5′-TTTCGGCTCGCTGTCTTACC-3′, YBR194W; 5′-TCTTCTTGCTCGGGGTTCTC-3′ and 5′-TGCTGAAGGCCTTTGCAAAG-3′, YPL189W; 5′-GCGAAGATTACGGCACTCGA-3′ and 5′-ACAGGTACGGGCTATCTGGA-3′, YLR183C; 5′-ACATCTGCCACGACACATCA-3′ and 5′-TGGTGGAGAGTACGGATCCA-3′, YJL105W; 5′-TTTCTTGCTCTTGGCGGCTA-3′ and 5′-AGTTAGGATCTGAGCCGGGT-3′, YPR007C; 5′-ACAGGTTCGAGCTTCATGGG-3′ and 5′-CGGGAATTTCATCCAGCGGA-3′, chrXV;367475-367594 5′-ACTTGGCACTTCTTCCTCAACA-3′ and 5′-TCGCAAAGTTGGCTAACCGT-3′, chrX;585916-586020 5′-ATGTCTCCCTGTTACCCGGT-3′ and 5′-ACAGGTGCTGTCACAAAACA-3′, chrIV;76,875-76,955 5'′-GGCAGCACCGAGAATGTTTT-3′ and 5′-GCTGTTAGCATATTGGGGGT-3′.</p></sec></sec><sec id="s4-5"><title>Yeast transcript analysis</title><p>RNA from 1 ml overnight cultures purified with RNAeasy plus (Qiagen) was used to generate cDNA using oligo-dTs and the GoScript Kit (Promega, Southampton, UK) followed by qPCR employing QuantiFast SYBR (Qiagen) all as per manufactures instructions. Primers used are TAF10; 5′-ATATTCCAGGATCAGGTCTTCCGTAGC-3′ and 5′-GTAGTCTTCTCATTCTGTTGATGTTGTTGTTG-3′, ACT1; 5′-CTTTCAACGTTCCAGCCTTC-3′ and 5′-CCAGCGTAAATTGGAACGAC-3′, YBR194W-snR6; 5′-CCTGCCACTTTCAAAAGGCG-3′ and 5′-CAGGGGAACTGCTGATCATCTCTG-3′, YBR194W; 5′-GGGTCGTGAAAAAGAGAACGG-3′ and 5′-ATGTGATGGTGCAGTGCCTC-3′.</p></sec><sec id="s4-6"><title>YBR194W promoter sequencing</title><p>The YBR194W promoter region was amplified using the following primers; 5′-ATTGTGGCAGTTCGGCTTTG-3′ and 5′-AGGTTTCCCAGTCTGGCTTG-3′ and Sanger sequenced using the latter.</p></sec></sec></body><back><ack id="ack"><title>Acknowledgements</title><p>We are grateful to David Rueda and Myron Goodman for sharing unpublished results and members of the Rada lab for helpful advice and discussions. The late Michael Neuberger instigated the initial stages of this work and remains in memory an inspiration. This work was supported by the Medical Research Council (MRC reference number MC_U105178806) and through an MRC Centennial Award to BJMT.</p></ack><sec sec-type="additional-information"><title>Additional information</title><fn-group content-type="competing-interest"><title>Competing interests</title><fn fn-type="conflict" id="conf1"><p>The authors declare that no competing interests exist.</p></fn></fn-group><fn-group content-type="author-contribution"><title>Author contributions</title><fn fn-type="con" id="con1"><p>BJMT, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con2"><p>YLW, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con3"><p>CR, Conception and design, Analysis and interpretation of data, Drafting or revising the article</p></fn></fn-group></sec><sec sec-type="supplementary-material"><title>Additional files</title><supplementary-material id="SD1-data"><object-id pub-id-type="doi">10.7554/eLife.03553.021</object-id><label>Supplementary file 1.</label><caption><p>Catalogue of yeast mutations.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.021">http://dx.doi.org/10.7554/eLife.03553.021</ext-link></p></caption><media mime-subtype="txt" mimetype="text" xlink:href="elife03553s001.txt"/></supplementary-material><supplementary-material id="SD2-data"><object-id pub-id-type="doi">10.7554/eLife.03553.022</object-id><label>Supplementary file 2.</label><caption><p>Coordinates of MELs.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.022">http://dx.doi.org/10.7554/eLife.03553.022</ext-link></p></caption><media mime-subtype="txt" mimetype="text" xlink:href="elife03553s002.txt"/></supplementary-material><supplementary-material id="SD3-data"><object-id pub-id-type="doi">10.7554/eLife.03553.023</object-id><label>Supplementary file 3.</label><caption><p>All mutationally enriched regions (MELs). Top panel indicate position of each non-clonal mutation indicated by a dot (at C, black; at G, red), with horizontal lines representing a single genome. Middle panel shows MELs (AID*, green; sA3G*, purple; EMS, grey). Bottom panel displays genomic features (including transcripts, replication origins, centromers), coloured according to feature type, with arrows indicating the direction of transcription. The coordinates of the region are indicated. Regions are ranked according to the number of mutations present.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.023">http://dx.doi.org/10.7554/eLife.03553.023</ext-link></p></caption><media mime-subtype="pdf" mimetype="application" xlink:href="elife03553s003.pdf"/></supplementary-material><supplementary-material id="SD4-data"><object-id pub-id-type="doi">10.7554/eLife.03553.024</object-id><label>Supplementary file 4.</label><caption><p>Scripts used for data analyses.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.03553.024">http://dx.doi.org/10.7554/eLife.03553.024</ext-link></p></caption><media mime-subtype="zip" mimetype="application" xlink:href="elife03553s004.zip"/></supplementary-material><sec sec-type="datasets"><title>Major datasets</title><p>The following dataset was generated:</p><p><related-object content-type="generated-dataset" source-id="http://www.ebi.ac.uk/ena/data/view/PRJEB7456" source-id-type="uri" id="dataro1"><collab collab-type="author">Taylor BJM</collab>, <collab collab-type="author">Wu YL</collab>, <collab collab-type="author">Rada C</collab>, <year>2014</year><x>, </x><source>Data from: Active RNAP pre-initiation sites are highly mutated by cytidine deaminases in yeast, with AID targeting small RNA genes</source><x>, </x><object-id pub-id-type="art-access-id">PRJEB7456</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://www.ebi.ac.uk/ena/data/view/PRJEB7456">http://www.ebi.ac.uk/ena/data/view/PRJEB7456</ext-link><x>, </x><comment>Publicly available at EMBL-EBI European Nucleotide Archive (<ext-link ext-link-type="uri" xlink:href="http://www.ebi.ac.uk">http://www.ebi.ac.uk</ext-link>).</comment></related-object></p><p>The following previously published datasets were used:</p><p><related-object content-type="existing-dataset" source-id="http://dx.doi.org/10.1038/nature12477" source-id-type="uri" id="dataro2"><collab collab-type="author">Ludmil B Alexandrov</collab>, <collab collab-type="author">Serena Nik-Zainal</collab>, <collab collab-type="author">David C Wedge</collab>, <collab collab-type="author">Samuel A Aparicio</collab>, <collab collab-type="author">Sam Behjati</collab>, <collab collab-type="author">Andrew V Biankin</collab>, <collab collab-type="author">Graham R Bignell</collab>, <collab collab-type="author">Niccolò Bolli</collab>, <collab collab-type="author">Ake Borg</collab>, <collab collab-type="author">Anne-Lise Børresen-Dale</collab>, <collab collab-type="author">Sandrine Boyault</collab>, <collab collab-type="author">Birgit Burkhardt</collab>, <collab collab-type="author">Adam P Butler</collab>, <collab collab-type="author">Carlos Caldas</collab>, <collab collab-type="author">Helen R Davies</collab>, <collab collab-type="author">Christine Desmedt</collab>, <collab collab-type="author">Roland Eils</collab>, <collab collab-type="author">Jórunn Erla Eyfjörd</collab>, <collab collab-type="author">John A Foekens</collab>, <collab collab-type="author">Mel Greaves</collab>, <collab collab-type="author">Fumie Hosoda</collab>, <collab collab-type="author">Barbara Hutter</collab>, <collab collab-type="author">Tomislav Ilicic</collab>, <collab collab-type="author">Sandrine Imbeaud</collab>, <collab collab-type="author">Marcin Imielinski</collab>, <collab collab-type="author">Natalie Jäger</collab>, <collab collab-type="author">David T Jones</collab>, <collab collab-type="author">David Jones</collab>, <collab collab-type="author">Stian Knappskog</collab>, <collab collab-type="author">Marcel Kool</collab>, <collab collab-type="author">Sunil R Lakhani</collab>, <collab collab-type="author">Carlos López-Otín</collab>, <collab collab-type="author">Sancha Martin</collab>, <collab collab-type="author">Nikhil C Munshi</collab>, <collab collab-type="author">Hiromi Nakamura</collab>, <collab collab-type="author">Paul A Northcott</collab>, <collab collab-type="author">Marina Pajic</collab>, <collab collab-type="author">Elli Papaemmanuil</collab>, <collab collab-type="author">Angelo Paradiso</collab>, <collab collab-type="author">John V Pearson</collab>, <collab collab-type="author">Xose S Puente</collab>, <collab collab-type="author">Keiran Raine</collab>, <collab collab-type="author">Manasa Ramakrishna</collab>, <collab collab-type="author">Andrea L Richardson</collab>, <collab collab-type="author">Julia Richter</collab>, <collab collab-type="author">Philip Rosenstiel</collab>, <collab collab-type="author">Matthias Schlesner</collab>, <collab collab-type="author">Ton N Schumacher</collab>, <collab collab-type="author">Paul N Span</collab>, <collab collab-type="author">Jon W Teague</collab>, <collab collab-type="author">Yasushi Totoki</collab>, <collab collab-type="author">Andrew N Tutt</collab>, <collab collab-type="author">Rafael Valdés-Mas</collab>, <collab collab-type="author">Marit M van Buuren</collab>, <collab collab-type="author">Laura van 't Veer</collab>, <collab collab-type="author">Anne Vincent-Salomon</collab>, <collab collab-type="author">Nicola Waddell</collab>, <collab collab-type="author">Lucy R Yates</collab>, <collab>Australian Pancreatic Cancer Genome Initiative, ICGC Breast Cancer Consortium, ICGC MMML-Seq Consortium, ICGC PedBrain</collab>, <collab collab-type="author">Jessica Zucman-Rossi</collab>, <collab collab-type="author">P Andrew Futreal</collab>, <collab collab-type="author">Ultan McDermott</collab>, <collab collab-type="author">Peter Lichter</collab>, <collab collab-type="author">Matthew Meyerson</collab>, <collab collab-type="author">Sean M Grimmond</collab>, <collab collab-type="author">Reiner Siebert</collab>, <collab collab-type="author">Elías Campo</collab>, <collab collab-type="author">Tatsuhiro Shibata</collab>, <collab collab-type="author">Stefan M Pfister</collab>, <collab collab-type="author">Peter J Campbell</collab> and <collab collab-type="author">Michael R Stratton</collab>, <year>2013</year><x>, </x><source>Data from: <ext-link ext-link-type="uri" xlink:href="ftp://ftp.sanger.ac.uk/pub/cancer/AlexandrovEtAl">ftp://ftp.sanger.ac.uk/pub/cancer/AlexandrovEtAl</ext-link></source><x>, </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature12477">10.1038/nature12477</ext-link><x>, </x><comment>Publicly available.</comment></related-object></p><p><related-object content-type="existing-dataset" source-id="http://dx.doi.org/10.1016/j.molcel.2004.06.004" source-id-type="uri" id="dataro3"><collab collab-type="author">García-Martínez J</collab>, <collab collab-type="author">Aranda A</collab>, <collab collab-type="author">Pérez-Ortín JE</collab>, <year>2004</year><x>, </x><source>Data from: <ext-link ext-link-type="uri" xlink:href="http://scsie.uv.es/chipsdna/chipsdna-e.html#datos">http://scsie.uv.es/chipsdna/chipsdna-e.html#datos</ext-link></source><x>, </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/j.molcel.2004.06.004">10.1016/j.molcel.2004.06.004</ext-link><x>, </x><comment>Publicly available.</comment></related-object></p><p><related-object content-type="existing-dataset" source-id="http://dx.doi.org/10.1038/nsmb.1913" source-id-type="uri" id="dataro4"><collab collab-type="author">Kim H</collab>, <collab collab-type="author">Erickson B</collab>, <collab collab-type="author">Luo W</collab>, <collab collab-type="author">Seward D</collab>, <collab collab-type="author">Graber JH</collab>, <collab collab-type="author">Pollock DD</collab>, <collab collab-type="author">Megee PC</collab>, <collab collab-type="author">Bentley DL</collab>, <year>2010</year><x>, </x><source>Data from: <ext-link ext-link-type="uri" xlink:href="http://downloads.yeastgenome.org/published_datasets/Kim_2010_PMID_20835241/">http://downloads.yeastgenome.org/published_datasets/Kim_2010_PMID_20835241/</ext-link></source><x>, </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nsmb.1913">10.1038/nsmb.1913</ext-link><x>, </x><comment>Publicly available.</comment></related-object></p><p><related-object content-type="existing-dataset" source-id="http://dx.doi.org/10.1038/nature10799" source-id-type="uri" id="dataro5"><collab collab-type="author">Rhee HS</collab>, <collab collab-type="author">Pugh BF</collab>, <year>2012</year><x>, </x><source>Data from: <ext-link ext-link-type="uri" xlink:href="http://downloads.yeastgenome.org/published_datasets/Rhee_2012_PMID_22258509/">http://downloads.yeastgenome.org/published_datasets/Rhee_2012_PMID_22258509/</ext-link></source><x>, </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature10799">10.1038/nature10799</ext-link><x>, </x><comment>Publicly available.</comment></related-object></p><p><related-object content-type="existing-dataset" source-id="http://dx.doi.org/10.1038/nature07728" source-id-type="uri" id="dataro6"><collab collab-type="author">Xu Z</collab>, <collab collab-type="author">Wei W</collab>, <collab collab-type="author">Gagneur J</collab>, <collab collab-type="author">Perocchi F</collab>, <collab collab-type="author">Clauder-Münster S</collab>, <collab collab-type="author">Camblong J</collab>, <collab collab-type="author">Guffanti E</collab>, <collab collab-type="author">Stutz F</collab>, <collab collab-type="author">Huber W</collab>, <collab collab-type="author">Steinmetz LM</collab>, <year>2009</year><x>, </x><source>Data from: <ext-link ext-link-type="uri" xlink:href="http://downloads.yeastgenome.org/published_datasets/Xu_2009_PMID_19169243/">http://downloads.yeastgenome.org/published_datasets/Xu_2009_PMID_19169243/</ext-link></source><x>, </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature07728">10.1038/nature07728</ext-link><x>, </x><comment>Publicly available.</comment></related-object></p><p><related-object content-type="existing-dataset" source-id="http://dx.doi.org/10.1016/j.molcel.2011.01.015" source-id-type="uri" id="dataro7"><collab collab-type="author">Venters BJ</collab>, <collab collab-type="author">Wachi S</collab>, <collab collab-type="author">Mavrich TN</collab>, <collab collab-type="author">Andersen BE</collab>, <collab collab-type="author">Jena P</collab>, <collab collab-type="author">Sinnamon AJ</collab>, <collab collab-type="author">Jain P</collab>, <collab collab-type="author">Rolleri NS</collab>, <collab collab-type="author">Jiang C</collab>, <collab collab-type="author">Hemeryck‐Walsh C</collab>, <collab collab-type="author">Pugh BF</collab>, <year>2011</year><x>, </x><source>Data from: <ext-link ext-link-type="uri" xlink:href="http://downloads.yeastgenome.org/published_datasets/Venters_2011_PMID_21329885/">http://downloads.yeastgenome.org/published_datasets/Venters_2011_PMID_21329885/</ext-link></source><x>, </x><ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/j.molcel.2011.01.015">10.1016/j.molcel.2011.01.015</ext-link><x>, </x><comment>Publicly available.</comment></related-object></p></sec></sec><ref-list><title>References</title><ref id="bib1"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Adelman</surname><given-names>K</given-names></name><name><surname>Lis</surname><given-names>JT</given-names></name></person-group><year>2012</year><article-title>Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoans</article-title><source>Nature Reviews Genetics</source><volume>13</volume><fpage>720</fpage><lpage>731</lpage><pub-id pub-id-type="doi">10.1038/nrg3293</pub-id></element-citation></ref><ref id="bib2"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Alexandrov</surname><given-names>LB</given-names></name><name><surname>Nik-Zainal</surname><given-names>S</given-names></name><name><surname>Wedge</surname><given-names>DC</given-names></name><name><surname>Aparicio</surname><given-names>SA</given-names></name><name><surname>Behjati</surname><given-names>S</given-names></name><name><surname>Biankin</surname><given-names>AV</given-names></name><name><surname>Bignell</surname><given-names>GR</given-names></name><name><surname>Bolli</surname><given-names>N</given-names></name><name><surname>Borg</surname><given-names>A</given-names></name><name><surname>Børresen-Dale</surname><given-names>AL</given-names></name><name><surname>Boyault</surname><given-names>S</given-names></name><name><surname>Burkhardt</surname><given-names>B</given-names></name><name><surname>Butler</surname><given-names>AP</given-names></name><name><surname>Caldas</surname><given-names>C</given-names></name><name><surname>Davies</surname><given-names>HR</given-names></name><name><surname>Desmedt</surname><given-names>C</given-names></name><name><surname>Eils</surname><given-names>R</given-names></name><name><surname>Eyfjörd</surname><given-names>JE</given-names></name><name><surname>Foekens</surname><given-names>JA</given-names></name><name><surname>Greaves</surname><given-names>M</given-names></name><name><surname>Hosoda</surname><given-names>F</given-names></name><name><surname>Hutter</surname><given-names>B</given-names></name><name><surname>Ilicic</surname><given-names>T</given-names></name><name><surname>Imbeaud</surname><given-names>S</given-names></name><name><surname>Imielinski</surname><given-names>M</given-names></name><name><surname>Jäger</surname><given-names>N</given-names></name><name><surname>Jones</surname><given-names>DT</given-names></name><name><surname>Jones</surname><given-names>D</given-names></name><name><surname>Knappskog</surname><given-names>S</given-names></name><name><surname>Kool</surname><given-names>M</given-names></name><name><surname>Lakhani</surname><given-names>SR</given-names></name><name><surname>López-Otín</surname><given-names>C</given-names></name><name><surname>Martin</surname><given-names>S</given-names></name><name><surname>Munshi</surname><given-names>NC</given-names></name><name><surname>Nakamura</surname><given-names>H</given-names></name><name><surname>Northcott</surname><given-names>PA</given-names></name><name><surname>Pajic</surname><given-names>M</given-names></name><name><surname>Papaemmanuil</surname><given-names>E</given-names></name><name><surname>Paradiso</surname><given-names>A</given-names></name><name><surname>Pearson</surname><given-names>JV</given-names></name><name><surname>Puente</surname><given-names>XS</given-names></name><name><surname>Raine</surname><given-names>K</given-names></name><name><surname>Ramakrishna</surname><given-names>M</given-names></name><name><surname>Richardson</surname><given-names>AL</given-names></name><name><surname>Richter</surname><given-names>J</given-names></name><name><surname>Rosenstiel</surname><given-names>P</given-names></name><name><surname>Schlesner</surname><given-names>M</given-names></name><name><surname>Schumacher</surname><given-names>TN</given-names></name><name><surname>Span</surname><given-names>PN</given-names></name><name><surname>Teague</surname><given-names>JW</given-names></name><name><surname>Totoki</surname><given-names>Y</given-names></name><name><surname>Tutt</surname><given-names>AN</given-names></name><name><surname>Valdés-Mas</surname><given-names>R</given-names></name><name><surname>van Buuren</surname><given-names>MM</given-names></name><name><surname>van 't Veer</surname><given-names>L</given-names></name><name><surname>Vincent-Salomon</surname><given-names>A</given-names></name><name><surname>Waddell</surname><given-names>N</given-names></name><name><surname>Yates</surname><given-names>LR</given-names></name>, <collab>Australian Pancreatic Cancer Genome Initiative</collab>, <collab>ICGC Breast Cancer Consortium</collab>, <collab>ICGC MMML-Seq Consortium</collab>, <collab>ICGC PedBrain</collab><name><surname>Zucman-Rossi</surname><given-names>J</given-names></name><name><surname>Futreal</surname><given-names>PA</given-names></name><name><surname>McDermott</surname><given-names>U</given-names></name><name><surname>Lichter</surname><given-names>P</given-names></name><name><surname>Meyerson</surname><given-names>M</given-names></name><name><surname>Grimmond</surname><given-names>SM</given-names></name><name><surname>Siebert</surname><given-names>R</given-names></name><name><surname>Campo</surname><given-names>E</given-names></name><name><surname>Shibata</surname><given-names>T</given-names></name><name><surname>Pfister</surname><given-names>SM</given-names></name><name><surname>Campbell</surname><given-names>PJ</given-names></name><name><surname>Stratton</surname><given-names>MR</given-names></name></person-group><year>2013</year><article-title>Signatures of mutational processes in human cancer</article-title><source>Nature</source><volume>500</volume><fpage>415</fpage><lpage>421</lpage><pub-id pub-id-type="doi">10.1038/nature12477</pub-id></element-citation></ref><ref id="bib3"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Basu</surname><given-names>U</given-names></name><name><surname>Meng</surname><given-names>FL</given-names></name><name><surname>Keim</surname><given-names>C</given-names></name><name><surname>Grinstein</surname><given-names>V</given-names></name><name><surname>Pefanis</surname><given-names>E</given-names></name><name><surname>Eccleston</surname><given-names>J</given-names></name><name><surname>Zhang</surname><given-names>T</given-names></name><name><surname>Myers</surname><given-names>D</given-names></name><name><surname>Wasserman</surname><given-names>CR</given-names></name><name><surname>Wesemann</surname><given-names>DR</given-names></name><name><surname>Januszyk</surname><given-names>K</given-names></name><name><surname>Gregory</surname><given-names>RI</given-names></name><name><surname>Deng</surname><given-names>H</given-names></name><name><surname>Lima</surname><given-names>CD</given-names></name><name><surname>Alt</surname><given-names>FW</given-names></name></person-group><year>2011</year><article-title>The RNA exosome targets the AID cytidine deaminase to both strands of transcribed duplex DNA substrates</article-title><source>Cell</source><volume>144</volume><fpage>353</fpage><lpage>363</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2011.01.001</pub-id></element-citation></ref><ref id="bib4"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Beale</surname><given-names>RC</given-names></name><name><surname>Petersen-Mahrt</surname><given-names>SK</given-names></name><name><surname>Watt</surname><given-names>IN</given-names></name><name><surname>Harris</surname><given-names>RS</given-names></name><name><surname>Rada</surname><given-names>C</given-names></name><name><surname>Neuberger</surname><given-names>MS</given-names></name></person-group><year>2004</year><article-title>Comparison of the differential context-dependence of DNA deamination by APOBEC enzymes: correlation with mutation spectra in vivo</article-title><source>Journal of Molecular Biology</source><volume>337</volume><fpage>585</fpage><lpage>596</lpage><pub-id pub-id-type="doi">10.1016/j.jmb.2004.01.046</pub-id></element-citation></ref><ref id="bib5"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bélanger</surname><given-names>K</given-names></name><name><surname>Savoie</surname><given-names>M</given-names></name><name><surname>Rosales Gerpe</surname><given-names>MC</given-names></name><name><surname>Couture</surname><given-names>JF</given-names></name><name><surname>Langlois</surname><given-names>MA</given-names></name></person-group><year>2013</year><article-title>Binding of RNA by APOBEC3G controls deamination-independent restriction of retroviruses</article-title><source>Nucleic Acids Research</source><volume>41</volume><fpage>7438</fpage><lpage>7452</lpage><pub-id pub-id-type="doi">10.1093/nar/gkt527</pub-id></element-citation></ref><ref id="bib6"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bransteitter</surname><given-names>R</given-names></name><name><surname>Pham</surname><given-names>P</given-names></name><name><surname>Scharff</surname><given-names>MD</given-names></name><name><surname>Goodman</surname><given-names>MF</given-names></name></person-group><year>2003</year><article-title>Activation-induced cytidine deaminase deaminates deoxycytidine on single-stranded DNA but requires the action of RNase</article-title><source>Proceedings of the National Academy of Sciences of USA</source><volume>100</volume><fpage>4102</fpage><lpage>4107</lpage><pub-id pub-id-type="doi">10.1073/pnas.0730835100</pub-id></element-citation></ref><ref id="bib7"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Burns</surname><given-names>MB</given-names></name><name><surname>Temiz</surname><given-names>NA</given-names></name><name><surname>Harris</surname><given-names>RS</given-names></name></person-group><year>2013</year><article-title>Evidence for APOBEC3B mutagenesis in multiple human cancers</article-title><source>Nature Genetics</source><volume>45</volume><fpage>977</fpage><lpage>983</lpage><pub-id pub-id-type="doi">10.1038/ng.2701</pub-id></element-citation></ref><ref id="bib8"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Casellas</surname><given-names>R</given-names></name><name><surname>Yamane</surname><given-names>A</given-names></name><name><surname>Kovalchuk</surname><given-names>AL</given-names></name><name><surname>Potter</surname><given-names>M</given-names></name></person-group><year>2009</year><article-title>Restricting activation-induced cytidine deaminase tumorigenic activity in B lymphocytes</article-title><source>Immunology</source><volume>126</volume><fpage>316</fpage><lpage>328</lpage><pub-id pub-id-type="doi">10.1111/j.1365-2567.2008.03050.x</pub-id></element-citation></ref><ref id="bib9"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Chaudhuri</surname><given-names>J</given-names></name><name><surname>Tian</surname><given-names>M</given-names></name><name><surname>Khuong</surname><given-names>C</given-names></name><name><surname>Chua</surname><given-names>K</given-names></name><name><surname>Pinaud</surname><given-names>E</given-names></name><name><surname>Alt</surname><given-names>FW</given-names></name></person-group><year>2003</year><article-title>Transcription-targeted DNA deamination by the AID antibody diversification enzyme</article-title><source>Nature</source><volume>422</volume><fpage>726</fpage><lpage>730</lpage><pub-id pub-id-type="doi">10.1038/nature01574</pub-id></element-citation></ref><ref id="bib10"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cherry</surname><given-names>JM</given-names></name><name><surname>Hong</surname><given-names>EL</given-names></name><name><surname>Amundsen</surname><given-names>C</given-names></name><name><surname>Balakrishnan</surname><given-names>R</given-names></name><name><surname>Binkley</surname><given-names>G</given-names></name><name><surname>Chan</surname><given-names>ET</given-names></name><name><surname>Christie</surname><given-names>KR</given-names></name><name><surname>Costanzo</surname><given-names>MC</given-names></name><name><surname>Dwight</surname><given-names>SS</given-names></name><name><surname>Engel</surname><given-names>SR</given-names></name><name><surname>Fisk</surname><given-names>DG</given-names></name><name><surname>Hirschman</surname><given-names>JE</given-names></name><name><surname>Hitz</surname><given-names>BC</given-names></name><name><surname>Karra</surname><given-names>K</given-names></name><name><surname>Krieger</surname><given-names>CJ</given-names></name><name><surname>Miyasato</surname><given-names>SR</given-names></name><name><surname>Nash</surname><given-names>RS</given-names></name><name><surname>Park</surname><given-names>J</given-names></name><name><surname>Skrzypek</surname><given-names>MS</given-names></name><name><surname>Simison</surname><given-names>M</given-names></name><name><surname>Weng</surname><given-names>S</given-names></name><name><surname>Wong</surname><given-names>ED</given-names></name></person-group><year>2012</year><article-title>Saccharomyces Genome Database: the genomics resource of budding yeast</article-title><source>Nucleic Acids Research</source><volume>40</volume><fpage>D700</fpage><lpage>D705</lpage><pub-id pub-id-type="doi">10.1093/nar/gkr1029</pub-id></element-citation></ref><ref id="bib11"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Chiarle</surname><given-names>R</given-names></name><name><surname>Zhang</surname><given-names>Y</given-names></name><name><surname>Frock</surname><given-names>RL</given-names></name><name><surname>Lewis</surname><given-names>SM</given-names></name><name><surname>Molinie</surname><given-names>B</given-names></name><name><surname>Ho</surname><given-names>YJ</given-names></name><name><surname>Myers</surname><given-names>DR</given-names></name><name><surname>Choi</surname><given-names>VW</given-names></name><name><surname>Compagno</surname><given-names>M</given-names></name><name><surname>Malkin</surname><given-names>DJ</given-names></name><name><surname>Neuberg</surname><given-names>D</given-names></name><name><surname>Monti</surname><given-names>S</given-names></name><name><surname>Giallourakis</surname><given-names>CC</given-names></name><name><surname>Gostissa</surname><given-names>M</given-names></name><name><surname>Alt</surname><given-names>FW</given-names></name></person-group><year>2011</year><article-title>Genome-wide translocation sequencing reveals mechanisms of chromosome breaks and rearrangements in B cells</article-title><source>Cell</source><volume>147</volume><fpage>107</fpage><lpage>119</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2011.07.049</pub-id></element-citation></ref><ref id="bib12"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Duquette</surname><given-names>ML</given-names></name><name><surname>Pham</surname><given-names>P</given-names></name><name><surname>Goodman</surname><given-names>MF</given-names></name><name><surname>Maizels</surname><given-names>N</given-names></name></person-group><year>2005</year><article-title>AID binds to transcription-induced structures in c-MYC that map to regions associated with translocation and hypermutation</article-title><source>Oncogene</source><volume>24</volume><fpage>5791</fpage><lpage>5798</lpage><pub-id pub-id-type="doi">10.1038/sj.onc.1208746</pub-id></element-citation></ref><ref id="bib13"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Formosa</surname><given-names>T</given-names></name></person-group><year>2013</year><article-title>The role of FACT in making and breaking nucleosomes</article-title><source>Biochimica Et Biophysica Acta</source><volume>1819</volume><fpage>247</fpage><lpage>255</lpage><pub-id pub-id-type="doi">10.1016/j.bbagrm.2011.07.009</pub-id></element-citation></ref><ref id="bib14"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Fritsch</surname><given-names>O</given-names></name><name><surname>Burkhalter</surname><given-names>MD</given-names></name><name><surname>Kais</surname><given-names>S</given-names></name><name><surname>Sogo</surname><given-names>JM</given-names></name><name><surname>Schär</surname><given-names>P</given-names></name></person-group><year>2010</year><article-title>DNA ligase 4 stabilizes the ribosomal DNA array upon fork collapse at the replication fork barrier</article-title><source>DNA Repair</source><volume>9</volume><fpage>879</fpage><lpage>888</lpage><pub-id pub-id-type="doi">10.1016/j.dnarep.2010.05.003</pub-id></element-citation></ref><ref id="bib15"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>García-Martínez</surname><given-names>J</given-names></name><name><surname>Aranda</surname><given-names>A</given-names></name><name><surname>Pérez-Ortín</surname><given-names>JE</given-names></name></person-group><year>2004</year><article-title>Genomic run-on evaluates transcription rates for all yeast genes and identifies gene regulatory mechanisms</article-title><source>Molecular Cell</source><volume>15</volume><fpage>303</fpage><lpage>313</lpage><pub-id pub-id-type="doi">10.1016/j.molcel.2004.06.004</pub-id></element-citation></ref><ref id="bib16"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Giardina</surname><given-names>C</given-names></name><name><surname>Lis</surname><given-names>JT</given-names></name></person-group><year>1993</year><article-title>DNA melting on yeast RNA polymerase II promoters</article-title><source>Science</source><volume>261</volume><fpage>759</fpage><lpage>762</lpage><pub-id pub-id-type="doi">10.1126/science.8342041</pub-id></element-citation></ref><ref id="bib17"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ginno</surname><given-names>PA</given-names></name><name><surname>Lott</surname><given-names>PL</given-names></name><name><surname>Christensen</surname><given-names>HC</given-names></name><name><surname>Korf</surname><given-names>I</given-names></name><name><surname>Chédin</surname><given-names>F</given-names></name></person-group><year>2012</year><article-title>R-loop formation is a distinctive characteristic of unmethylated human CpG island promoters</article-title><source>Molecular Cell</source><volume>45</volume><fpage>814</fpage><lpage>825</lpage><pub-id pub-id-type="doi">10.1016/j.molcel.2012.01.017</pub-id></element-citation></ref><ref id="bib18"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gómez-González</surname><given-names>B</given-names></name><name><surname>Aguilera</surname><given-names>A</given-names></name></person-group><year>2007</year><article-title>Activation-induced cytidine deaminase action is strongly stimulated by mutations of the THO complex</article-title><source>Proceedings of the National Academy of Sciences of USA</source><volume>104</volume><fpage>8409</fpage><lpage>8414</lpage><pub-id pub-id-type="doi">10.1073/pnas.0702836104</pub-id></element-citation></ref><ref id="bib19"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Grünberg</surname><given-names>S</given-names></name><name><surname>Warfield</surname><given-names>L</given-names></name><name><surname>Hahn</surname><given-names>S</given-names></name></person-group><year>2012</year><article-title>Architecture of the RNA polymerase II preinitiation complex and mechanism of ATP-dependent promoter opening</article-title><source>Nature Structural & Molecular Biology</source><volume>19</volume><fpage>788</fpage><lpage>796</lpage><pub-id pub-id-type="doi">10.1038/nsmb.2334</pub-id></element-citation></ref><ref id="bib20"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Guenther</surname><given-names>MG</given-names></name><name><surname>Levine</surname><given-names>SS</given-names></name><name><surname>Boyer</surname><given-names>LA</given-names></name><name><surname>Jaenisch</surname><given-names>R</given-names></name><name><surname>Young</surname><given-names>RA</given-names></name></person-group><year>2007</year><article-title>A chromatin landmark and transcription initiation at most promoters in human cells</article-title><source>Cell</source><volume>130</volume><fpage>77</fpage><lpage>88</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2007.05.042</pub-id></element-citation></ref><ref id="bib21"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Harris</surname><given-names>RS</given-names></name><name><surname>Liddament</surname><given-names>MT</given-names></name></person-group><year>2004</year><article-title>Retroviral restriction by APOBEC proteins</article-title><source>Nature Reviews Immunology</source><volume>4</volume><fpage>868</fpage><lpage>877</lpage><pub-id pub-id-type="doi">10.1038/nri1489</pub-id></element-citation></ref><ref id="bib22"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Houseley</surname><given-names>J</given-names></name><name><surname>Kotovic</surname><given-names>K</given-names></name><name><surname>El Hage</surname><given-names>A</given-names></name><name><surname>Tollervey</surname><given-names>D</given-names></name></person-group><year>2007</year><article-title>Trf4 targets ncRNAs from telomeric and rDNA spacer regions and functions in rDNA copy number control</article-title><source>The EMBO Journal</source><volume>26</volume><fpage>4996</fpage><lpage>5006</lpage><pub-id pub-id-type="doi">10.1038/sj.emboj.7601921</pub-id></element-citation></ref><ref id="bib23"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname><given-names>Y</given-names></name><name><surname>Ericsson</surname><given-names>I</given-names></name><name><surname>Torseth</surname><given-names>K</given-names></name><name><surname>Methot</surname><given-names>SP</given-names></name><name><surname>Sundheim</surname><given-names>O</given-names></name><name><surname>Liabakk</surname><given-names>NB</given-names></name><name><surname>Slupphaug</surname><given-names>G</given-names></name><name><surname>Di Noia</surname><given-names>JM</given-names></name><name><surname>Krokan</surname><given-names>HE</given-names></name><name><surname>Kavli</surname><given-names>B</given-names></name></person-group><year>2013</year><article-title>A combined nuclear and nucleolar localization motif in activation-induced cytidine deaminase (AID) controls immunoglobulin class switching</article-title><source>Journal of Molecular Biology</source><volume>425</volume><fpage>424</fpage><lpage>443</lpage><pub-id pub-id-type="doi">10.1016/j.jmb.2012.11.026</pub-id></element-citation></ref><ref id="bib24"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Huthoff</surname><given-names>H</given-names></name><name><surname>Autore</surname><given-names>F</given-names></name><name><surname>Gallois-Montbrun</surname><given-names>S</given-names></name><name><surname>Fraternali</surname><given-names>F</given-names></name><name><surname>Malim</surname><given-names>MH</given-names></name></person-group><year>2009</year><article-title>RNA-dependent oligomerization of APOBEC3G is required for restriction of HIV-1</article-title><source>PLOS Pathogens</source><volume>5</volume><fpage>e1000330</fpage><pub-id pub-id-type="doi">10.1371/journal.ppat.1000330</pub-id></element-citation></ref><ref id="bib25"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jarmuz</surname><given-names>A</given-names></name><name><surname>Chester</surname><given-names>A</given-names></name><name><surname>Bayliss</surname><given-names>J</given-names></name><name><surname>Gisbourne</surname><given-names>J</given-names></name><name><surname>Dunham</surname><given-names>I</given-names></name><name><surname>Scott</surname><given-names>J</given-names></name><name><surname>Navaratnam</surname><given-names>N</given-names></name></person-group><year>2002</year><article-title>An anthropoid-specific locus of orphan C to U RNA-editing enzymes on chromosome 22</article-title><source>Genomics</source><volume>79</volume><fpage>285</fpage><lpage>296</lpage><pub-id pub-id-type="doi">10.1006/geno.2002.6718</pub-id></element-citation></ref><ref id="bib26"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jiang</surname><given-names>C</given-names></name><name><surname>Pugh</surname><given-names>BF</given-names></name></person-group><year>2009</year><article-title>Nucleosome positioning and gene regulation: advances through genomics</article-title><source>Nature Reviews Genetics</source><volume>10</volume><fpage>161</fpage><lpage>172</lpage><pub-id pub-id-type="doi">10.1038/nrg2522</pub-id></element-citation></ref><ref id="bib27"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kenter</surname><given-names>AL</given-names></name></person-group><year>2012</year><article-title>AID targeting is dependent on RNA polymerase II pausing</article-title><source>Seminars in Immunology</source><volume>24</volume><fpage>281</fpage><lpage>286</lpage><pub-id pub-id-type="doi">10.1016/j.smim.2012.06.001</pub-id></element-citation></ref><ref id="bib28"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kijak</surname><given-names>GH</given-names></name><name><surname>Janini</surname><given-names>LM</given-names></name><name><surname>Tovanabutra</surname><given-names>S</given-names></name><name><surname>Sanders-Buell</surname><given-names>E</given-names></name><name><surname>Arroyo</surname><given-names>MA</given-names></name><name><surname>Robb</surname><given-names>ML</given-names></name><name><surname>Michael</surname><given-names>NL</given-names></name><name><surname>Birx</surname><given-names>DL</given-names></name><name><surname>McCutchan</surname><given-names>FE</given-names></name></person-group><year>2008</year><article-title>Variable contexts and levels of hypermutation in HIV-1 proviral genomes recovered from primary peripheral blood mononuclear cells</article-title><source>Virology</source><volume>376</volume><fpage>101</fpage><lpage>111</lpage><pub-id pub-id-type="doi">10.1016/j.virol.2008.03.017</pub-id></element-citation></ref><ref id="bib29"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname><given-names>H</given-names></name><name><surname>Erickson</surname><given-names>B</given-names></name><name><surname>Luo</surname><given-names>W</given-names></name><name><surname>Seward</surname><given-names>D</given-names></name><name><surname>Graber</surname><given-names>JH</given-names></name><name><surname>Pollock</surname><given-names>DD</given-names></name><name><surname>Megee</surname><given-names>PC</given-names></name><name><surname>Bentley</surname><given-names>DL</given-names></name></person-group><year>2010</year><article-title>Gene-specific RNA polymerase II phosphorylation and the CTD code</article-title><source>Nature Structural & Molecular Biology</source><volume>17</volume><fpage>1279</fpage><lpage>1286</lpage><pub-id pub-id-type="doi">10.1038/nsmb.1913</pub-id></element-citation></ref><ref id="bib30"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Klein</surname><given-names>IA</given-names></name><name><surname>Resch</surname><given-names>W</given-names></name><name><surname>Jankovic</surname><given-names>M</given-names></name><name><surname>Oliveira</surname><given-names>T</given-names></name><name><surname>Yamane</surname><given-names>A</given-names></name><name><surname>Nakahashi</surname><given-names>H</given-names></name><name><surname>Di Virgilio</surname><given-names>M</given-names></name><name><surname>Bothmer</surname><given-names>A</given-names></name><name><surname>Nussenzweig</surname><given-names>A</given-names></name><name><surname>Robbiani</surname><given-names>DF</given-names></name><name><surname>Casellas</surname><given-names>R</given-names></name><name><surname>Nussenzweig</surname><given-names>MC</given-names></name></person-group><year>2011</year><article-title>Translocation-capture sequencing reveals the extent and nature of chromosomal rearrangements in B lymphocytes</article-title><source>Cell</source><volume>147</volume><fpage>95</fpage><lpage>106</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2011.07.048</pub-id></element-citation></ref><ref id="bib31"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Krumm</surname><given-names>A</given-names></name><name><surname>Meulia</surname><given-names>T</given-names></name><name><surname>Brunvand</surname><given-names>M</given-names></name><name><surname>Groudine</surname><given-names>M</given-names></name></person-group><year>1992</year><article-title>The block to transcriptional elongation within the human c-myc gene is determined in the promoter-proximal region</article-title><source>Genes & Development</source><volume>6</volume><fpage>2201</fpage><lpage>2213</lpage><pub-id pub-id-type="doi">10.1101/gad.6.11.2201</pub-id></element-citation></ref><ref id="bib32"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kuong</surname><given-names>KJ</given-names></name><name><surname>Loeb</surname><given-names>LA</given-names></name></person-group><year>2013</year><article-title>APOBEC3B mutagenesis in cancer</article-title><source>Nature Genetics</source><volume>45</volume><fpage>964</fpage><lpage>965</lpage><pub-id pub-id-type="doi">10.1038/ng.2736</pub-id></element-citation></ref><ref id="bib33"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lada</surname><given-names>AG</given-names></name><name><surname>Stepchenkova</surname><given-names>EI</given-names></name><name><surname>Waisertreiger</surname><given-names>IS</given-names></name><name><surname>Noskov</surname><given-names>VN</given-names></name><name><surname>Dhar</surname><given-names>A</given-names></name><name><surname>Eudy</surname><given-names>JD</given-names></name><name><surname>Boissy</surname><given-names>RJ</given-names></name><name><surname>Hirano</surname><given-names>M</given-names></name><name><surname>Rogozin</surname><given-names>IB</given-names></name><name><surname>Pavlov</surname><given-names>YI</given-names></name></person-group><year>2013</year><article-title>Genome-wide mutation avalanches induced in diploid yeast cells by a base analog or an APOBEC deaminase</article-title><source>PLOS Genetics</source><volume>9</volume><fpage>e1003736</fpage><pub-id pub-id-type="doi">10.1371/journal.pgen.1003736</pub-id></element-citation></ref><ref id="bib34"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Larson</surname><given-names>DE</given-names></name><name><surname>Harris</surname><given-names>CC</given-names></name><name><surname>Chen</surname><given-names>K</given-names></name><name><surname>Koboldt</surname><given-names>DC</given-names></name><name><surname>Abbott</surname><given-names>TE</given-names></name><name><surname>Dooling</surname><given-names>DJ</given-names></name><name><surname>Ley</surname><given-names>TJ</given-names></name><name><surname>Mardis</surname><given-names>ER</given-names></name><name><surname>Wilson</surname><given-names>RK</given-names></name><name><surname>Ding</surname><given-names>L</given-names></name></person-group><year>2012</year><article-title>SomaticSniper: identification of somatic point mutations in whole genome sequencing data</article-title><source>Bioinformatics</source><volume>28</volume><fpage>311</fpage><lpage>317</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btr665</pub-id></element-citation></ref><ref id="bib35"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lawrence</surname><given-names>MS</given-names></name><name><surname>Stojanov</surname><given-names>P</given-names></name><name><surname>Polak</surname><given-names>P</given-names></name><name><surname>Kryukov</surname><given-names>GV</given-names></name><name><surname>Cibulskis</surname><given-names>K</given-names></name><name><surname>Sivachenko</surname><given-names>A</given-names></name><name><surname>Carter</surname><given-names>SL</given-names></name><name><surname>Stewart</surname><given-names>C</given-names></name><name><surname>Mermel</surname><given-names>CH</given-names></name><name><surname>Roberts</surname><given-names>SA</given-names></name><name><surname>Kiezun</surname><given-names>A</given-names></name><name><surname>Hammerman</surname><given-names>PS</given-names></name><name><surname>McKenna</surname><given-names>A</given-names></name><name><surname>Drier</surname><given-names>Y</given-names></name><name><surname>Zou</surname><given-names>L</given-names></name><name><surname>Ramos</surname><given-names>AH</given-names></name><name><surname>Pugh</surname><given-names>TJ</given-names></name><name><surname>Stransky</surname><given-names>N</given-names></name><name><surname>Helman</surname><given-names>E</given-names></name><name><surname>Kim</surname><given-names>J</given-names></name><name><surname>Sougnez</surname><given-names>C</given-names></name><name><surname>Ambrogio</surname><given-names>L</given-names></name><name><surname>Nickerson</surname><given-names>E</given-names></name><name><surname>Shefler</surname><given-names>E</given-names></name><name><surname>Cortés</surname><given-names>ML</given-names></name><name><surname>Auclair</surname><given-names>D</given-names></name><name><surname>Saksena</surname><given-names>G</given-names></name><name><surname>Voet</surname><given-names>D</given-names></name><name><surname>Noble</surname><given-names>M</given-names></name><name><surname>DiCara</surname><given-names>D</given-names></name><name><surname>Lin</surname><given-names>P</given-names></name><name><surname>Lichtenstein</surname><given-names>L</given-names></name><name><surname>Heiman</surname><given-names>DI</given-names></name><name><surname>Fennell</surname><given-names>T</given-names></name><name><surname>Imielinski</surname><given-names>M</given-names></name><name><surname>Hernandez</surname><given-names>B</given-names></name><name><surname>Hodis</surname><given-names>E</given-names></name><name><surname>Baca</surname><given-names>S</given-names></name><name><surname>Dulak</surname><given-names>AM</given-names></name><name><surname>Lohr</surname><given-names>J</given-names></name><name><surname>Landau</surname><given-names>DA</given-names></name><name><surname>Wu</surname><given-names>CJ</given-names></name><name><surname>Melendez-Zajgla</surname><given-names>J</given-names></name><name><surname>Hidalgo-Miranda</surname><given-names>A</given-names></name><name><surname>Koren</surname><given-names>A</given-names></name><name><surname>McCarroll</surname><given-names>SA</given-names></name><name><surname>Mora</surname><given-names>J</given-names></name><name><surname>Lee</surname><given-names>RS</given-names></name><name><surname>Crompton</surname><given-names>B</given-names></name><name><surname>Onofrio</surname><given-names>R</given-names></name><name><surname>Parkin</surname><given-names>M</given-names></name><name><surname>Winckler</surname><given-names>W</given-names></name><name><surname>Ardlie</surname><given-names>K</given-names></name><name><surname>Gabriel</surname><given-names>SB</given-names></name><name><surname>Roberts</surname><given-names>CW</given-names></name><name><surname>Biegel</surname><given-names>JA</given-names></name><name><surname>Stegmaier</surname><given-names>K</given-names></name><name><surname>Bass</surname><given-names>AJ</given-names></name><name><surname>Garraway</surname><given-names>LA</given-names></name><name><surname>Meyerson</surname><given-names>M</given-names></name><name><surname>Golub</surname><given-names>TR</given-names></name><name><surname>Gordenin</surname><given-names>DA</given-names></name><name><surname>Sunyaev</surname><given-names>S</given-names></name><name><surname>Lander</surname><given-names>ES</given-names></name><name><surname>Getz</surname><given-names>G</given-names></name></person-group><year>2013</year><article-title>Mutational heterogeneity in cancer and the search for new cancer-associated genes</article-title><source>Nature</source><volume>499</volume><fpage>214</fpage><lpage>218</lpage><pub-id pub-id-type="doi">10.1038/nature12213</pub-id></element-citation></ref><ref id="bib36"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Li</surname><given-names>H</given-names></name><name><surname>Durbin</surname><given-names>R</given-names></name></person-group><year>2009</year><article-title>Fast and accurate short read alignment with Burrows-Wheeler transform</article-title><source>Bioinformatics</source><volume>25</volume><fpage>1754</fpage><lpage>1760</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btp324</pub-id></element-citation></ref><ref id="bib37"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname><given-names>M</given-names></name><name><surname>Duke</surname><given-names>JL</given-names></name><name><surname>Richter</surname><given-names>DJ</given-names></name><name><surname>Vinuesa</surname><given-names>CG</given-names></name><name><surname>Goodnow</surname><given-names>CC</given-names></name><name><surname>Kleinstein</surname><given-names>SH</given-names></name><name><surname>Schatz</surname><given-names>DG</given-names></name></person-group><year>2008</year><article-title>Two levels of protection for the B cell genome during somatic hypermutation</article-title><source>Nature</source><volume>451</volume><fpage>841</fpage><lpage>845</lpage><pub-id pub-id-type="doi">10.1038/nature06547</pub-id></element-citation></ref><ref id="bib38"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>McKenna</surname><given-names>A</given-names></name><name><surname>Hanna</surname><given-names>M</given-names></name><name><surname>Banks</surname><given-names>E</given-names></name><name><surname>Sivachenko</surname><given-names>A</given-names></name><name><surname>Cibulskis</surname><given-names>K</given-names></name><name><surname>Kernytsky</surname><given-names>A</given-names></name><name><surname>Garimella</surname><given-names>K</given-names></name><name><surname>Altshuler</surname><given-names>D</given-names></name><name><surname>Gabriel</surname><given-names>S</given-names></name><name><surname>Daly</surname><given-names>M</given-names></name><name><surname>DePristo</surname><given-names>MA</given-names></name></person-group><year>2010</year><article-title>The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data</article-title><source>Genome Research</source><volume>20</volume><fpage>1297</fpage><lpage>1303</lpage><pub-id pub-id-type="doi">10.1101/gr.107524.110</pub-id></element-citation></ref><ref id="bib39"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Moqtaderi</surname><given-names>Z</given-names></name><name><surname>Struhl</surname><given-names>K</given-names></name></person-group><year>2004</year><article-title>Genome-wide occupancy profile of the RNA polymerase III machinery in <italic>Saccharomyces cerevisiae</italic> reveals loci with incomplete transcription complexes</article-title><source>Molecular and Cellular Biology</source><volume>24</volume><fpage>4118</fpage><lpage>4127</lpage><pub-id pub-id-type="doi">10.1128/MCB.24.10.4118-4127.2004</pub-id></element-citation></ref><ref id="bib40"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nambu</surname><given-names>Y</given-names></name><name><surname>Sugai</surname><given-names>M</given-names></name><name><surname>Gonda</surname><given-names>H</given-names></name><name><surname>Lee</surname><given-names>CG</given-names></name><name><surname>Katakai</surname><given-names>T</given-names></name><name><surname>Agata</surname><given-names>Y</given-names></name><name><surname>Yokota</surname><given-names>Y</given-names></name><name><surname>Shimizu</surname><given-names>A</given-names></name></person-group><year>2003</year><article-title>Transcription-coupled events associating with immunoglobulin switch region chromatin</article-title><source>Science</source><volume>302</volume><fpage>2137</fpage><lpage>2140</lpage><pub-id pub-id-type="doi">10.1126/science.1092481</pub-id></element-citation></ref><ref id="bib41"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nik-Zainal</surname><given-names>S</given-names></name><name><surname>Alexandrov</surname><given-names>LB</given-names></name><name><surname>Wedge</surname><given-names>DC</given-names></name><name><surname>Van Loo</surname><given-names>P</given-names></name><name><surname>Greenman</surname><given-names>CD</given-names></name><name><surname>Raine</surname><given-names>K</given-names></name><name><surname>Jones</surname><given-names>D</given-names></name><name><surname>Hinton</surname><given-names>J</given-names></name><name><surname>Marshall</surname><given-names>J</given-names></name><name><surname>Stebbings</surname><given-names>LA</given-names></name><name><surname>Menzies</surname><given-names>A</given-names></name><name><surname>Martin</surname><given-names>S</given-names></name><name><surname>Leung</surname><given-names>K</given-names></name><name><surname>Chen</surname><given-names>L</given-names></name><name><surname>Leroy</surname><given-names>C</given-names></name><name><surname>Ramakrishna</surname><given-names>M</given-names></name><name><surname>Rance</surname><given-names>R</given-names></name><name><surname>Lau</surname><given-names>KW</given-names></name><name><surname>Mudie</surname><given-names>LJ</given-names></name><name><surname>Varela</surname><given-names>I</given-names></name><name><surname>McBride</surname><given-names>DJ</given-names></name><name><surname>Bignell</surname><given-names>GR</given-names></name><name><surname>Cooke</surname><given-names>SL</given-names></name><name><surname>Shlien</surname><given-names>A</given-names></name><name><surname>Gamble</surname><given-names>J</given-names></name><name><surname>Whitmore</surname><given-names>I</given-names></name><name><surname>Maddison</surname><given-names>M</given-names></name><name><surname>Tarpey</surname><given-names>PS</given-names></name><name><surname>Davies</surname><given-names>HR</given-names></name><name><surname>Papaemmanuil</surname><given-names>E</given-names></name><name><surname>Stephens</surname><given-names>PJ</given-names></name><name><surname>McLaren</surname><given-names>S</given-names></name><name><surname>Butler</surname><given-names>AP</given-names></name><name><surname>Teague</surname><given-names>JW</given-names></name><name><surname>Jönsson</surname><given-names>G</given-names></name><name><surname>Garber</surname><given-names>JE</given-names></name><name><surname>Silver</surname><given-names>D</given-names></name><name><surname>Miron</surname><given-names>P</given-names></name><name><surname>Fatima</surname><given-names>A</given-names></name><name><surname>Boyault</surname><given-names>S</given-names></name><name><surname>Langerød</surname><given-names>A</given-names></name><name><surname>Tutt</surname><given-names>A</given-names></name><name><surname>Martens</surname><given-names>JW</given-names></name><name><surname>Aparicio</surname><given-names>SA</given-names></name><name><surname>Borg</surname><given-names>Å</given-names></name><name><surname>Salomon</surname><given-names>AV</given-names></name><name><surname>Thomas</surname><given-names>G</given-names></name><name><surname>Børresen-Dale</surname><given-names>AL</given-names></name><name><surname>Richardson</surname><given-names>AL</given-names></name><name><surname>Neuberger</surname><given-names>MS</given-names></name><name><surname>Futreal</surname><given-names>PA</given-names></name><name><surname>Campbell</surname><given-names>PJ</given-names></name><name><surname>Stratton</surname><given-names>MR</given-names></name>, <collab>Breast Cancer Working Group of the International Cancer Genome Consortium</collab></person-group><year>2012</year><article-title>Mutational processes molding the genomes of 21 breast cancers</article-title><source>Cell</source><volume>149</volume><fpage>979</fpage><lpage>993</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2012.04.024</pub-id></element-citation></ref><ref id="bib42"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Okazaki</surname><given-names>IM</given-names></name><name><surname>Okawa</surname><given-names>K</given-names></name><name><surname>Kobayashi</surname><given-names>M</given-names></name><name><surname>Yoshikawa</surname><given-names>K</given-names></name><name><surname>Kawamoto</surname><given-names>S</given-names></name><name><surname>Nagaoka</surname><given-names>H</given-names></name><name><surname>Shinkura</surname><given-names>R</given-names></name><name><surname>Kitawaki</surname><given-names>Y</given-names></name><name><surname>Taniguchi</surname><given-names>H</given-names></name><name><surname>Natsume</surname><given-names>T</given-names></name><name><surname>Iemura</surname><given-names>S</given-names></name><name><surname>Honjo</surname><given-names>T</given-names></name></person-group><year>2011</year><article-title>Histone chaperone Spt6 is required for class switch recombination but not somatic hypermutation</article-title><source>Proceedings of the National Academy of Sciences of USA</source><volume>108</volume><fpage>7920</fpage><lpage>7925</lpage><pub-id pub-id-type="doi">10.1073/pnas.1104423108</pub-id></element-citation></ref><ref id="bib43"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Pasqualucci</surname><given-names>L</given-names></name><name><surname>Neumeister</surname><given-names>P</given-names></name><name><surname>Goossens</surname><given-names>T</given-names></name><name><surname>Nanjangud</surname><given-names>G</given-names></name><name><surname>Chaganti</surname><given-names>RS</given-names></name><name><surname>Küppers</surname><given-names>R</given-names></name><name><surname>Dalla-Favera</surname><given-names>R</given-names></name></person-group><year>2001</year><article-title>Hypermutation of multiple proto-oncogenes in B-cell diffuse large-cell lymphomas</article-title><source>Nature</source><volume>412</volume><fpage>341</fpage><lpage>346</lpage><pub-id pub-id-type="doi">10.1038/35085588</pub-id></element-citation></ref><ref id="bib44"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Pavri</surname><given-names>R</given-names></name><name><surname>Gazumyan</surname><given-names>A</given-names></name><name><surname>Jankovic</surname><given-names>M</given-names></name><name><surname>Di Virgilio</surname><given-names>M</given-names></name><name><surname>Klein</surname><given-names>I</given-names></name><name><surname>Ansarah-Sobrinho</surname><given-names>C</given-names></name><name><surname>Resch</surname><given-names>W</given-names></name><name><surname>Yamane</surname><given-names>A</given-names></name><name><surname>Reina San-Martin</surname><given-names>B</given-names></name><name><surname>Barreto</surname><given-names>V</given-names></name><name><surname>Nieland</surname><given-names>TJ</given-names></name><name><surname>Root</surname><given-names>DE</given-names></name><name><surname>Casellas</surname><given-names>R</given-names></name><name><surname>Nussenzweig</surname><given-names>MC</given-names></name></person-group><year>2010</year><article-title>Activation-induced cytidine deaminase targets DNA at sites of RNA polymerase II stalling by interaction with Spt5</article-title><source>Cell</source><volume>143</volume><fpage>122</fpage><lpage>133</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2010.09.017</pub-id></element-citation></ref><ref id="bib45"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rada</surname><given-names>C</given-names></name><name><surname>Milstein</surname><given-names>C</given-names></name></person-group><year>2001</year><article-title>The intrinsic hypermutability of antibody heavy and light chain genes decays exponentially</article-title><source>The EMBO Journal</source><volume>20</volume><fpage>4570</fpage><lpage>4576</lpage><pub-id pub-id-type="doi">10.1093/emboj/20.16.4570</pub-id></element-citation></ref><ref id="bib46"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Reid</surname><given-names>RJD</given-names></name><name><surname>Sunjevaric</surname><given-names>I</given-names></name><name><surname>Kedacche</surname><given-names>M</given-names></name><name><surname>Rothstein</surname><given-names>R</given-names></name></person-group><year>2002</year><article-title>Efficient PCR-based gene disruption in Saccharomyces strains using intergenic primers</article-title><source>Yeast</source><volume>19</volume><fpage>319</fpage><lpage>328</lpage><pub-id pub-id-type="doi">10.1002/yea.817</pub-id></element-citation></ref><ref id="bib47"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rhee</surname><given-names>HS</given-names></name><name><surname>Pugh</surname><given-names>BF</given-names></name></person-group><year>2012</year><article-title>Genome-wide structure and organization of eukaryotic pre-initiation complexes</article-title><source>Nature</source><volume>483</volume><fpage>295</fpage><lpage>301</lpage><pub-id pub-id-type="doi">10.1038/nature10799</pub-id></element-citation></ref><ref id="bib48"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Roberts</surname><given-names>SA</given-names></name><name><surname>Lawrence</surname><given-names>MS</given-names></name><name><surname>Klimczak</surname><given-names>LJ</given-names></name><name><surname>Grimm</surname><given-names>SA</given-names></name><name><surname>Fargo</surname><given-names>D</given-names></name><name><surname>Stojanov</surname><given-names>P</given-names></name><name><surname>Kiezun</surname><given-names>A</given-names></name><name><surname>Kryukov</surname><given-names>GV</given-names></name><name><surname>Carter</surname><given-names>SL</given-names></name><name><surname>Saksena</surname><given-names>G</given-names></name><name><surname>Harris</surname><given-names>S</given-names></name><name><surname>Shah</surname><given-names>RR</given-names></name><name><surname>Resnick</surname><given-names>MA</given-names></name><name><surname>Getz</surname><given-names>G</given-names></name><name><surname>Gordenin</surname><given-names>DA</given-names></name></person-group><year>2013</year><article-title>An APOBEC cytidine deaminase mutagenesis pattern is widespread in human cancers</article-title><source>Nature Genetics</source><volume>45</volume><fpage>970</fpage><lpage>976</lpage><pub-id pub-id-type="doi">10.1038/ng.2702</pub-id></element-citation></ref><ref id="bib49"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Roberts</surname><given-names>SA</given-names></name><name><surname>Sterling</surname><given-names>J</given-names></name><name><surname>Thompson</surname><given-names>C</given-names></name><name><surname>Harris</surname><given-names>S</given-names></name><name><surname>Mav</surname><given-names>D</given-names></name><name><surname>Shah</surname><given-names>R</given-names></name><name><surname>Klimczak</surname><given-names>LJ</given-names></name><name><surname>Kryukov</surname><given-names>GV</given-names></name><name><surname>Malc</surname><given-names>E</given-names></name><name><surname>Mieczkowski</surname><given-names>PA</given-names></name><name><surname>Resnick</surname><given-names>MA</given-names></name><name><surname>Gordenin</surname><given-names>DA</given-names></name></person-group><year>2012</year><article-title>Clustered mutations in yeast and in human cancers can arise from damaged long single-strand DNA regions</article-title><source>Molecular Cell</source><volume>46</volume><fpage>424</fpage><lpage>435</lpage><pub-id pub-id-type="doi">10.1016/j.molcel.2012.03.030</pub-id></element-citation></ref><ref id="bib50"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Roschke</surname><given-names>V</given-names></name><name><surname>Kopantzev</surname><given-names>E</given-names></name><name><surname>Dertzbaugh</surname><given-names>M</given-names></name><name><surname>Rudikoff</surname><given-names>S</given-names></name></person-group><year>1997</year><article-title>Chromosomal translocations deregulating c-myc are associated with normal immune responses</article-title><source>Oncogene</source><volume>14</volume><fpage>3011</fpage><lpage>3016</lpage><pub-id pub-id-type="doi">10.1038/sj.onc.1201156</pub-id></element-citation></ref><ref id="bib51"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Storb</surname><given-names>U</given-names></name></person-group><year>2014</year><article-title>Why does somatic hypermutation by AID require transcription of its target genes?</article-title><source>Advances in Immunology</source><volume>122</volume><fpage>253</fpage><lpage>277</lpage><pub-id pub-id-type="doi">10.1016/B978-0-12-800267-4.00007-9</pub-id></element-citation></ref><ref id="bib52"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Strobl</surname><given-names>LJ</given-names></name><name><surname>Eick</surname><given-names>D</given-names></name></person-group><year>1992</year><article-title>Hold back of RNA polymerase II at the transcription start site mediates down-regulation of c-myc in vivo</article-title><source>The EMBO Journal</source><volume>11</volume><fpage>3307</fpage><lpage>3314</lpage></element-citation></ref><ref id="bib53"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Taylor</surname><given-names>BJ</given-names></name><name><surname>Nik-Zainal</surname><given-names>S</given-names></name><name><surname>Wu</surname><given-names>YL</given-names></name><name><surname>Stebbings</surname><given-names>LA</given-names></name><name><surname>Raine</surname><given-names>K</given-names></name><name><surname>Campbell</surname><given-names>PJ</given-names></name><name><surname>Rada</surname><given-names>C</given-names></name><name><surname>Stratton</surname><given-names>MR</given-names></name><name><surname>Neuberger</surname><given-names>MS</given-names></name></person-group><year>2013</year><article-title>DNA deaminases induce break-associated mutation showers with implication of APOBEC3B and 3A in breast cancer kataegis</article-title><source>eLife</source><volume>2</volume><fpage>e00534</fpage><pub-id pub-id-type="doi">10.7554/eLife.00534</pub-id></element-citation></ref><ref id="bib54"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Vannini</surname><given-names>A</given-names></name></person-group><year>2013</year><article-title>A structural perspective on RNA polymerase I and RNA polymerase III transcription machineries</article-title><source>Biochimica et biophysica acta</source><volume>1829</volume><fpage>258</fpage><lpage>264</lpage><pub-id pub-id-type="doi">10.1016/j.bbagrm.2012.09.009</pub-id></element-citation></ref><ref id="bib55"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Vannini</surname><given-names>A</given-names></name><name><surname>Cramer</surname><given-names>P</given-names></name></person-group><year>2012</year><article-title>Conservation between the RNA polymerase I, II, and III transcription initiation machineries</article-title><source>Molecular Cell</source><volume>45</volume><fpage>439</fpage><lpage>446</lpage><pub-id pub-id-type="doi">10.1016/j.molcel.2012.01.023</pub-id></element-citation></ref><ref id="bib56"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Venters</surname><given-names>BJ</given-names></name><name><surname>Wachi</surname><given-names>S</given-names></name><name><surname>Mavrich</surname><given-names>TN</given-names></name><name><surname>Andersen</surname><given-names>BE</given-names></name><name><surname>Jena</surname><given-names>P</given-names></name><name><surname>Sinnamon</surname><given-names>AJ</given-names></name><name><surname>Jain</surname><given-names>P</given-names></name><name><surname>Rolleri</surname><given-names>NS</given-names></name><name><surname>Jiang</surname><given-names>C</given-names></name><name><surname>Hemeryck-Walsh</surname><given-names>C</given-names></name><name><surname>Pugh</surname><given-names>BF</given-names></name></person-group><year>2011</year><article-title>A comprehensive genomic binding map of gene and chromatin regulatory proteins in Saccharomyces</article-title><source>Molecular Cell</source><volume>41</volume><fpage>480</fpage><lpage>492</lpage><pub-id pub-id-type="doi">10.1016/j.molcel.2011.01.015</pub-id></element-citation></ref><ref id="bib57"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname><given-names>M</given-names></name><name><surname>Yang</surname><given-names>Z</given-names></name><name><surname>Rada</surname><given-names>C</given-names></name><name><surname>Neuberger</surname><given-names>MS</given-names></name></person-group><year>2009</year><article-title>AID upmutants isolated using a high-throughput screen highlight the immunity/cancer balance limiting DNA deaminase activity</article-title><source>Nature Structural & Molecular Biology</source><volume>16</volume><fpage>769</fpage><lpage>776</lpage><pub-id pub-id-type="doi">10.1038/nsmb.1623</pub-id></element-citation></ref><ref id="bib58"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Waters</surname><given-names>R</given-names></name><name><surname>Parry</surname><given-names>JM</given-names></name></person-group><year>1973</year><article-title>The response to chemical mutagens of the individual haploid and homoallelic diploid UV-sensitive mutants of the rad 3 locus of <italic>Saccharomyces cerevisiae</italic></article-title><source>Molecular & General Genetics</source><volume>124</volume><fpage>135</fpage><lpage>143</lpage><pub-id pub-id-type="doi">10.1007/BF00265146</pub-id></element-citation></ref><ref id="bib59"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Willmann</surname><given-names>KL</given-names></name><name><surname>Milosevic</surname><given-names>S</given-names></name><name><surname>Pauklin</surname><given-names>S</given-names></name><name><surname>Schmitz</surname><given-names>KM</given-names></name><name><surname>Rangam</surname><given-names>G</given-names></name><name><surname>Simon</surname><given-names>MT</given-names></name><name><surname>Maslen</surname><given-names>S</given-names></name><name><surname>Skehel</surname><given-names>M</given-names></name><name><surname>Robert</surname><given-names>I</given-names></name><name><surname>Heyer</surname><given-names>V</given-names></name><name><surname>Schiavo</surname><given-names>E</given-names></name><name><surname>Reina-San-Martin</surname><given-names>B</given-names></name><name><surname>Petersen-Mahrt</surname><given-names>SK</given-names></name></person-group><year>2012</year><article-title>A role for the RNA pol II-associated PAF complex in AID-induced immune diversification</article-title><source>Journal of Experimental Medicine</source><volume>209</volume><fpage>2099</fpage><lpage>2111</lpage><pub-id pub-id-type="doi">10.1084/jem.20112145</pub-id></element-citation></ref><ref id="bib60"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wongsurawat</surname><given-names>T</given-names></name><name><surname>Jenjaroenpun</surname><given-names>P</given-names></name><name><surname>Kwoh</surname><given-names>CK</given-names></name><name><surname>Kuznetsov</surname><given-names>V</given-names></name></person-group><year>2012</year><article-title>Quantitative model of R-loop forming structures reveals a novel level of RNA-DNA interactome complexity</article-title><source>Nucleic Acids Research</source><volume>40</volume><fpage>e16</fpage><pub-id pub-id-type="doi">10.1093/nar/gkr1075</pub-id></element-citation></ref><ref id="bib61"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname><given-names>Z</given-names></name><name><surname>Wei</surname><given-names>W</given-names></name><name><surname>Gagneur</surname><given-names>J</given-names></name><name><surname>Perocchi</surname><given-names>F</given-names></name><name><surname>Clauder-Münster</surname><given-names>S</given-names></name><name><surname>Camblong</surname><given-names>J</given-names></name><name><surname>Guffanti</surname><given-names>E</given-names></name><name><surname>Stutz</surname><given-names>F</given-names></name><name><surname>Huber</surname><given-names>W</given-names></name><name><surname>Steinmetz</surname><given-names>LM</given-names></name></person-group><year>2009</year><article-title>Bidirectional promoters generate pervasive transcription in yeast</article-title><source>Nature</source><volume>457</volume><fpage>1033</fpage><lpage>1037</lpage><pub-id pub-id-type="doi">10.1038/nature07728</pub-id></element-citation></ref><ref id="bib62"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yamane</surname><given-names>A</given-names></name><name><surname>Resch</surname><given-names>W</given-names></name><name><surname>Kuo</surname><given-names>N</given-names></name><name><surname>Kuchen</surname><given-names>S</given-names></name><name><surname>Li</surname><given-names>Z</given-names></name><name><surname>Sun</surname><given-names>HW</given-names></name><name><surname>Robbiani</surname><given-names>DF</given-names></name><name><surname>McBride</surname><given-names>K</given-names></name><name><surname>Nussenzweig</surname><given-names>MC</given-names></name><name><surname>Casellas</surname><given-names>R</given-names></name></person-group><year>2011</year><article-title>Deep-sequencing identification of the genomic targets of the cytidine deaminase AID and its cofactor RPA in B lymphocytes</article-title><source>Nature Immunology</source><volume>12</volume><fpage>62</fpage><lpage>69</lpage><pub-id pub-id-type="doi">10.1038/ni.1964</pub-id></element-citation></ref><ref id="bib63"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yuan</surname><given-names>GC</given-names></name><name><surname>Liu</surname><given-names>YJ</given-names></name><name><surname>Dion</surname><given-names>MF</given-names></name><name><surname>Slack</surname><given-names>MD</given-names></name><name><surname>Wu</surname><given-names>LF</given-names></name><name><surname>Altschuler</surname><given-names>SJ</given-names></name><name><surname>Rando</surname><given-names>OJ</given-names></name></person-group><year>2005</year><article-title>Genome-scale identification of nucleosome positions in <italic>S. cerevisiae</italic></article-title><source>Science</source><volume>309</volume><fpage>626</fpage><lpage>630</lpage><pub-id pub-id-type="doi">10.1126/science.1112178</pub-id></element-citation></ref></ref-list></back><sub-article article-type="article-commentary" id="SA1"><front-stub><article-id pub-id-type="doi">10.7554/eLife.03553.025</article-id><title-group><article-title>Decision letter</article-title></title-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Proudfoot</surname><given-names>Nick J</given-names></name><role>Reviewing editor</role><aff><institution>University of Oxford</institution>, <country>United Kingdom</country></aff></contrib></contrib-group></front-stub><body><boxed-text><p>eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see <ext-link ext-link-type="uri" xlink:href="http://elifesciences.org/review-process">review process</ext-link>). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.</p></boxed-text><p>Thank you for sending your work entitled “Active RNAP pre-initiation sites are highly mutated by cytidine deaminases in yeast, with AID targeting small RNAs genes” for consideration at <italic>eLife.</italic> Your article has been favorably evaluated by James Manley (Senior editor) and 2 reviewers, one of whom, Nick Proudfoot, is a member of our Board of Reviewing Editors.</p><p>The Reviewing editor and another expert reviewer discussed their comments before we reached this decision, and the Reviewing editor has assembled the following comments to help you prepare a revised submission.</p><p>Analysis of the in vivo effect of human deaminases, such as AID and APOBEC3G, in heterologous systems such as <italic>E. coli</italic> and yeast has been undertaken by different labs and has contributed greatly to understand their mechanisms of action. Despite the limitations that yeast may have as a system to make conclusions about the in vivo action of these deaminases in human cells, it is clear that this type of approach reveals important features of the genome accessibility for these enzymes known to act specifically on ssDNA. In yeast, the pattern of mutations induced by these deaminases has been reported by different laboratories. This manuscript of Taylor and Rada, from the former laboratory of M. Neuberger, extends such studies further. A detailed analysis of clustering of mutations along the genome of diploid yeast reveals that such clusters are preferentially located at promoters of highly expressed RNAPII-driven genes. This is the case for both AID and APOBEC3G-induced mutations, and in the case of AID this is also observed for RNAPIII genes. Taylor and Rada conclude that this is due to the ability of the small RNAs to bring AID to its site of action. To probe this hypothesis they show that AID binds in vitro to yeast tRNA and poly U. This work is interesting and novel by providing new information about the yeast genome accessibility to human deaminases that should help clarify its mechanism of action. It is consistent with the ability of these enzymes to act on ssDNA formed at active pre-initiation complexes.</p><p>1) We are not convinced that the authors provide significant evidence that this recruitment is RNA mediated. AID was discovered by the Honjo lab as an RNA modifying enzyme, and the ability of AID to interact in vitro with RNA is not that novel. More importantly, the data presented is insufficient to prove that RNA plays a direct role in recruiting AID to pre-initiation complexes in vivo. The difference between RNAPII and RNAPIII promoters (embedded in the body of the genes) could be due to a different structure of the pre-initiation complex, a difference in the opening and accessibility of the ssDNA formed at such promoters or to different factors that might influence the action of AID in vivo, not necessarily its recruitment. Despite this, it is clear that these proteins act at RNAPII promoters as deduced from the impressive enrichment of MELs at such promoters. Is the average length of the RNA generated in such promoters small? Do long RNA molecules bind AID in vitro?</p><p>2) It is important to distinguish loading and recruitment of the deaminases from their sites of action (as seen by their mutational analysis). This study can only deduce sites of action not sites of loading and recruitment. This later point should be addressed by performing ChIP analysis to show that the deaminases are loaded at the promoter by the promoter initiation complex. Also reporter systems could be employed to probe these issues in vivo by turning on or off transcription (using inducible promoters e.g. GAL) This is relevant given the reported role of transcription factors in AID recruitment in human cells that in principle would not correlate with mutational hotspots.</p><p>3) Are the authors sure that Canavinine is an Arginine permease transporter inhibitor or just an Arginine analogue that uses the same permease (encoded by the CAN1 gene). This later reason could explain why can1 mutations impede canavarine uptake leading to can resistance? Please comment.</p><p>4) Can the authors further explain why mutations within the MELs are much more likely to occur on both alleles of the diploid strain compared to equivalent random fragments? As the information on whether such mutations are coincident between both alleles is not clear, I wonder whether the authors can exclude the possibility that such high levels could be due to gene conversion type of events whether or not mediated by double-strand breaks.</p><p>5) It is important that 57% of AID and 46 % of sA3G mutations occur within the promoter region compared to only 21% of EMS mutations. However, considering that the authors found 1227 and 568 MELs in the AID and sA3G treated genomes, but only 1 for MMS, a simple analysis would reveal that the MMS mutations in the promoter region compared to the total frequency of MMS-induced MELs is extremely high. Could the authors clarify or discuss this point better, so that the reader does not get misled by a simple analysis, if that is the case?</p></body></sub-article><sub-article article-type="reply" id="SA2"><front-stub><article-id pub-id-type="doi">10.7554/eLife.03553.026</article-id><title-group><article-title>Author response</article-title></title-group></front-stub><body><p>We have reinforced our conclusion that in yeast, the main determinant for targeting of the deaminases is the accessibility of the single stranded substrate at preinitiation complex sites. We now include ChIP data on the chromatin association of AID and APOBEC3G (sA3G) and new bioinformatic analyses. We have also reinforced the evidence in support for a role of RNA binding by AID in modulating the preferred sites of mutation by the addition of new genetic data, ChIP data and expression analysis. We have made changes in the text to improve the clarity and incorporate the new data, which has also resulted in the addition of a new author.</p><p><italic>1) We are not convinced that the authors provide significant evidence that this recruitment is RNA mediated. AID was discovered by the Honjo lab as an RNA modifying enzyme, and the ability of AID to interact in vitro with RNA is not that novel. More importantly, the data presented is insufficient to prove that RNA plays a direct role in recruiting AID to pre-initiation complexes in vivo. The difference between RNAPII and RNAPIII promoters (embedded in the body of the genes) could be due to a different structure of the pre-initiation complex, a difference in the opening and accessibility of the ssDNA formed at such promoters or to different factors that might influence the action of AID in vivo, not necessarily its recruitment. Despite this, it is clear that these proteins act at RNAPII promoters as deduced from the impressive enrichment of MELs at such promoters</italic>.</p><p>Both sA3G and AID (small ∼27Kd proteins) can access and mutate DNA at promoters; however we noted that AID showed marked preference for promoters of small structured RNAs (such as tRNAs or snRNAs). The two proteins are a perfect genetic control for each other, since the most salient difference between the two deaminases is their ability to bind RNA: the catalytic domain of APOBEC3G (sA3G) does not bind RNA, while AID seems to have a very good ability to bind RNA and in particular small structured RNAs such as tRNAs or snRNAs. We speculated that this RNA binding difference must be the main determinant for the different targeting preferences of AID.</p><p>In order to confirm this finding, we have now transplanted a small structured RNA to a locus that although transcribed was not a preferred target for either AID or sA3G. The results are included in a revised <xref ref-type="fig" rid="fig5">Figure 5</xref>, and show enhanced targeting of the promoter of the modified loci by AID but not sA3G. A similar number of new sequences confirm that the unmodified promoter remains a low frequency/untargeted locus for AID. Although at this moment we do not fully understand the mechanism for this recruitment, we think it confirms our general inference that AID targeting <italic>in vivo</italic> is affected by it ability to interact with RNA.</p><p><italic>2) It is important to distinguish loading and recruitment of the deaminases from their sites of action (as seen by their mutational analysis). This study can only deduce sites of action not sites of loading and recruitment. This later point should be addressed by performing ChIP analysis to show that the deaminases are loaded at the promoter by the promoter initiation complex</italic>.</p><p>We agree with the main concern of the reviewers in that we believe the activity of the deaminases does not necessarily correlate with their presence at chromatin, however we feel it is unlikely that ChIP experiments will provide a definitive answer to dissecting the mechanism that brings the deaminase to a genomic region (loading) versus the mechanism that promotes active mutation. The current literature in mammalian cells regarding association of AID with chromatin (as measured by ChIP) is complicated. The most solid data shows association of AID at the Sµ switch repeat region [a region that is highly enriched for AID targets] which is regulated by phosphorylation (Vuong et al 2013). ChIP data only occasionally and haphazardly correlates the presence of AID with mutation at off-target sites (<xref ref-type="bibr" rid="bib62">Yamane et al 2011</xref>, Hogenbirk et al 2012, discussed in Vaidyanathan et al 2014), unless other factors such as persistent RNAP II stalling are also present.</p><p>We have performed chromatin immunoprecipitation for both AID and the sA3G proteins. Our results are now included in <xref ref-type="fig" rid="fig3s3">Figure 3–figure supplement 3</xref>, where they confirm the expected absence of specific association of the deaminases with the promoters. The signal for the deaminases is found at similar frequency within both mutated and unmutated promoters and at unmutated intergenic regions.</p><p>We interpret our ChIP data as evidence for the transient nature of the interaction between the deaminase and its substrate in yeast [probably reflecting a high k-off], unlike mammalian cells where additional interactions might stabilize the presence of the deaminases (such as specific targeting factors at immunoglobulin genes or interaction with mammalian Spt5 or RPA30).</p><p>We have argued in the past that mutation is a more reliable measure of the “past” association of the deaminases with their substrate. At this moment we do not have a reliable assay to measure unproductive association of the deaminases that would distinguish occupancy of a locus versus functional outcome. Single molecule analysis might be a possible way to compare recruitment and mutation, but we feel those experiments are beyond the scope of this manuscript.</p><p>We are careful in our manuscript to avoid making the claim that the promoter initiation complex “actively loads” the deaminases, since we rather interpret our findings as indicating that the conformation of the DNA at the promoter site is permissive for the “access and activity” of the deaminases.</p><p><italic>Also reporter systems could be employed to probe these issues in vivo by turning on or off transcription (using inducible promoters e.g. GAL) This is relevant given the reported role of transcription factors in AID recruitment in human cells that in principle would not correlate with mutational hotspots.</italic></p><p>We have analysed transcription factor binding data and find enrichment of members of the basal transcription machinery and associated chromatin remodelling factors such as Spt16. No specific transcription factor is particularly associated with MEL containing promoter regions. The results are included as a new <xref ref-type="fig" rid="fig3s4">Figure 3–figure supplement 4</xref>, highlighting the association of deaminase mutation with highly transcribed genes. We believe these analyses support the main conclusion that the structure of the substrate at transcription initiation sites is the main determinant for off-target mutation by the deaminases, rather than particular association with a family of transcription factors.</p><p><italic>Is the average length of the RNA generated in such promoters small? Do long RNA molecules bind AID in vitro?</italic></p><p>As indicated in the modified <xref ref-type="fig" rid="fig5">Figure 5</xref>, both the full double domain version of APOBEC3G and AID bind poly U RNA (0.3 – 2 kilobase pairs), although APOBEC3G is consistently better in our assays at binding “long” RNA whereas AID is better at binding short structured RNAs like the tRNAs. All deaminases tested are able to bind double stranded polyA:U RNA, including the single domain deaminase APOBEC3A.</p><p>We have repeated the immunoprecipitation experiments testing the RNA binding properties of the deaminases to make the result as robust and reproducible as possible and have modified <xref ref-type="fig" rid="fig5">Figure 5</xref> with updated blots that reflect at least 3 independent experiments.</p><p>Our data suggests that different deaminases vary in their preferences and capacity to bind single, double stranded or highly structured RNA. Our results agree with evidence from the literature that suggests APOBEC3G might multimerise on diverse messenger RNAs. Further studies will focus on the preferences of AID-RNA interactions in mammalian cells.</p><p>As to whether there is a gene length preference for deaminases, for AID* and sA3G* datasets, we find no particular bias for mutations to occur at short or long RNAP II mRNA genes (see <xref ref-type="fig" rid="fig6">Author response image 1</xref> below).<fig id="fig6" position="float"><label>Author response image 1.</label><graphic xlink:href="elife03553f006"/></fig></p><p><italic>3) Are the authors sure that Canavinine is an Arginine permease transporter inhibitor or just an Arginine analogue that uses the same permease (encoded by the CAN1 gene). This later reason could explain why can1 mutations impede canavarine uptake leading to can resistance? Please comment.</italic></p><p>We apologise for the mistake in our description of mutation the CAN1 locus as a selection system and thank the reviewer for pointing it out. As the reviewers rightly suggest, we use resistance to L-canavanine (a toxic arginine analogue) that is imported inside the cell by the yeast <italic>S. cerevisiae</italic> using the amino acid transporter encoded by the CAN1 gene. Mutations that impair the transport also render the cells resistant to the toxicity of the drug. We have now corrected the text.</p><p><italic>4) Can the authors further explain why mutations within the MELs are much more likely to occur on both alleles of the diploid strain compared to equivalent random fragments? As the information on whether such mutations are coincident between both alleles is not clear, I wonder whether the authors can exclude the possibility that such high levels could be due to gene conversion type of events whether or not mediated by double-strand breaks.</italic></p><p>We observe that in many cases the same DNA fragment is associated with high density of mutations (MELs) in both alleles of the same gene (frequently but not always at the same nucleotide position). We argue the simplest explanation is repeated targeted of both alleles, either simultaneously or in successive rounds of mutation. Regions of equivalent size as MELs would only be expected to show biallelic mutations (homozygous or heterozygous) in 2-3% of the fragments (as predicted from a random distribution of the same number of mutations as in the observed datasets). We have modified the text in an attempt to make the point more clear but keeping the text concise.</p><p>As the reviewers mention, an alternative that we do not favour is that gene-conversion between the two alleles accounts for the high frequency of homozygous mutations in MELs. We repeatedly discuss the absence of kataegic mutations as an indication that DNA breaks and repair are very rarely found in association with the mutation preferences in our datasets. Our published data (<xref ref-type="bibr" rid="bib53">Taylor et al 2013</xref>) demonstrated that DNA breaks associated with deamination are frequent in Ung+ wild type cells and results in clustered kataegic mutations but rare or absent in Ung- yeast. Given that repair by gene conversion would rely on a break of the DNA, the lack of kataegic mutations argues against a mechanism that would require breaks in association with MELs.</p><p>Since we favour repeated targeting by the deaminases as the explanation for the high frequency of bi-allelic mutation observed associated with MELs, we have not expanded on the gene-conversion argument in the text which we feel might misdirect the reader.</p><p><italic>5) It is important that 57% of AID and 46 % of sA3G mutations occur within the promoter region compared to only 21% of EMS mutations. However, considering that the authors found 1227 and 568 MELs in the AID and sA3G treated genomes, but only 1 for MMS, a simple analysis would reveal that the MMS mutations in the promoter region compared to the total frequency of MMS-induced MELs is extremely high. Could the authors clarify or discuss this point better, so that the reader does not get misled by a simple analysis, if that is the case?</italic></p><p>The compact nature of the yeast genome makes the proportion of sequences associated with “promoter” function much higher than in mammalian cells, thus up to 20% of the genome corresponds to promoters. Fully random mutation will therefore result in 20% of mutations occurring on promoters, which is the frequency seen in EMS treated samples. We have made a note in the text to clarify this.</p><p>Our analysis reveals that the deaminase induced mutations are skewed towards promoter regions much more than expected (57% and 46% for AID and sA3G). Even within the small fraction of the genome corresponding to promoters, the regions with high density of mutations (that we define as MELs) are confined to even smaller regions, which when added up account for just 1.5% of the whole genome. This is a remarkable skewing that clearly deviates from random which our analyses attempt to convey.</p><p>However, as the reviewers rightly question, the single MEL identified in the EMS induced mutation set is not associated with a promoter, but with the CAN1 gene body and is the result of the selection that requires at least two mutations to inactivate the both alleles at the CAN1 locus. We have adjusted the text to emphasise this point. Therefore, there are no EMS induced MELs which associate with promoter regions.</p></body></sub-article></article> |