Permalink
Switch branches/tags
Nothing to show
Find file
Fetching contributors…
Cannot retrieve contributors at this time
1 lines (1 sloc) 239 KB
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.1d1 20130915//EN" "JATS-archivearticle1.dtd"><article article-type="research-article" dtd-version="1.1d1" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="nlm-ta">elife</journal-id><journal-id journal-id-type="hwp">eLife</journal-id><journal-id journal-id-type="publisher-id">eLife</journal-id><journal-title-group><journal-title>eLife</journal-title></journal-title-group><issn publication-format="electronic">2050-084X</issn><publisher><publisher-name>eLife Sciences Publications, Ltd</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">00808</article-id><article-id pub-id-type="doi">10.7554/eLife.00808</article-id><article-categories><subj-group subj-group-type="display-channel"><subject>Research article</subject></subj-group><subj-group subj-group-type="heading"><subject>Genes and chromosomes</subject></subj-group></article-categories><title-group><article-title>Condensin controls recruitment of RNA polymerase II to achieve nematode X-chromosome dosage compensation</article-title></title-group><contrib-group><contrib contrib-type="author" equal-contrib="yes" id="author-4908"><name><surname>Kruesi</surname><given-names>William S</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="fn" rid="equal-contrib">†</xref><xref ref-type="other" rid="par-3"/><xref ref-type="fn" rid="con1"/><xref ref-type="fn" rid="conf1"/><xref ref-type="other" rid="dataro1"/></contrib><contrib contrib-type="author" equal-contrib="yes" id="author-4909"><name><surname>Core</surname><given-names>Leighton J</given-names></name><xref ref-type="aff" rid="aff2"/><xref ref-type="fn" rid="equal-contrib">†</xref><xref ref-type="fn" rid="con2"/><xref ref-type="fn" rid="conf1"/><xref ref-type="other" rid="dataro1"/></contrib><contrib contrib-type="author" id="author-4910"><name><surname>Waters</surname><given-names>Colin T</given-names></name><xref ref-type="aff" rid="aff2"/><xref ref-type="fn" rid="con3"/><xref ref-type="fn" rid="conf1"/><xref ref-type="other" rid="dataro1"/></contrib><contrib contrib-type="author" id="author-4911"><name><surname>Lis</surname><given-names>John T</given-names></name><xref ref-type="aff" rid="aff2"/><xref ref-type="other" rid="par-4"/><xref ref-type="other" rid="par-5"/><xref ref-type="fn" rid="con4"/><xref ref-type="fn" rid="conf1"/><xref ref-type="other" rid="dataro1"/></contrib><contrib contrib-type="author" corresp="yes" id="author-1871"><name><surname>Meyer</surname><given-names>Barbara J</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="corresp" rid="cor1">*</xref><xref ref-type="other" rid="par-1"/><xref ref-type="other" rid="par-2"/><xref ref-type="fn" rid="con5"/><xref ref-type="fn" rid="conf1"/><xref ref-type="other" rid="dataro1"/></contrib><aff id="aff1"><institution content-type="dept">Department of Molecular and Cell Biology</institution>, <institution>Howard Hughes Medical Institute, University of California, Berkeley</institution>, <addr-line><named-content content-type="city">Berkeley</named-content></addr-line>, <country>United States</country></aff><aff id="aff2"><institution content-type="dept">Department of Molecular Biology and Genetics</institution>, <institution>Cornell University</institution>, <addr-line><named-content content-type="city">Ithaca</named-content></addr-line>, <country>United States</country></aff></contrib-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Proudfoot</surname><given-names>Nick</given-names></name><role>Reviewing editor</role><aff><institution>University of Oxford</institution>, <country>United Kingdom</country></aff></contrib></contrib-group><author-notes><corresp id="cor1"><label>*</label>For correspondence: <email>bjmeyer@berkeley.edu</email></corresp><fn fn-type="con" id="equal-contrib"><label>†</label><p>These authors contributed equally to this work</p></fn></author-notes><pub-date date-type="pub" publication-format="electronic"><day>18</day><month>06</month><year>2013</year></pub-date><pub-date pub-type="collection"><year>2013</year></pub-date><volume>2</volume><elocation-id>e00808</elocation-id><history><date date-type="received"><day>08</day><month>04</month><year>2013</year></date><date date-type="accepted"><day>09</day><month>05</month><year>2013</year></date></history><permissions><copyright-statement>© 2013, Kruesi et al</copyright-statement><copyright-year>2013</copyright-year><copyright-holder>Kruesi et al</copyright-holder><license xlink:href="http://creativecommons.org/licenses/by/3.0/"><license-p>This article is distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/3.0/">Creative Commons Attribution License</ext-link>, which permits unrestricted use and redistribution provided that the original author and source are credited.</license-p></license></permissions><self-uri content-type="pdf" xlink:href="elife00808.pdf"/><abstract><object-id pub-id-type="doi">10.7554/eLife.00808.001</object-id><p>The X-chromosome gene regulatory process called dosage compensation ensures that males (1X) and females (2X) express equal levels of X-chromosome transcripts. The mechanism in <italic>Caenorhabditis elegans</italic> has been elusive due to improperly annotated transcription start sites (TSSs). Here we define TSSs and the distribution of transcriptionally engaged RNA polymerase II (Pol II) genome-wide in wild-type and dosage-compensation-defective animals to dissect this regulatory mechanism. Our TSS-mapping strategy integrates GRO-seq, which tracks nascent transcription, with a new derivative of this method, called GRO-cap, which recovers nascent RNAs with 5′ caps prior to their removal by co-transcriptional processing. Our analyses reveal that promoter-proximal pausing is rare, unlike in other metazoans, and promoters are unexpectedly far upstream from the 5′ ends of mature mRNAs. We find that <italic>C</italic>. <italic>elegans</italic> equalizes X-chromosome expression between the sexes, to a level equivalent to autosomes, by reducing Pol II recruitment to promoters of hermaphrodite X-linked genes using a chromosome-restructuring condensin complex.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.001">http://dx.doi.org/10.7554/eLife.00808.001</ext-link></p></abstract><abstract abstract-type="executive-summary"><object-id pub-id-type="doi">10.7554/eLife.00808.002</object-id><title>eLife digest</title><p>In many species, including humans, females have two X chromosomes whereas males have only one. To ensure that females do not end up with a double dose of the proteins encoded by genes on the X chromosome, animals employ a strategy called dosage compensation to control the expression of X-linked genes.</p><p>The mechanisms underlying dosage compensation vary between species, but they typically involve a regulatory complex that binds to the X chromosomes of one sex to modify gene expression. In the nematode worm <italic>Caenorhabditis elegans</italic>—which consists of hermaphrodites (XX) and males (XO)—this regulatory complex, called the dosage compensation complex (DCC), binds to both X chromosomes of XX individuals, reducing gene expression from each by 50%. DCC shares many subunits with a protein complex called condensin, which regulates the structure of chromosomes to achieve proper chromosome segregation. However, it is unclear exactly how the DCC controls the expression of X-linked genes.</p><p>For a gene to be expressed, an enzyme called RNA polymerase II must bind to the gene’s promoter—a stretch of DNA upstream of the protein-coding part of the gene—so that it can begin transcribing the DNA into RNA. Promoters have been difficult to define in <italic>C. elegans</italic>, but Kruesi et al. devised a strategy to map transcription start sites, and hence promoters, throughout the worm genome. The strategy integrates the results of two methods: One measures the extent and orientation of each gene’s transcribed region, and the other locates the distinctive cap structures that mark the true 5′ ends of newly made RNAs.</p><p>Using this new promoter information, coupled with genome-wide measurements of the levels of newly synthesized transcripts from wild-type and dosage-compensation-defective animals, they showed that <italic>C. elegans</italic> achieves dosage compensation by reducing the recruitment of RNA polymerase II to the promoters of X-linked genes in XX individuals.</p><p>Kruesi et al. also identified a second regulatory mechanism that acts in both sexes to increase the level of transcription of genes on the X chromosome. This ensures that after dosage compensation, genes on the X chromosome are expressed at a similar level to those on the autosomes (all chromosomes other than X and Y).</p><p>As well as shedding light on the mechanism by which dosage compensation occurs in <italic>C. elegans</italic>, the study by Kruesi et al. provides a valuable data set on transcription start sites in the worm, and puts forward a general strategy that could be used to map these sites in other species.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.002">http://dx.doi.org/10.7554/eLife.00808.002</ext-link></p></abstract><kwd-group kwd-group-type="author-keywords"><title>Author keywords</title><kwd>dosage compensation</kwd><kwd>transcription</kwd><kwd>X-chromosome and autosome balance</kwd><kwd>transcription start site identification technology</kwd><kwd>X chromosome</kwd></kwd-group><kwd-group kwd-group-type="research-organism"><title>Research organism</title><kwd>C. elegans</kwd></kwd-group><funding-group><award-group id="par-1"><funding-source><institution-wrap><institution>National Institutes of Health</institution></institution-wrap></funding-source><award-id>R01GM30702</award-id><principal-award-recipient><name><surname>Meyer</surname><given-names>Barbara J</given-names></name></principal-award-recipient></award-group><award-group id="par-2"><funding-source><institution-wrap><institution>Howard Hughes Medical Institute</institution></institution-wrap></funding-source><principal-award-recipient><name><surname>Meyer</surname><given-names>Barbara J</given-names></name></principal-award-recipient></award-group><award-group id="par-3"><funding-source><institution-wrap><institution>National Institutes of Health</institution></institution-wrap></funding-source><award-id>T32GM07127</award-id><principal-award-recipient><name><surname>Kruesi</surname><given-names>William S</given-names></name></principal-award-recipient></award-group><award-group id="par-4"><funding-source><institution-wrap><institution>National Institutes of Health</institution></institution-wrap></funding-source><award-id>R01GM25232</award-id><principal-award-recipient><name><surname>Lis</surname><given-names>John T</given-names></name></principal-award-recipient></award-group><award-group id="par-5"><funding-source><institution-wrap><institution>National Institutes of Health</institution></institution-wrap></funding-source><award-id>HG004845</award-id><principal-award-recipient><name><surname>Lis</surname><given-names>John T</given-names></name></principal-award-recipient></award-group><funding-statement>The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.</funding-statement></funding-group><custom-meta-group><custom-meta specific-use="meta-only"><meta-name>Author impact statement</meta-name><meta-value><italic>C. elegans</italic> equalizes the expression of X-chromosome genes between the sexes by reducing the recruitment of RNA polymerase II to promoters of X-linked genes in hermaphrodites, using a chromosome-restructuring complex called condensin.</meta-value></custom-meta></custom-meta-group></article-meta></front><body><sec id="s1" sec-type="intro"><title>Introduction</title><p>The essential, X-chromosome-wide regulatory process called dosage compensation ensures that males (XO or XY) and females (XX), from worms to humans, express equivalent levels of X-chromosome products despite their unequal dose of X chromosomes (<xref ref-type="bibr" rid="bib25">Gelbart and Kuroda, 2009</xref>; <xref ref-type="bibr" rid="bib50">Meyer, 2010</xref>; <xref ref-type="bibr" rid="bib12">Conrad and Akhtar, 2012</xref>; <xref ref-type="bibr" rid="bib36">Jeon et al., 2012</xref>). The failure to dosage compensate is lethal. Dosage compensation strategies differ across species, but invariably a regulatory complex is targeted to the X chromosomes of one sex to modulate transcription along the entire X. The molecular mechanisms by which these complexes regulate gene expression remain elusive. Here we analyzed X-chromosome dosage compensation in the nematode <italic>Caenorhabditis elegans</italic> to determine the step of transcription controlled by its dosage compensation complex (DCC). The DCC binds to both X chromosomes of hermaphrodites to reduce transcription by half (<xref ref-type="bibr" rid="bib50">Meyer, 2010</xref>; <xref ref-type="bibr" rid="bib53">Pferdehirt et al., 2011</xref>). Sequence-specific DNA binding sites recruit the DCC to X and facilitate its spreading along X (<xref ref-type="bibr" rid="bib22">Ercan et al., 2009</xref>; <xref ref-type="bibr" rid="bib35">Jans et al., 2009</xref>; <xref ref-type="bibr" rid="bib53">Pferdehirt et al., 2011</xref>). The DCC shares subunits with condensin (<xref ref-type="bibr" rid="bib17">Csankovszki et al., 2009</xref>; <xref ref-type="bibr" rid="bib49">Mets and Meyer, 2009</xref>), a protein complex required for the compaction, resolution, and segregation of mitotic and meiotic chromosomes (<xref ref-type="bibr" rid="bib61">Wood et al., 2010</xref>), suggesting that DCC-dependent changes in chromosome structure facilitate transcription regulation. In principle, the DCC could control any step of transcription: recruitment of RNA polymerase II (Pol II) to the gene promoter, initiation of transcription, escape of Pol II from the promoter or proximal pause sites, elongation of RNA transcripts, or termination of transcription.</p><p>To understand the mechanism of <italic>C</italic>. <italic>elegans</italic> dosage compensation, we first developed a procedure to map the position, density, and orientation of transcriptionally engaged Pol II genome-wide in <italic>C</italic>. <italic>elegans</italic> and then devised a strategy to identify the transcription start sites (TSSs). Nascent RNA transcripts from approximately 70% of <italic>C</italic>. <italic>elegans</italic> genes undergo a rapid co-transcriptional processing event in which the 5′ end is replaced by a common 22-nucleotide leader RNA (SL1) through a trans-splicing mechanism (<xref ref-type="bibr" rid="bib8">Blumenthal, 2012</xref>). Because trans-splicing removes information about Pol II initiation from nascent RNAs, TSSs have been difficult to identify from accumulated mRNAs (<xref ref-type="bibr" rid="bib51">Morton and Blumenthal, 2011</xref>). The paucity of annotated promoters has made transcription regulation a challenge to study in <italic>C</italic>. <italic>elegans</italic>. By comparing the quantity, location, and direction of engaged Pol II from wild-type and dosage-compensation-defective embryos relative to TSSs, we determined the step of transcription controlled by the DCC.</p><p>Our work establishes a general strategy for TSS mapping in any organism and provides an invaluable TSS data set for dissecting <italic>C</italic>. <italic>elegans</italic> gene regulation. We show that <italic>C</italic>. <italic>elegans</italic> equalizes X-chromosome-wide gene expression between the sexes by reducing Pol II recruitment to the promoters of X-linked genes in XX embryos via a mechanism that utilizes a chromosome-restructuring complex. We also show that a separate regulatory mechanism functions in <italic>C</italic>. <italic>elegans</italic> to elevate the intrinsic level of transcription from the X chromosomes of both sexes, so that after dosage compensation, X chromosomes and the two sets of autosomes have equivalent expression.</p></sec><sec id="s2" sec-type="results"><title>Results</title><sec id="s2-1"><title>Genome-wide mapping of transcriptionally engaged Pol II and transcription start sites reveals promoters to be far upstream of mature mRNA 5′ ends</title><p>To map the distribution of transcriptionally engaged Pol II genome-wide, we performed global run-on sequencing (GRO-seq) experiments using nuclei from three stages of wild-type animals (embryos, starved L1 larvae, and L3 larvae) and dosage-compensation-defective embryos. In GRO-seq reactions, engaged polymerases are allowed to transcribe (run-on) short distances (100 nucleotides) and incorporate affinity tags into their nascent RNAs under conditions that prohibit new initiation (<xref ref-type="bibr" rid="bib16">Core et al., 2008</xref>). Tagged transcripts are affinity purified, amplified, sequenced, and aligned to the genome to map engaged Pol II (<xref ref-type="fig" rid="fig1">Figure 1A–E</xref>).<fig-group><fig id="fig1" position="float"><object-id pub-id-type="doi">10.7554/eLife.00808.003</object-id><label>Figure 1.</label><caption><title>Genome-wide annotation of <italic>Caenorhabditis</italic> <italic>elegans</italic> transcription start sites.</title><p>(<bold>A</bold>)–(<bold>E</bold>) Examples of newly annotated transcription start sites (TSSs) for protein-coding genes, non-coding RNA genes, and multigenic transcription units called operons identified using the combination of GRO-seq and GRO-cap. Red arrows demark the WormBase (WB) gene models. Dashed vertical lines show the WB gene starts. The GRO-seq signal is in reads per kilobase per million (RPKM). For protein coding genes, the GRO-seq signal was averaged across 25 bp windows with a 25 bp step. The GRO-cap signal is in reads per million (RPM). TAP+ is the signal from capped mRNAs, and TAP− is the background. For (<bold>D)</bold> and (<bold>E)</bold>, the GRO-cap signal is the TAP+ signal after subtracting the TAP− signal. (<bold>A</bold>) TSS for a trans-spliced gene. The TSS maps 981 bp upstream of the WB start, with a continuous intervening GRO-seq signal. (<bold>B</bold>) TSS for the polycistronc microRNA cluster <italic>mir-54-56</italic> maps 158 bp upstream of the primary transcript start. (<bold>C</bold>) TSS for a non-trans-spliced gene. The TSS from GRO-cap and GRO-seq aligns with the WB start site. (<bold>D</bold>) Identification of the operon TSS shows the operon includes an additional gene, <italic>tag-175</italic>. The TSS for the operon maps 781 bp upstream of <italic>tag-175</italic>. (<bold>E</bold>) TSSs for genes in operons that also use independent promoters, including the TSS for a snoRNA gene within the intron of a gene. <italic>vha-10</italic> mRNA is trans-spliced with an SL2 RNA, indicating processing from a polycistronic RNA and an SL1 RNA, indicating transcription from an independent promoter. (<bold>F</bold>) Heat maps show that TSSs vastly improve gene models. The GRO-seq signal from embryos was plotted, one gene per row, for each of 4246 genes relative to the WB start (left) or the new TSS (right). The genes were ordered with increasing distance between the TSS and WB start. The light line moving rightward in the right panel does not represent TSSs. It reflects reduced GRO-seq signal immediately downstream of the trans-splice acceptor site that has been commonly annotated as the WB start site. (<bold>G</bold>) Heat maps showing the GRO-cap signal from embryos that was plotted for each of 4246 genes relative to the WB start (left) or the new TSS (right). The genes were ordered with increasing distance between the TSS and WB start.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.003">http://dx.doi.org/10.7554/eLife.00808.003</ext-link></p><p><supplementary-material id="SD1-data"><object-id pub-id-type="doi">10.7554/eLife.00808.004</object-id><label>Figure 1—source data 1.</label><caption><title>Aligned reads for GRO-seq, GRO-cap, and ChIP-seq experiments.</title><p>The number of reads from each replicate of GRO-seq, GRO-cap, and ChIP-seq that uniquely aligned to the <italic>Caenorhabditis elegans</italic> genome are listed.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.004">http://dx.doi.org/10.7554/eLife.00808.004</ext-link></p></caption><media mime-subtype="xls" mimetype="application" xlink:href="elife00808s001.xls"/></supplementary-material></p><p><supplementary-material id="SD2-data"><object-id pub-id-type="doi">10.7554/eLife.00808.005</object-id><label>Figure 1—source data 2.</label><caption><p>Annotation of transcription start sites for protein-coding genes.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.005">http://dx.doi.org/10.7554/eLife.00808.005</ext-link></p></caption><media mime-subtype="xls" mimetype="application" xlink:href="elife00808s002.xls"/></supplementary-material></p><p><supplementary-material id="SD3-data"><object-id pub-id-type="doi">10.7554/eLife.00808.006</object-id><label>Figure 1—source data 3.</label><caption><p>Annotation of transcription start sites for non-coding RNAs.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.006">http://dx.doi.org/10.7554/eLife.00808.006</ext-link></p></caption><media mime-subtype="xls" mimetype="application" xlink:href="elife00808s003.xls"/></supplementary-material></p></caption><graphic xlink:href="elife00808f001"/></fig><fig id="fig1s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.007</object-id><label>Figure 1—figure supplement 1.</label><caption><title>GRO-seq profiles are reproducible between replicates.</title><p>The GRO-seq profiles of a select X-chromosome genomic region from two biological replicates of control RNAi embryos and their average GRO-seq profile are shown along with the unique mappability of GRO-seq data in the region. Red arrows show the location and direction of transcription for each protein-coding gene in the region, which are <italic>dnj-7</italic>, <italic>C55B6.1</italic>, <italic>ZK867.2</italic>, <italic>spp-22</italic>, <italic>syd-9</italic>, <italic>F46H5.2</italic>, and <italic>K03A1.2</italic>, from left to right. Gene models are from the WormBase WS230 release. The level of GRO-seq signal is provided in RPKM (reads per kilobase per million). Throughout this manuscript, the average GRO-seq signal of two biological replicates for each developmental stage or condition is used.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.007">http://dx.doi.org/10.7554/eLife.00808.007</ext-link></p></caption><graphic xlink:href="elife00808fs001"/></fig><fig id="fig1s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.008</object-id><label>Figure 1—figure supplement 2.</label><caption><title>Genome-wide GRO-seq signal is highly correlated between replicates.</title><p>(<bold>A</bold>), (<bold>C</bold>)–(<bold>E</bold>) Scatter plots comparing GRO-seq signal between biological replicates. (<bold>B</bold>) Scatter plot comparing GRO-seq signal between averaged replicates of wild-type embryos vs control RNAi embryos. Average GRO-seq signal was calculated in 500 bp windows genome-wide. Pair-wise comparisons were performed between samples using windows with at least one read in both replicates. The average GRO-seq signal within the window is shown in RPKM (reads per kilobase per million). The red line represents a theoretical 1:1 fit. The statistical relationship between the replicates is indicated by the Spearman correlation coefficient ρ.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.008">http://dx.doi.org/10.7554/eLife.00808.008</ext-link></p></caption><graphic xlink:href="elife00808fs002"/></fig><fig id="fig1s3" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.009</object-id><label>Figure 1—figure supplement 3.</label><caption><title>GRO-seq signal within protein coding genes is highly correlated between replicates.</title><p>(<bold>A</bold>)–(<bold>E</bold>) Average GRO-seq expression within the gene bodies was calculated using gene models from the WormBase WS230 release. For all genes greater than 1.1 kb, the GRO-seq signal was totaled within the gene body, excluding the first and last 300 bp. The total GRO-seq signal was divided by the total number of uniquely mappable base pairs within the same region to generate the average expression. The average gene body expression level in RPKM (reads per kilobase per million) is plotted on the axes. The statistical relationship between the replicates is indicated by the Spearman correlation coefficient ρ.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.009">http://dx.doi.org/10.7554/eLife.00808.009</ext-link></p></caption><graphic xlink:href="elife00808fs003"/></fig><fig id="fig1s4" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.010</object-id><label>Figure 1—figure supplement 4.</label><caption><title>GRO-seq expression is correlated with gene expression from microarray and RNA-seq experiments.</title><p>(<bold>A</bold>) GRO-seq experiments have a higher dynamic range than microarray experiments. Scatter plots are shown of gene expression levels determined by GRO-seq vs microarray experiments from control RNAi embryos. Average GRO-seq expression was calculated as in <xref ref-type="fig" rid="fig1s3">Figure 1—figure supplement 3</xref>. Microarray data were obtained from <xref ref-type="bibr" rid="bib35">Jans et al. (2009)</xref>. The GRO-seq signal is shown as the log<sub>10</sub> of the average RPKM (reads per kilobase per million), and microarray data are shown as the log<sub>10</sub> of expression values. (<bold>B</bold>) GRO-seq and RNA-seq data are correlated. Scatter plots are shown of gene expression levels determined by GRO-seq vs RNA-seq experiments from starved L1s. Average GRO-seq expression was calculated as in <xref ref-type="fig" rid="fig1s3">Figure 1—figure supplement 3</xref>, and RNA-seq data were obtained from <xref ref-type="bibr" rid="bib48">Maxwell et al. (2012)</xref>. The GRO-seq signal is the log<sub>10</sub> of the average RPKM, and RNA-seq reads are the log<sub>10</sub> of FPKM (fragments per kilobase per million). The two samples show a Spearman correlation coefficient of 0.788.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.010">http://dx.doi.org/10.7554/eLife.00808.010</ext-link></p></caption><graphic xlink:href="elife00808fs004"/></fig><fig id="fig1s5" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.011</object-id><label>Figure 1—figure supplement 5.</label><caption><title>Genome-wide annotation of TSSs improves gene models.</title><p>(<bold>A</bold>)–(<bold>D</bold>) To gauge the improvement of our new transcription start site (TSS) calls on gene model accuracy, we plotted the average GRO-seq signal across a 2 kb window centered on the WormBase (WB) starts or TSSs for genes having TSSs identified in the same developmental stage. For example, (<bold>A</bold>) shows a plot of the average GRO-seq signal from 4246 genes of control RNAi embryos around the WB starts or our TSSs called from embryos. The GRO-seq signal is averaged at each bp and then averaged across 25 bp windows. Plotting the GRO-seq signal against real TSSs reduces the upstream signal due to incorrectly annotated gene starts, indicating a dramatic improvement in gene models. n represents the number of genes in each stage having a TSS identified in that stage. RPKM: reads per kilobase per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.011">http://dx.doi.org/10.7554/eLife.00808.011</ext-link></p></caption><graphic xlink:href="elife00808fs005"/></fig><fig id="fig1s6" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.012</object-id><label>Figure 1—figure supplement 6.</label><caption><title>GRO-cap signal is strong at newly annotated TSSs.</title><p>(<bold>A</bold>)–(<bold>D</bold>) Because the corrected GRO-cap signal (TAP+ signal after subtracting the TAP− signal) was used to annotate transcription start sites (TSSs), we assessed whether our TSS calls coincided with the spike of the GRO-cap signal, as would be expected. To do so, we averaged the corrected GRO-cap signal across a 2 kb window centered on the TSSs or WormBase (WB) starts for genes having TSS annotated in the same developmental stage. For example, (<bold>A</bold>) shows the GRO-cap signal from control RNAi embryos plotted for genes with a TSS call in wild-type embryos. Each plot shows increased GRO-cap signal at the TSSs compared to the WB starts, indicating a vast improvement in gene models. The GRO-cap signal was averaged over 25 bp windows. n represents the number of genes in each stage having a TSS identified in that stage. RPM: reads per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.012">http://dx.doi.org/10.7554/eLife.00808.012</ext-link></p></caption><graphic xlink:href="elife00808fs006"/></fig><fig id="fig1s7" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.013</object-id><label>Figure 1—figure supplement 7.</label><caption><title>Heat maps showing GRO-seq and GRO-cap signal relative to either WB starts or TSSs for developmental stages reveal improvements in gene models.</title><p>(<bold>A</bold>)–(<bold>C</bold>) The GRO-seq signal was plotted, one gene per row, for each gene of a specific developmental stage relative to the WormBase (WB) starts. The genes were ordered from top to bottom with increasing distance between the transcription start site (TSS) and WB start. The GRO-seq signal was averaged across 15 bp windows. Darker red indicates more transcription. (<bold>D</bold>) Heat maps showing the GRO-cap signal from <italic>sdc-2</italic> mutant embryos plotted against either WB starts (left) or TSSs (right) called in wild-type embryos for 4246 genes. The genes were ordered with increasing distance between the TSS and WB start. RPKM: reads per kilobase per million; RPM: reads per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.013">http://dx.doi.org/10.7554/eLife.00808.013</ext-link></p></caption><graphic xlink:href="elife00808fs007"/></fig><fig id="fig1s8" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.014</object-id><label>Figure 1—figure supplement 8.</label><caption><title>TSSs can be far upstream of the previously annotated WB starts.</title><p>(<bold>A</bold>)–(<bold>C</bold>) Shown are GRO-seq signals, GRO-cap signals (TAP+ and TAP− or TAP− subtracted from TAP+ [E–F]), ChIP-chip signals of phospho ser 2 Pol II (from <xref ref-type="bibr" rid="bib53">Pferdehirt et al., 2011</xref>), and ChIP-chip signals of hypo-phosphorylated Pol II (8WG16 antibody, modENCODE_3545) for genes whose transcription start sites (TSSs) are far upstream of the WormBase (WB) starts. (<bold>A</bold>) The TSS for <italic>ZK1073.1</italic> is 5051 bp upstream of the WB start and 5101 bp upstream of the trans-splice acceptor site. (<bold>B</bold>) The TSS for <italic>ztf-22</italic> is 8927 bp upstream of the WB start and 8917 bp upstream of the trans-splice acceptor site. (<bold>C</bold>) The TSS for operon CEOP4336 is 6693 bp upstream of both the WB start and the first trans-splice acceptor site. The combination of continuous Pol II signal in the upstream regions and the lack of 3′ UTRs or polyA signals (<xref ref-type="bibr" rid="bib47">Mangone et al., 2010</xref>) in the upstream regions implies that transcription within the outron is not from sources other than the designated TSS. These results strongly support the argument that the GRO-cap signal paired with the continuous GRO-seq signal from the WB start defines true TSSs. RPKM: reads per kilobase per million; RPM: reads per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.014">http://dx.doi.org/10.7554/eLife.00808.014</ext-link></p></caption><graphic xlink:href="elife00808fs008"/></fig><fig id="fig1s9" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.015</object-id><label>Figure 1—figure supplement 9.</label><caption><title>GRO-cap revealed that 21 U-RNAs have a TSS 2 bp upstream of the mature RNA.</title><p>GRO-cap readily identified the transcription start sites (TSSs) of 21 U-RNAs from L3 larvae. To map TSSs, we determined the highest GRO-cap signal (TAP+ minus TAP−) within 10 bp of the 5′ end of 9148 mature, non-overlapping 21 U-RNAs. Of these RNAs, 5260 (57.5%) had a putative TSS with a GRO-cap Z-score greater than 3 in the 10 bp interval (p&lt;0.01). The TSSs for 4783 (91%) of 21 U-RNAs RNAs with a called TSS were precisely 2 bp upstream of the mature RNA, indicating that 21 U-RNAs receive a 5′ cap and are processed to the mature sequence by removing the two 5′-most base pairs.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.015">http://dx.doi.org/10.7554/eLife.00808.015</ext-link></p></caption><graphic xlink:href="elife00808fs009"/></fig><fig id="fig1s10" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.016</object-id><label>Figure 1—figure supplement 10.</label><caption><title>Features of promoters and TSSs.</title><p>(<bold>A</bold>) A transcription start site (TSS) can be far downstream of the WormBase (WB) start. The TSS for <italic>MO3C11.3</italic> is 2510 bp downstream of the WB start in all developmental stages examined. The TSS was identified for <italic>ymel-1</italic>, a downstream gene in the operon known to have either SL1 or SL2 RNA leaders on its mRNA. (<bold>B</bold>) WB gene models can be based on inaccurately predicted transcript isoforms. <italic>rpt-4</italic>, the first gene in an operon, has two annotated RNA isoforms in WB. Isoform a has a small annotated exon followed by an intron of greater than 2 kb. However, the only TSS in the region identified by GRO-cap in all developmental stages assayed is just upstream of the SL1 splice acceptor site for isoform b, implying that isoform a is incorrect or expressed in a stage not analyzed. (<bold>C</bold>) Identification of the TSS for a gene within an operon. <italic>tag-30</italic> mRNA is trans-spliced with an SL2 RNA, indicating it is processed from a polycistronic message, and it is trans-spliced with an SL1 RNA, indicating it also transcribed from an independent promoter. GRO-cap identified the internal TSS for <italic>tag-30</italic>. (<bold>D</bold>) Genes can have two or more different RNA isoforms that share the same 5′ end but different 3′ ends. GRO-seq identifies the 3′ accumulation of Pol II corresponding to both 3′ ends. GRO-seq and corrected GRO-cap (TAP+ signal after subtracting TAP− signal) signals are shown for <italic>unc-84</italic>. RPKM: reads per kilobase per million; RPM: reads per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.016">http://dx.doi.org/10.7554/eLife.00808.016</ext-link></p></caption><graphic xlink:href="elife00808fs010"/></fig><fig id="fig1s11" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.017</object-id><label>Figure 1—figure supplement 11.</label><caption><title>Distances between the TSS and WB starts of the trans-splicing acceptor site.</title><p>(<bold>A</bold>) and (<bold>B</bold>) For all genes with a transcription start site (TSS) called in wild-type embryos, the difference between the TSS and WormBase (WB) start or SL1 trans-splice acceptor site was grouped in bins of 25 bp and plotted as a histogram. Positive distances mean that the TSS is upstream of the annotated WB start or trans-splice acceptor site, while negative distances mean the TSS is downstream. The dotted red line demarks the position where the TSS calls are the same as the WB starts or trans-splicing acceptor site. (<bold>A</bold>) Plot of distance between TSS and WB start. The prevalence of distances near zero suggests that many WB start positions are correct and likely reflects non-trans-spliced genes. (<bold>B</bold>) Plot of outron length, the distance between the TSS and site of SL1 attachment to RNAs (<xref ref-type="bibr" rid="bib3">Allen et al., 2011</xref>). The SL1 trans-splice acceptor sites correspond to the site with the highest number of SL1 reads. Genes with multiple isoforms were eliminated from the analysis if the most 5′ part of the isoform differed. Many outrons are in the 50–500 bp range, consistent with previous estimations, but our data show that outron length is often significantly longer, up to 14 kb.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.017">http://dx.doi.org/10.7554/eLife.00808.017</ext-link></p></caption><graphic xlink:href="elife00808fs011"/></fig><fig id="fig1s12" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.018</object-id><label>Figure 1—figure supplement 12.</label><caption><title>Comparison of enhancers in <xref ref-type="bibr" rid="bib10">Chen et al. (2013)</xref> and our annotated TSSs.</title><p>(<bold>A</bold>)–(<bold>C</bold>) Comparison of our GRO-seq and GRO-cap profiles to enhancer regions from <xref ref-type="bibr" rid="bib10">Chen et al. (2013)</xref>. (<bold>A</bold>) <xref ref-type="bibr" rid="bib10">Chen et al. (2013)</xref> analyzed scRNA sequencing data from mixed-stage embryos to annotate transcription start sites (TSSs). They required that clusters of scRNA signal (labeled as ‘TICs’ and shown in red or blue) be within 200 bp of the WormBase (WB) start to be annotated as a new TSS. They then classified as enhancers (shown in yellow) the scRNA clusters that are not associated with a gene, have specific chromatin modifications not associated with promoters, and overlap with transcription factor binding sites. Shown is a panel from their Figure 6A (© Genome Research, Cold Spring Harbor Laboratory Press). Our GRO-seq and GRO-cap (TAP+ minus TAP−) data from mixed-stage embryos are shown for the same genomic region upstream of the <italic>tol-1</italic> gene as in Figure 6A. Our data and their data are precisely aligned. In this example, spikes of GRO-cap signal correspond with TICs, and several GRO-cap spikes correspond to their newly annotated enhancers. Continuous ChIP-chip signal for hypo-phosphorylated Pol II (8WG16 antibody, modENCODE_3545) and continuous GRO-seq signal occur from the most upstream enhancer to the WB start. The GRO-seq signal increases in intensity as it passes GRO-cap spikes and TICs, suggesting that each transcription initiation event contributes to the cumulative Pol II signal, which stops increasing in intensity once it reaches the WB start. In addition, no 3′ UTRs or polyA sites were found in this <italic>tol-1</italic> upstream region from the data sets of <xref ref-type="bibr" rid="bib34">Jan et al. (2011)</xref> and <xref ref-type="bibr" rid="bib47">Mangone et al. (2010)</xref>, implying no <italic>tol-1-</italic>independent polyadenylated transcription units in the upstream region. This analysis suggests that for this region some enhancers are likely to be TSSs that give rise to full-length transcripts. (<bold>B</bold>) The TSS we called for <italic>agmo-1</italic> from GRO-cap and GRO-seq data corresponds to the enhancer called by <xref ref-type="bibr" rid="bib10">Chen et al. (2013)</xref>. ChIP-chip data from modENCODE for hypo-phosphorylated Pol II antibody and the lack of 3′ UTRs support the TSS call. The distance between the TSS and both the WB start and the trans-splice acceptor site is 2534 bp. This example further supports the proposal that a proportion of the enhancers are outron TSSs. (<bold>C</bold>) The TSS we called for <italic>pat-2</italic> from GRO-cap and GRO-seq data corresponds to one of the two enhancers called by <xref ref-type="bibr" rid="bib10">Chen et al. (2013)</xref> upstream of <italic>pat-2</italic>. ChIP-chip data from modENCODE for hypo-phosphorylated Pol II antibody and the lack of 3′ UTRs support the TSS call. The distance between the TSS and the WB start is 2878 bp, while the distance from the TSS to the trans-splice acceptor site is 2875 bp. This example further supports the proposal that a proportion of the enhancers are outron TSSs. The ChIP-chip signal 3′ of the <italic>pat-2</italic> 3′ UTR is from the genes on the opposite strand. RPKM: reads per kilobase per million; RPM: reads per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.018">http://dx.doi.org/10.7554/eLife.00808.018</ext-link></p></caption><graphic xlink:href="elife00808fs012"/></fig></fig-group></p><p>The two GRO-seq biological replicates for each stage had high statistical correlation throughout the genome (Spearman correlation, ρ &gt; 0.94) and across gene bodies (Spearman correlation, ρ &gt; 0.98) (<xref ref-type="fig" rid="fig1s1 fig1s2 fig1s3">Figure 1—figure supplements 1–3</xref> and <xref ref-type="supplementary-material" rid="SD1-data">Figure 1—source data 1</xref>). Gene expression levels calculated from GRO-seq data correlated well with expression data from microarrays and RNA-seq experiments (<xref ref-type="fig" rid="fig1s4">Figure 1—figure supplement 4</xref>). For the majority of expressed genes, we found continuous GRO-seq signal upstream of the WormBase (WB)-annotated transcription starts (<xref ref-type="fig" rid="fig1">Figure 1A,B</xref>), suggesting that GRO-seq reactions contain nascent RNAs with true 5′ ends.</p><p>To map TSSs unambiguously, we performed a series of enzymatic selections on our GRO-seq run-on RNAs to capture only those RNAs with a 5′ cap, the 7-methyl guanosine residue added shortly after transcription initiation (<xref ref-type="bibr" rid="bib56">Rasmussen and Lis, 1993</xref>). This modified GRO-seq procedure, called GRO-cap (<xref ref-type="fig" rid="fig2">Figure 2</xref>), enabled us to map TSSs with nucleotide resolution by tracking nascent RNAs prior to trans-splicing. Use of nascent RNAs without size selection also reduces the background from RNAs that are capped post-transcriptionally and increases the probability of identifying TSSs from promoters with low Pol II occupancy.<fig id="fig2" position="float"><object-id pub-id-type="doi">10.7554/eLife.00808.019</object-id><label>Figure 2.</label><caption><title>GRO-cap strategy for identifying TSSs.</title><p>GRO-cap is a modified form of GRO-seq that utilizes the tagging and extensive purification of nascent RNAs from GRO-seq (<xref ref-type="bibr" rid="bib16">Core et al., 2008</xref>) and then employs redundant enzymatic steps to enrich for RNAs with 5′ caps. Of particular importance to this study, GRO-cap permits analysis of RNAs prior to their co-transcriptional processing, which replaces true transcription start sites (TSSs) with trans-spliced leader RNAs in <italic>Caenorhabditis elegans</italic>. GRO-seq run-ons have been tuned to extend the length of nascent RNAs by only 100 nucleotides on average, thus minimizing any possibility that independent transcription units might been artifactually linked. In GRO-cap, nuclei are isolated and RNA polymerases are allowed to transcribe briefly in a run-on reaction in the presence of Br-UTP, as in GRO-seq. RNA is isolated but the base-hydrolysis step of GRO-seq is omitted to increase the probability of capturing nascent RNA molecules with a 5′ 7-methyl-GTP cap. BrU-RNAs made during the run-on reaction are enriched by selection with anti-BrdU beads to ensure the identification of true TSSs from capped nascent RNAs rather than 5′ ends from RNAs that received post-transcriptional capping (<xref ref-type="bibr" rid="bib23">Fejes-Toth et al., 2009</xref>). A 3′ RNA adapter (red) is ligated to the RNAs, followed by another round of bead enrichment. Selection against 5′ mono-phosphate RNAs that do not represent capped RNAs (and any carry-through 5′ RNA adapters) is achieved by sequential enzymatic treatment with Terminator exonuclease to degrade 5′ mono-phosphate RNAs and then alkaline phosphatase to remove 5′ phosphates from 5′ mono-phosphate RNAs resistant to the exonuclease. Half of the nuclear run-on (NRO) RNA pool is treated with tobacco acid pyrophosphatase (TAP+) to remove the 5′ cap from the RNA, thereby exposing a 5′ mono-phosphate. The other half is left untreated (TAP−) to provide a control population of residual 5′ mono-phosphate RNAs that never had 5′ caps. The 5′ mono-phosphate RNAs are ligated to 5′ RNA adapters (blue). The TAP+ and TAP− samples are prepared for Illumina sequencing as in GRO-seq by reverse transcription of RNA into DNA and then amplification of DNA from 5′ and 3′ adapter regions. We note that transcripts &lt;500 bp are captured most efficiently on Illumina sequencing platforms. The enriched TSS regions are identified by mapping the 5′ ends of the sequence reads back to the genome and comparing the TAP+ and TAP− sites to eliminate false TSSs. Comparing the GRO-cap candidate TSSs to the 5′ ends of transcription units defined by GRO-seq permits reliable assignment of TSSs to transcription units.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.019">http://dx.doi.org/10.7554/eLife.00808.019</ext-link></p></caption><graphic xlink:href="elife00808f002"/></fig></p><p>For genes that lack trans-splicing, the site of maximum GRO-cap signal was coincident with both the 5′-most GRO-seq signal and the WB-annotated start (<xref ref-type="fig" rid="fig1">Figure 1C</xref>), confirming that GRO-cap and GRO-seq together permit high-confidence mapping of TSSs. For trans-spliced genes with no previously identified TSSs, strong GRO-cap and GRO-seq signals were found upstream of the WB-annotated starts, and TSS calls were supported by uninterrupted GRO-seq signal between the maximum GRO-cap signal and the WB start (<xref ref-type="fig" rid="fig1">Figure 1A</xref>).</p><p>In total, a TSS was identified for 31.7% (6353 genes) of all <italic>C</italic>. <italic>elegans</italic> protein-coding genes from at least one of the three developmental stages examined (<xref ref-type="supplementary-material" rid="SD2-data">Figure 1—source data 2</xref>). Of our TSS calls, 77% are for genes shown previously to be trans-spliced (<xref ref-type="supplementary-material" rid="SD2-data">Figure 1—source data 2</xref>) (<xref ref-type="bibr" rid="bib3">Allen et al., 2011</xref>). Plotting the average GRO-seq and GRO-cap signals from each developmental stage relative to the TSSs from the same stage revealed the vast improvement in gene models (<xref ref-type="fig" rid="fig1">Figure 1F,G</xref> and <xref ref-type="fig" rid="fig1s5 fig1s6 fig1s7">Figure 1—figure supplements 5, 6 and 7A–C</xref>). Independent GRO-cap reactions from the same developmental stage gave very similar results (<xref ref-type="fig" rid="fig1">Figure 1G</xref> and <xref ref-type="fig" rid="fig1s7">Figure 1—figure supplement 7D</xref>).</p><p>TSSs were also annotated for the majority of genes encoding short non-coding RNAs such as snoRNAs, 21 U-RNAs, and microRNAs, including TSSs for the five polycistronic microRNA clusters (<xref ref-type="fig" rid="fig1">Figure 1C,E</xref> and <xref ref-type="supplementary-material" rid="SD3-data">Figure 1—source data 3</xref>) (see ‘Materials and methods’). The TSSs for the 21 U-RNAs were 2 bp upstream from the mature RNA (<xref ref-type="fig" rid="fig1s9">Figure 1—figure supplement 9</xref>).</p><p>In addition, the primary TSSs and gene composition were determined for many multigenic transcription units called operons in which the polycistronic pre-mRNAs are processed to monocistronic mRNAs through 3′ end formation and trans-splicing using SL2 RNA (<xref ref-type="fig" rid="fig1">Figure 1D,E</xref>) (<xref ref-type="bibr" rid="bib8">Blumenthal, 2012</xref>). Identification of the TSS for one operon showed it to include a 5′ gene not previously ascribed to it (<xref ref-type="fig" rid="fig1">Figure 1D</xref>). TSSs for several operons showed the TSS to be far downstream (&gt;2 kb) of the WB start, a result we also found for genes not included in operons (<xref ref-type="fig" rid="fig1s10">Figure 1—figure supplement 10A,B</xref>). Within operons some genes have independent promoters to transcribe their pre-mRNAs (<xref ref-type="bibr" rid="bib3">Allen et al., 2011</xref>), and their TSSs were determined, including snoRNA genes within introns of internal genes (<xref ref-type="fig" rid="fig1">Figure 1E</xref> and <xref ref-type="fig" rid="fig1s10">Figure 1—figure supplement 10A,C</xref>). In general, we used conservative statistical criteria for calling TSSs, and additional TSSs can be identified by visual inspection of the data.</p><p>The new TSS calls revealed promoters to be further upstream from the 5′ ends of mature mRNAs than previously thought, as demonstrated by heat maps showing the GRO-seq signal or GRO-cap signal of re-annotated genes relative to WB starts or TSSs (<xref ref-type="fig" rid="fig1">Figure 1F,G</xref> and <xref ref-type="fig" rid="fig1s7">Figure 1—figure supplement 7A–D</xref>) and histograms showing distances between TSSs and WB starts or SL1 trans-splice acceptor sites (TSA) (<xref ref-type="fig" rid="fig1s11">Figure 1—figure supplement 11A,B</xref>). The TSS-to-TSA, called the outron, was previously thought to be 50–500 bp. We found instead that outrons can be as long as 14 kb and have a median of 260 bp and mean of 753 bp (<xref ref-type="fig" rid="fig1s11">Figure 1—figure supplement 11B</xref>). Fully 59% of outrons are longer than 200 bp, 21% are longer than 1 kb, and 2.3% are longer than 5 kb (e.g., <xref ref-type="fig" rid="fig1s8">Figure 1—figure supplement 8A–C</xref>).</p><p>Multiple lines of evidence indicate that the GRO-seq signal between newly called TSSs and previously identified TSAs reflects legitimate outrons rather than independent overlapping upstream transcripts. First, 3′ UTRs or polyA signals are rare in outrons of &gt;1 kb in length, indicating the engaged Pol II is not from independent polyadenylated transcripts. From 565 such outrons, only 1.4% had an identified 3′ UTR in the <xref ref-type="bibr" rid="bib34">Jan et al. (2011)</xref> study, and 0.7% had a 3′ UTR in the <xref ref-type="bibr" rid="bib47">Mangone et al. (2010)</xref> study. Furthermore, only 3.5% had a polyA site (<xref ref-type="bibr" rid="bib47">Mangone et al., 2010</xref>). Second, regions corresponding to long outrons have a continuous ChIP-chip signal from antibodies enriched for either the ser2 phosphorylated form of Pol II or the hypo-phosphorylated form of Pol II (<xref ref-type="fig" rid="fig1s8">Figure 1—figure supplement 8A–C</xref>). These results and the restricted run-on length of ∼100 nucleotides (<xref ref-type="bibr" rid="bib16">Core et al., 2008</xref>) indicate that the GRO-seq signal corresponds to bound Pol II in vivo and is not an artifact of the nuclear run-on (NRO) reactions in vitro extending beyond the 3’ ends defined in vivo (<xref ref-type="fig" rid="fig1s8">Figure 1—figure supplement 8A–C</xref> and <xref ref-type="fig" rid="fig3">Figure 3A</xref>). Third, a heat map of individual genes showing the GRO-cap signal relative to TSSs reveals that a dominant TSS contributes the majority of the vast GRO-cap signal (<xref ref-type="fig" rid="fig1">Figure 1G</xref>). Together these observations strongly support the argument that the GRO-cap signal paired with the continuous GRO-seq signal from WB starts defines true TSSs. Recently published studies of TSSs in <italic>C</italic>. <italic>elegans</italic> that used cutoffs for TSS calls of either 1 kb upstream (<xref ref-type="bibr" rid="bib28">Gu et al., 2012</xref>) or 200 bp upstream (<xref ref-type="bibr" rid="bib10">Chen et al., 2013</xref>) of WB starts identified some outron TSSs but could not identify TSSs for a large class of genes with longer outrons (see ‘Discussion’).<fig-group><fig id="fig3" position="float"><object-id pub-id-type="doi">10.7554/eLife.00808.020</object-id><label>Figure 3.</label><caption><title>Features of promoters and TSSs.</title><p>(<bold>A</bold>) Trans-spliced genes can have multiple transcription start sites (TSSs), suggesting that trans-splicing eliminates the pressure to have only one precise TSS per gene. Shown are GRO-seq and corrected GRO-cap (TAP+ signal after subtracting TAP− signal) signals with ChIP-seq data of hypo-phosphorylated Pol II (8WG16 antibody, modENCODE_2439) for the trans-spliced gene <italic>sca-1</italic> expressed in the L3 larval stage. The total GRO-seq signal becomes more intense as additional TSSs (from 5′ to 3′) contribute to the pool of engaged Pol II molecules that transcribe through the upstream regulatory region of <italic>sca-1</italic>. The combination of continuous Pol II signal in the upstream region and the lack of 3′ UTRs or polyA signals (<xref ref-type="bibr" rid="bib47">Mangone et al., 2010</xref>) strengthens the interpretation that the GRO-cap signal combined with the continuous GRO-seq signal identified true TSSs for <italic>sca-1</italic>. From left to right, the TSSs reside upstream of the WormBase (WB) gene model by 5728 bp, 4582 bp, 3044 bp, 1669 bp, and 159 bp. The ChIP-seq signal 3′ of the <italic>sca-1</italic> 3′ UTR is from the <italic>klp-7</italic> gene on the opposite strand. (<bold>B</bold>) A gene can use different primary TSSs in different developmental states. The primary TSS for <italic>tag-294</italic> in embryos and L3 larvae is 1529 bp upstream of the WB start, while the primary TSS in starved L1 larvae is 656 bp upstream. DNA sequences flanking newly annotated TSSs have evolutionarily conserved core promoter elements, including (<bold>C</bold>) TATA-box elements and (<bold>D</bold>) initiator elements (Inr). Of 4547 embryo genes with TSSs, 162 genes (3.6%) have a TATA element with a perfect match to the consensus 15–45 bp upstream of it, and 745 genes (16.4%) have an Inr with the adenine residing at the TSS (+1 bp). Consensus sequences for TATA elements and the Inr are above the graphs. RPKM: reads per kilobase per million; RPM: reads per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.020">http://dx.doi.org/10.7554/eLife.00808.020</ext-link></p></caption><graphic xlink:href="elife00808f003"/></fig><fig id="fig3s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.021</object-id><label>Figure 3—figure supplement 1.</label><caption><title>Evolutionarily conserved promoter elements.</title><p>(<bold>A</bold>) A TATA box element with one or no mismatch from the consensus (TATAWAWR) is highly enriched 15–45 bp upstream of transcription start sites (TSSs) for 391 of 4547 <italic>Caenorhabditis elegans</italic> genes. The consensus derived from the 578 elements in this region is above the histogram. (<bold>B</bold>) The core promoter region is highly conserved across nematode species. The UCSC Genome Browser uses phastCons (<ext-link ext-link-type="uri" xlink:href="http://compgen.bscb.cornell.edu/phast/">http://compgen.bscb.cornell.edu/phast/</ext-link>) to investigate DNA sequence conservation across seven Caenorhabditis species. Values range from 0 (no conservation) to 1 (highest conservation) for each base pair. We calculated the average DNA conservation in a 2 kb window surrounding the new TSSs (top) and WormBase (WB) starts (bottom) and found substantial conservation in each location, likely for different reasons. The conservation at TSSs likely reflects conservation of core promoter elements, and the conservation near WB starts likely reflects conservation at the junction between trans-splice acceptor site and first exon.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.021">http://dx.doi.org/10.7554/eLife.00808.021</ext-link></p></caption><graphic xlink:href="elife00808fs013"/></fig><fig id="fig3s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.022</object-id><label>Figure 3—figure supplement 2.</label><caption><title>Conserved core promoter elements in promoters of microRNA genes.</title><p>(<bold>A</bold>) A TATA box element with a perfect match to the consensus derived for genes encoding microRNAs was highly enriched 29–32 bp upstream of transcription start sites (TSSs) for 15 of 57 microRNA genes. The consensus is shown above the histogram. The distance between the TATA element and the TSS was calculated from the 3′-most base of the TATA element. (<bold>B</bold>) A TATA element with one or no mismatch from the consensus is highly enriched 15–45 bp upstream of TSSs for 24 of 57 microRNA genes. The consensus derived from the 31 elements in this region is above the histogram. (<bold>C</bold>) Inr elements are enriched in promoters of microRNA genes. Of 57 microRNA genes, 21 have an Inr with an adenine at the +1 position of the TSS. The consensus derived from the 38 Inr motifs in the region is above the histogram.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.022">http://dx.doi.org/10.7554/eLife.00808.022</ext-link></p></caption><graphic xlink:href="elife00808fs014"/></fig></fig-group></p></sec><sec id="s2-2"><title>Features of <italic>C. elegans</italic> promoters</title><p>Several other noteworthy features of promoters emerged from this comprehensive mapping of <italic>C</italic>. <italic>elegans</italic> transcription units and TSSs. (1) Many trans-spliced genes have multiple TSSs (<xref ref-type="fig" rid="fig3">Figure 3A</xref>), suggesting that trans-splicing has removed the selective pressure to form promoters with a single TSS. (2) Genes can use different TSSs across developmental stages, indicating developmental stage-specific regulation of transcription initiation (<xref ref-type="fig" rid="fig3">Figure 3B</xref>). (3) DNA sequences flanking the newly annotated TSSs have strong sequence conservation across nematode species (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1B</xref>) and also have evolutionarily conserved core promoter elements, including the TATA-box (worm consensus TATAWAWR) (<xref ref-type="fig" rid="fig3">Figure 3C</xref> and <xref ref-type="fig" rid="fig1s1 fig1s2">Figure 3—figure supplements 1A and 2A,B</xref>) and the initiator element (Inr) (worm consensus YCAYTY) (<xref ref-type="fig" rid="fig3">Figure 3D</xref> and <xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2C</xref>), both of which facilitate formation of the Pol II pre-initiation complex (<xref ref-type="bibr" rid="bib38">Juven-Gershon and Kadonaga, 2010</xref>).</p></sec><sec id="s2-3"><title>Features of <italic>C. elegans</italic> transcription: abundant 3′ Pol II accumulation and divergent transcription, but scarce promoter-proximal pausing</title><p>Three prominent features of transcription emerged. Accumulation of Pol II at 3′ ends of genes is abundant (<xref ref-type="fig" rid="fig4">Figure 4A–C</xref> and <xref ref-type="fig" rid="fig4s1">Figure 4—figure supplement 1A,B</xref>), as is divergent transcription from promoters lacking upstream divergent genes (<xref ref-type="fig" rid="fig4">Figure 4D–F</xref>). In contrast, promoter-proximal RNA Pol II pausing in <italic>C</italic>. <italic>elegans</italic> is rare under normal growth conditions, unlike in other metazoans, as shown later in ‘Results’.<fig-group><fig id="fig4" position="float"><object-id pub-id-type="doi">10.7554/eLife.00808.023</object-id><label>Figure 4.</label><caption><title>Features of <italic>Caenorhabditis</italic> <italic>elegans</italic> transcription: 3′ Pol II pausing and divergent transcription.</title><p>(<bold>A</bold>)–(<bold>C</bold>) Pol II 3′ accumulation is prevalent in worms. 3′ End pausing ratios were calculated by dividing the highest average GRO-seq signal at the 3′ end by the average GRO-seq signal in the gene body. (<bold>A</bold>) A histogram of the 3′ end pausing ratios shows 3′ accumulation of Pol II is more extensive in <italic>Caenorhabditis elegans</italic> than in Drosophila. The histogram compares 3′ accumulation for 3984 genes expressed in <italic>C</italic>. <italic>elegans</italic> embryos with 6107 genes expressed in Drosophila cell lines. (<bold>B</bold>) The GRO-seq signal surrounding the 3′ end (cleavage and polyadenylation site [CPS]) was averaged for genes in or not in operons. Genes at the beginning (n = 430) and middle (n = 276) of operons were more highly expressed and had higher 3′ accumulation than genes at the end (n = 474) of operons or not in operons (n = 5048). Genes plotted had to be greater than 3 kb in length. (<bold>C</bold>) 3′ End pausing ratios were calculated for all classes of genes in (<bold>B</bold>) and plotted as boxplots. For this analysis, genes had to be greater than 3 kb in length and the gene body RPKM (reads per kilobase per million) had to be ≥1. Genes found at the beginning (n = 415) and middle (n = 275) of operons had higher 3′ pausing than genes at the end (n = 467) of operons. Genes lacking a downstream gene had similar 3′ pausing ratios whether or not (n = 3670) they were in operons. The 3′ pausing ratios for genes in all classes were greater than for Drosophila genes (n = 3260). (<bold>D</bold>)–(<bold>F</bold>) Upstream divergent transcription is common at promoters of <italic>C</italic>. <italic>elegans</italic> genes. GRO-seq and GRO-cap profiles show transcription of a divergent gene pair (<bold>D</bold>) or divergent transcription from a promoter without an upstream divergent gene partner (<bold>E</bold>). Gene on plus strand (red gene and signal). Gene on minus strand (blue gene and signal). (<bold>F</bold>) Upstream divergent transcription from <italic>C</italic>. <italic>elegans</italic> promoters is intermediate between that in humans and Drosophila. Plot compares the log<sub>2</sub>(sense/antisense) transcription ratio of human and fly promoters to <italic>C</italic>. <italic>elegans</italic> promoters without divergent gene pairs. The median log<sub>2</sub> ratios are 0.3 for humans, 2.3 for <italic>C</italic>. <italic>elegans</italic>, and 5.0 for Drosophila. RPM: reads per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.023">http://dx.doi.org/10.7554/eLife.00808.023</ext-link></p><p><supplementary-material id="SD4-data"><object-id pub-id-type="doi">10.7554/eLife.00808.024</object-id><label>Figure 4—source data 1.</label><caption><title>Gene expression, and 5′ pausing and 3′ pausing data for protein-coding genes.</title><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.024">http://dx.doi.org/10.7554/eLife.00808.024</ext-link></p></caption><media mime-subtype="xls" mimetype="application" xlink:href="elife00808s004.xls"/></supplementary-material></p></caption><graphic xlink:href="elife00808f004"/></fig><fig id="fig4s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.025</object-id><label>Figure 4—figure supplement 1.</label><caption><title>The 3′ accumulation of RNA polymerase II.</title><p>(<bold>A</bold>) The 3′ accumulation of RNA Pol II is positively correlated with gene expression. The maximum average GRO-seq signal from control RNAi embryos was calculated across a 200 bp window at the 3′ end of each gene greater than 1.1 kb with an average expression of greater than 1 RPKM (reads per kilobase per million) in the gene body. This 3′ end GRO-seq average was compared with the average GRO-seq expression of the same genes in a scatter plot. The positive correlation of 3′ RNA Pol II accumulation and level of gene expression is reflected by the Spearman correlation of 0.776. (<bold>B</bold>) The 3′ pausing ratios are greater in <italic>Caenorhabditis elegans</italic> than in Drosophila. To calculate 3′ pausing ratios, the maximum average GRO-seq signal across a 200 bp window near the 3′ end was divided by the average GRO-seq signal in the gene body. The distribution of ratios from embryos, starved L1s, L3s, and Drosophila S2 cells was plotted. Although the distributions of all <italic>C</italic>. <italic>elegans</italic> states were significantly different from Drosophila S2 cells (Mann–Whitney U test, p&lt;0.0001 for all comparisons to S2 cells), the 3′ pausing ratios got progressively smaller for the later <italic>C</italic>. <italic>elegans</italic> developmental stages.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.025">http://dx.doi.org/10.7554/eLife.00808.025</ext-link></p></caption><graphic xlink:href="elife00808fs015"/></fig><fig id="fig4s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.026</object-id><label>Figure 4—figure supplement 2.</label><caption><title>RNA polymerase II accumulation at 3′ ends compared against trans-spliced genes, non-trans-spliced genes, and U-rich regions at 3′ ends.</title><p>(<bold>A</bold>) Shown are the 3′ pausing ratios for the first and middle genes in operons (n = 949), last genes in operons (n = 625), monocistronic genes with trans-splicing (n = 603), monocistronic genes without trans-splicing (n = 603), and Drosophila genes (n = 3942) plotted as boxplots. All genes in this analysis had to be ≥2 kb and have a gene body RPKM (reads per kilobase per million) ≥5. For comparison of monocistronic gene sets, each monocistronic gene with trans-splicing had a monocistronic gene without trans-splicing of equivalent expression level. The first and middle genes in operons had the highest 3′ pausing ratio. Monocistronic genes with trans-splicing had a slightly higher 3′ pausing ratio than last genes in operons. Monocistronic genes with trans-splicing had a higher 3′ pausing ratio than monocistronic genes lacking trans-splicing (Mann–Whitney U p&lt;10<sup>−10</sup>). The 3′ pausing ratios for genes in all classes were greater than for Drosophila genes. Monocistronic genes were classified as trans-spliced or not trans-spliced depending on whether an SL leader was identified in <xref ref-type="bibr" rid="bib3">Allen et al. (2011)</xref>. (<bold>B</bold>) and (<bold>C</bold>) Accumulation of Pol II GRO-seq signal at 3′ ends (cleavage and polyadenylation site [CPS]) does not overlap with U-rich regions. (<bold>B</bold>) Accumulation of Pol II at 3′ ends of the first and middle genes in operons (n = 706), last genes in operons (n = 474), and genes not in operons (n = 5048). (<bold>C</bold>) Plot of relative U-richness for the gene sets in (<bold>B</bold>). The proportion of genes with a U at each base pair was calculated and then averaged over 25 bp windows. The vertical green line in (<bold>B</bold>) is aligned with the peak of 3′ GRO-seq signal and is drawn at the same position in (<bold>C</bold>).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.026">http://dx.doi.org/10.7554/eLife.00808.026</ext-link></p></caption><graphic xlink:href="elife00808fs016"/></fig><fig id="fig4s3" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.027</object-id><label>Figure 4—figure supplement 3.</label><caption><title>Divergent transcription in <italic>Caenorhabditis elegans</italic>.</title><p>(<bold>A</bold>) and (<bold>B</bold>) Comparison of average GRO-seq signal around promoters with a divergent gene pair vs promoters not associated with a divergent gene partner. The genes analyzed had a log<sub>2</sub>(sense/antisense) of ≤1.5. The upstream divergent transcripts from a promoter without a divergent gene partner are shorter and less abundant than those from a promoter of a divergent gene pair. (<bold>C</bold>) Comparison of the GRO-cap signal around promoters of divergent gene pairs (blue) and promoters not associated with a divergent gene partner. Upstream divergent transcription (also referred to as antisense) begins at approximately the same distance from the transcription start site (TSS) of a gene whether or not a divergent gene partner is present. The GRO-cap signal was only evaluated in this analysis for genes having a log<sub>2</sub>(sense/antisense) ratio ≤1.5. Distance was calculated between the TSS and the maximum antisense GRO-cap signal within 500 bp. (<bold>D</bold>) and (<bold>E</bold>) Promoters having a TATA box matching the consensus or having a TATA with no or one mismatch preferentially transcribe in a single direction when no divergent gene is upstream. Comparison of the average GRO-seq signal from control RNAi embryos in sense and antisense directions for a 3 kb window surrounding the TSS. RPKM: reads per kilobase per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.027">http://dx.doi.org/10.7554/eLife.00808.027</ext-link></p></caption><graphic xlink:href="elife00808fs017"/></fig></fig-group></p><p>Pol II 3′ accumulation, likely caused by slow 3′ end formation and RNA processing (<xref ref-type="bibr" rid="bib27">Gromak et al., 2006</xref>), is positively correlated in <italic>C</italic>. <italic>elegans</italic> with the expression level of the gene (<xref ref-type="fig" rid="fig4s1">Figure 4—figure supplement 1A</xref> and <xref ref-type="supplementary-material" rid="SD4-data">Figure 4—source data 1</xref>) and is more extensive in <italic>C</italic>. <italic>elegans</italic> than in <italic>Drosophila</italic> (<xref ref-type="fig" rid="fig4">Figure 4A</xref> and <xref ref-type="fig" rid="fig4s1">Figure 4—figure supplement 1B</xref>). Multiple peaks of 3′ accumulation within a gene help identify genes having several isoforms with distinct 3′ ends (<xref ref-type="fig" rid="fig1s10">Figure 1—figure supplement 10D</xref>). The prevalence in <italic>C</italic>. <italic>elegans</italic> of polycistronic operons requiring extensive RNA processing to produce monocistronic mRNAs caused us to ask whether 3′ Pol II accumulation was positively correlated with the gene’s position in an operon and hence the level of RNA processing. First and middle genes in an operon require two forms of co-transcriptional RNA processing at the 3′ end to generate monocistronic mRNAs, polyadenylation of the upstream gene, and trans-splicing of the downstream gene, while the terminal gene requires only polyadenylation at the 3′ end. First and middle genes showed more 3′ accumulation than terminal genes or genes not in operons (<xref ref-type="fig" rid="fig4">Figure 4B,C</xref>). In other words, 3′ pausing is longer at genes with another gene to process just downstream. The 3′ Pol II accumulation at terminal genes and genes not in operons was equivalent, yet greater than in <italic>Drosophila</italic> genes. Lastly, 3′ accumulation was similar between terminal genes of operons and monocistronic genes undergoing trans-splicing, and both gene sets had greater 3′ accumulation than monocistronic genes lacking trans-splicing, which nonetheless had more 3′ accumulation than <italic>Drosophila</italic> genes (<xref ref-type="fig" rid="fig4s2">Figure 4—figure supplement 2A</xref>). Cues triggering trans-splicing, particularly operon-specific trans-splicing, appear to facilitate 3′ accumulation and perhaps predispose <italic>C</italic>. <italic>elegans</italic> Pol II to greater 3′ pausing genome-wide, whether or not a gene resides in an operon. It should be noted that Pol II accumulation at 3′ ends does not overlap with U-rich regions at 3′ ends. Therefore, the high GRO-seq signal is not due to selective enrichment of U-rich RNAs (<xref ref-type="fig" rid="fig4s2">Figure 4—figure supplement 2B,C</xref>).</p><p>GRO-seq readily detects divergent transcription from promoters. In <italic>C</italic>. <italic>elegans</italic>, divergent transcripts are short and initiated 75–150 bp upstream from TSSs of promoters lacking upstream divergent genes (<xref ref-type="fig" rid="fig4">Figure 4E</xref> and <xref ref-type="fig" rid="fig4s3">Figure 4—figure supplement 3A–C</xref>). The frequency of upstream divergent transcription in <italic>C</italic>. <italic>elegans</italic> appears to be intermediate in degree between that in mammals, where it occurs at the majority of active promoters (<xref ref-type="bibr" rid="bib39">Kapranov et al., 2007</xref>; <xref ref-type="bibr" rid="bib16">Core et al., 2008</xref>; <xref ref-type="bibr" rid="bib58">Seila et al., 2008</xref>), and that in <italic>Drosophila</italic>, where it occurs only rarely (<xref ref-type="bibr" rid="bib52">Nechaev et al., 2010</xref>; <xref ref-type="bibr" rid="bib15">Core et al., 2012</xref>) (<xref ref-type="fig" rid="fig4">Figure 4F</xref>). For both mammals and worms, promoters with TATA elements rarely support divergent transcription (<xref ref-type="fig" rid="fig4s3">Figure 4—figure supplement 3D,E</xref>) (<xref ref-type="bibr" rid="bib15">Core et al., 2012</xref>). These results underscore fundamental similarities and differences in the architecture, evolution, and function of eukaryotic promoters.</p></sec><sec id="s2-4"><title>Genome-wide consequences of disrupting dosage compensation</title><p>Analysis of transcriptionally engaged Pol II in wild-type vs dosage-compensation-defective embryos provided a robust assessment of the genome-wide impact of disrupting dosage compensation. To assess dosage compensation, we disrupted <italic>sdc-2</italic> (<underline>s</underline>ex determination and <underline>d</underline>osage <underline>c</underline>ompensation), the central XX-specific factor that triggers assembly of all DCC components onto X and induces hermaphrodite sexual differentiation by repressing the autosomal male sex-determining gene <italic>her-1</italic> (<xref ref-type="fig" rid="fig5">Figure 5A</xref>) (<xref ref-type="bibr" rid="bib18">Dawes et al., 1999</xref>). Without <italic>sdc-2</italic>, DCC subunits fail to bind to X, and <italic>her-1</italic> is expressed, causing XX embryos to become severely masculinized and die from overexpression of X-chromosome genes (<xref ref-type="fig" rid="fig5">Figure 5A</xref>). The pivotal role of <italic>sdc-2</italic> in dosage compensation made its depletion the most effective way to disrupt dosage compensation, although depletion of any DCC condensin subunit causes similar XX lethality and elevation of X gene expression, as shown previously by our genome-wide measurements of gene expression (<xref ref-type="bibr" rid="bib35">Jans et al., 2009</xref>).<fig-group><fig id="fig5" position="float"><object-id pub-id-type="doi">10.7554/eLife.00808.028</object-id><label>Figure 5.</label><caption><title>GRO-seq analysis of dosage-compensation.</title><p>(<bold>A</bold>) Genetic hierarchy for coordinate control of sex determination and dosage compensation. <italic>sdc-2</italic> is expressed solely in XX embryos and triggers the hermaphrodite fate. <italic>sdc-2</italic> acts together with <italic>sdc-1</italic> and <italic>sdc-3</italic>, both zinc finger proteins, to induce hermaphrodite sexual development by repressing transcription of the male sex-determining gene <italic>her-1. sdc-2</italic> acts together with <italic>sdc-3</italic> and <italic>dpy-30</italic>, also a member of the MLL/COMPASS gene activating complex, to load the DCC onto X and thereby turn dosage compensation on. <italic>sdc-2</italic> is the single gene required for all DCC components to assemble onto X. Without <italic>sdc-2</italic>, <italic>her-1</italic> is expressed, causing sexual transformation of XX embryos to the male fate, and the DCC fails to assemble onto X, causing severe dosage compensation disruption and the death of all XX embryos. The DCC contains not only the X loaders (red, orange) but also five homologs of the mitotic condensin complex (yellow, blue, green). The DCC binds to the X chromosomes of only XX animals to reduce transcription by half, thereby equalizing X-chromosome gene expression between males (XO) and hermaphrodites (XX). (<bold>B</bold>) GRO-seq shows that both RNA isoforms of <italic>her-1</italic> are elevated in <italic>sdc-2</italic> XX mutants. <italic>her-1</italic> is expressed at such a low level in XX embryos that the two transcription start sites (TSSs) are only evident with GRO-cap in the <italic>sdc-2</italic> mutants. The gene model (red arrow) incorporates 3′ end data from <xref ref-type="bibr" rid="bib34">Jan et al. (2011)</xref>. (<bold>C</bold>) and (<bold>D</bold>) The X-linked protein coding gene <italic>pdi-2</italic> is elevated in expression in <italic>sdc-2</italic> mutants as is the gene encoding the <italic>mir-62</italic> microRNA. For <italic>pdi-2</italic>, the elevation starts at the TSS and is evident throughout the gene. Red arrows show our re-annotated gene models. (<bold>E</bold>) The DCC subunit DPY-27 binds just upstream of the TSS. Comparison of the average DPY-27 ChIP-seq signal relative to WormBase (WB) starts and TSSs of X-linked genes. RPKM: reads per kilobase per million; RPM: reads per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.028">http://dx.doi.org/10.7554/eLife.00808.028</ext-link></p><p><supplementary-material id="SD5-data"><object-id pub-id-type="doi">10.7554/eLife.00808.029</object-id><label>Figure 5—source data 1.</label><caption><title>Genome-wide changes in gene expression caused by the disruption of dosage compensation.</title><p>Genes were separated into different gene sets based on their length, origin, and chromosome (X vs autosome) to compare GRO-seq gene expression between the <italic>sdc-2</italic> mutant and control RNAi embryos. Shown are the total number of genes, the median <italic>sdc-2</italic> mutant/control RNAi expression ratio, the average <italic>sdc-2</italic> mutant/control RNAi expression ratio together with the standard error of the mean, and the number of genes in each set that are more highly expressed in each condition. Across numerous protein-coding gene sets, the X chromosome is more highly expressed and the autosomes are slightly less expressed in the <italic>sdc-2</italic> mutant. Furthermore, when X-linked genes are significantly changed in expression, they are almost exclusively increased in expression. Each gene set is separated into two sets, one containing all genes and the other containing genes that are significantly changed in expression as determined by analysis with DESeq (p&lt;0.05) (<xref ref-type="bibr" rid="bib4">Anders and Huber, 2010</xref>). For the first two lists (labeled with ‘≥250 bp’), average GRO-seq gene expression was calculated from the beginning to the end of either the WormBase (WB) model or the newly annotated transcription start site (TSS) gene model. WB genes had to be expressed at greater than 1 RPKM (reads per kilobase per million), and have at least 250 uniquely mappable bases in both sets. For the next set (labeled with ‘WormBase WS230 Genes ≥1.1 kb’), average GRO-seq expression was calculated for genes greater than 1.1 kb, with the first and last 300 bp of the gene excluded. The level of expression had to be ≥1 RPKM for WB genes, and have at least 250 uniquely mappable bases for a gene to be included. For the final two sets of genes (labeled with ‘≥1.1 kb’ and ‘≥1.5 kb’), expression was calculated for genes of the indicated length that have a newly annotated TSS, with the first and last 300 bp of the gene excluded. A gene had to have at least 250 uniquely mappable bases for it to be included. RNA polymerase II transcribed microRNAs are controlled by dosage compensation, while RNA polymerase III transcribed tRNAs are not. Average GRO-seq gene expression from <italic>sdc-2</italic> mutant and control RNAi embryos was compared across ncRNAs. For microRNAs, expression values were calculated from the full length of the WB ‘primary transcript’ or re-annotated TSS gene models. For tRNAs, expression values were calculated from the beginning of the ‘mature transcript’ to 50 bp downstream of the stop. Because tRNAs are highly repetitive and transcription of highly transcribed tRNAs continues downstream of the stop, the extra 50 bp was included to increase the unique mappability of each tRNA. For a gene to be considered for analysis, it had to have at least 25 bp of uniquely mappable DNA, and to have an average expression of at least 1 RPKM in both control RNAi and <italic>sdc-2</italic> mutant embryos. The median and mean <italic>sdc-2</italic>/control expression levels show that X-linked microRNAs are more susceptible to dosage compensation than autosomal microRNAs. X-linked tRNAs are decreased slightly in expression in the <italic>sdc-2</italic> mutant, suggesting that its expression is not controlled by dosage compensation.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.029">http://dx.doi.org/10.7554/eLife.00808.029</ext-link></p></caption><media mime-subtype="xls" mimetype="application" xlink:href="elife00808s005.xls"/></supplementary-material></p></caption><graphic xlink:href="elife00808f005"/></fig><fig id="fig5s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.030</object-id><label>Figure 5—figure supplement 1.</label><caption><title>Western blot shows the reduction in SDC-2 protein levels in <italic>sdc-2(y93, RNAi)</italic> animals.</title><p>The <italic>sdc-2(y93)</italic> partial-loss-of function mutant was treated with RNAi against <italic>sdc-2</italic> to reduce its gene activity. Extracts from wild-type and <italic>sdc-2</italic> mutant embryos were fractionated on an SDS-PAGE gel, transferred to a membrane, and probed with antibodies to SDC-2 and Α-tubulin as a loading control. SDC-2 is less abundant in the RNAi-treated mutant than wild-type embryos. The mutant SDC-2 protein has a lower molecular weight than the wild-type protein because the <italic>y93</italic> allele is an in-frame deletion.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.030">http://dx.doi.org/10.7554/eLife.00808.030</ext-link></p></caption><graphic xlink:href="elife00808fs018"/></fig><fig id="fig5s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.031</object-id><label>Figure 5—figure supplement 2.</label><caption><title>X-linked gene expression is selectively increased in <italic>sdc-2</italic> mutants.</title><p>(<bold>A</bold>)–(<bold>D</bold>) Scatter plots of the average GRO-seq signal (log<sub>2</sub> RPKM) from RNAi control and <italic>sdc-2</italic> mutant embryos from the bodies of genes on X or autosomes. The genes had to be ≥1.1 kb and had to have at least 250 uniquely mappable bp in the gene body. (<bold>A</bold>)–(<bold>B</bold>) Whether the group of genes on X contains the subset of genes with transcription start sites (TSSs) or the larger set of genes annotated in WormBase (WB), the genes are more highly expressed in the <italic>sdc-2</italic> mutants. (<bold>C</bold>)–(<bold>D</bold>) Whether the group of genes on autosomes contains the subset of genes with TSSs or the larger set of genes annotated in WB, the genes are fairly equivalently expressed in <italic>sdc-2</italic> mutant embryos and control RNAi embryos. RPKM: reads per kilobase per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.031">http://dx.doi.org/10.7554/eLife.00808.031</ext-link></p></caption><graphic xlink:href="elife00808fs019"/></fig><fig id="fig5s3" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.032</object-id><label>Figure 5—figure supplement 3.</label><caption><title>Occupancy of the DCC subunit DPY-27 in the promoter of a gene is correlated with the gene’s expression level but not its dosage compensation status.</title><p>The total DPY-27 ChIP-seq signal was calculated for the 500 bp window upstream of the transcription start sites (TSSs) for X-linked genes and compared with the expression level of the genes in control RNAi embryos (<bold>A</bold>) and the change in expression levels between <italic>sdc-2</italic> mutant and control RNAi embryos (<bold>B</bold>). (<bold>A</bold>) DPY-27 has greater occupancy in the promoters of the more highly expressed genes. The scatter plot compares the log<sub>2</sub> RPM of the DPY-27 ChIP-seq signal with the log<sub>2</sub>(RPKM) (reads per kilobase per million) of the GRO-seq signal from control RNAi embryos. The two are positively correlated (Spearman correlation coefficient of 0.428). (<bold>B</bold>) The level of DPY-27 occupancy in the promoter of the gene does not predict whether the expression of the gene will change in response to the disruption of dosage compensation. The scatter plot compares the log<sub>2</sub>(RPM) of the DPY-27 ChIP-seq signal with the log<sub>2</sub>(<italic>sdc-2</italic> mutant/control RNAi) expression difference. No correlation was found (Spearman correlation coefficient of −0.003). RPM: reads per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.032">http://dx.doi.org/10.7554/eLife.00808.032</ext-link></p></caption><graphic xlink:href="elife00808fs020"/></fig></fig-group></p><p>Severe disruption of <italic>sdc-2</italic> function was achieved by treating an <italic>sdc-2</italic> partial loss-of-function XX mutant with <italic>sdc-2</italic> RNAi [<italic>sdc-2(y93, RNAi)</italic>] (<xref ref-type="fig" rid="fig5s1">Figure 5—figure supplement 1</xref>). GRO-seq data showed that expression of <italic>her-1</italic> (<xref ref-type="fig" rid="fig5">Figure 5B</xref>) and protein-coding genes on X (<xref ref-type="fig" rid="fig5">Figure 5C</xref>) were elevated in <italic>sdc-2</italic> mutants, while expression of genes on autosomes was mildly reduced (<xref ref-type="fig" rid="fig5s2">Figure 5—figure supplement 2A–D</xref>). <italic>her-1</italic> was de-repressed 12.7-fold, and the increase in GRO-seq signal was uniform from the promoter to the 3′ end, suggesting that SDC-2 controls sex determination by reducing Pol II recruitment or initiation at the <italic>her-1</italic> promoter in XX embryos (<xref ref-type="fig" rid="fig5">Figure 5B</xref>).</p><p>The median change of gene expression in <italic>sdc-2</italic> mutant vs control embryos was an increase of 1.63- to 1.67-fold for X-linked genes of all lengths and a slight reduction of 0.79- to 0.81-fold for autosome-linked genes (<xref ref-type="supplementary-material" rid="SD5-data">Figure 5—source data 1</xref> and <xref ref-type="supplementary-material" rid="SD4-data">Figure 4—source data 1</xref>). For the subset of X and autosomal genes whose expression was statistically different (p≤0.05) between mutant and control embryos, 99% of X-linked genes had elevated expression in the mutants, while 63–66% of autosomal genes had reduced expression (<xref ref-type="supplementary-material" rid="SD5-data">Figure 5—source data 1</xref> and <xref ref-type="supplementary-material" rid="SD4-data">Figure 4—source data 1</xref>). The reduced autosomal gene expression was robust to normalization procedures that counteract potential complications from increased X expression (see ‘Materials and methods’).</p><p>Our GRO-seq experiments also provided the first indication that dosage compensation controls the expression of small non-coding RNAs. We found most embryonically expressed X-linked microRNAs to be dosage compensated (<xref ref-type="fig" rid="fig5">Figure 5D</xref> and <xref ref-type="supplementary-material" rid="SD5-data">Figure 5—source data 1</xref>, and ‘Materials and methods’), while X-linked tRNAs were not, implying that RNA polymerase II is broadly sensitive to dosage compensation, and RNA polymerase III is insensitive (<xref ref-type="supplementary-material" rid="SD5-data">Figure 5—source data 1</xref>).</p><p>Previous studies showed the DCC binds to sequence-specific DNA recruitment sites on X and disperses to promoter regions of actively transcribed genes (<xref ref-type="bibr" rid="bib22">Ercan et al., 2009</xref>; <xref ref-type="bibr" rid="bib35">Jans et al., 2009</xref>; <xref ref-type="bibr" rid="bib53">Pferdehirt et al., 2011</xref>), but the lack of precise TSS calls had prevented an accurate alignment of DCC binding sites with promoters. Mapping of new TSSs relative to DCC binding sites called from our ChIP-seq experiments performed for this comparison showed the peak of DCC binding to be immediately upstream of the TSS (<xref ref-type="fig" rid="fig5">Figure 5E</xref>). Although the DCC binds to promoter regions, prior studies showed that DCC binding to a gene is neither necessary nor sufficient for the compensation of that gene, and not all genes are dosage compensated (<xref ref-type="bibr" rid="bib35">Jans et al., 2009</xref>). We assessed the generality of that conclusion by comparing DCC binding upstream of a TSS to the increase in gene expression caused by disrupting <italic>sdc-2</italic>. Our results confirmed and extended the original conclusion (<xref ref-type="fig" rid="fig5s3">Figure 5—figure supplement 3A,B</xref>). Thus, the DCC can act at a distance to control gene expression, and DCC binding intensity is not a proxy for the dosage compensation status of the gene.</p></sec><sec id="s2-5"><title>The <italic>C</italic>. <italic>elegans</italic> dosage compensation process does not regulate promoter-proximal pausing</title><p>To determine the step of transcription controlled by the DCC, we compared the distribution of the GRO-seq signal along X and autosomal genes between control and dosage-compensation-defective embryos. The change in Pol II distribution expected from disrupting dosage compensation differs according to the step of transcription affected by the DCC. If the DCC restricts Pol II recruitment, a uniform increase in engaged Pol II would be expected from the promoter to the 3′ end of genes of mutants. If the DCC reduces transcription by preventing the release of Pol II promoter-proximal pausing, or by reducing transcription elongation at stages downstream of the pause site, an increase in the level of engaged Pol II would be expected in the gene body and 3′ end in mutants, with the increase beginning more promoter proximal for a mechanism that controls pausing.</p><p>Promoter-proximal pausing of transcriptionally engaged Pol II is a rate-limiting step of transcription in metazoans (<xref ref-type="bibr" rid="bib1">Adelman and Lis, 2012</xref>) observed at ∼ 40% of active genes in mammalian cells (<xref ref-type="bibr" rid="bib16">Core et al., 2008</xref>) and more than 60% of active genes in <italic>Drosophila</italic> (<xref ref-type="bibr" rid="bib15">Core et al., 2012</xref>). To determine whether control of 5′ pausing was a plausible mechanism for dosage compensation, we calculated 5′ pausing ratios (GRO-seq signal for 5′ end/gene body) for genes in each developmental stage using TSSs calls for the respective stage (<xref ref-type="supplementary-material" rid="SD4-data">Figure 4—source data 1</xref>). In contrast to <italic>Drosophila</italic> and mammalian genomes, very few genes (0.38%, 15 of 3975) of wild-type <italic>C</italic>. <italic>elegans</italic> embryos showed evidence of 5′ pausing (<xref ref-type="supplementary-material" rid="SD4-data">Figure 4—source data 1</xref>, <xref ref-type="fig" rid="fig6">Figure 6A,B</xref>, and <xref ref-type="fig" rid="fig6s1">Figure 6—figure supplement 1A–D</xref>). The paused genes were not enriched on X (<xref ref-type="fig" rid="fig6">Figure 6C</xref>). Moreover, the number of 5′ paused genes was not decreased in <italic>sdc-2</italic> mutants (<xref ref-type="supplementary-material" rid="SD4-data">Figure 4—source data 1</xref> and <xref ref-type="fig" rid="fig6">Figure 6D,E</xref>).<fig-group><fig id="fig6" position="float"><object-id pub-id-type="doi">10.7554/eLife.00808.033</object-id><label>Figure 6.</label><caption><title>Promoter-proximal RNA Pol II pausing is rare in <italic>Caenorhabditis elegans</italic> and is not the target of dosage compensation.</title><p>(<bold>A</bold>) GRO-seq and GRO-cap signals show that a gene not paused in embryos becomes paused in L1 larvae deprived of food. (<bold>B</bold>) Comparison of average GRO-seq signal from embryos, starved L1 larvae, and L3 larvae within 2 kb of transcription start sites (TSSs) called in all three stages shows that promoter-proximal pausing in embryos and L3s is rare compared to that in starved L1 larvae. (<bold>C</bold>) Promoter-proximal pausing is not enriched on the X chromosome relative to autosomes in embryos. If dosage compensation prevented the release of Pol II from promoter-proximal pause sites, there should be higher levels of pausing on the X chromosome. (<bold>D</bold>) and (<bold>E</bold>) The level of promoter-proximal pausing is not decreased in <italic>sdc-2</italic> mutants compared to control embryos. If dosage compensation reduced gene expression by preventing the release of Pol II from promoter-proximal pause sites, the <italic>sdc-2</italic> mutant should exhibit lower levels of pausing. (<bold>D</bold>) Although X-linked genes have increased expression in <italic>sdc-2</italic> mutants, their level of pausing is not decreased. (<bold>E</bold>) The level of pausing displayed by autosomal genes is unchanged in <italic>sdc-2</italic> mutants. RPKM: reads per kilobase per million; RPM: reads per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.033">http://dx.doi.org/10.7554/eLife.00808.033</ext-link></p></caption><graphic xlink:href="elife00808f006"/></fig><fig id="fig6s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.034</object-id><label>Figure 6—figure supplement 1.</label><caption><title>The dosage compensation process does not control promoter-proximal pausing of Pol II.</title><p>(<bold>A</bold>)–(<bold>C</bold>) Shown are plots of the average GRO-seq signal from different developmental stages of wild-type embryos plotted across a 2 kb window centered on the transcription start sites (TSSs) identified in the listed developmental stage. The GRO-seq signal is averaged over 25 bp windows. n represents the number of genes in a stage having a TSS identified in that stage. For example, (<bold>A</bold>) shows the average GRO-seq signal for control RNAi embryos, starved L1s, and L3s plotted relative to the distance from the embryo-derived TSSs. Embryos and L3 larvae exhibit lower levels of promoter-proximal pausing than L1 larvae deprived of food, and (<bold>D</bold>) the RNAi process does not change the level of pausing. Promoter-proximal pausing is rare in embryos and therefore unlikely to be the target for the dosage compensation process.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.034">http://dx.doi.org/10.7554/eLife.00808.034</ext-link></p></caption><graphic xlink:href="elife00808fs021"/></fig></fig-group></p><p>Nonetheless, the GRO-seq assay is capable of detecting 5′ pausing in <italic>C</italic>. <italic>elegans</italic>, since genes not paused in embryos become paused in L1s deprived of food (<xref ref-type="fig" rid="fig6">Figure 6A</xref> and <xref ref-type="fig" rid="fig6s1">Figure 6—figure supplement 1A–C</xref>). In starved L1s, we found 7.7% of genes (166 of 2133) to exhibit 5′ pausing, and most were on autosomes (<xref ref-type="supplementary-material" rid="SD4-data">Figure 4—source data 1</xref>). A prior genome-wide analysis of Pol II binding in L1 animals cultured with and without food discovered an enrichment of Pol II only at some promoters of food-deprived larvae, consistent with 5′ proximal pausing (<xref ref-type="bibr" rid="bib7">Baugh et al., 2009</xref>). Our results confirm their proposal that Pol II pausing can be induced by food deprivation. These cumulative results make it highly implausible that 5′ pausing control is the mechanism of X-chromosome dosage compensation in embryos, and they are consistent with the lack of a <italic>C</italic>. <italic>elegans</italic> negative elongation factor complex (NELF), which contributes, in part, to pausing in other organisms (<xref ref-type="bibr" rid="bib65">Yamaguchi et al., 1999</xref>).</p></sec><sec id="s2-6"><title>The <italic>C</italic>. <italic>elegans</italic> dosage compensation process regulates the recruitment of Pol II to promoters</title><p>Analysis of GRO-seq data comparing transcriptionally engaged Pol II in control vs dosage-compensation-defective XX embryos revealed a uniform increase in the level of engaged Pol II across the entire length of X-linked genes, from TSSs to 3′ ends, in the <italic>sdc-2</italic> mutant. This conclusion was reached both by metagene analysis (<xref ref-type="fig" rid="fig7">Figure 7A</xref>) and by the analysis of individual genes exhibiting a range of overexpression in the mutant (<xref ref-type="fig" rid="fig5 fig7">Figures 5C and 7C</xref>, and <xref ref-type="fig" rid="fig7s1">Figure 7—figure supplement 1</xref>). Analysis of individual genes averts complications in interpretations that might arise from averaging of data.<fig-group><fig id="fig7" position="float"><object-id pub-id-type="doi">10.7554/eLife.00808.035</object-id><label>Figure 7.</label><caption><title>The DCC condensin complex reduces X-chromosome gene expression in XX embryos by restricting Pol II recruitment to promoters.</title><p>(<bold>A</bold>) Uniform increase in GRO-seq signal across the length of X-linked genes results from disrupting dosage compensation. Metagene analysis comparing the average GRO-seq signal from 665 X-linked genes ≥1.5 kb in control RNAi or <italic>sdc-2</italic> mutant embryos. Genes were scaled to the same length as follows: 5′ ends (−1 kb to +500 bp of the transcription start site [TSS]) and 3′ ends (500 bp upstream to 1 kb downstream of 3′ end) were not scaled, and the gene body was scaled to 2 kb. The signal was averaged at each base pair and then averaged across 25 bp windows. The GRO-seq signal is elevated approximately 1.6-fold across genes in <italic>sdc-2</italic> mutant versus control RNAi embryos (below). (<bold>B</bold>) The GRO-seq signal is decreased slightly across autosomal genes in <italic>sdc-2</italic> mutant versus RNAi control embryos. Metagene analysis of 2949 autosomal genes ≥1.5 kb performed as in (<bold>D</bold>). The ratio of the GRO-seq signal in mutant versus control embryos is about 0.9. (<bold>C</bold>) Heat map shows that the GRO-seq signal is increased along the length of each X-linked gene in <italic>sdc-2</italic> mutants. For each of 665 genes, the GRO-seq signal from mutant or control embryos was summed across 100 bp windows and the <italic>sdc-2</italic>/control ratio was calculated for each window. The log<sub>2</sub>(<italic>sdc-2</italic>/control) value was plotted across the scaled gene. (<bold>D</bold>) Heat map shows that the GRO-seq signal is moderately decreased along the length of individual autosomal genes in <italic>sdc-2</italic> mutants. For each of 2949 autosomal genes, the log<sub>2</sub>(<italic>sdc-2</italic>/control) value was plotted across the scaled gene as in (<bold>F</bold>). (<bold>E</bold>) Dosage compensation does not specifically affect Pol II elongation. An elongation density index was calculated for each gene greater than 2 kb in length that did not have another gene on the same strand within 1 kb of the TSS. After excluding the first and last 500 bp of the gene, the average signal across the last 75% of the remaining gene was divided by the average signal across the first 25% of the remaining gene. Ratios of the indices between the <italic>sdc-2</italic> mutant and control RNAi embryos are not significantly different for genes on the X compared to the autosomes. Error bars represent a 95% confidence interval for the mean indices. n = 481 (X); n = 1861 (autosomes). (<bold>F</bold>) Occupancy of hypo-phosphorylated Pol II at the promoters of X-linked genes is increased in dosage compensation mutants, showing greater Pol II recruitment. Comparison of normalized Pol II ChIP-chip signal from control RNAi or <italic>sdc-2</italic> mutant embryos relative to newly annotated TSSs of X-linked genes. (<bold>G</bold>) Sense and upstream divergent transcription are coordinately increased for X-linked genes in <italic>sdc-2</italic> mutants. Comparison of average sense or antisense GRO-seq signal from <italic>sdc-2</italic> mutant and control RNAi embryos for a 3 kb window surrounding TSSs for genes with no divergent gene partner. The GRO-cap signal was only evaluated in this analysis for genes having a log<sub>2</sub>(sense/antisense) ratio ≤1.5 in control RNAi embryos.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.035">http://dx.doi.org/10.7554/eLife.00808.035</ext-link></p></caption><graphic xlink:href="elife00808f007"/></fig><fig id="fig7s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.036</object-id><label>Figure 7—figure supplement 1.</label><caption><title>GRO-seq signal is increased along the length of individual X-linked genes when dosage compensation is disrupted.</title><p>The GRO-seq signal from <italic>sdc-2</italic> mutant vs control RNAi embryos is shown for a representative set of 27 X-linked genes. The first vertical line in each gene shows the location of the transcription start site (TSS), and the second vertical line shows the location of the 3′ end. The GRO-seq signal has been averaged across 100 bp windows. The ratio of <italic>sdc-2</italic> expression vs control expression and the gene name are shown above each plot. Previously, we found that not all genes on X are dosage compensated (<xref ref-type="bibr" rid="bib35">Jans et al., 2009</xref>), and these GRO-seq data are consistent with that finding. RPKM: reads per kilobase per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.036">http://dx.doi.org/10.7554/eLife.00808.036</ext-link></p></caption><graphic xlink:href="elife00808fs022"/></fig><fig id="fig7s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.037</object-id><label>Figure 7—figure supplement 2.</label><caption><title>Disruption of dosage compensation causes a uniform increase in GRO-seq signal across the length of X-linked genes in different quartiles of gene expression determined from control RNAi samples.</title><p>(<bold>A</bold>)–(<bold>D</bold>) Metagene analyses comparing the average GRO-seq signal from X-linked genes ≥1.5 kb in control RNAi and <italic>sdc-2</italic> mutant embryos. The 665 X-linked genes have been split into four quartiles of gene expression and plotted independently. Genes were scaled to the same length as follows: the 5′ end (1 kb upstream to 500 bp downstream of the TSS ) and the 3′ end (500 bp upstream to 1 kb downstream of the termination site) were not scaled, and the remainder of the gene was scaled to a length of 2 kb. Signal was averaged across the genes of each group at each base pair and then averaged across 25 bp windows. The ratio of GRO-seq signal in mutant vs control embryos is plotted below each metagene analysis. RPKM: reads per kilobase per million; RPM: reads per million; TSS: transcription start site.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.037">http://dx.doi.org/10.7554/eLife.00808.037</ext-link></p></caption><graphic xlink:href="elife00808fs023"/></fig><fig id="fig7s3" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.038</object-id><label>Figure 7—figure supplement 3.</label><caption><title>Disruption of dosage compensation causes a uniform increase in GRO-seq signal across the length of X-linked genes of different size ranges.</title><p>(<bold>A</bold>)–(<bold>C</bold>) Shown are metagene analyses comparing average GRO-seq signal from X-linked genes ≥1.5 kb in control RNAi or <italic>sdc-2</italic> mutant embryos. The 665 X-linked genes have been split into three size ranges (1.5–3.0 kb, 3.0–6.0 kb, and &gt;6 kb) and plotted independently as in <xref ref-type="fig" rid="fig7">Figure 7A</xref>. RPKM: reads per kilobase per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.038">http://dx.doi.org/10.7554/eLife.00808.038</ext-link></p></caption><graphic xlink:href="elife00808fs024"/></fig><fig id="fig7s4" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.00808.039</object-id><label>Figure 7—figure supplement 4.</label><caption><title>The level of antisense transcription is unaffected by dosage compensation.</title><p>(<bold>A</bold>) Sense and upstream divergent transcription are coordinately increased for X-linked genes in <italic>sdc-2</italic> mutants. Scatter plot comparing the log<sub>2</sub>(sense/antisense) ratios for <italic>sdc-2</italic> mutant and control RNAi embryos shows that the ratios are similar, indicating that sense and antisense transcription is coordinately increased in <italic>sdc-2</italic> mutants. In combination with GRO-seq and ChIP-chip data, this result supports the view that recruitment of Pol II is elevated in <italic>sdc-2</italic> mutants. The statistical relationship between the replicates is indicated by the Spearman correlation coefficient ρ. The red line depicts a 1:1 relationship between the ratios. (<bold>B</bold>) Occupancy of hypo-phosphorylated Pol II at the promoters of X-linked genes that do not have a divergent gene pair yet have antisense transcription is increased in dosage compensation mutants, showing greater Pol II recruitment. Comparison of normalized Pol II ChIP-chip signal from control RNAi or <italic>sdc-2</italic> mutant embryos relative to newly annotated transcription start sites (TSSs) of X-linked genes. The ChIP-chip signal was only evaluated in this analysis for genes having a log<sub>2</sub>(sense/antisense) ratio ≤1.5 in control RNAi embryos.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.039">http://dx.doi.org/10.7554/eLife.00808.039</ext-link></p></caption><graphic xlink:href="elife00808fs025"/></fig></fig-group></p><p>For the metagene analysis, the ratio of engaged polymerase in mutant vs control was consistently elevated by at least 1.6-fold across scaled X-linked genes, from TSSs to 3′ ends (<xref ref-type="fig" rid="fig7">Figure 7A</xref>). This elevation in the GRO-seq signal was also readily apparent in metagene analyses of X-linked genes subdivided by quartiles of expression (<xref ref-type="fig" rid="fig7s2">Figure 7—figure supplement 2</xref>) or by gene length (<xref ref-type="fig" rid="fig7s3">Figure 7—figure supplement 3</xref>). In contrast, metagene analysis of autosomal genes from the same data sets showed only a slight decrease in engaged Pol II across their lengths (<xref ref-type="fig" rid="fig7">Figure 7B</xref>), consistent with the slight decrease in autosomal gene expression in mutants (<xref ref-type="bibr" rid="bib35">Jans et al., 2009</xref>).</p><p>Heat maps displaying the log<sub>2</sub> ratio of mutant to control GRO-seq signal across individual X-linked (<xref ref-type="fig" rid="fig7">Figure 7C</xref>) and autosomal (<xref ref-type="fig" rid="fig7">Figure 7D</xref>) genes confirmed the uniform elevation of transcriptionally engaged Pol II across X-linked genes and the reduction across individual autosomal genes of <italic>sdc-2</italic> mutants. Together, these results demonstrate that the dosage compensation mechanism reduces X-linked gene expression in <italic>C</italic>. <italic>elegans</italic> hermaphrodites by restricting either the recruitment or initiation of Pol II. This conclusion is strongly supported by the analysis of elongation density indices (average GRO-seq signal across the last 75% of genes/average signal across the first 25%) in mutant vs control embryos for genes on X and autosomes. The ratios of indices between control and mutant embryos were not significantly different for genes on X compared to autosomes, indicating that dosage compensation does not preferentially affect Pol II elongation (<xref ref-type="fig" rid="fig7">Figure 7E</xref>).</p><p>Two further lines of evidence show Pol II recruitment to be the predominant step of transcription targeted by the dosage compensation process. First, re-analysis of our previous genome-wide ChIP-chip analysis of Pol II binding (<xref ref-type="bibr" rid="bib53">Pferdehirt et al., 2011</xref>) relative to the new TSSs showed an approximately twofold increase in the occupancy of hypo-phosphorylated Pol II at the promoters of X-linked genes in <italic>sdc-2</italic> mutant vs control embryos (<xref ref-type="fig" rid="fig7">Figure 7F</xref>). DNA-bound hypo-phosphorylated Pol II at promoters is enriched for non-initiated Pol II. The increase in Pol II occupancy at promoters of mutants (measured by ChIP) was nearly equivalent to the increase in post-initiated Pol II (measured by GRO-seq), indicating that Pol II promoter recruitment is rate-limiting when dosage compensation is active.</p><p>Second, in <italic>sdc-2</italic> mutants, the levels of sense and upstream divergent transcription were elevated coordinately at promoters on X lacking upstream divergent genes (<xref ref-type="fig" rid="fig7">Figure 7G</xref> and <xref ref-type="fig" rid="fig7s4">Figure 7—figure supplement 4A</xref>). At those promoters, the level of hypo-phosphorylated Pol II was also elevated about twofold in mutant vs control embryos, as assayed by ChIP (<xref ref-type="fig" rid="fig7s4">Figure 7—figure supplement 4B</xref>). Since we found increased Pol II occupancy in mutants and observed virtually no promoter-proximal pausing in control embryos, escape from pausing into productive elongation, or any other form of post-initiation regulation, cannot be the cause of elevated transcription at the divergent promoters in <italic>sdc-2</italic> mutants. Furthermore, an increase in transcription initiation by bound polymerases cannot account for these results, but an increase in Pol II recruitment can. Our combined experiments demonstrate that <italic>C</italic>. <italic>elegans</italic> dosage compensation controls X-chromosome-wide gene expression predominantly by reducing Pol II recruitment to promoters of hermaphrodites.</p></sec><sec id="s2-7"><title><italic>C. elegans</italic> balances gene expression between X chromosomes and autosomes</title><p>In organisms that equalize X-chromosome gene expression between the sexes by a dosage compensation process, the question arises as to whether the compensated level of X-chromosome expression is equivalent to or half of the expression from the two sets of autosomes. The answer to this question has been controversial, although evidence has mounted in favor of a mechanism to balance total expression between X chromosomes and the two sets of autosomes based on measurements of accumulated transcripts (<xref ref-type="bibr" rid="bib63">Xiong et al., 2010</xref>; <xref ref-type="bibr" rid="bib20">Deng et al., 2011</xref>; <xref ref-type="bibr" rid="bib21">Disteche, 2012</xref>; <xref ref-type="bibr" rid="bib46">Lin et al., 2012</xref>; <xref ref-type="bibr" rid="bib19">Deng et al., 2013</xref>; <xref ref-type="bibr" rid="bib37">Jue et al., 2013</xref>). Our GRO-seq experiments have addressed this question in the most definitive way to date by measuring the levels of nascent transcripts prior to co-transcriptional processing. We show that in wild-type <italic>C</italic>. <italic>elegans</italic> embryos, X and autosomes have nearly equivalent levels of total gene expression, and that levels of transcribing Pol II are uniform across X and autosomal genes (<xref ref-type="fig" rid="fig8">Figure 8A</xref>). In dosage-compensation-defective mutants, the level of X expression and engaged Pol II exceeds that of autosomes by 1.7-fold, from the TSSs to the 3′ ends (<xref ref-type="fig" rid="fig8">Figure 8B</xref>). These results demonstrate the existence of a separate mechanism in <italic>C</italic>. <italic>elegans</italic> to elevate the intrinsic rate of transcription from the X chromosomes of both sexes, so that after dosage compensation, X chromosomes and the two sets of autosomes have equivalent expression. Our experiments provide evidence that it, like dosage compensation, works at the level of controlling Pol II recruitment.<fig id="fig8" position="float"><object-id pub-id-type="doi">10.7554/eLife.00808.040</object-id><label>Figure 8.</label><caption><title>Gene expression is balanced between X chromosomes and autosomes.</title><p>(<bold>A</bold>) <italic>Caenorhabditis elegans</italic> has a mechanism to equalize expression between X chromosomes and autosomes. Metagene analysis comparing the average GRO-seq signal from X-linked and autosome-linked genes of control RNAi embryos. The X to autosome expression ratio is 0.9. (<bold>B</bold>) In dosage-compensation-defective mutants, the level of X-chromosome expression exceeds that of autosomes by 1.7-fold. Metagene analysis comparing the average GRO-seq signal from X-linked and autosome-linked genes of <italic>sdc-2</italic> mutant embryos. RPKM: reads per kilobase per million.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.00808.040">http://dx.doi.org/10.7554/eLife.00808.040</ext-link></p></caption><graphic xlink:href="elife00808f008"/></fig></p></sec></sec><sec id="s3" sec-type="discussion"><title>Discussion</title><p>A critical step in dissecting the mechanism of dosage compensation in any organism is to determine the primary step of transcription controlled by the dosage compensation complex to balance X-chromosome gene expression between the sexes. A major obstacle towards attaining this goal for <italic>C</italic>. <italic>elegans</italic> had been the lack of properly annotated transcription start sites for the majority of its genes. We therefore initiated our studies by developing a general and efficient strategy to map TSSs for Pol II-regulated genes with nucleotide resolution. This method, called GRO-cap, utilized purified nascent transcripts from the NRO reactions developed for GRO-seq and employed redundant enzymatic steps to capture RNAs with 5′ caps. GRO-cap tracks RNAs prior to their co-transcriptional processing. This approach was critical for our analysis because co-transcriptional processing removes TSSs from the transcripts of most nematode genes, including protein-coding genes and small non-coding RNA genes.</p><p>We found TSSs for protein-coding genes to be unexpectedly far upstream from the 5′ ends of mature mRNAs. True TSSs were as far as 14 kb upstream of previously annotated 5′ ends (typically the trans-spliced acceptor sites), and 59% of TSSs were &gt;200 bp upstream and 21% &gt;1 kb upstream. GRO-cap was also effective at identifying TSSs for the majority of small processed RNAs, including microRNAs, tRNAs, snoRNAs, and 21 U-RNAs.</p><p>To determine the step of transcription controlled by the DCC, we compared the genome-wide distribution of transcriptionally engaged Pol II (from GRO-seq) relative to the newly defined TSSs (from GRO-cap) in wild-type vs dosage-compensation-defective XX embryos. We found 5′ promoter-proximal pausing to be rare, unlike in other metazoans, and not likely to be the target of dosage compensation. Instead, dosage compensation reduces recruitment of RNA Pol II to the promoters of hermaphrodite X-linked genes, thereby decreasing transcription from the two X chromosomes of hermaphrodites to the level of transcription from the single X chromosome of males. A separate regulatory mechanism elevates the intrinsic level of transcription from X chromosomes of both sexes so that after dosage compensation total expression from X chromosomes and autosomes is equivalent.</p><sec id="s3-1"><title>Advantages of GRO-cap</title><p>GRO-cap has advantages over TSS mapping strategies that use total accumulated RNA or short-capped (sc) RNA as the starting material. The use of nascent RNAs without base hydrolysis or size selection, as in GRO-cap, enriches the proportion of 5′ capped RNAs within the starting RNA population, reduces the level of false TSS calls from RNAs that are capped post-transcriptionally (<xref ref-type="bibr" rid="bib23">Fejes-Toth et al., 2009</xref>), and increases the probability of identifying TSSs from promoters that are transcribed at low levels. Most importantly, the GRO-cap strategy permits reliable assignment of TSSs by pairing TSS calls with uninterrupted GRO-seq signal for transcriptionally engaged Pol II between the GRO-cap TSS and the previously annotated 5′ end. Multiple lines of evidence showed that GRO-cap signal combined with continuous GRO-seq signal from WB starts defines true TSSs.</p><p>Two reports of nematode TSSs have recently been published. The first used two approaches to identify the transcription starts (<xref ref-type="bibr" rid="bib28">Gu et al., 2012</xref>). The first, called CapSeq, used total accumulated RNA as the starting material for the enzymatic enrichment of RNAs with 5′ caps and set the cutoff for TSS calls to be within 1 kb upstream and 100 bp downstream from previously annotated 5′ ends. As a consequence, CapSeq did not identify TSSs for a large class of protein-coding genes. TSSs of small processed non-coding RNAs were also difficult to identify by CapSeq. The second approach, called CIP-TAP, used scRNAs and was equivalently effective as GRO-cap for identifying TSSs of small non-coding RNAs.</p><p>The second report also used scRNAs to map TSSs and set the cutoff for TSS calls to be no greater then 200 bp upstream of the previously annotated 5′ ends, thereby also not defining TSSs for a large class of genes (<xref ref-type="bibr" rid="bib10">Chen et al., 2013</xref>). The authors found numerous clusters of Pol II initiation upstream of their calls and classified many of these initiation events as enhancer-like chromatin signatures based on overlap with bound transcription factors. Based on comparison with our data (e.g., <xref ref-type="fig" rid="fig1s12">Figure 1—figure supplement 12A,B</xref>), we propose that a proportion of the upstream enhancer-like signatures are TSSs giving rise to full-length transcripts. Two clear examples of genes are shown in <xref ref-type="fig" rid="fig1s12">Figure 1—figure supplement 12B,C</xref>, where the single TSS (either 2534 bp or 2878 bp upstream of the WB start) called from GRO-cap and GRO-seq data was classified as an enhancer by <xref ref-type="bibr" rid="bib10">Chen et al. (2013)</xref>.</p></sec><sec id="s3-2"><title>Features of <italic>C</italic>. <italic>elegans</italic> transcription</title><p>Although 5′ promoter-proximal pausing in metazoans is a common rate-limiting step of transcription that is highly regulated to control gene expression (<xref ref-type="bibr" rid="bib1">Adelman and Lis, 2012</xref>), we found it to be rare in <italic>C</italic>. <italic>elegans</italic> under normal growth conditions. However, in L1 larvae deprived of food, we found 5′ pausing at 7.7% of genes with TSS calls. A prior study using ChIP-seq discovered the accumulation of Pol II at promoters of starved larvae and proposed that 5′ promoter-proximal pausing was responsible for Pol II accumulation (<xref ref-type="bibr" rid="bib7">Baugh et al., 2009</xref>). Our results confirm that interpretation. Two factors promote 5′ pausing in metazoans, negative elongation factor (NELF) and DRB sensitivity-inducing factor (DSIF), although the relative contribution of each is not fully understood (<xref ref-type="bibr" rid="bib1">Adelman and Lis, 2012</xref>; <xref ref-type="bibr" rid="bib64">Yamaguchi et al., 2013</xref>). <italic>C</italic>. <italic>elegans</italic> appears to lack NELF, suggesting either that DSIF is sufficient in the infrequent cases of pausing or that the core promoter complex (<xref ref-type="bibr" rid="bib41">Kwak et al., 2013</xref>) or another not-yet identified negative elongation factor participates.</p><p>The 3′ accumulation of Pol II has been documented previously in flies and humans using GRO-seq (<xref ref-type="bibr" rid="bib16">Core et al., 2008</xref>, <xref ref-type="bibr" rid="bib15">2012</xref>). We found 3′ Pol II accumulation to be more extensive in nematodes than flies. Pol II 3′ accumulation is likely caused by slow 3′ end formation and RNA processing (<xref ref-type="bibr" rid="bib27">Gromak et al., 2006</xref>). Because <italic>C</italic>. <italic>elegans</italic> has numerous polycistronic operons requiring extensive RNA processing to produce monocistronic mRNAs (<xref ref-type="bibr" rid="bib8">Blumenthal, 2012</xref>), we were able to discover a strong positive correlation between the amount of 3′ Pol II accumulation and the amount of RNA processing. Curiously, though, even genes that were not part of operons exhibited greater 3′ accumulation than genes in flies. This observation raises the question of whether RNA processing in <italic>C</italic>. <italic>elegans</italic> is generally slower than that in other organisms to accommodate extensive trans-splicing, or whether <italic>C</italic>. <italic>elegans</italic> accommodates operons by imposing additional regulation on Pol II to enhance its 3′ pausing at all genes to assess whether to halt transcription or continue elongation.</p></sec><sec id="s3-3"><title><italic>C. elegans</italic> dosage compensation controls RNA Pol II recruitment</title><p>Gene expression in metazoans is controlled by diverse regulatory mechanisms that function over widely different distances. Some mechanisms act locally on individual genes, while others such as dosage compensation function across large chromosomal territories or along entire chromosomes to regulate a large set of genes coordinately. In general, the step of transcription controlled by long-range mechanisms is not understood. In our study, multiple lines of evidence supported the conclusion that <italic>C</italic>. <italic>elegans</italic> dosage compensation regulates gene expression along X primarily by reducing the recruitment of RNA Pol II to the promoters of hermaphrodite X-linked genes.</p><p>First, regulation of 5′ promoter-proximal pausing cannot be the mechanism underlying <italic>C</italic>. <italic>elegans</italic> dosage compensation. If the DCC reduced X-chromosome gene expression by increasing 5′ promoter-proximal pausing, numerous genes on X would be paused in wild-type embryos, and disruption of dosage compensation would reduce the level of pausing. GRO-seq experiments showed instead that 5′ promoter-proximal pausing is rare in wild-type XX embryos. The few genes that exhibited 5′ pausing were not enriched on X chromosomes relative to autosomes, and the level of pausing was not decreased in dosage-compensation defective mutants.</p><p>Second, Pol II-mediated transcription elongation is not preferentially affected by the DCC. The ratios of elongation density indices (average GRO-seq signal in last 75% of a gene/average GRO-seq signal in first 25% of a gene) calculated for genes on X chromosomes and autosomes in control vs dosage-compensation-defective embryos were not significantly different between X-linked and autosomal genes, indicating that Pol II-mediated transcription elongation is not selectively changed on X chromosomes of mutants.</p><p>Third, the level of transcriptionally engaged Pol II assayed genome-wide by GRO-seq in control vs dosage-compensation-defective XX embryos revealed a uniform increase in engaged Pol II across the entire length of X-chromosome genes, but not autosomal genes, in mutants. Hence, the DCC controls X-chromosome gene expression in XX animals by reducing either the recruitment or initiation of Pol II. This conclusion was validated both by metagene analysis and by analysis of hundreds of individual X-linked genes showing different levels of de-repression in dosage-compensation-defective mutants.</p><p>Fourth, genome-wide quantification of Pol II occupancy by ChIP plotted relative to the new TSSs showed an increase in the hypo-phosphorylated form of Pol II at promoters of dosage-compensation-defective embryos vs control embryos that was equivalent to the increase in post-initiated Pol II measured by GRO-seq. Since DNA-bound hypo-phosphorylated Pol II at promoters is enriched for non-initiated Pol II, these results indicate that Pol II promoter recruitment is rate limiting when dosage compensation is active.</p><p>Our combined experiments reveal that the primary mechanism by which the <italic>C</italic>. <italic>elegans</italic> dosage compensation process reduces X-chromosome gene expression by half in XX embryos is to limit Pol II recruitment to promoters of X-linked genes. Our study does not eliminate the possibility of a minor repressive influence acting through another step of transcription or through a post-transcriptional mechanism such as RNA stability.</p></sec><sec id="s3-4"><title>DCC function</title><p>How might condensin reduce Pol II recruitment to the promoters of X-linked genes by approximately twofold? Our current and prior (<xref ref-type="bibr" rid="bib35">Jans et al., 2009</xref>) studies showed that DCC binding to the promoter of a gene is neither necessary nor sufficient to elicit repression of the gene. Hence, the DCC influences gene expression over long distance, likely by imposing changes in higher-order chromosome structure. Clues to such a DCC function were suggested originally by the simultaneous discovery of the DCC and condensin’s biochemical properties in vitro as an ATPase that alters DNA topology (<xref ref-type="bibr" rid="bib11">Chuang et al., 1994</xref>; <xref ref-type="bibr" rid="bib40">Kimura and Hirano, 1997</xref>; <xref ref-type="bibr" rid="bib29">Hagstrom et al., 2002</xref>; <xref ref-type="bibr" rid="bib31">Hirano, 2012</xref>; <xref ref-type="bibr" rid="bib54">Piazza et al., 2013</xref>) and its canonical roles in vivo of compacting and resolving mitotic and meiotic chromosomes for proper chromosome segregation (<xref ref-type="bibr" rid="bib32">Hirano and Mitchison, 1994</xref>; <xref ref-type="bibr" rid="bib29">Hagstrom et al., 2002</xref>; <xref ref-type="bibr" rid="bib9">Chan et al., 2004</xref>; <xref ref-type="bibr" rid="bib31">Hirano, 2012</xref>; <xref ref-type="bibr" rid="bib54">Piazza et al., 2013</xref>). However, mechanisms of DCC function are perhaps best deduced from its non-canonical roles in vivo of regulating interphase chromosome structure (<xref ref-type="bibr" rid="bib6">Bauer et al., 2012</xref>) and meiotic crossover recombination (<xref ref-type="bibr" rid="bib49">Mets and Meyer, 2009</xref>; <xref ref-type="bibr" rid="bib61">Wood et al., 2010</xref>; <xref ref-type="bibr" rid="bib5">Aragon et al., 2013</xref>). Condensin II in <italic>Drosophila</italic> induces axial compaction of interphase chromosomes, globally disrupts inter-chromosomal interactions, and promotes dispersal of peri-centric heterochromatin (<xref ref-type="bibr" rid="bib6">Bauer et al., 2012</xref>). These activities serve to compartmentalize the interphase nucleus into discrete chromosomal territories. Furthermore, nematodes carrying mutations that disrupt condensin I, which shares four subunits with the DCC, display elongated chromosomal axes during meiotic prophase and exhibit chromosome-wide changes in the distribution of double strand breaks and crossovers (<xref ref-type="bibr" rid="bib49">Mets and Meyer, 2009</xref>), providing strong evidence that DCC subunits control intra-chromosomal structure.</p><p>These cytological, biochemical, and genetic observations of condensin function suggest that chromosome structure might affect Pol II recruitment in several ways, which are not mutually exclusive. Chromosome compaction and associated topological changes in DNA could broadly affect promoter accessibility of Pol II and the regulatory factors that recruit or stabilize it, thereby reducing Pol II recruitment in a quantitatively similar manner at different sites. Compaction of interphase chromosomal territories could reduce the local concentration of bound transcription activators and hence bound Pol II. Changes in intra-chromosomal interactions could alter the relationships between distal regulatory regions and their target promoters, thereby limiting Pol II recruitment. These and other models are topics of future studies.</p></sec><sec id="s3-5"><title>Dosage compensation mechanisms differ between nematodes and flies</title><p>Dosage compensation strategies differ across species. Mammals inactivate one of the two female X chromosomes, flies double expression of the single male X chromosome, and nematodes halve expression of both hermaphrodite X chromosomes. A central question is whether the molecular mechanisms underlying these diverse forms of chromosome-wide transcriptional regulation are the same or different.</p><p>In <italic>Drosophila melanogaster</italic>, dosage compensation is achieved by the male-specific lethal (MSL) complex that binds along the single X chromosome of males to double transcription of X-linked genes (<xref ref-type="bibr" rid="bib25">Gelbart and Kuroda, 2009</xref>; <xref ref-type="bibr" rid="bib12">Conrad and Akhtar, 2012</xref>). The complex contains two long non-coding RNAs and five proteins, including the H4K16 histone acetyltransferase MOF (<xref ref-type="bibr" rid="bib30">Hilfiker et al., 1997</xref>; <xref ref-type="bibr" rid="bib13">Conrad et al., 2012a</xref>) and the H2BK34 ubiquitin ligase MSL2 (<xref ref-type="bibr" rid="bib62">Wu et al., 2011</xref>). The MSL complex was proposed to regulate gene expression by controlling transcription elongation (<xref ref-type="bibr" rid="bib59">Smith et al., 2001</xref>). This model was subsequently supported by genome-wide mapping of MSL proteins and MSL-dependent H4K16ac to the bodies of male X-linked genes, with a bias toward 3′ ends (<xref ref-type="bibr" rid="bib2">Alekseyenko et al., 2006</xref>; <xref ref-type="bibr" rid="bib26">Gilfillan et al., 2006</xref>; <xref ref-type="bibr" rid="bib45">Legube et al., 2006</xref>), and GRO-seq experiments in S2 cells with or without RNAi of the <italic>msl2</italic> gene (<xref ref-type="bibr" rid="bib42">Larschan et al., 2011</xref>). While a recent study analyzing genome-wide Pol II occupancy in males and females by ChIP-seq experiments suggested an alternative view that fly dosage compensation operates at the level of Pol II recruitment or initiation (<xref ref-type="bibr" rid="bib14">Conrad et al., 2012b</xref>), a mathematical computation error discovered in this study (<xref ref-type="bibr" rid="bib24">Ferrari et al., 2013</xref>; <xref ref-type="bibr" rid="bib60">Straub and Becker, 2013</xref>) rendered the results insufficient to distinguish between the competing models. The elongation model received recent further support from the discovery that dosage compensation is disrupted by impairing the function of SPT5, a transcription elongation factor that co-localizes with the MSL complex on male X chromosomes and interacts physically with the MSL complex (<xref ref-type="bibr" rid="bib55">Prabhakaran and Kelley, 2012</xref>). Thus, the weight of evidence strongly favors enhancement of transcription elongation as the primary mechanism of fly dosage compensation.</p><p>Not only does the overall dosage compensation strategy differ between worms and flies, the underlying molecular mechanism appears to differ as well. In worms, reduction in X-chromosome gene expression is primarily achieved by reducing recruitment of Pol II to promoters, while in flies, elevation in X-chromosome gene expression is primarily achieved by facilitating Pol II transcription elongation. Multiple solutions have evolved to coordinately control gene expression across an entire chromosome.</p></sec><sec id="s3-6"><title>Balancing gene expression between X chromosomes and autosomes</title><p>In many species, the evolution of sex chromosomes to be the primary determinants of sexual fate resulted in males having one X chromosome and females having two. Such chromosome sex-determining mechanisms had the potential to cause two problems in gene expression, an imbalance in X-chromosome gene expression between the sexes and an imbalance in gene expression between the male X chromosome and his two sets of autosomes (<xref ref-type="bibr" rid="bib21">Disteche, 2012</xref>). X-chromosome dosage compensation strategies such as the <italic>Drosophila</italic> strategy solved both problems by doubling transcription of the single male X chromosome. An alternative solution would be to co-evolve two mechanisms, one to increase expression of X chromosomes in both sexes, thereby preventing hypo X expression in males, and a second to decrease total expression from female X chromosomes to prevent hyper X expression relative to female autosomes and male X chromosomes. Nematodes and mammals evolved the second strategy for equalizing X-chromosome expression between the sexes. It has been controversial whether these organisms also evolved a strategy to elevate X expression in both sexes and thereby balance expression between X chromosomes and autosomes (<xref ref-type="bibr" rid="bib63">Xiong et al., 2010</xref>; <xref ref-type="bibr" rid="bib20">Deng et al., 2011</xref>; <xref ref-type="bibr" rid="bib21">Disteche, 2012</xref>; <xref ref-type="bibr" rid="bib46">Lin et al., 2012</xref>; <xref ref-type="bibr" rid="bib19">Deng et al., 2013</xref>; <xref ref-type="bibr" rid="bib37">Jue et al., 2013</xref>).</p><p>The bulk of evidence now favors the presence of a mechanism to up-regulate X-chromosome expression in males and females of both organisms (<xref ref-type="bibr" rid="bib20">Deng et al., 2011</xref>, <xref ref-type="bibr" rid="bib19">2013</xref>; <xref ref-type="bibr" rid="bib37">Jue et al., 2013</xref>), and our results provide the most compelling evidence to date for <italic>C</italic>. <italic>elegans.</italic> Our approach of examining nascent transcripts in mixed-stage embryos by GRO-seq conferred three advantages. First, it enabled us to quantify transcription specifically from somatic embryonic cells and thereby avert the complication of quantifying maternally contributed germline transcripts that contaminate mature embryo mRNA. Second, we could examine nascent transcription at a stage for which most germline-specific mechanisms of gene regulation would have been erased. Third, we could quantify transcriptionally engaged Pol II across the entire length of a gene, starting from the bona fide TSS, and assess the step of transcription controlled by the process. We found that a major part of the mechanism to increase X expression in <italic>C</italic>. <italic>elegans</italic> is to increase the level of Pol II recruitment to promoters, as was shown recently for mammals (<xref ref-type="bibr" rid="bib19">Deng et al., 2013</xref>). Together, these results reinforce the evolutionary importance of balancing gene expression across all chromosomes of a genome.</p></sec></sec><sec id="s4" sec-type="materials|methods"><title>Materials and methods</title><sec id="s4-1"><title>Nematode strains</title><p>The following nematode strains were used:<list list-type="bullet"><list-item><p>Wild-type strain N2</p></list-item><list-item><p><italic>sdc-2</italic> mutant, <italic>sdc-2(y93,RNAi)</italic></p></list-item><list-item><p>Control RNAi, N2(L4440 <italic>RNAi</italic>).</p></list-item></list></p></sec><sec id="s4-2"><title>Worm growth</title><p>Bacteria for use in feeding RNAi were prepared by growing HT115 bacteria bearing an RNAi plasmid (<italic>sdc-2</italic> or the L4440 negative control) overnight in TB and ampicillin, inducing for 2 hr with 1 mM IPTG, pelleting, and resuspending in 1 vol (wt/vol) of LB with 20% glycerol. For RNAi treatment, embryos were harvested from gravid hermaphrodites and allowed to hatch off for 24 hr. NG agar plates supplemented with 1 mM IPTG and 100 μg/ml carbenicillin were spotted with super-induced RNAi bacteria and allowed to induce further overnight at 25°C. Hatched off L1 larvae were spotted on the RNAi plates and grown at 20°C until gravid.</p><p>For wild-type samples, embryos were harvested from gravid hermaphrodites grown at 20°C on concentrated HB101. Embryos were allowed to hatch off and starve for 24 hr at 20°C. Worms were fed and grown to L3 stage under liquid culture (1 worm/μl; 10 mg/ml HB101) for 34 hr at 20°C.</p></sec><sec id="s4-3"><title>Isolation of nuclei</title><p>Animals were collected from whichever stage was desired. After washing twice with M9 buffer, animals were washed with cold nuclear isolation buffer (250 mM sucrose, 10 mM Tris-HCl (pH 7.9), 10 mM MgCl<sub>2</sub>, 1 mM EGTA, 0.25% NP-40, 1 mM DTT, protease inhibitors, 4 U/ml SUPERaseIn [AM2696; Ambion, Grand Island, NY]). Animals were resuspended in nuclear isolation buffer (embryos and starved L1 in 3 vol, L3 in 1 vol), and dripped into liquid nitrogen to freeze. Starved L1 and L3 samples were ground under liquid nitrogen by mortar and pestle. Larval samples, post-grinding, and embryo samples were dounced with a Kontes 2 ml glass dounce to release nuclei. Douncing and collection of nuclei was performed for up to six rounds as follows: dounce with 10X pestle A, 10X pestle B, 5 min centrifugation at 100×<italic>g</italic>, removal of nuclei-containing supernatant, and addition of an equal volume of nuclear isolation buffer to the pellet. Nuclear isolation was monitored each round to determine effectiveness and when it was complete. The pooled supernatant was centrifuged for 5 min at 1000×<italic>g</italic> to pellet nuclei. The nuclear pellet was washed with nuclear freezing buffer (40% glycerol, 50 mM Tris-HCl (pH 8.3), 0.1 mM EDTA, 5 mM MgCl<sub>2</sub>, 1 mM DTT, protease inhibitors, 4 U/ml SUPERaseIn). Approximately 1 × 10<sup>8</sup> nuclei were resuspended in 100 µl nuclear freezing buffer and stored at −80°C until GRO-seq reactions were performed.</p></sec><sec id="s4-4"><title>Preparation of GRO-seq libraries</title><sec id="s4-4-1"><title>NRO reaction</title><p>Nuclei (100 μl) were mixed with an equal volume of reaction buffer (10 mM Tris-HCl (pH 8.0), 5 mM MgCl<sub>2</sub>, 1 mM DTT, 300 mM KCl, 20 U of SUPERaseIn, 1% sarkosyl, 500 μM each of ATP, GTP, and Br-UTP, 2 μM CTP, and 0.33 μM α-<sup>32</sup>P-CTP [3000 Ci/mmol]). The reaction was allowed to proceed for 5 min at 30°C. The reaction was stopped by the addition of 2 ml (10× volume) of TRIzol (Invitrogen). The phases were separated by the addition of 400 μl of chloroform as per the manufacturer’s instructions. An additional acid-phenol and then chloroform extraction were carried out, followed by precipitation with 2.5 vol of ethanol. The pellet was washed in 75% ethanol before resuspending in 20 μl of DEPC-treated water. Base hydrolysis was performed on ice by the addition of 5 μl 1 M NaOH and incubated on ice for 30 min. The reaction was neutralized by the addition of 25 μl 1 M Tris-HCl (pH 6.8). The reaction was then run through a p-30 RNAse-free spin column (BioR, Hercules, CA) according to the manufacturer’s instructions. The column flowthrough was brought to 100 μl with DEPC water and EDTA was added to a final concentration of 1 mM.</p></sec><sec id="s4-4-2"><title>Bead pre-wash</title><p>All buffers used in bead enrichment steps were kept on ice and were supplemented with 4 U/ml of SUPERaseIn. Anti-deoxyBrU beads (#sc-32323-ac; Santa Cruz Biotech, Santa Cruz, CA) were first washed three times with a pre-wash buffer: 0.25× SSPE, 500 mM NaCl, 1 mM EDTA, 0.05% Tween for 5 min; washed twice in binding buffer: 0.25× SSPE, 37.5 mM NaCl, 1 mM EDTA, 0.05% Tween for 5 min; blocked in bead blocking buffer: 0.25× SSPE, 1 mM EDTA, 0.05% Tween, 0.1% PVP, and 1 mg/ml ultrapure BSA (AM2618; Ambion) for 1 hr; followed by one wash in binding buffer for 5 min. The ratio of beads to volume did not exceed 1:8 for any wash or blocking step. The beads were resuspended in a 25% slurry (original concentration).</p></sec><sec id="s4-4-3"><title>Bead enrichment</title><p>NRO RNA was heat denatured at 70°C for 3 min and placed on ice for 2 min. Then, 350 μl of binding buffer and 50 μl of bead slurry were added to the RNA, and the samples were incubated for 30 min on a rotating stand (8 rpm). The beads were washed once in binding buffer; once in low salt wash buffer: 0.2× SSPE, 1 mM EDTA, 0.05% Tween; once in high salt wash buffer: 0.25% SSPE, 137.5 mM NaCl, 1 mM EDTA, 0.05% Tween; and twice in TET: 10 mM Tris-HCl (pH 7.5), 1 mM EDTA, 0.05% Tween. The NRO RNA was eluted three times (2× 125 μl, 1× 250 μl) with elution buffer: 20 mM DTT, 150 mM NaCl, 5 mM Tris-HCl (pH 7.5), 1 mM EDTA, and 0.1% SDS. The NRO RNA was then isolated by a standard extraction-precipitation method: one acid-phenol extraction, one chloroform extraction, addition of NaCl to 300 mM and 1 μl of glycoblue (AM9515; Ambion) to the aqueous phase, precipitation with 2.5 vol of cold ethanol, and a wash of the resulting pellet with 75% ethanol. The pellet was resuspended in DEPC water at volumes appropriate for the subsequent step.</p></sec><sec id="s4-4-4"><title>3′ End repair</title><p>NRO RNA was heated 70°C for 3 min, followed by incubation on ice for 2 min. Then 3 μl 10× T4 PNK buffer, 1 μl SUPERaseIn, and 2 μl of T4 polynucleotide kinase (NEB, Ipswich, MA) were added and the reaction incubated for 30 min at 37°C. The reaction was brought to 100 μl with DEPC water and EDTA to a final concentration of 10 mM to stop the reaction. RNA was heat denatured as above and subjected to two more rounds of bead binding/elution as described above.</p></sec><sec id="s4-4-5"><title>Poly-A tailing, reverse transcription, and amplification</title><p>NRO RNAs were then polyadenylated as described in <xref ref-type="bibr" rid="bib33">Ingolia (2010)</xref>. Poly-A tailed RNA (20 μl) was then reverse transcribed as follows: 2 pmol of reverse transcription primer (TruSeq_circ_RTP_pT: 5′-/5Phos/GATCGTCGGACTGTAGAACTCTGAAC/iSP18/CACTCA/iSP18/GCCTTGGCACCCGAGAATTCCATTTTTTTTTTTTTTTTTTTTVN-3′), 1 μl 12.5 mM dNTPs, and 3 μl of DEPC water were added the mixture incubated at 75°C for 3 min. The primer (note the low concentration) is allowed to anneal at 42° for 10 min. Then, 1 μl 0.1 mM DTT, 1 μl 1 M Tris-HCl (pH 8.3), 1 μl SUPERaseIN, and 1 μl SuperScript III reverse transcriptase (18080051; Invitrogen) were added and the reverse transcription was carried out at 48°C for 5 min and 54°C for 30 min. The reaction was stopped by incubation at 70°C for 10 min. The reaction was extracted once with buffer saturated phenol–chloroform (pH 8.0), once with chloroform, supplemented with 2 μl glycoblue and 300 mM NaCl, and then precipitated with 2.5 vol of ethanol.</p><p>The NRO cDNA library was then PAGE purified away from excess RT primer on an 8% denaturing PAGE gel as described (<xref ref-type="bibr" rid="bib33">Ingolia 2010</xref>). The purified library was then circularized in an intra-molecular ligation reaction as follows: cDNA was resuspended in 15 μl of 10 mM Tris-HCl (pH 8.0) and mixed with 2 μl of 10× CircLigase buffer, 1 μl 1 mM ATP, 1 μl of 50 mM MnCl<sub>2</sub> and 1 μl of CircLigase (Epicentre, Madison, WI). The reaction was incubated at 60°C for 90 min, with a brief centrifugation every 15 min to bring down condensation. The reaction was heat inactivated at 80°C for 10 min, extracted once with buffer saturated phenol–chloroform (pH 8.0), once with chloroform, supplemented with 2 μl glycoblue and 300 mM NaCl, and then precipitated with 2.5 vol of ethanol.</p><p>The circularized NRO cDNA library was resuspended in 10 μl of water and amplified and PAGE purified as described (<xref ref-type="bibr" rid="bib16">Core et al., 2008</xref>) and quantified before submission for sequencing. Amplification was achieved using the TruSeq miRNA cloning oligo set (RP1: 5′-AATGATACGGCGACCACCGAGATCTACACGTTCAGAGTTCTACAGTCCGA-3′ [Illumina part # 15005505] and RPI-#: 5′-CAAGCAGAAGACGGCATACGAGATNNNNNNGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA-3‘, where the ‘#’ represents one of the 48 Illumina six-base barcodes in the middle of the oligo, shown above as ‘NNNNNN’).</p></sec></sec><sec id="s4-5"><title>Preparation of GRO-cap libraries</title><p>All NRO reactions and bead enrichment steps for GRO-cap were carried out as described above, with the exception that the RNA for GRO-cap was not base hydrolyzed. After the first bead binding, the TruSeq RNA 3′ Adapter (RA3) (Illumina part # 15013207) was ligated to the 3′ end of the NRO RNA. First, 50 pmol of the 3′ adapter were mixed with NRO RNA and 2 μl 50% PEG 8000, brought to 14 μl with DEPC water, incubated at 70°C for 3 min and put on ice for 2 min. Then, 2 μl of 10× T4 RNA Ligase I buffer, 2 μl 10 mM ATP, 1 μl SUPERaseIN, and 1.5 μl T4 RNA Ligase I (M0204; NEB) were added and reaction incubated at 22°C for 4–6 hr. The reaction was then brought to 100 μl with binding buffer and subjected to a second round of bead enrichment.</p><p>After the second bead enrichment, 5′ mono-phosphate RNAs were selected against in two successive steps. First, to selectively degrade RNAs with a 5′ mono-phosphate, NRO RNA was resuspended in 16.5 μl DEPC water, mixed with 0.5 μl SUPERaseIN, 2 μl 10× Terminator reaction buffer A, and 1 μl of Terminator 5′ phosphate-dependent exonuclease (TER51020; Epicentre), and incubated at 30°C for 1 hr. The reaction was extracted and precipitated using the standard method (above), and resuspended in 10 μl DEPC water. Second, 5′ mono-phosphate RNAs were dephosphorylated to prevent their participation in subsequent ligation reactions. For this, the RNA was then mixed with 1 μl SUPERaseIN, 14.5 μl DEPC water, 10× Antarctic Phosphatase buffer, 1.5 μl of Antarctic Phosphatase (M0289S; NEB), and incubated at 37°C for 30 min. The reaction was brought to 200 μl with 10 mM Tris-HCl (pH 7.5), 5 mM EDTA, and heat inactivated at 65°C for 5 min. The reaction was then extracted and precipitated using the standard extraction method and resuspended in 10 μl DEPC water.</p><p>The 5′ capped RNAs then were prepared for ligation and the final library preparation steps. The NRO RNAs were then split in half and the experimental sample was treated with tobacco acid pyrophosphatase (TAP): 1 μl SUPERaseIN, 15 μl DEPC water, 3 μl 10× TAP buffer, and 1 μl TAP (T19500; Epicentre), with incubation at 37°C for 1 hr. The control reaction was treated identically except for the addition of TAP. The reaction was brought to 200 μl and then extracted and precipitated using the standard method. The TruSeq RNA 5′ Adapter (RA5) (Illumina part # 15013205) was ligated to the 5′ end of the NRO RNA. First, 50 pmol of the 5′ adapter were mixed with NRO RNA, and 2 μl 50% PEG 8000, brought to 14 μl with DEPC water, incubated at 70°C for 3 min and put on ice for 2 min. Then, 2 μl of 10× T4 RNA Ligase I buffer, 2 μl 10 mM ATP, 1 μl SUPERaseIN, and 1.5 μl T4 RNA Ligase I (M0204; NEB) were added and reaction incubated at 22°C for 4–6 hr. The reaction was then brought to 100 μl with binding buffer and subjected to a third round of bead enrichment. After the third enrichment, samples were reverse transcribed, amplified and PAGE purified as described (<xref ref-type="bibr" rid="bib16">Core et al., 2008</xref>), and quantified before submission for sequencing.</p></sec><sec id="s4-6"><title>Mapping of GRO-seq and GRO-cap reads</title><p>Libraries were sequenced with Illumina’s HiSeq 2000 platform. Reads were required to have passed the CASAVA 1.8 quality filtering to be considered further. To remove reads containing the RT-PCR adapter, we used cutadapt version 0.9.5 (<ext-link ext-link-type="uri" xlink:href="http://code.google.com/p/cutadapt/">http://code.google.com/p/cutadapt/</ext-link>) with the following command (cutadapt -a TGGAATTCTCGGGTGCCAAGG -a AAAAAAAAAAAAAAAAAAAA -z -O 15 -e .1 --minimum-length=30). The remaining reads were trimmed to 30 bp in length and aligned uniquely to the <italic>C</italic>. <italic>elegans</italic> WS230 genome using bowtie’s default settings (version 0.12.7), which permit two mismatches in the first 28 bp.</p><p>Because the 5’ base in each read most closely identifies the location of transcriptionally engaged RNA polymerase prior to the run-on, we created pile-ups using only the first base of each read. For GRO-seq, the pile-up of reads mapping to both strands was normalized by the number of millions of reads that mapped uniquely to the genome and was multiplied by 1000 to obtain RPKM (reads per kilobase per million). For GRO-cap, reads were normalized by the same method but were not multiplied by 1000, hence RPM (reads per million). As expected, the GRO-cap signal is enriched at TSSs compared to the GRO-seq signal.</p></sec><sec id="s4-7"><title>Evaluation of GRO-seq normalization</title><p>Further analysis of our GRO-seq data using an alternative normalization strategy confirmed that X-linked genes are increased in expression, and autosomal genes are decreased in expression in the <italic>sdc-2</italic> mutant. This new analysis evaluated the original normalization strategy. Because expression of genes on the X chromosome is elevated in <italic>sdc-2</italic> mutants, it is difficult to determine whether the autosomes are changed in expression in the <italic>sdc-2</italic> mutant. The proportion of autosomal reads relative to total reads per experiment is lower in the <italic>sdc-2</italic> mutant (75.9%) than in the control RNAi (85.7%). To normalize the autosomal expression between the two conditions, we divided the average expression of all genes in the <italic>sdc-2</italic> mutant by the scaling factor of this difference (75.9%/84.715% = 0.885). We then investigated changes in the <italic>sdc-2</italic> mutant compared to the control for our most inclusive gene set: WB WS230 genes that are &gt;250 bp in length (see <xref ref-type="supplementary-material" rid="SD5-data">Figure 5—source data 1</xref>). With this alternative normalization, the median <italic>sdc-2</italic>/control gene expression ratios increase for both X (1.67 becomes 1.88) and autosomes (0.81 becomes 0.91). This procedure changed the proportion of X-linked genes that are more highly expressed in the mutant (genes increased in expression by 1.5-fold or greater) from 64.9% to 78.4%. The original normalization showed that fewer autosomal genes increase by 1.5-fold than decrease by 1.5-fold (5.2% compared to 26.7%, respectively), and the alternative normalization shows the same trend (8.4% compared to 16.5%). Therefore, the same conclusions about X and autosomal gene expression can be made with either normalization strategy.</p></sec><sec id="s4-8"><title>Creation of the unique mappability track</title><p>We computationally identified all 30mers in the WS230 genome. After passing these sequences through the cutadapt parameters outlined above, we aligned the remaining sequences uniquely using bowtie. We mapped the 5′ base to determine whether a read beginning at that base pair can be aligned uniquely.</p></sec><sec id="s4-9"><title>Gene lists and DNA coordinate conversion</title><p>Genome annotation files were downloaded from WB. Protein coding genes that were labeled as ‘Coding_transcript’ and ‘gene’ were extracted from the WS230 annotation file. Genes encoding microRNAs and snoRNAs that were labeled as ‘miRNA_primary_transcript’ and ‘snoRNA_mature_transcript’, respectively, were extracted from the WS225 annotation file. tRNAs predicted by ‘tRNAscan-SE-1.23’ were extracted from the WS230 annotation file. The WB remap and unmap tools were used to convert DNA coordinates between releases, thereby ensuring that all analyses matched the correct genome versions.</p></sec><sec id="s4-10"><title>Identification of TSSs</title><sec id="s4-10-1"><title>Protein coding genes</title><p>As the TSSs for <italic>C</italic>. <italic>elegans</italic> trans-spliced genes, which comprise 70% of all genes (<xref ref-type="bibr" rid="bib3">Allen et al., 2011</xref>), were unknown, it was essential to properly annotate TSSs for use in our later analyses. We calculated TSSs in wild-type embryos, starved L1s, L3s, and <italic>sdc-2</italic> mutant embryos independently. To do this, we utilized both our GRO-cap and GRO-seq data with the assumption that a true TSS would be supported by a continuous GRO-seq signal extending upstream from the WB start to the newly annotated TSS. To correct for background noise in the TAP+ GRO-cap signal, we subtracted the TAP− control GRO-cap signal from the TAP+ GRO-cap signal. The Z-score was calculated at each position on a chromosome using the standard deviation and mean calculated for that individual chromosome, ignoring the top 0.005% of scores as outliers. Removing these outliers was required so that the locations with the strongest GRO-seq signal did not markedly affect the standard deviation and Z-score. To prevent biasing TSS calls to only those genes in which transcription began at only one base pair, we averaged the Z-score data in a 10 bp moving window with a 1 bp step.</p><p>Because the true location of most TSSs are unknown, we searched for TSSs of protein coding genes in a window that began 250 bp downstream of the WB WS230 start and continued upstream until the beginning of the closest transcript (protein coding, microRNA, snoRNA, snRNA, tRNA, or rRNA) on the opposite strand or until 100 bp away from the end of the closest gene on the same strand. If the TSS of a gene was contained within the transcript of another gene or had another protein coding gene within it on the same strand, we did not consider it. This reduced our possible gene list from 20,516 genes to 20,008 genes. Within these genes, we searched for the highest 10 bp Z-score average in the upstream window and concurrently calculated GRO-seq signal across a 200 bp sliding window with a 1 bp step. If the average GRO-seq signal dropped below 1 RPKM, we did not proceed further upstream to identify a TSS. This step was necessary to eliminate TSS calls that were clearly not supported by GRO-seq data. After identifying the highest 10 bp Z-score average, we mapped the highest 1 bp Z-score within the 10 bp window to be called as the TSS position.</p><p>To determine both the correct 10 bp Z-score average to use as a cutoff for calling TSSs and the effectiveness of the script at calling TSSs, we visually inspected all TSS calls across the X chromosome and determined whether the GRO-cap and GRO-seq signal supported the call. For each stage, we selected a Z-score cutoff for which &gt;90% of TSS calls were verified by visual inspection (wild-type embryo &gt;4.5, <italic>sdc-2</italic> mutant embryo &gt;4.725, wild-type starved L1 &gt;4.764, wild-type L3 &gt;5.01). To confirm these cutoffs held true across the genome, a random set of 250 autosomal genes with a 10 bp Z-score average greater than the 90th percentile was investigated visually for the wild-type embryo data set. These genes had &gt;90% agreement for TSS calls, confirming that the same cutoffs could be used for genes on both the X and autosomes. For TSS calls that differed by more than 2 kb upstream from the WB start, we inspected each gene visually to eliminate any TSS calls due to such complications as unannotated transcripts. In addition, for all genes that had an overlapping gene on the opposite strand, we verified each TSS visually. If no TSS had been called, but one could be identified visually with ease, we annotated the TSS. In a few cases, visual inspection allowed us to find a TSS significantly downstream of the WB-annotated start. The criteria used to accept the downstream TSS included a strong GRO-cap signal, no GRO-seq signal upstream of the GRO-cap signal, and no contradicting GRO-cap signal in any worm stage.</p><p>In total, we identified a TSS in at least one developmental stage for 31.7% (6353 genes) of all <italic>C</italic>. <italic>elegans</italic> protein coding genes. Not all TSS calls could be made from each stage. We called TSSs for 4246 genes in wild-type embryos, 2443 genes in wild-type starved L1, 4513 genes in wild-type L3, and 2809 genes in DCC mutant embryos. Of these calls, 875 were unique to wild-type embryos, 290 to wild-type starved L1, 1039 to wild-type L3, and 149 to DCC mutant embryos.</p><p>For many analyses, GRO-seq signal and GRO-cap signal were only analyzed across genes for which a TSS had been identified in that stage (<xref ref-type="fig" rid="fig1">Figure 1F,G</xref> and <xref ref-type="fig" rid="fig1s5 fig1s7">Figure 1—figure supplements 5–7</xref>, and <xref ref-type="supplementary-material" rid="SD4-data">Figure 4—source data 1</xref>). For other analyses (<xref ref-type="fig" rid="fig6">Figure 6B</xref>), the genes had to have a TSS call from each of the three wild-type stages, and the distances between TSSs among the different stages could not exceed 100 bp. The TSS position was used from the stage that had the highest 10 bp Z-score. For dosage compensation comparisons, a total of 4547 TSS calls was used (4246 TSS calls from wild-type embryos and the 301 TSS calls from <italic>sdc-2</italic> mutant embryos not in the wild-type embryos).</p></sec><sec id="s4-10-2"><title>Non-coding RNAs</title><p>Upon obtaining GRO-seq and GRO-cap data, we found that the WB gene models (WS225 models with their coordinates converted to WS230) for many small RNAs, including microRNAs and snoRNAs, lack properly annotated TSSs. As the regulation of expression of many ncRNAs, particularly microRNAs, is essential for proper development, we set out to properly annotate TSSs for many small ncRNAs. We utilized the protein coding gene strategy outlined above to identify likely TSS positions for these ncRNAs. As relatively few of these small ncRNAs exist in the genome, we examined the predicted TSS of each ncRNA gene visually to confirm or deny the computational TSS or to call a TSS. Through these analyses, we annotated TSSs for 74/141 snoRNAs and 70/180 microRNAs, including the five polycistronic microRNA clusters (<italic>mir-35-41</italic>; <italic>mir-42-44</italic>; <italic>mir-229</italic> and <italic>mir-64-66</italic>; <italic>mir-54-56</italic> [<xref ref-type="fig" rid="fig1">Figure 1B</xref>]; and <italic>mir-73-74</italic>) (<xref ref-type="bibr" rid="bib43">Lau et al., 2001</xref>). We found that most microRNAs are dosage compensated. The compensated microRNAs are the following: <italic>mir-47</italic>, <italic>mir-49</italic>, <italic>mir-54-56</italic> (polycistronic), <italic>mir-62</italic>, <italic>mir-63</italic>, <italic>mir-73-74</italic> (polycistronic), <italic>mir-75</italic>, <italic>mir-81</italic>, <italic>mir-82</italic>, <italic>mir-239.1</italic>, <italic>mir-239.2</italic>, and <italic>mir-791</italic>.</p></sec></sec><sec id="s4-11"><title>GRO-seq gene body, and 5′ pausing and 3′ pausing calculations</title><sec id="s4-11-1"><title>GRO-seq gene expression for genes &gt;1.1 kb</title><p>The gene expression level for a particular gene was calculated from the average GRO-seq signal within the body of the gene. To prevent the complication of factoring any 5′ or 3′ pausing into our gene expression calculations, we excluded the first and last 300 bp of the gene from the calculations. Within the remaining region, we totaled the GRO-seq signal at each base pair that could be mapped uniquely, and divided this total signal by the number of uniquely mappable bases within the region. To ensure that we averaged GRO-seq signal over a sufficiently large number of bases, we required that genes be ≥1.1 kb in length and have at least 250 uniquely mappable bases within the gene body region included in the calculation. To compare expression from <italic>sdc-2</italic> mutant and control RNAi embryos, we calculated differential gene expression using the R package DESeq (<xref rid="bib4">Anders and Huber, 2010</xref>).</p></sec><sec id="s4-11-2"><title>5′ Pausing ratio</title><p>To investigate the level of 5′ pausing, we divided the average GRO-seq signal in the 5′ end by the average GRO-seq signal in the gene body for genes ≥1.1 kb. To determine the average 5′ end GRO-seq signal, we determined the total GRO-seq signal in the region between 50 bp upstream and 100 bp downstream of the newly annotated TSSs, and divided it by the total number of uniquely mappable bases in the 150 bp region. We then divided this average 5′ GRO-seq signal by the average gene body expression level as calculated above. To be considered paused, we required that genes have an average gene body expression of ≥1 RPKM, and have a 5′ pausing ratio ≥2. We calculated 5′ pausing ratios for genes in every developmental stage using the TSS calls in the respective stage.</p><p>As 5′ pausing is rare and the genome is dense, many genes called paused by the above criteria may not actually be paused if the signal is due to transcription from another source such as 3′ accumulation from an upstream gene or antisense transcription from a bidirectional promoter. To eliminate these false or ambiguous pausing calls, we visually inspected each gene with a 5′ pausing ratio ≥2 in any state to determine whether the level of GRO-seq signal at the 5′ end could be due to transcription from another source. Those that were ambiguous or were likely due to unrelated transcription were removed from our analyses. We found the following ratios of genes (paused/total) to exhibit 5′ pausing: 15 of 3975 genes of wild-type embryos, 14 of 3984 genes of RNAi control embryos, 32 of 3969 genes of <italic>sdc-2</italic> mutant embryos, 166 of 2133 genes of starved L1 larvae, and 78 of 3899 genes of L3 larvae.</p><p>RNA Pol II ChIP-seq has been used to assess 5′ accumulation of Pol II binding during L1 arrest (<xref ref-type="bibr" rid="bib7">Baugh et al., 2009</xref>). Though 5′ accumulation was reported, pausing of elongation was not demonstrated. There is little overlap between our set of genes paused during L1 arrest and those reported to have 5′ accumulation of Pol II. This discrepancy is due to the facts that the correct TSSs were not then known for many of the genes and that much of the Pol II accumulated at the 5′ end had not begun elongation and was therefore not detected by GRO-seq (R Baugh, personal communication).</p></sec><sec id="s4-11-3"><title>3′ Pausing ratio</title><p>To investigate the level of 3′ accumulation of Pol II, we calculated a 3′ pausing ratio by dividing the average GRO-seq signal in the 3′ end by the average GRO-seq signal in the gene body. The average GRO-seq signal for the 3′ end was calculated for 200 bp sliding windows with a 1 bp step from 250 bp upstream to 750 bp downstream of the WB WS230 stop. Within each window, the GRO-seq signal was summed and divided by the total number of uniquely mappable bases within the window. The highest average GRO-seq signal was divided by the average gene expression as calculated above to create the 3′ pausing ratio. We then plotted a histogram of 3′ pausing ratios for all genes with an average gene body expression of ≥1 RPKM.</p></sec><sec id="s4-11-4"><title>GRO-seq gene expression for genes greater than 250 bp</title><p>To expand the set of genes in which we compared gene expression between <italic>sdc-2</italic> mutant and control RNAi embryos, we calculated gene expression across all genes &gt;250 bp (see <xref ref-type="supplementary-material" rid="SD5-data">Figure 5—source data 1</xref>). For this calculation, we totaled the GRO-seq signal at each base pair that could be mapped uniquely across the entire length of the gene model, and divided this total signal by the number of uniquely mappable bases within the region. To ensure that we averaged the GRO-seq signal over a sufficiently large number of bases, we required that genes have at least 250 uniquely mappable bases within the gene body.</p></sec><sec id="s4-11-5"><title>Non-coding RNA expression</title><p>To investigate the effect of a DCC mutation on non-coding RNA expression, we calculated the average GRO-seq signal across microRNAs and tRNAs. For microRNAs, we totaled the GRO-seq signal from the TSS to the end of primary transcript, including all downstream genes in a polycistronic cluster. We then divided this by the total number of uniquely mappable bases within the window. For tRNAs, we totaled the GRO-seq signal from the start to 50 bp downstream of the stop. We then divided the total signal by the total number of uniquely mappable bases within the window. We included the 50 bp downstream of the stop in our calculations for two reasons. The tRNAs are very similar and have a low level of unique mappability. In expressed tRNAs, GRO-seq signal is evident downstream of the end of the mature transcript. We only included tRNAs with at least 25 uniquely mappable base pairs in our expression values.</p></sec></sec><sec id="s4-12"><title>Elongation density index</title><p>To determine whether dosage compensation specifically affects transcription elongation across the X chromosome, we determined an elongation density index for each gene with a newly annotated TSS. We calculated the average GRO-seq signal across the last 75% of the gene and divided it by the average GRO-seq signal across the first 25% of the gene, excluding the first and last 500 bp of the gene from this calculation. To ensure that we averaged the GRO-seq signal over a sufficiently large number of bases, we required that genes be ≥2 kb in length. To avoid outlier ratios that can result from a low number of reads, genes with an average RPKM &lt;1 in the first 500 bp, or the first 25% or last 75% of the remaining gene were excluded from the analysis. To reduce the possibility that the elongation density index was influenced by the 3′ accumulation of Pol II of an upstream gene, we required that genes lack another gene on the same strand within 1 kb of the TSS. We analyzed 481 X-linked genes and 1861 autosomal genes.</p></sec><sec id="s4-13"><title>Creating metagene profiles</title><p>To compare GRO-seq signal across genes, we scaled genes to be the same length, allowing us to average the GRO-seq signal across them. To avoid small genes that could affect the sensitivity of our analyses, we required that genes be ≥1.5 kb in length. These genes were scaled to the same length as follows: the 5′ end (1000 bp upstream to 500 bp downstream of the TSS) and the 3′ end (500 bp upstream to 1000 bp downstream of the WB stop site) were not scaled, and the remainder of the gene was scaled to a length of 2 kb. We predicted that leaving the ends of the gene unscaled might allow us to better identify any effects that occurred at the ends of genes.</p></sec><sec id="s4-14"><title>Calculation of average GRO-seq signal</title><p>To investigate GRO-seq trends surrounding the TSS or across scaled metagenes, we plotted the average GRO-seq signal across these regions of interest. To do so, we totaled the strand-specific GRO-seq signal for every gene in the gene list at each base pair in the region. We then divided the total GRO-seq signal at each base pair by the number of genes that are uniquely mappable at that base pair. We then took a 25 bp moving window average of this average GRO-seq signal.</p></sec><sec id="s4-15"><title>Heat maps</title><p>We used the <italic>Python</italic> package matplotlib to produce heat maps showing the GRO-seq expression of the TSSs of all genes, and the difference in expression between DCC mutant and control RNAi embryos. To show the GRO-seq signal at either the empirically determined TSS or WB start, the signal from each gene with a TSS call was plotted, one gene per row, and the GRO-seq signal was averaged across 15 bp windows. Genes were ordered from top to bottom with increasing distance between the WB start and the TSS called from GRO-cap. To show expression changes in the DCC mutant, we scaled each gene ≥1.5 kb to the same length as described above. We then totaled the GRO-seq signal from the DCC mutant embryos and separately from control RNAi embryos across 100 bp windows and calculated an average. The log<sub>2</sub> of the ratio was plotted for every 100 bp bin across the length of the metagene.</p></sec><sec id="s4-16"><title>Calculation of upstream divergent transcription</title><p>To calculate the relative level of upstream divergent transcription for each gene, we determined the maximum sense and antisense transcription in a 150 bp window within 500 bp of the TSS. Upstream divergent transcript data from Human and <italic>Drosophila</italic> samples were obtained (<xref ref-type="bibr" rid="bib15">Core et al., 2012</xref>). We used R to plot kernel density estimations of the log<sub>2</sub> (sense/antisense) ratio for <italic>C</italic>. <italic>elegans</italic>, Human, and <italic>Drosophila</italic>. To determine how far upstream the closest upstream divergent gene was to each TSS, we searched for the closest transcript (protein coding, non-coding RNA, tRNA, or rRNA) upstream and antisense to the TSS. Prior to the search, the 6353 protein coding genes with new TSS calls were re-annotated with the most upstream TSS identified. To determine the upstream distance between the TSS and the start of upstream divergent transcription, we searched 500 bp upstream of the TSS to identify the position with the highest level of GRO-cap TAP+ minus the TAP− signal.</p></sec><sec id="s4-17"><title>DCC ChIP-seq</title><p>Wild-type N2 animals were grown on NG agar plates with HB101 bacteria. Mixed-stage embryos were harvested from gravid hermaphrodites, and cross-linked with 2% formaldehyde for 30 min. Cross-linked embryos were resuspended in 3 ml of FA buffer (150 mM NaCl, 50 mM HEPES-KOH (pH 7.6), 1 mM EDTA, 1% Triton X-100, 0.1% sodium deoxycholate, 1 mM DTT, protease inhibitor cocktail) for every 1 g of embryos. This mixture was frozen on liquid nitrogen, then ground under liquid nitrogen by mortar and pestle. Chromatin was sheared by the Covaris S2 sonicator (20% duty factor, power level 8, 200 cycles per burst) for a total of 30 min processing time (60 s ON, 45 s OFF, 30 cycles).</p><p>To perform the ChIP reactions, extract containing approximately 2 mg of protein was incubated in a microfuge tube with 6.6 μg of anti-DPY-27 or random IgG antibodies overnight at 4°C. A 25 μl bed volume of protein A Sepharose beads was added to the ChIP for 2 hr. ChIPs were washed for 5 min at room temperature twice with FA buffer (150 mM NaCl), once with FA buffer (1 M NaCl), once with FA buffer (500 mM NaCl), once with TEL buffer (10 mM Tris-HCl (pH 8.0), 250 mM LiCl, 1% NP-40, 1% sodium deoxycholate, 1 mM EDTA), and twice with TE buffer. Protein and DNA were eluted twice with 1% SDS, 250 mM NaCl, 1 mM EDTA at 65°C for 15 min. After elution, sequencing libraries were prepared as published (<xref ref-type="bibr" rid="bib66">Zhong et al., 2010</xref>) with minor changes: sequencing adapters were as described (<xref ref-type="bibr" rid="bib44">Lefrancois et al., 2009</xref>) and adapters were ligated using the NEB Quick Ligation Kit (M2200).</p><p>Libraries were sequenced on the Illumina GA2 platform. After barcode removal, 32 bp reads were aligned uniquely to the <italic>C</italic>. <italic>elegans</italic> WS190 genome using bowtie. MACS (version 1.4) was used to call peaks and create pileups with DPY-27 ChIP as the treatment and random IgG ChIP as the control. To account for read depth, the ChIP signal was normalized to the total number of millions of reads that uniquely aligned to the genome. To correct for non-specific binding, the IgG signal was subtracted from the DPY-27 signal. The resulting ChIP-seq signal from two biological replicates was averaged at each base pair genome-wide.</p></sec><sec id="s4-18"><title>Re-analysis of RNA polymerase II ChIP-chip</title><p>Raw RNA Pol II ChIP-chip data from experiments using 8WG16 antibody (raised against the hypo-phosphorylated Pol II C-terminal domain) (<xref ref-type="bibr" rid="bib53">Pferdehirt et al., 2011</xref>) was obtained from GEO (accession numbers GSM634580 and GSM634582). The ChIP signal was normalized to the GC content of individual probes using MA2C (Song et al., 2007). The average ChIP-chip signal surrounding the TSS was calculated using the sitepro script within the CEAS package version 1.0.2 (<ext-link ext-link-type="uri" xlink:href="http://liulab.dfci.harvard.edu/CEAS">http://liulab.dfci.harvard.edu/CEAS</ext-link>).</p></sec><sec id="s4-19"><title>DNA motif searches</title><p>To determine whether <italic>C</italic>. <italic>elegans</italic> genes contain known core promoter motifs such as the TATA-box and Initiator element (Inr), we performed motif searches using MEME (<ext-link ext-link-type="uri" xlink:href="http://meme.ebi.edu.au/meme/intro.html">http://meme.ebi.edu.au/meme/intro.html</ext-link>). To identify a worm TATA-box, we searched for strand-specific motifs within a region 15–45 bp upstream of the TSS. To identify a worm Inr, we searched for strand-specific motifs within 10 bp of the TSS in either direction. These searches identified a TATA consensus of TATAWAWR, compared to TATAWAWR for yeast (<xref ref-type="bibr" rid="bib57">Rhee and Pugh, 2012</xref>), and an Inr consensus of YCAYTY, compared to YYANWYY in humans and TCAYTY in <italic>Drosophila</italic> (<xref ref-type="bibr" rid="bib38">Juven-Gershon and Kadonaga, 2010</xref>). To determine where these motifs lie, we calculated their distance from the TSS. For TATA, we calculated how far upstream the most 5′ base lies. For the Inr, we calculated how far the adenine lies from the TSS. In other organisms the adenine has been shown to be the +1 nucleotide in transcription; the location of the worm Inr relative to the TSS suggests that this is true in <italic>C</italic>. <italic>elegans</italic>.</p></sec></sec></body><back><ack id="ack"><title>Acknowledgements</title><p>We thank the Vincent J Coates Genomics Sequencing Laboratory for Illumina sequencing, D Stalford for assistance with figures, and T Cline for comments on the manuscript.</p></ack><sec sec-type="additional-information"><title>Additional information</title><fn-group content-type="competing-interest"><title>Competing interests</title><fn fn-type="conflict" id="conf1"><p>The authors declare that no competing interests exist.</p></fn></fn-group><fn-group content-type="author-contribution"><title>Author contributions</title><fn fn-type="con" id="con1"><p>WSK, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con2"><p>LJC, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con3"><p>CTW, Acquisition of data</p></fn><fn fn-type="con" id="con4"><p>JTL, Conception and design, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con5"><p>BJM, Conception and design, Analysis and interpretation of data, Drafting or revising the article</p></fn></fn-group></sec><sec sec-type="supplementary-material"><title>Additional files</title><sec sec-type="datasets"><title>Major datasets</title><p>The following dataset was generated:</p><p><related-object content-type="generated-dataset" document-id="Dataset ID and/or url" document-id-type="dataset" document-type="data" id="dataro1"><name><surname>Kruesi</surname><given-names>WS</given-names></name>, <name><surname>Core</surname><given-names>LJ</given-names></name>, <name><surname>Waters</surname><given-names>CT</given-names></name>, <name><surname>Lis</surname><given-names>JT</given-names></name>, <name><surname>Meyer</surname><given-names>BJ</given-names></name>, <year>2013</year><x>, </x><source>Condensin controls recruitment of RNA polymerase II to achieve X-chromosome dosage compensation</source><x>, </x><object-id pub-id-type="art-access-id">GSE43087</object-id><x>; </x><ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE43087">http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE43087</ext-link><x>, </x><comment>In the public domain at GEO: <ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/geo/">http://www.ncbi.nlm.nih.gov/geo/</ext-link>.</comment></related-object></p></sec></sec><ref-list><title>References</title><ref id="bib1"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Adelman</surname><given-names>K</given-names></name><name><surname>Lis</surname><given-names>JT</given-names></name></person-group><year>2012</year><article-title>Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoans</article-title><source>Nat Rev Genet</source><volume>13</volume><fpage>720</fpage><lpage>31</lpage><pub-id pub-id-type="doi">10.1038/nrg3293</pub-id></element-citation></ref><ref id="bib2"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Alekseyenko</surname><given-names>AA</given-names></name><name><surname>Larschan</surname><given-names>E</given-names></name><name><surname>Lai</surname><given-names>WR</given-names></name><name><surname>Park</surname><given-names>PJ</given-names></name><name><surname>Kuroda</surname><given-names>MI</given-names></name></person-group><year>2006</year><article-title>High-resolution ChIP-chip analysis reveals that the Drosophila MSL complex selectively identifies active genes on the male X chromosome</article-title><source>Genes Dev</source><volume>20</volume><fpage>848</fpage><lpage>57</lpage><pub-id pub-id-type="doi">10.1101/gad.1400206</pub-id></element-citation></ref><ref id="bib3"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Allen</surname><given-names>MA</given-names></name><name><surname>Hillier</surname><given-names>LW</given-names></name><name><surname>Waterston</surname><given-names>RH</given-names></name><name><surname>Blumenthal</surname><given-names>T</given-names></name></person-group><year>2011</year><article-title>A global analysis of <italic>C. elegans</italic> trans-splicing</article-title><source>Genome Res</source><volume>21</volume><fpage>255</fpage><lpage>64</lpage><pub-id pub-id-type="doi">10.1101/gr.113811.110</pub-id></element-citation></ref><ref id="bib4"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Anders</surname><given-names>S</given-names></name><name><surname>Huber</surname><given-names>W</given-names></name></person-group><year>2010</year><article-title>Differential expression analysis for sequence count data</article-title><source>Genome Biol</source><volume>11</volume><fpage>R106</fpage><pub-id pub-id-type="doi">10.1186/gb-2010-11-10-r106</pub-id></element-citation></ref><ref id="bib5"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Aragon</surname><given-names>L</given-names></name><name><surname>Martinez-Perez</surname><given-names>E</given-names></name><name><surname>Merkenschlager</surname><given-names>M</given-names></name></person-group><year>2013</year><article-title>Condensin, cohesin and the control of chromatin states</article-title><source>Curr Opin Genet Dev</source><volume>23</volume><fpage>204</fpage><lpage>11</lpage><pub-id pub-id-type="doi">10.1016/j.gde.2012.11.004</pub-id></element-citation></ref><ref id="bib6"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bauer</surname><given-names>CR</given-names></name><name><surname>Hartl</surname><given-names>TA</given-names></name><name><surname>Bosco</surname><given-names>G</given-names></name></person-group><year>2012</year><article-title>Condensin II promotes the formation of chromosome territories by inducing axial compaction of polyploid interphase chromosomes</article-title><source>PLOS GENET</source><volume>8</volume><fpage>e1002873</fpage><pub-id pub-id-type="doi">10.1371/journal.pgen.1002873</pub-id></element-citation></ref><ref id="bib7"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Baugh</surname><given-names>LR</given-names></name><name><surname>Demodena</surname><given-names>J</given-names></name><name><surname>Sternberg</surname><given-names>PW</given-names></name></person-group><year>2009</year><article-title>RNA Pol II accumulates at promoters of growth genes during developmental arrest</article-title><source>Science</source><volume>324</volume><fpage>92</fpage><lpage>4</lpage><pub-id pub-id-type="doi">10.1126/science.1169628</pub-id></element-citation></ref><ref id="bib8"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Blumenthal</surname><given-names>T</given-names></name></person-group><year>2012</year><article-title>Trans-splicing and operons in <italic>C. elegans</italic></article-title><source>WormBook</source><fpage>1</fpage><lpage>11</lpage><pub-id pub-id-type="doi">10.1895/wormbook.1.5.2</pub-id></element-citation></ref><ref id="bib9"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Chan</surname><given-names>RC</given-names></name><name><surname>Severson</surname><given-names>AF</given-names></name><name><surname>Meyer</surname><given-names>BJ</given-names></name></person-group><year>2004</year><article-title>Condensin restructures chromosomes in preparation for meiotic divisions</article-title><source>J Cell Biol</source><volume>167</volume><fpage>613</fpage><lpage>25</lpage><pub-id pub-id-type="doi">10.1083/jcb.200408061</pub-id></element-citation></ref><ref id="bib10"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname><given-names>RA</given-names></name><name><surname>Down</surname><given-names>TA</given-names></name><name><surname>Stempor</surname><given-names>P</given-names></name><name><surname>Chen</surname><given-names>QB</given-names></name><name><surname>Egelhofer</surname><given-names>TA</given-names></name><name><surname>Hillier</surname><given-names>LW</given-names></name><etal/></person-group><year>2013</year><article-title>The landscape of RNA polymerase II transcription initiation in <italic>C. elegans</italic> reveals promoter and enhancer architectures</article-title><source>Genome Res</source><comment>advance online publication</comment><pub-id pub-id-type="doi">10.1101/gr.153668.112</pub-id></element-citation></ref><ref id="bib11"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Chuang</surname><given-names>PT</given-names></name><name><surname>Albertson</surname><given-names>DG</given-names></name><name><surname>Meyer</surname><given-names>BJ</given-names></name></person-group><year>1994</year><article-title>DPY-27: a chromosome condensation protein homolog that regulates <italic>C. elegans</italic> dosage compensation through association with the X chromosome</article-title><source>Cell</source><volume>79</volume><fpage>459</fpage><lpage>74</lpage><pub-id pub-id-type="doi">10.1016/0092-8674(94)90255-0</pub-id></element-citation></ref><ref id="bib12"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Conrad</surname><given-names>T</given-names></name><name><surname>Akhtar</surname><given-names>A</given-names></name></person-group><year>2012</year><article-title>Dosage compensation in <italic>Drosophila melanogaster</italic>: epigenetic fine-tuning of chromosome-wide transcription</article-title><source>Nat Rev Genet</source><volume>13</volume><fpage>123</fpage><lpage>34</lpage><pub-id pub-id-type="doi">10.1038/nrg3124</pub-id></element-citation></ref><ref id="bib13"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Conrad</surname><given-names>T</given-names></name><name><surname>Cavalli</surname><given-names>FM</given-names></name><name><surname>Holz</surname><given-names>H</given-names></name><name><surname>Hallacli</surname><given-names>E</given-names></name><name><surname>Kind</surname><given-names>J</given-names></name><name><surname>Ilik</surname><given-names>I</given-names></name><etal/></person-group><year>2012a</year><article-title>The MOF chromobarrel domain controls genome-wide H4K16 acetylation and spreading of the MSL complex</article-title><source>Dev Cell</source><volume>22</volume><fpage>610</fpage><lpage>24</lpage><pub-id pub-id-type="doi">10.1016/j.devcel.2011.12.016</pub-id></element-citation></ref><ref id="bib14"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Conrad</surname><given-names>T</given-names></name><name><surname>Cavalli</surname><given-names>FM</given-names></name><name><surname>Vaquerizas</surname><given-names>JM</given-names></name><name><surname>Luscombe</surname><given-names>NM</given-names></name><name><surname>Akhtar</surname><given-names>A</given-names></name></person-group><year>2012b</year><article-title>Drosophila dosage compensation involves enhanced Pol II recruitment to male X-linked promoters</article-title><source>Science</source><volume>337</volume><fpage>742</fpage><lpage>6</lpage><pub-id pub-id-type="doi">10.1126/science.1221428</pub-id></element-citation></ref><ref id="bib15"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Core</surname><given-names>LJ</given-names></name><name><surname>Waterfall</surname><given-names>JJ</given-names></name><name><surname>Gilchrist</surname><given-names>DA</given-names></name><name><surname>Fargo</surname><given-names>DC</given-names></name><name><surname>Kwak</surname><given-names>H</given-names></name><name><surname>Adelman</surname><given-names>K</given-names></name><etal/></person-group><year>2012</year><article-title>Defining the status of RNA polymerase at promoters</article-title><source>Cell Rep</source><volume>2</volume><fpage>1025</fpage><lpage>35</lpage><pub-id pub-id-type="doi">10.1016/j.celrep.2012.08.034</pub-id></element-citation></ref><ref id="bib16"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Core</surname><given-names>LJ</given-names></name><name><surname>Waterfall</surname><given-names>JJ</given-names></name><name><surname>Lis</surname><given-names>JT</given-names></name></person-group><year>2008</year><article-title>Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters</article-title><source>Science</source><volume>322</volume><fpage>1845</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1126/science.1162228</pub-id></element-citation></ref><ref id="bib17"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Csankovszki</surname><given-names>G</given-names></name><name><surname>Collette</surname><given-names>K</given-names></name><name><surname>Spahl</surname><given-names>K</given-names></name><name><surname>Carey</surname><given-names>J</given-names></name><name><surname>Snyder</surname><given-names>M</given-names></name><name><surname>Petty</surname><given-names>E</given-names></name><etal/></person-group><year>2009</year><article-title>Three distinct condensin complexes control <italic>C. elegans</italic> chromosome dynamics</article-title><source>Curr Biol</source><volume>19</volume><fpage>9</fpage><lpage>19</lpage><pub-id pub-id-type="doi">10.1016/j.cub.2008.12.006</pub-id></element-citation></ref><ref id="bib18"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dawes</surname><given-names>HE</given-names></name><name><surname>Berlin</surname><given-names>DS</given-names></name><name><surname>Lapidus</surname><given-names>DM</given-names></name><name><surname>Nusbaum</surname><given-names>C</given-names></name><name><surname>Davis</surname><given-names>TL</given-names></name><name><surname>Meyer</surname><given-names>BJ</given-names></name></person-group><year>1999</year><article-title>Dosage compensation proteins targeted to X chromosomes by a determinant of hermaphrodite fate</article-title><source>Science</source><volume>284</volume><fpage>1800</fpage><lpage>4</lpage><pub-id pub-id-type="doi">10.1126/science.284.5421.1800</pub-id></element-citation></ref><ref id="bib19"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Deng</surname><given-names>X</given-names></name><name><surname>Berletch</surname><given-names>JB</given-names></name><name><surname>Ma</surname><given-names>W</given-names></name><name><surname>Nguyen</surname><given-names>DK</given-names></name><name><surname>Hiatt</surname><given-names>JB</given-names></name><name><surname>Noble</surname><given-names>WS</given-names></name><etal/></person-group><year>2013</year><article-title>Mammalian X upregulation is associated with enhanced transcription initiation, RNA half-life, and MOF-mediated H4K16 acetylation</article-title><source>Dev Cell</source><volume>25</volume><fpage>55</fpage><lpage>68</lpage><pub-id pub-id-type="doi">10.1016/j.devcel.2013.01.028</pub-id></element-citation></ref><ref id="bib20"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Deng</surname><given-names>X</given-names></name><name><surname>Hiatt</surname><given-names>JB</given-names></name><name><surname>Nguyen</surname><given-names>DK</given-names></name><name><surname>Ercan</surname><given-names>S</given-names></name><name><surname>Sturgill</surname><given-names>D</given-names></name><name><surname>Hillier</surname><given-names>LW</given-names></name><etal/></person-group><year>2011</year><article-title>Evidence for compensatory upregulation of expressed X-linked genes in mammals, <italic>Caenorhabditis elegans</italic> and <italic>Drosophila melanogaster</italic></article-title><source>Nat Genet</source><volume>43</volume><fpage>1179</fpage><lpage>85</lpage><pub-id pub-id-type="doi">10.1038/ng.948</pub-id></element-citation></ref><ref id="bib21"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Disteche</surname><given-names>CM</given-names></name></person-group><year>2012</year><article-title>Dosage compensation of the sex chromosomes</article-title><source>Ann Rev Genetics</source><volume>46</volume><fpage>537</fpage><lpage>60</lpage><pub-id pub-id-type="doi">10.1146/annurev-genet-110711-155454</pub-id></element-citation></ref><ref id="bib22"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ercan</surname><given-names>S</given-names></name><name><surname>Dick</surname><given-names>LL</given-names></name><name><surname>Lieb</surname><given-names>JD</given-names></name></person-group><year>2009</year><article-title>The <italic>C. elegans</italic> dosage compensation complex propagates dynamically and independently of X chromosome sequence</article-title><source>Curr Biol</source><volume>19</volume><fpage>1777</fpage><lpage>87</lpage><pub-id pub-id-type="doi">10.1016/j.cub.2009.09.047</pub-id></element-citation></ref><ref id="bib23"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Fejes-Toth</surname><given-names>F</given-names></name><name><surname>Sotirova</surname><given-names>V</given-names></name><name><surname>Sachidanandam</surname><given-names>R</given-names></name><name><surname>Assaf</surname><given-names>G</given-names></name><name><surname>Hannon</surname><given-names>GJ</given-names></name><name><surname>Kapranov</surname><given-names>P</given-names></name><etal/></person-group><year>2009</year><article-title>Post-transcriptional processing generates a diversity of 5’-modified long and short RNAs</article-title><source>Nature</source><volume>457</volume><fpage>1028</fpage><lpage>32</lpage><pub-id pub-id-type="doi">10.1038/nature07759</pub-id></element-citation></ref><ref id="bib24"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ferrari</surname><given-names>F</given-names></name><name><surname>Jung</surname><given-names>YL</given-names></name><name><surname>Kharchenko</surname><given-names>PV</given-names></name><name><surname>Plachetka</surname><given-names>A</given-names></name><name><surname>Alekseyenko</surname><given-names>AA</given-names></name><name><surname>Kuroda</surname><given-names>MI</given-names></name><etal/></person-group><year>2013</year><article-title>Comment on “Drosophila dosage compensation involves enhanced Pol II recruitment to male X-linked promoters”</article-title><source>Science</source><volume>340</volume><fpage>273</fpage><pub-id pub-id-type="doi">10.1126/science.1231815</pub-id></element-citation></ref><ref id="bib25"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gelbart</surname><given-names>ME</given-names></name><name><surname>Kuroda</surname><given-names>MI</given-names></name></person-group><year>2009</year><article-title>Drosophila dosage compensation: a complex voyage to the X chromosome</article-title><source>Development</source><volume>136</volume><fpage>1399</fpage><lpage>410</lpage><pub-id pub-id-type="doi">10.1242/dev.029645</pub-id></element-citation></ref><ref id="bib26"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gilfillan</surname><given-names>GD</given-names></name><name><surname>Straub</surname><given-names>T</given-names></name><name><surname>de Wit</surname><given-names>E</given-names></name><name><surname>Greil</surname><given-names>F</given-names></name><name><surname>Lamm</surname><given-names>R</given-names></name><name><surname>van Steensel</surname><given-names>B</given-names></name><etal/></person-group><year>2006</year><article-title>Chromosome-wide gene-specific targeting of the Drosophila dosage compensation complex</article-title><source>Genes Dev</source><volume>20</volume><fpage>858</fpage><lpage>70</lpage><pub-id pub-id-type="doi">10.1101/gad.1399406</pub-id></element-citation></ref><ref id="bib27"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gromak</surname><given-names>N</given-names></name><name><surname>West</surname><given-names>S</given-names></name><name><surname>Proudfoot</surname><given-names>NJ</given-names></name></person-group><year>2006</year><article-title>Pause sites promote transcriptional termination of mammalian RNA polymerase II.</article-title><source>Mol Cell Biol</source><volume>26</volume><fpage>3986</fpage><lpage>96</lpage><pub-id pub-id-type="doi">10.1128/mcb.26.10.3986-3996.2006</pub-id></element-citation></ref><ref id="bib28"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gu</surname><given-names>W</given-names></name><name><surname>Lee</surname><given-names>HC</given-names></name><name><surname>Chaves</surname><given-names>D</given-names></name><name><surname>Youngman</surname><given-names>EM</given-names></name><name><surname>Pazour</surname><given-names>GJ</given-names></name><name><surname>Conte</surname><given-names>D</given-names><suffix>Jnr</suffix></name><etal/></person-group><year>2012</year><article-title>CapSeq and CIP-TAP identify Pol II start sites and reveal capped small RNAs as <italic>C. elegans</italic> piRNA precursors</article-title><source>Cell</source><volume>151</volume><fpage>1488</fpage><lpage>500</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2012.11.023</pub-id></element-citation></ref><ref id="bib29"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hagstrom</surname><given-names>KA</given-names></name><name><surname>Holmes</surname><given-names>VF</given-names></name><name><surname>Cozzarelli</surname><given-names>NR</given-names></name><name><surname>Meyer</surname><given-names>BJ</given-names></name></person-group><year>2002</year><article-title><italic>C. elegans</italic> condensin promotes mitotic chromosome architecture, centromere organization, and sister chromatid segregation during mitosis and meiosis</article-title><source>Genes Dev</source><volume>16</volume><fpage>729</fpage><lpage>42</lpage><pub-id pub-id-type="doi">10.1101/gad.968302</pub-id></element-citation></ref><ref id="bib30"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hilfiker</surname><given-names>A</given-names></name><name><surname>Hilfiker-Kleiner</surname><given-names>D</given-names></name><name><surname>Pannuti</surname><given-names>A</given-names></name><name><surname>Lucchesi</surname><given-names>JC</given-names></name></person-group><year>1997</year><article-title>mof, a putative acetyl transferase gene related to the Tip60 and MOZ human genes and to the SAS genes of yeast, is required for dosage compensation in Drosophila</article-title><source>EMBO J</source><volume>16</volume><fpage>2054</fpage><lpage>60</lpage><pub-id pub-id-type="doi">10.1093/emboj/16.8.2054</pub-id></element-citation></ref><ref id="bib31"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hirano</surname><given-names>T</given-names></name></person-group><year>2012</year><article-title>Condensins: universal organizers of chromosomes with diverse functions</article-title><source>Genes Dev</source><volume>26</volume><fpage>1659</fpage><lpage>78</lpage><pub-id pub-id-type="doi">10.1101/gad.194746.112</pub-id></element-citation></ref><ref id="bib32"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hirano</surname><given-names>T</given-names></name><name><surname>Mitchison</surname><given-names>TJ</given-names></name></person-group><year>1994</year><article-title>A heterodimeric coiled-coil protein required for mitotic chromosome condensation in vitro</article-title><source>Cell</source><volume>79</volume><fpage>449</fpage><lpage>58</lpage><pub-id pub-id-type="doi">10.1016/0092-8674(94)90254-2</pub-id></element-citation></ref><ref id="bib33"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ingolia</surname><given-names>NT</given-names></name></person-group><year>2010</year><article-title>Genome-wide translational profiling by ribosome footprinting</article-title><source>Methods Enzymol</source><volume>470</volume><fpage>119</fpage><lpage>42</lpage><pub-id pub-id-type="doi">10.1016/s0076-6879(10)70006-9</pub-id></element-citation></ref><ref id="bib34"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jan</surname><given-names>CH</given-names></name><name><surname>Friedman</surname><given-names>RC</given-names></name><name><surname>Ruby</surname><given-names>JG</given-names></name><name><surname>Bartel</surname><given-names>DP</given-names></name></person-group><year>2011</year><article-title>Formation, regulation and evolution of <italic>Caenorhabditis elegans</italic> 3’UTRs</article-title><source>Nature</source><volume>469</volume><fpage>97</fpage><lpage>101</lpage><pub-id pub-id-type="doi">10.1038/nature09616</pub-id></element-citation></ref><ref id="bib35"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jans</surname><given-names>J</given-names></name><name><surname>Gladden</surname><given-names>JM</given-names></name><name><surname>Ralston</surname><given-names>EJ</given-names></name><name><surname>Pickle</surname><given-names>CS</given-names></name><name><surname>Michel</surname><given-names>AH</given-names></name><name><surname>Pferdehirt</surname><given-names>RR</given-names></name><etal/></person-group><year>2009</year><article-title>A condensin-like dosage compensation complex acts at a distance to control expression throughout the genome</article-title><source>Genes Dev</source><volume>23</volume><fpage>602</fpage><lpage>18</lpage><pub-id pub-id-type="doi">10.1101/gad.1751109</pub-id></element-citation></ref><ref id="bib36"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jeon</surname><given-names>Y</given-names></name><name><surname>Sarma</surname><given-names>K</given-names></name><name><surname>Lee</surname><given-names>JT</given-names></name></person-group><year>2012</year><article-title>New and Xisting regulatory mechanisms of X chromosome inactivation</article-title><source>Curr Opin Genet Dev</source><volume>22</volume><fpage>62</fpage><lpage>71</lpage><pub-id pub-id-type="doi">10.1016/j.gde.2012.02.007</pub-id></element-citation></ref><ref id="bib37"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jue</surname><given-names>NK</given-names></name><name><surname>Murphy</surname><given-names>MB</given-names></name><name><surname>Kasowitz</surname><given-names>SD</given-names></name><name><surname>Qureshi</surname><given-names>SM</given-names></name><name><surname>Obergfell</surname><given-names>CJ</given-names></name><name><surname>Elsisi</surname><given-names>S</given-names></name><etal/></person-group><year>2013</year><article-title>Determination of dosage compensation of the mammalian X chromosome by RNA-seq is dependent on analytical approach</article-title><source>BMC Genomics</source><volume>14</volume><fpage>150</fpage><pub-id pub-id-type="doi">10.1186/1471-2164-14-150</pub-id></element-citation></ref><ref id="bib38"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Juven-Gershon</surname><given-names>T</given-names></name><name><surname>Kadonaga</surname><given-names>JT</given-names></name></person-group><year>2010</year><article-title>Regulation of gene expression via the core promoter and the basal transcriptional machinery</article-title><source>Dev Biol</source><volume>339</volume><fpage>225</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1016/j.ydbio.2009.08.009</pub-id></element-citation></ref><ref id="bib39"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kapranov</surname><given-names>P</given-names></name><name><surname>Cheng</surname><given-names>J</given-names></name><name><surname>Dike</surname><given-names>S</given-names></name><name><surname>Nix</surname><given-names>DA</given-names></name><name><surname>Duttagupta</surname><given-names>R</given-names></name><name><surname>Willingham</surname><given-names>AT</given-names></name><etal/></person-group><year>2007</year><article-title>RNA maps reveal new RNA classes and a possible function for pervasive transcription</article-title><source>Science</source><volume>316</volume><fpage>1484</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1126/science.1138341</pub-id></element-citation></ref><ref id="bib40"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kimura</surname><given-names>K</given-names></name><name><surname>Hirano</surname><given-names>T</given-names></name></person-group><year>1997</year><article-title>ATP-dependent positive supercoiling of DNA by 13S condensin: a biochemical implication for chromosome condensation</article-title><source>Cell</source><volume>90</volume><fpage>625</fpage><lpage>34</lpage><pub-id pub-id-type="doi">10.1016/S0092-8674(00)80524-3</pub-id></element-citation></ref><ref id="bib41"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kwak</surname><given-names>H</given-names></name><name><surname>Fuda</surname><given-names>NJ</given-names></name><name><surname>Core</surname><given-names>LJ</given-names></name><name><surname>Lis</surname><given-names>JT</given-names></name></person-group><year>2013</year><article-title>Precise maps of RNA polymerase reveal how promoters direct initiation and pausing</article-title><source>Science</source><volume>339</volume><fpage>950</fpage><lpage>3</lpage><pub-id pub-id-type="doi">10.1126/science.1229386</pub-id></element-citation></ref><ref id="bib42"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Larschan</surname><given-names>E</given-names></name><name><surname>Bishop</surname><given-names>EP</given-names></name><name><surname>Kharchenko</surname><given-names>PV</given-names></name><name><surname>Core</surname><given-names>LJ</given-names></name><name><surname>Lis</surname><given-names>JT</given-names></name><name><surname>Park</surname><given-names>PJ</given-names></name><etal/></person-group><year>2011</year><article-title>X chromosome dosage compensation via enhanced transcriptional elongation in Drosophila</article-title><source>Nature</source><volume>471</volume><fpage>115</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1038/nature09757</pub-id></element-citation></ref><ref id="bib43"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lau</surname><given-names>NC</given-names></name><name><surname>Lim</surname><given-names>LP</given-names></name><name><surname>Weinstein</surname><given-names>EG</given-names></name><name><surname>Bartel</surname><given-names>DP</given-names></name></person-group><year>2001</year><article-title>An abundant class of tiny RNAs with probable regulatory roles in <italic>Caenorhabditis elegans</italic></article-title><source>Science</source><volume>294</volume><fpage>858</fpage><lpage>62</lpage><pub-id pub-id-type="doi">10.1126/science.1065062</pub-id></element-citation></ref><ref id="bib44"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lefrancois</surname><given-names>P</given-names></name><name><surname>Euskirchen</surname><given-names>GM</given-names></name><name><surname>Auerbach</surname><given-names>RK</given-names></name><name><surname>Rozowsky</surname><given-names>J</given-names></name><name><surname>Gibson</surname><given-names>T</given-names></name><name><surname>Yellman</surname><given-names>CM</given-names></name><etal/></person-group><year>2009</year><article-title>Efficient yeast ChIP-Seq using multiplex short-read DNA sequencing</article-title><source>BMC Genomics</source><volume>10</volume><fpage>37</fpage><pub-id pub-id-type="doi">10.1186/1471-2164-10-37</pub-id></element-citation></ref><ref id="bib45"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Legube</surname><given-names>G</given-names></name><name><surname>McWeeney</surname><given-names>SK</given-names></name><name><surname>Lercher</surname><given-names>MJ</given-names></name><name><surname>Akhtar</surname><given-names>A</given-names></name></person-group><year>2006</year><article-title>X-chromosome-wide profiling of MSL-1 distribution and dosage compensation in Drosophila</article-title><source>Genes Dev</source><volume>20</volume><fpage>871</fpage><lpage>83</lpage><pub-id pub-id-type="doi">10.1101/gad.377506</pub-id></element-citation></ref><ref id="bib46"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lin</surname><given-names>F</given-names></name><name><surname>Xing</surname><given-names>K</given-names></name><name><surname>Zhang</surname><given-names>J</given-names></name><name><surname>He</surname><given-names>X</given-names></name></person-group><year>2012</year><article-title>Expression reduction in mammalian X chromosome evolution refutes Ohno’s hypothesis of dosage compensation</article-title><source>Proc Natl Acad Sci USA</source><volume>109</volume><fpage>11752</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1073/pnas.1201816109</pub-id></element-citation></ref><ref id="bib47"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Mangone</surname><given-names>M</given-names></name><name><surname>Manoharan</surname><given-names>AP</given-names></name><name><surname>Thierry-Mieg</surname><given-names>D</given-names></name><name><surname>Thierry-Mieg</surname><given-names>J</given-names></name><name><surname>Han</surname><given-names>T</given-names></name><name><surname>Mackowiak</surname><given-names>SD</given-names></name><etal/></person-group><year>2010</year><article-title>The landscape of <italic>C. elegans</italic> 3’UTRs</article-title><source>Science</source><volume>329</volume><fpage>432</fpage><lpage>5</lpage><pub-id pub-id-type="doi">10.1126/science.1191244</pub-id></element-citation></ref><ref id="bib48"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Maxwell</surname><given-names>CS</given-names></name><name><surname>Antoshechkin</surname><given-names>I</given-names></name><name><surname>Kurhanewicz</surname><given-names>N</given-names></name><name><surname>Belsky</surname><given-names>JA</given-names></name><name><surname>Baugh</surname><given-names>LR</given-names></name></person-group><year>2012</year><article-title>Nutritional control of mRNA isoform expression during developmental arrest and recovery in <italic>C. elegans</italic></article-title><source>Genome Res</source><volume>22</volume><fpage>1920</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1101/gr.133587.111</pub-id></element-citation></ref><ref id="bib49"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Mets</surname><given-names>DG</given-names></name><name><surname>Meyer</surname><given-names>BJ</given-names></name></person-group><year>2009</year><article-title>Condensins regulate meiotic DNA break distribution, thus crossover frequency, by controlling chromosome structure</article-title><source>Cell</source><volume>139</volume><fpage>73</fpage><lpage>86</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2009.07.035</pub-id></element-citation></ref><ref id="bib50"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Meyer</surname><given-names>BJ</given-names></name></person-group><year>2010</year><article-title>Targeting X chromosomes for repression</article-title><source>Curr Opin Genet Dev</source><volume>20</volume><fpage>179</fpage><lpage>89</lpage><pub-id pub-id-type="doi">10.1016/j.gde.2010.03.008</pub-id></element-citation></ref><ref id="bib51"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Morton</surname><given-names>JJ</given-names></name><name><surname>Blumenthal</surname><given-names>T</given-names></name></person-group><year>2011</year><article-title>Identification of transcription start sites of trans-spliced genes: uncovering unusual operon arrangements</article-title><source>RNA</source><volume>17</volume><fpage>327</fpage><lpage>37</lpage><pub-id pub-id-type="doi">10.1261/rna.2447111</pub-id></element-citation></ref><ref id="bib52"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nechaev</surname><given-names>S</given-names></name><name><surname>Fargo</surname><given-names>DC</given-names></name><name><surname>dos Santos</surname><given-names>G</given-names></name><name><surname>Liu</surname><given-names>L</given-names></name><name><surname>Gao</surname><given-names>Y</given-names></name><name><surname>Adelman</surname><given-names>K</given-names></name></person-group><year>2010</year><article-title>Global analysis of short RNAs reveals widespread promoter-proximal stalling and arrest of Pol II in Drosophila</article-title><source>Science</source><volume>327</volume><fpage>335</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1126/science.1181421</pub-id></element-citation></ref><ref id="bib53"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Pferdehirt</surname><given-names>RR</given-names></name><name><surname>Kruesi</surname><given-names>WS</given-names></name><name><surname>Meyer</surname><given-names>BJ</given-names></name></person-group><year>2011</year><article-title>An MLL/COMPASS subunit functions in the <italic>C. elegans</italic> dosage compensation complex to target X chromosomes for transcriptional regulation of gene expression</article-title><source>Genes Dev</source><volume>25</volume><fpage>499</fpage><lpage>515</lpage><pub-id pub-id-type="doi">10.1101/gad.2016011</pub-id></element-citation></ref><ref id="bib54"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Piazza</surname><given-names>I</given-names></name><name><surname>Haering</surname><given-names>CH</given-names></name><name><surname>Rutkowska</surname><given-names>A</given-names></name></person-group><year>2013</year><article-title>Condensin: crafting the chromosome landscape</article-title><source>Chromosoma</source><volume>122</volume><fpage>175</fpage><lpage>90</lpage><pub-id pub-id-type="doi">10.1007/s00412-013-0405-1</pub-id></element-citation></ref><ref id="bib55"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Prabhakaran</surname><given-names>M</given-names></name><name><surname>Kelley</surname><given-names>RL</given-names></name></person-group><year>2012</year><article-title>Mutations in the transcription elongation factor SPT5 disrupt a reporter for dosage compensation in Drosophila</article-title><source>PLOS GENET</source><volume>8</volume><fpage>e1003073</fpage><pub-id pub-id-type="doi">10.1371/journal.pgen.1003073</pub-id></element-citation></ref><ref id="bib56"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rasmussen</surname><given-names>EB</given-names></name><name><surname>Lis</surname><given-names>JT</given-names></name></person-group><year>1993</year><article-title>In vivo transcriptional pausing and cap formation on three Drosophila heat shock genes</article-title><source>Proc Natl Acad Sci USA</source><volume>90</volume><fpage>7923</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1073/pnas.90.17.7923</pub-id></element-citation></ref><ref id="bib57"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rhee</surname><given-names>HS</given-names></name><name><surname>Pugh</surname><given-names>BF</given-names></name></person-group><year>2012</year><article-title>Genome-wide structure and organization of eukaryotic pre-initiation complexes</article-title><source>Nature</source><volume>483</volume><fpage>295</fpage><lpage>301</lpage><pub-id pub-id-type="doi">10.1038/nature10799</pub-id></element-citation></ref><ref id="bib58"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Seila</surname><given-names>AC</given-names></name><name><surname>Calabrese</surname><given-names>JM</given-names></name><name><surname>Levine</surname><given-names>SS</given-names></name><name><surname>Yeo</surname><given-names>GW</given-names></name><name><surname>Rahl</surname><given-names>PB</given-names></name><name><surname>Flynn</surname><given-names>RA</given-names></name><etal/></person-group><year>2008</year><article-title>Divergent transcription from active promoters</article-title><source>Science</source><volume>322</volume><fpage>1849</fpage><lpage>51</lpage><pub-id pub-id-type="doi">10.1126/science.1162253</pub-id></element-citation></ref><ref id="bib59"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Smith</surname><given-names>ER</given-names></name><name><surname>Allis</surname><given-names>CD</given-names></name><name><surname>Lucchesi</surname><given-names>JC</given-names></name></person-group><year>2001</year><article-title>Linking global histone acetylation to the transcription enhancement of X-chromosomal genes in Drosophila males</article-title><source>J Biol Chem</source><volume>276</volume><fpage>31483</fpage><lpage>6</lpage><pub-id pub-id-type="doi">10.1074/jbc.C100351200</pub-id></element-citation></ref><ref id="bib67"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Song</surname><given-names>JS</given-names></name><name><surname>Johnson</surname><given-names>WE</given-names></name><name><surname>Zhu</surname><given-names>X</given-names></name><name><surname>Zhang</surname><given-names>X</given-names></name><name><surname>Li</surname><given-names>W</given-names></name><name><surname>Manrai</surname><given-names>AK</given-names></name><etal/></person-group><year>2007</year><article-title>Model-based analysis of two-color arrays (MA2C)</article-title><source>Genome Biology</source><volume>8</volume><fpage>R178</fpage><pub-id pub-id-type="doi">10.1186/gb-2007-8-8-r178</pub-id></element-citation></ref><ref id="bib60"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Straub</surname><given-names>T</given-names></name><name><surname>Becker</surname><given-names>PB</given-names></name></person-group><year>2013</year><article-title>Comment on “Drosophila dosage compensation involves enhanced Pol II recruitment to male X-linked promoters”</article-title><source>Science</source><volume>340</volume><fpage>273</fpage><pub-id pub-id-type="doi">10.1126/science.1231895</pub-id></element-citation></ref><ref id="bib61"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wood</surname><given-names>AJ</given-names></name><name><surname>Severson</surname><given-names>AF</given-names></name><name><surname>Meyer</surname><given-names>BJ</given-names></name></person-group><year>2010</year><article-title>Condensin and cohesin complexity: the expanding repertoire of functions</article-title><source>Nat Rev Genet</source><volume>11</volume><fpage>391</fpage><lpage>404</lpage><pub-id pub-id-type="doi">10.1038/nrg2794</pub-id></element-citation></ref><ref id="bib62"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname><given-names>L</given-names></name><name><surname>Zee</surname><given-names>BM</given-names></name><name><surname>Wang</surname><given-names>Y</given-names></name><name><surname>Garcia</surname><given-names>BA</given-names></name><name><surname>Dou</surname><given-names>Y</given-names></name></person-group><year>2011</year><article-title>The RING finger protein MSL2 in the MOF complex is an E3 ubiquitin ligase for H2B K34 and is involved in crosstalk with H3 K4 and K79 methylation</article-title><source>Mol Cell</source><volume>43</volume><fpage>132</fpage><lpage>44</lpage><pub-id pub-id-type="doi">10.1016/j.molcel.2011.05.015</pub-id></element-citation></ref><ref id="bib63"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Xiong</surname><given-names>Y</given-names></name><name><surname>Chen</surname><given-names>X</given-names></name><name><surname>Chen</surname><given-names>Z</given-names></name><name><surname>Wang</surname><given-names>X</given-names></name><name><surname>Shi</surname><given-names>S</given-names></name><name><surname>Zhang</surname><given-names>J</given-names></name><etal/></person-group><year>2010</year><article-title>RNA sequencing shows no dosage compensation of the active X-chromosome</article-title><source>Nat Genet</source><volume>42</volume><fpage>1043</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1038/ng.711</pub-id></element-citation></ref><ref id="bib64"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yamaguchi</surname><given-names>Y</given-names></name><name><surname>Shibata</surname><given-names>H</given-names></name><name><surname>Handa</surname><given-names>H</given-names></name></person-group><year>2013</year><article-title>Transcription elongation factors DSIF and NELF: promoter-proximal pausing and beyond</article-title><source>Biochim Biophys Acta</source><volume>1829</volume><fpage>98</fpage><lpage>104</lpage><pub-id pub-id-type="doi">10.1016/j.bbagrm.2012.11.007</pub-id></element-citation></ref><ref id="bib65"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yamaguchi</surname><given-names>Y</given-names></name><name><surname>Takagi</surname><given-names>T</given-names></name><name><surname>Wada</surname><given-names>T</given-names></name><name><surname>Yano</surname><given-names>K</given-names></name><name><surname>Furuya</surname><given-names>A</given-names></name><name><surname>Sugimoto</surname><given-names>S</given-names></name><etal/></person-group><year>1999</year><article-title>NELF, a multisubunit complex containing RD, cooperates with DSIF to repress RNA polymerase II elongation</article-title><source>Cell</source><volume>97</volume><fpage>41</fpage><lpage>51</lpage><pub-id pub-id-type="doi">10.1016/S0092-8674(00)80713-8</pub-id></element-citation></ref><ref id="bib66"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zhong</surname><given-names>M</given-names></name><name><surname>Niu</surname><given-names>W</given-names></name><name><surname>Lu</surname><given-names>ZJ</given-names></name><name><surname>Sarov</surname><given-names>M</given-names></name><name><surname>Murray</surname><given-names>JI</given-names></name><name><surname>Janette</surname><given-names>J</given-names></name><etal/></person-group><year>2010</year><article-title>Genome-wide identification of binding sites defines distinct functions for <italic>Caenorhabditis elegans</italic> PHA-4/FOXA in development and environmental response</article-title><source>PLOS GENET</source><volume>6</volume><fpage>e1000848</fpage><pub-id pub-id-type="doi">10.1371/journal.pgen.1000848</pub-id></element-citation></ref></ref-list></back><sub-article article-type="article-commentary" id="SA1"><front-stub><article-id pub-id-type="doi">10.7554/eLife.00808.041</article-id><title-group><article-title>Decision letter</article-title></title-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Proudfoot</surname><given-names>Nick</given-names></name><role>Reviewing editor</role><aff><institution>University of Oxford</institution>, <country>United Kingdom</country></aff></contrib></contrib-group></front-stub><body><boxed-text><p>eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see <ext-link ext-link-type="uri" xlink:href="http://www.elifesciences.org/the-journal/review-process">review process</ext-link>). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.</p></boxed-text><p>Thank you for sending your work entitled “Condensin controls recruitment of RNA Polymerase II to achieve nematode X-chromosome dosage compensation” for consideration at <italic>eLife</italic>. Your article has been favorably evaluated by a Senior editor and 3 reviewers, one of whom is a member of our Board of Reviewing Editors.</p><p>The Reviewing editor and the other reviewers discussed their comments before we reached this decision, and the Reviewing editor has assembled the following comments to help you prepare a revised submission.</p><p>This study describes the development of a new technique for mapping nascent TSS;GRO-cap. This powerful new technique has been applied to two aspects of <italic>C. elegans</italic> transcriptional regulation: mapping the nascent TSS of outrons and transcriptional regulation of X-chromosome inactivation. Both of these applications are important and publishable. However the first application needs further work/clarification.</p><p>1) Our main concern is the authors’ claim that a combination of GRO-seq and GRO-cap allows unequivocal assignment of new upstream TSS for <italic>C. elegans</italic> outrons (5’ end of mature mRNA is formed by SL1 trans-splicing and so doesn’t define gene TSS). For example, in the Discussion section, the authors state that they assign upstream TSS “by requiring that TSS calls be supported by uninterrupted GRO-seq signal for transcriptionally engaged Pol II between the GRO-cap TSS and the previously annotated 5’ end”. GRO-seq shows the presence of active polymerases, and hence, uninterrupted signal does not rule out the possibility of signal arising from multiple, partially overlapping transcripts. For example, if two tandem transcriptional units were located in close proximity, GRO-seq signal could appear continuous (especially as run-on can proceed after the TTS). This could explain the differences with the <xref ref-type="bibr" rid="bib10">Chen et al (2013)</xref> data, which proposes that separate transcription units may exist upstream of outrons that could be enhancer derived transcripts (eRNAs).</p><p>Since this is a key issue we recommend further analysis on this point. To distinguish between these two possibilities, it would be necessary to analyse secondary promoters detected by GRO-cap and confirm that there are no 3’ ends at these positions (PMID: <ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/pubmed/20522740">20522740</ext-link> or <ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/pubmed/21085120">21085120</ext-link>) that would break these very long outrons into two independent units.</p><p>Specific comments related to point 1: A) In <xref ref-type="fig" rid="fig1">Figure 1G</xref>, one can observe a secondary TSS in the position of the annotated WB start, in addition to the one described by the authors. Specifically, in the right panel (where the reads are centered on the GRO-cap determined TSS), a clear white line moving towards the right is apparent. To be able to appreciate how important this secondary TSS (located in the WB defined position) is with respect to the new ones defined by the authors, it would be useful make the same plot but using the GRO-cap data.</p><p>B) <xref ref-type="fig" rid="fig1s11">Figure 1–figure supplement 11</xref>: the fact that GRO-seq signal increases as it passes GRO-cap spikes does not prove that they are continuous transcripts with different 5’ UTRs. Partially overlapping tandem eRNA with lower expression level than the main transcript could produce the same pattern.</p><p>C) It seems to us that there is one limitation for the GRO-cap method that the authors did not discuss. Only elongating polymerases in the proximity of the TSS (e.g., &lt;500bp?) will have a nascent RNA short enough to produce a sequencing library compatible with Illumina technology. Although that does not alter the discussed results in this case, this limitation should be stated for future users of the technique.</p><p>2) A way to further strengthen the argument that authentic TSS of outrons in many cases is distant to the mature mRNA 5’ end would be ChIP based analysis using Pol II CTD ser5 and ser2 specific antibodies. Thus could be performed on a few selected transcription units with amplicons covering the outron and 5’ half of the coding region.</p><p>3) Regarding the accumulation of Pol II at the 3’ end of genes. The authors suggest that extensive pausing at the 3’ end of genes may be linked to trans-splicing. In this context, it would be beneficial if the “not in operon genes” in the analysis presented in <xref ref-type="fig" rid="fig4">Figure 4B</xref> were further subdivided into monocistronic trans-spliced genes and non-trans-sliced genes. It would be interesting to see if non-trans-spliced genes also show Pol II accumulation at the 3’ end. In addition they should comment on the possibility that the high U content in the intergenic regions within operons (ur element that direct trans-splicing) and perhaps also at the 3’ end of genes could create a partial bias during GRO-seq and so skew the results for these regions.</p><p>4) The effect of <italic>scd2</italic> mutant on dosage compensation and consequent effects on transcription uncovers that dosage compensation also affects small non-coding RNAs (miRNAs). Are there more miRNA affected by dosage compensation? What do they have in common? Do they regulate a group of genes with similar function?</p></body></sub-article><sub-article article-type="reply" id="SA2"><front-stub><article-id pub-id-type="doi">10.7554/eLife.00808.042</article-id><title-group><article-title>Author response</article-title></title-group></front-stub><body><p><italic>1) Our main concern is the authors’ claim that a combination of GRO-seq and GRO-cap allows unequivocal assignment of new upstream TSS for C. elegans outrons (5’ end of mature mRNA is formed by SL1 trans-splicing and so doesn’t define gene TSS). For example, in the Discussion section, the authors state that they assign upstream TSS “by requiring that TSS calls be supported by uninterrupted GRO-seq signal for transcriptionally engaged Pol II between the GRO-cap TSS and the previously annotated 5’ end”. GRO-seq shows the presence of active polymerases, and hence, uninterrupted signal does not rule out the possibility of signal arising from multiple, partially overlapping transcripts. For example, if two tandem transcriptional units were located in close proximity, GRO-seq signal could appear continuous (especially as run-on can proceed after the TTS). This could explain the differences with the <xref ref-type="bibr" rid="bib10">Chen et al (2013)</xref> data, which proposes that separate transcription units may exist upstream of outrons that could be enhancer derived transcripts (eRNAs)</italic>.</p><p><italic>Since this is a key issue we recommend further analysis on this point. To distinguish between these two possibilities, it would be necessary to analyse secondary promoters detected by GRO-cap and confirm that there are no 3’ ends at these positions (PMID: <ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/pubmed/20522740">20522740</ext-link> or <ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/pubmed/21085120">21085120</ext-link>) that would break these very long outrons into two independent units</italic>.</p><p><italic>Specific comments related to point 1: A) In <xref ref-type="fig" rid="fig1">Figure 1G</xref>, one can observe a secondary TSS in the position of the annotated WB start, in addition to the one described by the authors. Specifically, in the right panel (where the reads are centered on the GRO-cap determined TSS), a clear white line moving towards the right is apparent. To be able to appreciate how important this secondary TSS (located in the WB defined position) is with respect to the new ones defined by the authors, it would be useful make the same plot but using the GRO-cap data</italic>.</p><p><italic>B) <xref ref-type="fig" rid="fig1s11">Figure 1–figure supplement 11</xref>: the fact that GRO-seq signal increases as it passes GRO-cap spikes does not prove that they are continuous transcripts with different 5’ UTRs. Partially overlapping tandem eRNA with lower expression level than the main transcript could produce the same pattern</italic>.</p><p><italic>C) It seems to us that there is one limitation for the GRO-cap method that the authors did not discuss. Only elongating polymerases in the proximity of the TSS (e.g., &lt;500bp?) will have a nascent RNA short enough to produce a sequencing library compatible with Illumina technology. Although that does not alter the discussed results in this case, this limitation should be stated for future users of the technique</italic>.</p><p>The main concern of the reviewers was our claim that the combination of GRO-cap and GRO-seq permits accurate assignment of new upstream TSSs for <italic>C. elegans</italic> outrons. The reviewers were specifically concerned with our procedure of “requiring that TSS calls be supported by uninterrupted GRO-seq signal for engaged Pol II between the GRO-cap TSS and the previously annotated 5’ end.” They felt that GRO-seq shows the presence of active polymerases and therefore uninterrupted signal does not rule out the possibility of signal arising from multiple, partially overlapping transcripts. The example given was that “if two tandem transcription units were located in close proximity, GRO-seq signal could appear continuous (especially as run-on can proceed after the TTS.” They felt that such overlapping transcripts could explain the difference between our work and that of <xref ref-type="bibr" rid="bib10">Chen et al. (2013)</xref>, which proposes that separate transcription units may exist upstream of outrons and could be enhancer derived transcripts. The reviewers asked for further analysis to distinguish between these possibilities. The first request was to analyze the outron regions for 3’ ends as defined in two separate studies, Jan et al., 2011 and <xref ref-type="bibr" rid="bib47">Mangone et al., 2010</xref>.</p><p>We performed this analysis and found 3’ UTRs and polyA signals to be rare in outrons of greater than 1 kb in length. From 565 such outrons, only 1.4% had an identified 3’ UTR in the Jan et al. (2011) study, and 0.7% had a 3’ UTR in the <xref ref-type="bibr" rid="bib47">Mangone et al. (2010)</xref> study. Furthermore, only 3.5% had a polyA site (<xref ref-type="bibr" rid="bib47">Mangone et al., 2010</xref>). This analysis strongly supports the view that the engaged Pol II signal is not from independent transcription units. Furthermore, in the long outrons, we failed to find the typical GRO-seq signature of <italic>C. elegans</italic> 3' ends: a spike of high Pol II accumulation at the 3’ cleavage and polyadenylation site.</p><p>Relevant to this first concern, the reviewers commented that in the heat maps of the old <xref ref-type="fig" rid="fig1">Figure 1G</xref> (new <xref ref-type="fig" rid="fig1">Figure 1F</xref>), they saw a clear white line moving towards the right and interpreted it as secondary TSSs. The reviewers asked that we plot GRO-cap signal relative to our called TSSs to see whether we could find secondary TSSs.</p><p>The idea of the plot was helpful and the plot itself was revealing. The heat map of individual genes displaying GRO-cap signal relative to TSSs (new <xref ref-type="fig" rid="fig1">Figure 1G</xref>) showed that a dominant TSS contributes the majority of the vast GRO-cap signal and thereby supports our claim about the correct identification of TSSs. In addition, we have an alternative interpretation of the light line. It reflects reduced GRO-seq signal caused by trans-splicing near the Wormbase start sites (the sites for the trans-splice acceptor site).</p><p>When we first submitted our paper for review, the data from <xref ref-type="bibr" rid="bib10">Chen et al. (2013)</xref> were not available for comparison. For that reason we could only comment on their enhancers for the single gene shown in one of their figures (old <xref ref-type="fig" rid="fig1s11">Figure 1–figure supplement 11</xref>). We found strong overlap between the regions they called enhancers and our TSS calls. That gene was particularly complex for such a comparison because it appears by our analysis to have multiple, alternative TSSs. The interpretation of multiple TSSs rather than independent upstream transcription units has subsequently been supported by finding no evidence of 3’ UTRs or polyA sites from the data of Jan et al. (2011) and <xref ref-type="bibr" rid="bib47">Mangone et al (2010)</xref>, and continuous ChIP signal for the hypo-phosphorylated form of Pol II (see new <xref ref-type="fig" rid="fig1s12">Figure 1–figure supplement 12A</xref>).</p><p>Since the Chen et al. data became available, we found other specific genes to compare. Particularly informative are genes for which Chen et al. call a single upstream enhancer and we called a single strong TSS at exactly the same position as the enhancer. An example is in the new <xref ref-type="fig" rid="fig1s12">Figure 1–figure supplement 12B</xref>. It shows a gene with a single called TSS (2534 bp upstream of WB start) that corresponds with the single called enhancer. Continuous ChIP signal of hyper-phosphorylated Pol II was found between the TSS and the WB start, and no 3’ UTRs or polyA signals were found. This and other similar examples strongly support the view that (1) some regions called enhancers by Chen et al. are actual TSSs and (2) GRO-cap signal paired with continuous GRO-seq signal from WB starts defines authentic TSSs. That said, many of the enhancers they mapped do not correspond to outron TSSs, and we are not dismissing the general enhancer mapping performed by Chen et al. A relevant example illustrating this point is in <xref ref-type="fig" rid="fig1s12">Figure 1–figure supplement 12C</xref>, which shows a strong TSS overlapping one of the two called enhancers. On balance, though, we do think the enhancer mapping of Chen et al. needs to be corrected in light of our TSS mapping. The revised text now reflects a more balanced assessment of the Chen et al. analysis as it relates to our TSS mapping based on the newly available data.</p><p><italic>2) A way to further strengthen the argument that authentic TSS of outrons in many cases is distant to the mature mRNA 5’ end would be ChIP based analysis using Pol II CTD ser5 and ser2 specific antibodies. Thus could be performed on a few selected transcription units with amplicons covering the outron and 5’ half of the coding region</italic>.</p><p>We have addressed this point by reanalyzing ChIP data from our prior studies (<xref ref-type="bibr" rid="bib53">Pferdehirt et al., 2011</xref>) and those of modENCODE. In general, we found that regions corresponding to long outrons have continuous ChIP-chip signal from antibodies enriched for either the ser2 phosphorylated form of Pol II or the hypo-phosphorylated form of Pol II (see new <xref ref-type="fig" rid="fig1s8">Figure 1–figure supplement 8A–C</xref> and <xref ref-type="fig" rid="fig3">Figure 3A</xref>). These results and the restricted run-on length of ∼100 nucleotides that is typical of GRO-seq reactions (<xref ref-type="bibr" rid="bib16">Core et al., 2008</xref>) indicate that GRO-seq signal corresponds to bound Pol II in vivo and is not an artifact of the nuclear run-on reactions in vitro extending beyond the 3’ ends defined in vivo<italic>.</italic> We now emphasize in the protocol (see <xref ref-type="fig" rid="fig2">Figure 2</xref>) that GRO-seq run-ons have been tuned to only extend the length of nascent RNAs by 100 bases on average, thus minimizing the concern that independent transcription units have been artifactually linked.</p><p>In summary, our paper now provides multiple lines of evidence showing that the GRO-seq signal between newly called TSSs and previously identified TASs represents legitimate outrons rather than independent overlapping upstream transcripts.</p><p>A final issue raised by reviewers about our GRO-cap method is the following:
”It seems to us that there is one limitation for the GRO-cap method that the authors did not discuss. Only elongating polymerases in the proximity of the TSS (e.g., &lt;500bp?) will have a nascent RNA short enough to produce a sequencing library compatible with Illumina technology. Although that does not alter the discussed results in this case, this limitation should be stated for future users of the technique.”</p><p>We responded to this request by adding the following comment to the legend of <xref ref-type="fig" rid="fig2">Figure 2</xref>, which describes the protocol in detail: “We note that transcripts &lt; 500 bp are captured most efficiently on Illumina sequencing platforms.”</p><p><italic>3) Regarding the accumulation of Pol II at the 3’ end of genes. The authors suggest that extensive pausing at the 3’ end of genes may be linked to trans-splicing. In this context, it would be beneficial if the “not in operon genes” in the analysis presented in <xref ref-type="fig" rid="fig4">Figure 4B</xref> were further subdivided into monocistronic trans-spliced genes and non-trans-sliced genes. It would be interesting to see if non-trans-spliced genes also show Pol II accumulation at the 3’ end. In addition they should comment on the possibility that the high U content in the intergenic regions within operons (ur element that direct trans-splicing) and perhaps also at the 3’ end of genes could create a partial bias during GRO-seq and so skew the results for these regions</italic>.</p><p>The third request made by the reviewers relates to the accumulation of Pol II at the 3' end of genes. In response to the reviewers, we analyzed 3’ Pol II accumulation in monocistronic genes with and without trans-splicing and presented the results in the text and in the new <xref ref-type="fig" rid="fig1s4">Figure 4–figure supplement 2A</xref>. We found that first and middle genes in operons had the highest 3’-pausing ratio. Monocistronic genes with trans-splicing had a slightly higher 3’-pausing ratio than terminal genes in operons. Monocistronic genes with trans-splicing had a higher 3’-pausing ratio than monocistronic genes lacking trans-splicing (Mann-Whitney-U p&lt;e-10). The 3’-pausing ratios for genes in all classes were greater than for Drosophila genes. These results further support our proposal that pausing at 3’ ends is linked to trans-splicing.</p><p>Regarding the relationship between U content and 3’ end pausing, we found that Pol II accumulation at 3’ ends does not overlap with U-rich regions at 3’ ends. Therefore, the high GRO-seq signal is not due to selective enrichment of U-rich RNAs (<xref ref-type="fig" rid="fig1s2">Figure 4–figure supplement 2B,C</xref>).</p><p><italic>4) The effect of</italic> scd2 <italic>mutant on dosage compensation and consequent effects on transcription uncovers that dosage compensation also affects small non-coding RNAs (miRNAs). Are there more miRNA affected by dosage compensation? What do they have in common? Do they regulate a group of genes with similar function</italic>?</p><p>The fourth request from the reviewers concerned the effect of the <italic>sdc-2</italic> dosage compensation mutation on the expression of X-linked microRNAs. We found most embryonically expressed X-linked microRNAs to be dosage compensated. The targets of these microRNAs have not been determined experimentally and the predicted targets do not represent groups of genes with similar functions.</p></body></sub-article></article>