Skip to content
Calculation of average occupancy and FDR for RNA pol II DamID datasets
R
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
DmR6.genes.gff
README.md
polii.gene.call

README.md

polii.gene.call

Rscript for calculating average PolII occupancy and FDR for RNA Pol II DamID datasets, based on original algorithms developed by Tony Southall. Modifications from the original method are described in detail in the sourcecode.

The script processes datafiles in gatc.bedgraph or gatc.gff format, such as those generated by damidseq_pipeline.

Citation

If you find this software useful, please cite:

Marshall OJ and Brand AH. (2015) damidseq_pipeline: an automated pipeline for processing DamID sequencing datasets. Bioinformatics. 31(20):3371-3. doi: 10.1093/bioinformatics/btv386. (pubmed; full text, open access)

Southall TD, Gold KS, Egger B, Davidson CM, Caygill EE, Marshall OJ, Brand AH. (2013) Cell-type-specific profiling of gene expression and chromatin binding without cell isolation: assaying RNA Pol II occupancy in neural stem cells. Dev Cell. 26(1):101-12. doi: 10.1016/j.devcel.2013.05.020 (pubmed; full text, open access)

Requirements

  1. R

  2. A GFF-formatted list of genes. A file for release 6 of the Drosophila genome is provided in the archive; most GFF annotation files should also work. Place this file in an accessible directory and use the --genes.file commandline switch to access it:

    polii.gene.call --genes.file=/path/to/my-genes-anotation.gff

Installation

To install, copy the polii.gene.call executable into your path.

Usage

Run polii.gene.call as follows:

polii.gene.call [options] [list of gatc.bedgraph and/or gatc.gff files to process]

Each file will be processed separately, with the output being two files:

  1. [filename].genes.details.csv
A .csv table of all genes, together with average occupancy and FDR
  1. [filename].genes
A plain text list of all genes below the FDR threshold (default is 0.01; change with --FDR= commandline switch).  These genes may be considered to represent the significantly transcribed genes within a genome.

For a list of all possible commandline options, use

polii.gene.call --help

Downstream data processing

Two different transcriptomes generated through this method may be compared using the polii.correlation.plot Rscript available from the damid_misc repository.

	Rscript polii.correlation.plot [file1.genes.details.csv] [file2.genes.details.csv]

The output is both a graphical plot of differentially expressed genes, and a table listing the difference in mean log2(occupancy).

Differences between RNA pol II DamID and RNAseq

Although both are methods for transcriptional profiling, please be aware that there may be differences between these two methods. In particular, transcript abundancy as assessed through RNAseq will depend on transcript stability, whereas RNA pol II occupancy may provide a better indication of transcription levels.

You can’t perform that action at this time.