Skip to content

Data analysis scripts for Rendeiro et. al, 2016 (doi:10.1038/ncomms11938)

License

Notifications You must be signed in to change notification settings

zeehio/cll-chromatin

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DOI

Chromatin accessibility maps of chronic lymphocytic leukemia identify subtype-specific epigenome signatures and transcription regulatory networks

André F. Rendeiro*, Christian Schmidl*, Jonathan C. Strefford*, Renata Walewska, Zadie Davis, Matthias Farlik, David Oscier, Christoph Bock Chromatin accessibility maps of chronic lymphocytic leukemia identify subtype-specific epigenome signatures and transcription regulatory networks. Nat. Commun. 7:11938 doi: 10.1038/ncomms11938 (2016).

*Shared first authors

Paper: http://dx.doi.org/10.1038/ncomms11938

Website: cll-chromatin.computational-epigenetics.org

This repository contains scripts used in the analysis of the data in the paper.


Manuscript

The manuscript is written in scholarly markdown, therefore you need Scholdoc to render the markdown into a pdf, rst, word or html manuscript.

A rendered version is available here.

You can see the raw manuscript here along with the figures.

To render the pdf version of the manuscript, run:

make manuscript

this requires in general a full latex installation.


Analysis

In the paper website you can find most of the output of the whole analysis.

Here are a few steps needed to reproduce it (more than I'd want to, I admit):

  1. Clone the repository: git clone git@github.com:epigen/cll-chromatin.git
  2. Install required software for the analysis:make requirements or pip install -r requirements.txt

If you wish to reproduce the processing of the raw data (access has to be requested through EGA), run these steps:

  1. Apply for access to the raw data from EGA.
  2. Download the data localy.
  3. Prepare Looper configuration files similar to these that fit your local system.
  4. Run samples through the pipeline: make preprocessing or looper -c metadata/project_config_file.yaml
  5. Get external files (genome annotations mostly): make external_files or use the files in the paper website (external folder).
  6. Run the analysis: make analysis

Additionaly, processed (bigWig and narrowPeak files together with a gene expression matrix) are available from GEO with accession number GSE81274.

If you wish to reproduce the plots from the analysis you can, in principle:

  1. run python src/analysis.py

Not all parts of the analysis are possible to run as is, though. The TF network interence is based on a R package (PIQ) which is really hard to script runs in a system-independent way.

About

Data analysis scripts for Rendeiro et. al, 2016 (doi:10.1038/ncomms11938)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 80.5%
  • TeX 16.3%
  • Shell 1.5%
  • Other 1.7%