Chromatin accessibility maps of chronic lymphocytic leukemia identify subtype-specific epigenome signatures and transcription regulatory networks
André F. Rendeiro*, Christian Schmidl*, Jonathan C. Strefford*, Renata Walewska, Zadie Davis, Matthias Farlik, David Oscier, Christoph Bock Chromatin accessibility maps of chronic lymphocytic leukemia identify subtype-specific epigenome signatures and transcription regulatory networks. Nat. Commun. 7:11938 doi: 10.1038/ncomms11938 (2016).
*Shared first authors
Paper: http://dx.doi.org/10.1038/ncomms11938
Website: cll-chromatin.computational-epigenetics.org
This repository contains scripts used in the analysis of the data in the paper.
The manuscript is written in scholarly markdown, therefore you need Scholdoc to render the markdown into a pdf, rst, word or html manuscript.
A rendered version is available here.
You can see the raw manuscript here along with the figures.
To render the pdf version of the manuscript, run:
make manuscript
this requires in general a full latex installation.
In the paper website you can find most of the output of the whole analysis.
Here are a few steps needed to reproduce it (more than I'd want to, I admit):
- Clone the repository:
git clone git@github.com:epigen/cll-chromatin.git
- Install required software for the analysis:
make requirements
orpip install -r requirements.txt
If you wish to reproduce the processing of the raw data (access has to be requested through EGA), run these steps:
- Apply for access to the raw data from EGA.
- Download the data localy.
- Prepare Looper configuration files similar to these that fit your local system.
- Run samples through the pipeline:
make preprocessing
orlooper -c metadata/project_config_file.yaml
- Get external files (genome annotations mostly):
make external_files
or use the files in the paper website (external
folder). - Run the analysis:
make analysis
Additionaly, processed (bigWig and narrowPeak files together with a gene expression matrix) are available from GEO with accession number GSE81274.
If you wish to reproduce the plots from the analysis you can, in principle:
- run
python src/analysis.py
Not all parts of the analysis are possible to run as is, though. The TF network interence is based on a R package (PIQ) which is really hard to script runs in a system-independent way.