Data analysis scripts for Rendeiro et. al, 2016 (doi:10.1038/ncomms11938)
Python TeX Shell Other
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
manuscript
metadata
src
.gitignore
LICENSE.md
Makefile
README.md
requirements.txt

README.md

DOI

Chromatin accessibility maps of chronic lymphocytic leukemia identify subtype-specific epigenome signatures and transcription regulatory networks

André F. Rendeiro*, Christian Schmidl*, Jonathan C. Strefford*, Renata Walewska, Zadie Davis, Matthias Farlik, David Oscier, Christoph Bock Chromatin accessibility maps of chronic lymphocytic leukemia identify subtype-specific epigenome signatures and transcription regulatory networks. Nat. Commun. 7:11938 doi: 10.1038/ncomms11938 (2016).

*Shared first authors

Paper: http://dx.doi.org/10.1038/ncomms11938

Website: cll-chromatin.computational-epigenetics.org

This repository contains scripts used in the analysis of the data in the paper.


Manuscript

The manuscript is written in scholarly markdown, therefore you need Scholdoc to render the markdown into a pdf, rst, word or html manuscript.

A rendered version is available here.

You can see the raw manuscript here along with the figures.

To render the pdf version of the manuscript, run:

make manuscript

this requires in general a full latex installation.


Analysis

In the paper website you can find most of the output of the whole analysis.

Here are a few steps needed to reproduce it (more than I'd want to, I admit):

  1. Clone the repository: git clone git@github.com:epigen/cll-chromatin.git
  2. Install required software for the analysis:make requirements or pip install -r requirements.txt

If you wish to reproduce the processing of the raw data (access has to be requested through EGA), run these steps:

  1. Apply for access to the raw data from EGA.
  2. Download the data localy.
  3. Prepare Looper configuration files similar to these that fit your local system.
  4. Run samples through the pipeline: make preprocessing or looper -c metadata/project_config_file.yaml
  5. Get external files (genome annotations mostly): make external_files or use the files in the paper website (external folder).
  6. Run the analysis: make analysis

Additionaly, processed (bigWig and narrowPeak files together with a gene expression matrix) are available from GEO with accession number GSE81274.

If you wish to reproduce the plots from the analysis you can, in principle:

  1. run python src/analysis.py

Not all parts of the analysis are possible to run as is, though. The TF network interence is based on a R package (PIQ) which is really hard to script runs in a system-independent way.