# Generating Coverage Heatmaps

Preliminaries:

In [1]:
path = "/Users/edwinlock/Desktop/Divya/Thesis/annotation"

## ATAC Peak Heatmaps

First we generate the matrix using the **computeMatrix** command. Note: the following script generates the matrices for all cell types if and only if they don't already exist!

Docs: https://deeptools.readthedocs.io/en/develop/content/tools/computeMatrix.html

Important files and parameters:
--regionsFileName

File name or names, in BED or GTF format, containing the regions to plot. If multiple bed files are given, each one is considered a group that can be plotted separately. 

--scoreFileName

bigWig file(s) containing the scores to be plotted. Multiple files should be separated by spaces. BigWig files can be obtained by using the bamCoverage or bamCompare tools. 

--afterRegionStartLength, -a

Distance downstream of the end site of the given regions. If the regions are genes, this would be the distance downstream of the transcription end site. (Default: 0)

--beforeRegionStartLength, -b

Distance upstream of the start site of the regions defined in the region file. If the regions are genes, this would be the distance upstream of the transcription start site. (Default: 0)


**NB:** We assume that all files lie in the $path and are named \<cell_type\>_atac.bed and \<cell_type\>_coverage.bw.

In [2]:
%%bash -s "$path"
cd $1
for CELL in x1 x2 xins
do
    if [ ! -f "$CELL"_atac_peak_matrix.tab.gz ]; then
        computeMatrix reference-point \
        --regionsFileName "$CELL"_atac.bed \
        --scoreFileName "$CELL"_coverage.bw \
        --afterRegionStartLength 2500 \
        --beforeRegionStartLength 2500 \
        -p 24 \
        -o "$CELL"_atac_peak_matrix.tab.gz
    fi;
done

Process is terminated.


Now we plot the heatmap using the deeptools **plotHeatmap** command.

Docs: https://deeptools.readthedocs.io/en/develop/content/tools/plotHeatmap.html

In [3]:
%%bash
cd ~/Desktop/Divya/Thesis/annotation/
for CELL in x1 x2 xins
do
    if [ -f "$CELL"_atac_peak_matrix.tab.gz ]; then
        TITLE="${CELL^} ATAC peaks"
        plotHeatmap -m "$CELL"_atac_peak_matrix.tab.gz \
        --plotType se \
        --heatmapHeight 28 \
        --heatmapWidth 5 \
        --samplesLabel "$TITLE" \
        --xAxisLabel "Distance (bp)" \
        --yAxisLabel "RPKM" \
        --legendLocation none \
        --whatToShow "plot, heatmap and colorbar" \
        --colorMap viridis \
        --missingDataColor 0 \
        --refPointLabel Peak \
        -o "$CELL"_atac_peaks.pdf
    fi;
done

## ATAC TSS Heatmaps

Compute the matrix and plot the heatmaps as above.

In [4]:
%%bash
cd ~/Desktop/Divya/Thesis/annotation/
for CELL in x1 x2 xins
do
    if [ ! -f "$CELL"_atac_tss_matrix.tab.gz ]; then
        computeMatrix reference-point \
        --regionsFileName TSS.bed \
        --scoreFileName "$CELL"_coverage.bw \
        --afterRegionStartLength 2500 \
        --beforeRegionStartLength 2500 \
        -p 24 \
        -o "$CELL"_atac_tss_matrix.tab.gz
    fi;
done

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/bin/computeMatrix", line 14, in <module>
    main(args)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/deeptools/computeMatrix.py", line 421, in main
    hm.computeMatrix(scores_file_list, args.regionsFileName, parameters, blackListFileName=args.blackListFileName, verbose=args.verbose, allArgs=args)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/deeptools/heatmapper.py", line 264, in computeMatrix
    verbose=verbose)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/deeptools/mapReduce.py", line 85, in mapReduce
    bed_interval_tree = GTF(bedFile, defaultGroup=defaultGroup, transcriptID=transcriptID, exonID=exonID, transcript_id_designator=transcript_id_designator, keepExons=keepExons)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/deeptoolsi

CalledProcessError: Command 'b'cd ~/Desktop/Divya/Thesis/annotation/\nfor CELL in x1 x2 xins\ndo\n    if [ ! -f "$CELL"_atac_tss_matrix.tab.gz ]; then\n        computeMatrix reference-point \\\n        --regionsFileName TSS.bed \\\n        --scoreFileName "$CELL"_coverage.bw \\\n        --afterRegionStartLength 2500 \\\n        --beforeRegionStartLength 2500 \\\n        -p 24 \\\n        -o "$CELL"_atac_tss_matrix.tab.gz\n    fi;\ndone\n'' returned non-zero exit status 1.

In [None]:
%%bash
cd ~/Desktop/Divya/Thesis/annotation/
for CELL in x1 x2 xins
do
    if [ -f "$CELL"_atac_tss_matrix.tab.gz ]; then
        TITLE="${CELL^} TSS"
        plotHeatmap -m "$CELL"_atac_tss_matrix.tab.gz \
        --plotType se \
        --heatmapHeight 28 \
        --heatmapWidth 5 \
        --samplesLabel "$TITLE" \
        --xAxisLabel "Distance (bp)" \
        --yAxisLabel "RPKM" \
        --legendLocation none \
        --whatToShow "plot, heatmap and colorbar" \
        --colorMap Blues \
        --missingDataColor 0 \
        --refPointLabel TSS \
        -o "$CELL"_atac_tss.pdf
    fi;
done

## ChIP Peak Heatmaps

First we generate the matrix using the computeMatrix command. Note: the following script generates the matrices for all cell types if and only if they don't already exist!

In [None]:
%%bash
cd ~/Desktop/Divya/Thesis/annotation/
for REGION in k27ac k4me1
do
    if [ ! -f "$REGION"_chip_peak_matrix.tab.gz ]; then
        computeMatrix reference-point \
        --regionsFileName h3"$REGION"_summits.bed \
        --scoreFileName h3"$REGION"_meanlog2.bw \
        --afterRegionStartLength 2500 \
        --beforeRegionStartLength 2500 \
        -p 24 \
        -o "$REGION"_chip_peak_matrix.tab.gz
    fi;
done

Now we plot the heatmaps using the deeptools **plotHeatmap** command.

Docs: https://deeptools.readthedocs.io/en/develop/content/tools/plotHeatmap.html

In [None]:
%%bash
cd ~/Desktop/Divya/Thesis/annotation/
for REGION in k27ac k4me1
do
    if [ -f "$REGION"_chip_peak_matrix.tab.gz ]; then
        TITLE="H3${REGION^} ATAC peaks"
        plotHeatmap -m "$REGION"_chip_peak_matrix.tab.gz \
        --plotType se \
        --heatmapHeight 28 \
        --heatmapWidth 5 \
        --samplesLabel "$TITLE" \
        --xAxisLabel "Distance (bp)" \
        --yAxisLabel "log2-fold change" \
        --regionsLabel enhancers \
        --legendLocation none \
        --whatToShow "plot, heatmap and colorbar" \
        --colorMap viridis \
        --missingDataColor 1 \
        --refPointLabel Peak \
        -o "$REGION"_chip_peaks.pdf
    fi;
done



## ChIP TSS Heatmaps

Rinse and repeat.

In [None]:
%%bash
cd ~/Desktop/Divya/Thesis/annotation/
for REGION in k27ac k4me1
do
    if [ ! -f "$REGION"_chip_tss_matrix.tab.gz ]; then
        computeMatrix reference-point \
        --regionsFileName TSS.bed \
        --scoreFileName h3"$REGION"_meanlog2.bw \
        --afterRegionStartLength 5000 \
        --beforeRegionStartLength 5000 \
        -p 24 \
        -o "$REGION"_chip_tss_matrix.tab.gz
    fi;
done

In [None]:
%%bash
cd ~/Desktop/Divya/Thesis/annotation/
for REGION in k27ac k4me1
do
    if [ -f "$REGION"_chip_tss_matrix.tab.gz ]; then
        TITLE="H3${REGION^} around TSS"
        plotHeatmap -m "$REGION"_chip_tss_matrix.tab.gz \
        --plotType se \
        --heatmapHeight 28 \
        --heatmapWidth 5 \
        --samplesLabel "$TITLE" \
        --xAxisLabel "Distance (bp)" \
        --yAxisLabel "log2-fold change" \
        --regionsLabel enhancers \
        --legendLocation none \
        --whatToShow "plot, heatmap and colorbar" \
        --colorMap Blues \
        --missingDataColor 1 \
        --refPointLabel TSS \
        -o "$REGION"_chip_tss.pdf
    fi;
done

## Miscellaneous

Let's see K27ac around K4me1 peaks. First we compute the matrix, and then we create the heatmap.

In [None]:
%%bash
# Compute the matrix file if it doesn't exist already
if [ ! -f chip_k27ac_around_k4me1__peak_matrix.tab.gz ]; then
    computeMatrix reference-point \
            --regionsFileName h3k4me1_summits.bed \
            --scoreFileName h3k27ac_meanlog2.bw \
            --afterRegionStartLength 2500 \
            --beforeRegionStartLength 2500 \
            -p 24 \
            -o chip_k27ac_around_k4me1_peak_matrix.tab.gz
fi;

plotHeatmap -m chip_k27ac_around_k4me1_peak_matrix.tab.gz \
        --plotType se \
        --heatmapHeight 28 \
        --heatmapWidth 5 \
        --samplesLabel "K27ac around K4me1" \
        --xAxisLabel "Distance (bp)" \
        --yAxisLabel "log2-fold change" \
        --regionsLabel enhancers \
        --legendLocation none \
        --whatToShow "plot, heatmap and colorbar" \
        --colorMap plasma \
        --missingDataColor 0 \
        --refPointLabel Peak \
        -o chip_k27ac_around_k4me1_peak.pdf

Do the converse to investigate K4me1 peaks around K27ac.

In [None]:
%%bash

# Compute the matrix file if it doesn't exist already
if [ ! -f chip_k4me1_around_k27ac_peak_matrix.tab.gz ]; then
    computeMatrix reference-point \
            --regionsFileName h3k27ac_summits.bed \
            --scoreFileName h3k4me1_meanlog2.bw \
            --afterRegionStartLength 2500 \
            --beforeRegionStartLength 2500 \
            -p 24 \
            -o chip_k4me1_around_k27ac_peak_matrix.tab.gz
fi;

plotHeatmap -m chip_k4me1_around_k27ac_peak_matrix.tab.gz \
        --plotType se \
        --heatmapHeight 28 \
        --heatmapWidth 5 \
        --samplesLabel "K4me1 around K27ac" \
        --xAxisLabel "Distance (bp)" \
        --yAxisLabel "log2-fold change" \
        --regionsLabel enhancers \
        --legendLocation none \
        --whatToShow "plot, heatmap and colorbar" \
        --colorMap cividis \
        --missingDataColor 0 \
        --refPointLabel Peak \
        -o chip_k4me1_around_k27ac_peak.pdf


In [None]:
##ATAC intergenic and intronic peaks in chip regions

In [None]:
%%bash
cd ~/Desktop/Divya/Thesis/enhancers/
plotHeatmap -m atacchipintergenicnew.tab.gz \
    -o atacchipintergenic.pdf \
    --plotType se \
    --heatmapHeight 28 \
    --heatmapWidth 5 \
    --samplesLabel "Intergenic regions" \
    --xAxisLabel "Distance (bp)" \
    --yAxisLabel "RPKM" \
    --legendLocation none \
    --whatToShow "plot, heatmap and colorbar" \
    --colorMap plasma \
    --missingDataColor 0 \
    --refPointLabel "Peak"


In [None]:
%%bash
cd ~/Desktop/Divya/Thesis/enhancers/
plotHeatmap -m atacchipintronicnew.tab.gz \
    --plotType se \
    --heatmapHeight 28 \
    --heatmapWidth 5 \
    --samplesLabel "Intronic regions" \
    --xAxisLabel "Distance (bp)" \
    --yAxisLabel "RPKM" \
    --legendLocation none \
    --whatToShow "plot, heatmap and colorbar" \
    --colorMap plasma \
    --missingDataColor 0 \
    --refPointLabel Peak \
    -o atacchipintronic.pdf