# 3.3 Calling differentially expressed peaks with DESeq2

### IMPORTANT: Please make sure that you are using the bash kernel to run this notebook. ###




In this tutorial, we will focus on calling differential peaks: 
![Analysis pipeline](part4.png)

## Missing R packages 

When running the scripts in this section, if you get an error saying the gplots package has not been installed, you can install the package locally by  running the **3.5 Install R packages** notebook.

## Running DESeq

We run DESeq with 5 comparisons (which we call "contrasts"): 
* Media 
    * SCD vs SCE
* Salt 
    * 1 vs 0 , where 1 = salt used, 0 = no salt used
* Strain: 
    * WT vs cln3 
    * WT vs whi5
    * WT vs whi5cnl3
   

In [3]:
#create a directory to store the DeSeq output 
DESEQ_DIR="${ANALYSIS_DIR}deseq/"
[[ ! -d $DESEQ_DIR ]] && mkdir -p "$DESEQ_DIR"

Rscript $SRC_DIR/runDESeqTrainingCamp.r $MASTER_DATA/counts.filtered.tab $MASTER_DATA/batches.deseq2.txt $DESEQ_DIR

Loading required package: S4Vectors
Loading required package: methods
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, cbind, colnames, do.call,
    duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect,
    is.unsorted, lapply, lengths, Map, mapply, match, mget, order,
    paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind,
    Reduce, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit


Attaching package: ‘S4Vectors’

This code will generate 5 pairs of files: 

* DEMedia_SCD_vs_SCE.txt  
* Media_SCD_vs_SCE.txt.sigPeakNames  


* Salt_1_vs_0.txt  
* DESalt_1_vs_0.txt.sigPeakNames  


* Strain_WT_vs_cln3.txt  
* DEStrain_WT_vs_cln3.txt.sigPeakNames


* Strain_WT_vs_whi5.txt  
* DEStrain_WT_vs_whi5.txt.sigPeakNames  


* Strain_WT_vs_whi5cln3.txt
* DEStrain_WT_vs_whi5cln3.txt.sigPeakNames


The first is the raw output from DESeq for all peaks. We will not have time to discuss everything in this file, but feel free to read the DESeq manual and see if you can understand it. The second, which ends in “sigPeakNames,” contains a list of the IDs of the differentially open peaks from ATAC‐seq. The p‐value cutoff for differential openness that we use is 0.05. You can examine the content of these files with the following commands: 

In [4]:
head -n20 $DESEQ_DIR/Media_SCD_vs_SCE.txt
#head -n20 $DESEQ_DIR/Salt_1_vs_0.txt
#head -n20 $DESEQ_DIR/Strain_WT_vs_cln3.txt
#head -n20 $DESEQ_DIR/Strain_WT_vs_whi5.txt
#head -n20 $DESEQ_DIR/Strain_WT_vs_whi5cln3.txt


baseMean	log2FoldChange	lfcSE	stat	pvalue	padj
chrI_0_156422	26084.7015972314	0.181049014703123	1.96988531736345	0.0919084035538903	0.926770814654233	0.999384634705468
chrI_156464_156851	44.9589928934705	0.501374031244778	1.33779989187538	0.374775057383159	0.707827765824978	0.999384634705468
chrI_157271_157456	19.9526780442076	0.645956988548164	1.31637284527805	0.490709749039012	0.623631750219979	0.999384634705468
chrI_157831_157984	9.07390983920557	1.06394540998462	1.22506164250447	0.868483162863159	0.385129885681319	0.971104364520846
chrI_158496_158893	72.6942074836576	-0.00515400304259715	1.38361186260075	-0.00372503530933109	0.997027858711757	0.999384634705468
chrI_159448_159806	42.7643467784961	0.126253276130358	1.38680152762531	0.0910391816098931	0.927461457885821	0.999384634705468
chrI_159903_160189	47.6945001293688	0.316446635561479	1.50658461983367	0.210042390845868	0.833634587815822	0.999384634705468
chrI_166115_166931	289.860262308754	0.96948968596705	1.5402926956197

In [6]:
head -n20 $DESEQ_DIR/DEMedia_SCD_vs_SCE.txt.sigPeakNames
#head -n20 $DESEQ_DIR/DESalt_1_vs_0.txt.sigPeakNames
#head -n20 $DESEQ_DIR/DEStrain_WT_vs_cln3.txt.sigPeakNames
#head -n20 $DESEQ_DIR/DEStrain_WT_vs_whi5.txt.sigPeakNames
#head -n20 $DESEQ_DIR/DEStrain_WT_vs_whi5cln3.txt.sigPeakNames

chrIV	1515148	1515306
chrIV	1516434	1516627
chrIX	421454	422802
chrIX	426990	427998
chrIX	430190	430883
chrIX	433561	433834
chrIX	435365	435534
chrV	561740	561981
chrVII	1062872	1063078
chrVII	1066835	1067137
chrX	657235	657746
chrX	667155	667367
chrX	704762	705655
chrXI	481814	481986
chrXI	490886	491632
chrXI	517827	517992
chrXI	558757	558930
chrXI	630070	631067
chrXII	0	455655
chrXV	1073696	1074118
