# Introduction

ChIP-Seq is the combination of chromatin immunoprecipitation (ChIP) assays with high-throughput sequencing (Seq) and can be used to identify DNA binding sites for transcription factors and other proteins. The goal of this hands-on session is to perform the basic steps of the analysis of ChIP-Seq data, as well as some downstream analysis. Throughout this practical we will try to identify potential transcription factor binding sites of PAX5 in human lymphoblastoid cells.


## Learning outcomes

By the end of this tutorial you can expect to be able to: 

  * generate an unspliced alignment by aligning raw sequencing data to the human genome using **[Bowtie2](http://bowtie-bio.sourceforge.net/bowtie2/index.shtml)**
  * manipulate the SAM output in order to visualise the alignment in **[IGV](http://software.broadinstitute.org/software/igv/)**
  * based on the aligned reads, find immuno-enriched areas using the peak caller **[MACS2](https://github.com/taoliu/MACS)**
  * perform functional annotation and motif analysis on the predicted binding regions


## Tutorial sections

This tutorial comprises the following sections:    

  1. [Introducing the tutorial dataset](dataset-intro.ipynb) 
  2. [Aligning the PAX5 sample to the genome](pax5-alignment.ipynb)
  3. [Manipulating SAM output](manipulate-sam.ipynb) 
  4. [Visualising alignments in IGV](alignment-visualisation.ipynb) 
  5. [Aligning the control sample to the genome](control-alignment.ipynb)
  6. [Identifying enriched areas using MACS](identifying-enriched-areas.ipynb)
  7. [File formats](file-formats.ipynb)
  8. [Inspecting genomic regions using bedtools](inspecting-genomic-regions.ipynb)
  9. [Motif analysis](motif-analysis.ipynb)
 
 
## Authors
This tutorial was converted into a Jupyter notebook by [Victoria Offord](https://github.com/vaofford) based on materials developed by Angela Goncalves, Myrto Kostadima, Steven Wilder and Maria Xenophontos.

## Prerequisites

This tutorial assumes that you have the following software or packages and their dependencies installed on your computer. The software or packages used in this tutorial may be updated from time to time so, we have also given you the version which was used when writing the tutorial. 

| Package               | Link for download/installation instructions                          | Version tested |
| :-------------------: | :------------------------------------------------------------------: |:-------------: |
| bedtools              | http://bedtools.readthedocs.io/en/latest/content/installation.html   | 2.26.0         |
| Bowtie2               | http://bowtie-bio.sourceforge.net/bowtie2                            | 2.3.4.1        |
| IGV                   | http://software.broadinstitute.org/software/igv                      | 2.7.2          |
| MACS2                 | https://github.com/taoliu/MACS                                       | 2.1.0.20150420 |
| meme                  | http://meme-suite.org/tools/meme                                     | 4.10.0         |
| samtools              | https://github.com/samtools/samtools                                 | 1.9            |
| tomtom                | http://web.mit.edu/meme_v4.11.4/share/doc/tomtom.html                | 4.10.0         |
| UCSC tools            | http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64                | NA             |

## Where can I find the tutorial data?

You can find the data for this tutorial by typing the following command in a new terminal window.

In [None]:
cd /home/manager/course_data/Module6_CHiPSeq

Now, let’s head to the first section of this tutorial which will be [introducing the tutorial dataset](dataset-intro.ipynb).