Skip to content

pezoaso/problem-set-4

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

---
title: "Problem set 4"
author: "Your name here"
date: "`r Sys.Date()`"
---

# Overview

For this problem set you will need to analyze some ChIP-seq data to
identify a mystery factor X.

## Workflow

Create a `run.sh` file that runs the entire workflow (or as much as possible).

### Alignment

Align FASTQ data to the human genome with bowtie2. There are two files
in the `data/` directory:

```
data/factorx.chr1.fq.gz
data/hg19.chr1.fa.gz
```

First build a bowtie2 index with `bowtie2-build` and use `bowtie2` and `samtools` to align the reads to the index.

**The output of the alignment step is a sorted BAM file.**

### Create bedGraph

Create a bedGraph file from the sorted BAM files. Use the
`bedGraphToBigWig` utility and the `hg19.chrom.size` file in the `data/`
directory.

### Create a track in the UCSC browser

1. Create a branch in your forked repository called `gh-pages`:

```bash
$ git branch gh-pages
$ git push origin gh-pages
```

1. Go to the browser and add a "custom track" in the `hg19` genome build.
your trackline should look something like this (all on one line):

```
track type=bedGraph bigDataUrl="http://<username>.github.io/<repo name>/path/to/bw color=255,0,0 visiblity=full name='chip data' description='chip description'
```

### Peak calling

Call peaks from the bedGraph data using MACS2.

```bash
$ macs2 callpeak -t <BAM file>
```


### Generate motifs from the peak calls

1. Use these peak calls to collect FASTA sequences with `bedtools getfasta`.

1. Derive motifs from the FASTA sequences with `meme`.

```bash
# if you get an error about "max size" add -maxsize 1000000
$ meme <FASTA file> -nmotifs 1 -maxw 20 -minw 8 -dna 
```

1. Extract the motif from the `meme.txt` output and use TOMTOM to identify
the motif match. You can use the `meme-get-motif` to extract the first motif from the file:

```bash
meme-get-motif -id 1 < meme.txt
```

Copy the numeric matrix into the the search box on the tomtom site and report which motif it matches.

About

Problem Set 4

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages