## Downloads and Installs for CNV processing pipeline ##

### Through anaconda: ###

```bash
conda create -y -n cnv
source activate cnv
conda install -c bioconda python=2.7 cutadapt=1.16 bedtools=2.27.1 java-jdk
source deactivate 
source ~/.bashrc
```

### Manual download and installation: ###

We will download the following into our software directory. We do not need to add to our path variable since the script that we will write will ask for the full path to these packages:

** Installing Samtools **

```bash
cd ~/software
wget https://github.com/samtools/samtools/releases/download/1.8/samtools-1.8.tar.bz2
tar -vxf samtools-1.8.tar.bz2
cd samtools-1.8
./configure --disable-lzma
make
```

** Installing BWA **

```bash
wget https://downloads.sourceforge.net/project/bio-bwa/bwa-0.7.17.tar.bz2 --no-check-certificate
tar -xvf bwa-0.7.17.tar.bz2
cd bwa-0.7.17
make
```

** Download Picard tools ***

```bash
wget https://github.com/broadinstitute/picard/releases/download/2.18.9/picard.jar
```

## Running processing pipeline for CNV calling ##

** Choosing fastq files **

Make a new directory on your projects folder in your scratch directory and softlink the fastq files that you will be processing from the following file in the shared directory: 

    /oasis/tscc/scratch/cshl_2018/raw_data_dna_experiments/

Make sure to link to the files with both R1 and R2.

** Making a shell script for running pipeline **

Write a .sh script to run the CNV pipeline. Use the following as a sample script:

```bash
#!/bin/bash

T=~/anaconda2/envs/cnv/bin

python -u /oasis/tscc/scratch/cshl_2018/shared_scripts/CNV_pipeline_CSHL_v2.py \ #python script in shared folder
-d path/to/fastq_files \ #full path to the directory with selected fastq files
-p \ #indicated paired-end experiment
-r /oasis/tscc/scratch/cshl_2018/reference_files/hg19_fasta/hg19.fasta \ #genome fasta file with index
-bwa path/to/bwa-0.7.17 \ # path to bwa install
-bed ${T} \ # path to bedtools install
-sam path/to/samtools-1.8 \ # path to samtools install
-pic path/to/picard.jar \ # path to picard tools 
-cut ${T} \ # path to cutadapt
-S 17 \ # start adapter trimming 
-E 0 # end adapter trimming
```

To run the pipeline, grab an interactive node (nodes=1:ppn=2 should be fine), activate your cnv environment, and run the .sh script