# Pull sequencing reads into fastq.gz from SRA or local storage
### Find the local fastq.gz file using 
[Tony frontend](http://iona.wi.mit.edu/tonydev/dbSearch.pl)
## Example below pulls 1M reads for every type of seq data supported <br> Types of seq pulled:
    * Chip-Seq and Chromatin Accessibility (ATAC-Seq and DNAse-Seq)
    * ChiA-PET
    * Hi-C
    * Hi-ChIP
    * DNAse-HiC
    * RNA-Seq
    * Gro-Seq
    * 4C
### links to pipelines for each type of data below

In [1]:
#!/bin/bash
#Set directories for input/output
#Define data folders
out_dir="/tmp/"

# Don't forget!!!
to quality-check, trim, and filter your reads using [this pipeline](fq2preppedReads.ipynb) before running ANY of the pipelines

In [15]:
#Chip-seq/Chromatin Accessibility
#CTCF mesc
/root/pipelines/SRA2fq SRR524848 $out_dir CTCF_mesc 1000000
#Input mesc
/root/pipelines/SRA2fq SRR524849 $out_dir output_data 1000000
#H3k27Ac mesc
/root/pipelines/SRA2fq SRR066766 $out_dir output_data 1000000
#ATAC mesc
/root/pipelines/SRA2fq SRR2927023 $out_dir output_data 1000000

#CTCF hesc
/root/pipelines/SRA2fq SRR2056018 $out_dir CTCF_hesc 1000000
#Input hesc
/root/pipelines/SRA2fq SRR2056020 $out_dir input_hesc 1000000
#H3K27Ac hesc
/root/pipelines/SRA2fq SRR2056016 $out_dir h3k27Ac_hesc 1000000
#ATAC h7 hesc
/root/pipelines/SRA2fq SRR3689760 $out_dir ATAC_h7 1000000

See [this pipeline](fq2peaks.ipynb) to call peaks using Homer and/or MACS2 as well as nucleosome positioning using nucleoatac

In [2]:
#chia-PET
#mesc
/root/pipelines/SRA2fq SRR1296617 $out_dir ChiA_SMC1_mesc 1000000
#hesc
/root/pipelines/SRA2fq SRR2054933 $out_dir ChiA_SMC1_hesc 1000000

See [this pipeline](fq2ChIAInts.ipynb) to prep reads, align, call and normalize pairwise interactions using ChiAPet2 and/or Origami and dump into cooler format.

In [18]:
#Hi-c
#mesc
/root/pipelines/SRA2fq SRR443883 $out_dir HiC_mesc 1000000
#hesc
/root/pipelines/SRA2fq SRR400260 $out_dir HiC_hesc 1000000

See [this pipeline](fq2HiCInts.ipynb) to go from fastq reads, align, normalize and dump into cooler format using HiCPro

In [19]:
#Hi-ChIp
#mesc
/root/pipelines/SRA2fq SRR3467183 $out_dir HiChip_mesc 1000000
#GM12878
/root/pipelines/SRA2fq SRR3467176 $out_dir HiChip_hesc 1000000

See [this pipeline](fq2HiChIPInts.ipynb) custom pipeline to go from fastq reads through HiCPro + scripts to normalize and dump into cooler format

In [20]:
#Dnase Hi-c
#mesc patski cells
/root/pipelines/SRA2fq SRR2033066 $out_dir dnaseHiC_patski 1000000
#hesc
/root/pipelines/SRA2fq SRR1248175 $out_dir dnaseHiC_hesc 1000000

See [this pipeline](fq2DNAseHiCInts.ipynb) to go from fastq reads, align, normalize and dump into cooler format using HiCPro

In [None]:
#RNA-Seq
#mesc 4cell
/root/pipelines/SRA2fq SRR1840518 $out_dir rnaseq_mesc 1000000
#hesc mesoderm
/root/pipelines/SRA2fq SRR3439456 $out_dir rnaseq_hesc 1000000

See [this pipeline](fq2countsFPKM.ipynb) to take RNA-Seq reads and align and quantify/normalize expression values (FPKM) using RSEM

In [None]:
#Gro-seq
#mesc
/root/pipelines/SRA2fq SRR935093 $out_dir groseq_mesc 1000000
#h1 hesc (our data!)
/root/pipelines/SRA2fq SRR574826 $out_dir groseq_hesc 1000000

See [this pipeline](fq2GroRPKM.ipynb) find nascent transcripts using FStitch and miRNA promoters using mirSTP

In [None]:
#4C
#mesc poised enhancers = viewpoints
/root/pipelines/SRA2fq SRR4451724 $out_dir 4c_poiEnh_mesc 1000000
#hesc MT2A
/root/pipelines/SRA2fq SRR1409666 $out_dir 4c_MT2A_hesc 1000000

See [this pipeline](fq24CInts.ipynb) to get wiggle file from fastq reads using HiCPro and/or custom pipeline