# General Demultiplexing Pipeline

#### In a Terminal window start by...

- logging in to gardner (ssh -l t.sur.avangelatos gardner.cri.uchicago.edu)
- loading relevant modules

In [None]:
module load gcc/6.2.0
module load python/2.7.13

- changing the working directory

In [None]:
cd /group/gilbert-lab/Lutz/Cadaver/Alex

### 1) Validate Mapping File
Checks that the metadata mapping file is in the correct format. If there are any errors a warning message will be displayed and the errors can be viewed in the .log file in the validate_mappingfile directory.

In [None]:
validate_mapping_file.py -m raw_data/mapfile_metadata.txt -o raw_data/validate_mappingfile

The output of this code will be stored as "validate_mappingfile" file in the raw_data file.


### 2) Demultiplex

> Joining Reads & Barcodes

In [None]:
mkdir raw_data/joined

scripts/ea-utils/bin/fastq-join raw_data/Undetermined_S0_L001_R1_001.fastq raw_data/Undetermined_S0_L001_R2_001.fastq -o raw_data/joined/out.%.fastq #> raw_data/joined/out.stats.txt
#Undetermined_S0_L001_R1_001.fastq is the forward seqs.... and Undetermined_S0_L001_R2_001.fastq are the reverse

scripts/fastq-barcode.pl raw_data/Undetermined_S0_L001_I1_001.fastq raw_data/joined/out.join.fastq > raw_data/joined/out.barcodes.fastq
#Undetermined_S0_L001_I1_001.fastq are the barcodes

> Demultiplex Reads

In [None]:
mkdir raw_data/demux
split_libraries_fastq.py -i raw_data/joined/out.join.fastq -b raw_data/joined/out.barcodes.fastq -m raw_data/mapfile_metadata.txt -o raw_data/demux/cadaver_demux_seqs --barcode_type=12 --max_barcode_errors=0 --store_demultiplexed_fastq

### 3) Identify sub-OTUs
Searching for Exact Sequence Variants (ESVs) using Deblur

#### Input file: 
Demultiplexed FASTA file (e.g. filter_derep.fasta) but in this case (seqs.fna)

#### Output files:
- reference-hit.biom
- reference-hit.seqs.fa
- reference-non-hit.biom
- reference-non-hit.seqs.fa
- all.biom (contains both 1 and 3)
- all.seqs.fa (contains both 2 and 4)

Focus on reference hit outputs only.


Open new terminal window and load the following modules to work with qiime 1...

In [None]:
module load gcc/6.2.0
module load python/3.5.3
module load qiime2

In [None]:
# Run Deblur
deblur workflow --seqs-fp raw_data/demux/cadaver_demux_seqs/seqs.fna --output-dir deblur_results -t 150

Return to terminal window with these modules loaded...

In [None]:
module load gcc/6.2.0
module load python/2.7.13

### 4) Align Sequences (GreenGenes reference)

In [None]:
align_seqs.py -i deblur_results/reference-hit.seqs.fa -t /group/gilbert-lab/Lutz/Cadaver/Alex/gg_13_8_otus/rep_set_aligned/85_otus.pynast.fasta -o aligned

### 5) Make Phylogeny

In [None]:
mkdir final_biom_files

make_phylogeny.py -i aligned/reference-hit.seqs_aligned.fasta -o final_biom_files/rep_phylo.tre

### 6) Assign Taxonomy

In [None]:
assign_taxonomy.py -i deblur_results/reference-hit.seqs.fa -r gg_13_8_otus/rep_set/97_otus.fasta -t gg_13_8_otus/taxonomy/97_otu_taxonomy.txt -o deblur_results/taxon_assignment/

### 7) Add Metadata

In [None]:
biom add-metadata --sc-separated taxonomy --observation-header OTUID,taxonomy --observation-metadata-fp deblur_results/taxon_assignment/reference-hit.seqs_tax_assignments.txt -i deblur_results/reference-hit.biom -o final_biom_files/cadaver_deblur.biom