# Parkinson's Mouse Tutorial - Import & Demux

Run this notebook in `qiime2-2021.11`.

Well be working through the [pd-mouse tutorial](https://docs.qiime2.org/2021.11/tutorials/pd-mice/).

*Note: did you run `jupyter serverextension enable --py qiime2 --sys-prefix` before getting here?*

Also, see the [Jupyter Markdown documentation](https://jupyter.brynmawr.edu/services/public/dblank/Jupyter%20Notebook%20Users%20Manual.ipynb).

In [None]:
from os import getcwd, listdir, chdir, mkdir
import qiime2 as q2

In [None]:
getcwd()

In [None]:
listdir()

In [None]:
mkdir('../processed')

In [None]:
chdir('../processed')
getcwd()

## Download and View Metadata

We'll use `wget` to download the metadata file, and then visualize it in onw of two ways:
 - [QIIME 2 View Website](https://view.qiime2.org/)
 - [QIIME 2 CLI / Utilities](https://docs.qiime2.org/2021.11/tutorials/utilities/)
 - [QIIME 2 API](https://docs.qiime2.org/2021.11/interfaces/artifact-api/)
 
 *Note: If you are running this notebook on the HPC, you may need to copy and paste these commands into the "Grace Shell Access" under the "Clusters" menu of the Grace HPC Portal page. Make sure you are downloading the files into the appropriate directory. Aalternatively, simply download the files to you computer and use Jupyter Lab to upload the files.*

In [None]:
# Download Metadata
! wget \
    -O "metadata.tsv" \
    "https://data.qiime2.org/2021.11/tutorials/pd-mice/sample_metadata.tsv"

In [None]:
# Peek at the metadata
! qiime tools inspect-metadata metadata.tsv

**Make metadata Visualization**

In [None]:
! qiime metadata tabulate \
  --m-input-file metadata.tsv \
  --o-visualization metadata.qzv

In [None]:
! qiime tools peek metadata.qzv

In [None]:
# Visualize via API
q2.Visualization.load('metadata.qzv')

## Import data into QIIME 2

We will import:
 - [Manifest File](https://docs.qiime2.org/2021.11/tutorials/importing/#fastq-manifest-formats)
 - Demultiplexed Sequences (contrast to Multiplexed Sequences)
 
See the [Importing Data Tutorial](https://docs.qiime2.org/2021.11/tutorials/importing/#importing-data) for more information.

In [None]:
# get manifest file
!wget \
  -O "manifest.tsv" \
  "https://data.qiime2.org/2021.11/tutorials/pd-mice/manifest"

In [None]:
# get demultiplexed sequences
!wget \
  -O "demultiplexed_seqs.zip" \
  "https://data.qiime2.org/2021.11/tutorials/pd-mice/demultiplexed_seqs.zip"

In [None]:
# unzip sequences
! unzip demultiplexed_seqs.zip

In [None]:
! head manifest.tsv

**Import and Summarize Data**

In [None]:
! qiime tools import \
  --type "SampleData[SequencesWithQuality]" \
  --input-format SingleEndFastqManifestPhred33V2 \
  --input-path ./manifest.tsv \
  --output-path ./demux_seqs.qza

In [None]:
! qiime demux summarize \
  --i-data ./demux_seqs.qza \
  --o-visualization ./demux_seqs.qzv

In [None]:
q2.Visualization.load('demux_seqs.qzv')

## Denoising Sequence data

 - DADA2 approach as outlined in the tutorial.
 - Alternate trimming w/ DADA2.
 - Using deblur w/ default trimming.

### Default

In [None]:
getcwd()

In [None]:
! qiime dada2 denoise-single \
    --i-demultiplexed-seqs ./demux_seqs.qza \
    --p-trunc-len 150 \
    --p-n-threads 8 \
    --o-table ./dada2_table.qza \
    --o-representative-sequences ./dada2_rep_set.qza \
    --o-denoising-stats ./dada2_stats.qza \
    --verbose

In [None]:
# summarize denoising stats
! qiime metadata tabulate \
    --m-input-file ./dada2_stats.qza  \
    --o-visualization ./dada2_stats.qzv

In [None]:
q2.Visualization.load('dada2_stats.qzv')

In [None]:
# summarize ESV table
! qiime feature-table summarize \
    --i-table ./dada2_table.qza \
    --m-sample-metadata-file ./metadata.tsv \
    --o-visualization ./dada2_table.qzv

In [None]:
q2.Visualization.load('dada2_table.qzv')

In [None]:
! qiime feature-table tabulate-seqs \
    --i-data ./dada2_rep_set.qza \
    --o-visualization ./dada2_rep_set.qzv

In [None]:
q2.Visualization.load('dada2_rep_set.qzv')

### Alternate Trimming w/ DADA2

In [None]:
! qiime dada2 denoise-single \
    --i-demultiplexed-seqs ./demux_seqs.qza \
    --p-trim-left 30 \
    --p-trunc-len 130 \
    --o-table ./dada2_table_alt.qza \
    --o-representative-sequences ./dada2_rep_set_alt.qza \
    --o-denoising-stats ./dada2_stats_alt.qza \
    --verbose

In [None]:
# summarize denoising stats
! qiime metadata tabulate \
    --m-input-file ./dada2_stats_alt.qza  \
    --o-visualization ./dada2_stats_alt.qzv

In [None]:
q2.Visualization.load('dada2_stats_alt.qzv')

In [None]:
# summarize ESV table
! qiime feature-table summarize \
    --i-table ./dada2_table_alt.qza \
    --m-sample-metadata-file ./metadata.tsv \
    --o-visualization ./dada2_table_alt.qzv

In [None]:
q2.Visualization.load('dada2_table_alt.qzv')

### deblur w/ default

In [None]:
! qiime quality-filter q-score \
    --i-demux ./demux_seqs.qza \
    --o-filtered-sequences demux-seqs-deblur.qza \
    --o-filter-stats demux-deblur-stats.qza

In [None]:
# Defaults to Greengenes. 
#    If you want to use SILVA or another ref db, then use:
#    `qiime deblur denoise-other`
#    silva files are located here: https://docs.qiime2.org/2021.11/data-resources/
! qiime deblur denoise-16S \
    --i-demultiplexed-seqs demux-seqs-deblur.qza \
    --p-trim-length 150 \
    --o-representative-sequences rep-seqs-deblur.qza \
    --o-table table-deblur.qza \
    --p-sample-stats \
    --o-stats deblur-stats.qza

In [None]:
! qiime metadata tabulate \
    --m-input-file demux-deblur-stats.qza \
    --o-visualization demux-deblur-stats.qzv

! qiime deblur visualize-stats \
    --i-deblur-stats deblur-stats.qza \
    --o-visualization deblur-stats.qzv

In [None]:
q2.Visualization.load('demux-deblur-stats.qzv')

In [None]:
q2.Visualization.load('deblur-stats.qzv')

In [None]:
! qiime feature-table summarize \
    --i-table table-deblur.qza \
    --o-visualization table-deblur.qzv \
    --m-sample-metadata-file metadata.tsv

! qiime feature-table tabulate-seqs \
    --i-data rep-seqs-deblur.qza \
    --o-visualization rep-seqs-deblur.qzv

In [None]:
q2.Visualization.load('table-deblur.qzv')

In [None]:
q2.Visualization.load('rep-seqs-deblur.qzv')