##### Steps for importing data:
- download sra toolkit
- prefetch SRA files for patients P014, P020, P027, P044 and P061 (only patients where pre- and post treatment data was available; also limiting the amount of data was necessary because of processing power limitations)(prefetch SRR23801859 SRR23801959 SRR23801860 SRR23801941 SRR23801866 SRR23801999 SRR23801898 SRR23801944 SRR23801908 SRR23801935 in command shell)
- fastq-dump the same files (fastq-dump --outdir fastq --gzip --skip-technical  --readids --read-filter pass --dumpbase --split-3 --clip in command shell)
- import data into jupyterhub
- save data as qza artifact (manifest file was written manually in notepad)

In [36]:
# importing all required packages & notebook extensions at the start of the notebook
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import qiime2 as q2
import scipy.stats as stats
from qiime2 import Artifact

# This is to supress the warning messages (if any) generated in our code
import warnings
warnings.filterwarnings('ignore')

%matplotlib inline

In [8]:
raw_data_dir = "../data/raw"
data_dir = "../data/processed"
vis_dir  = "../results"

In [27]:
! qiime tools import \
  --type 'SampleData[PairedEndSequencesWithQuality]' \
  --input-path $raw_data_dir/manifest.csv \
  --input-format PairedEndFastqManifestPhred33V2 \
  --output-path $data_dir/demux-paired-end.qza

[32mImported ../data/raw/manifest.csv as PairedEndFastqManifestPhred33V2 to ../data/processed/demux-paired-end.qza[0m
[0m

In [39]:
! qiime assembly assemble-megahit \
    --i-seqs  /data/w9_data/reads.qza \
    --p-presets meta-sensitive \
    --p-num-cpu-threads 3 \
    --o-contigs $data_dir/contigs.qza


Usage: [94mqiime assembly assemble-megahit[0m [OPTIONS]

  This method uses MEGAHIT to assemble provided paired- or single-end NGS
  reads into contigs.

[1mInputs[0m:
  [94m[4m--i-seqs[0m ARTIFACT [32mSampleData[SequencesWithQuality |[0m
    [32mPairedEndSequencesWithQuality][0m
                          The paired- or single-end sequences to be
                          assembled.                                [35m[required][0m
[1mParameters[0m:
  [94m--p-presets[0m TEXT [32mChoices('meta', 'meta-sensitive', 'meta-large',[0m
    [32m'disabled')[0m           Override a group of parameters. See the megahit
                          documentation for details.                [35m[optional][0m
  [94m--p-min-count[0m INTEGER   Minimum multiplicity for filtering (k_min+1)-mers.
    [32mRange(1, None)[0m                                                [35m[default: 2][0m
  [94m--p-k-list[0m INTEGERS... [32mRange(15, 255, inclusive_end=True)[0m
               