#### Adam Klie<br>04/08/2020
## Process downloaded data into necessary qiime2 artifacts
Use a bash Kernel<br>
Turn the downloaded BIOM table into the necessary Qiime2 artifacts for analysis<br>
Requires BIOM table download from download_data.ipynb notebook<br>
Requires filtered-metadata table from filter_data.ipynb notebook

In [1]:
# Navigate to desired directory
cd /Users/adamklie/Desktop/rotations/knight_lab/projects/2020_04_06_PA_microbiome/data/test_subset

### Generate FeatureTable[Frequency] artifacts from BIOM table

In [3]:
# Generate FeatureTable[Frequency] artifact
qiime tools import \
    --input-path exercise_samples.biom \
    --type 'FeatureTable[Frequency]' \
    --input-format BIOMV210Format \
    --output-path ./table.qza

# Filter out based on 
qiime feature-table filter-samples \
  --i-table table.qza \
  --m-metadata-file filtered-metadata.tsv \
  --o-filtered-table filtered-table.qza

[32mImported exercise_samples.biom as BIOMV210Format to ./table.qza[0m
[32mSaved FeatureTable[Frequency] to: filtered-table.qza[0m


### Generate FeatureData[Sequence] artifacts from BIOM table

In [4]:
# Generate FeatureData[Sequence] artifact
biom summarize-table \
    --observations -i exercise_samples.biom \
    | tail -n +16 | awk -F ':' '{print ">"$1"\n"$1}' \
    > rep_seqs.fna
    
qiime tools import \
    --input-path rep_seqs.fna \
    --output-path rep-seqs.qza \
    --type FeatureData[Sequence]
    
# Filter out based on filter.ipynb output
qiime feature-table filter-seqs \
  --i-data rep-seqs.qza \
  --i-table filtered-table.qza \
  --o-filtered-data filtered-rep-seqs.qza

[32mImported rep_seqs.fna as DNASequencesDirectoryFormat to rep-seqs.qza[0m
[32mSaved FeatureData[Sequence] to: filtered-rep-seqs.qza[0m


### Generate a tree with fragment insertion

In [5]:
wget https://data.qiime2.org/2020.2/common/sepp-refs-gg-13-8.qza

--2020-04-23 10:12:55--  https://data.qiime2.org/2020.2/common/sepp-refs-gg-13-8.qza
Resolving data.qiime2.org (data.qiime2.org)... 52.35.38.247
Connecting to data.qiime2.org (data.qiime2.org)|52.35.38.247|:443... connected.
HTTP request sent, awaiting response... 302 FOUND
Location: https://s3-us-west-2.amazonaws.com/qiime2-data/2020.2/common/sepp-refs-gg-13-8.qza [following]
--2020-04-23 10:12:55--  https://s3-us-west-2.amazonaws.com/qiime2-data/2020.2/common/sepp-refs-gg-13-8.qza
Resolving s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)... 52.218.253.8
Connecting to s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)|52.218.253.8|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 50161069 (48M) [binary/octet-stream]
Saving to: ‘sepp-refs-gg-13-8.qza’


2020-04-23 10:12:59 (13.2 MB/s) - ‘sepp-refs-gg-13-8.qza’ saved [50161069/50161069]



In [6]:
qiime fragment-insertion sepp \
    --i-representative-sequences filtered-rep-seqs.qza \
    --i-reference-database sepp-refs-gg-13-8.qza \
    --p-threads 4 \
    --o-tree insertion-tree.qza \
    --o-placements insertion-placements.qza

[32mSaved Phylogeny[Rooted] to: insertion-tree.qza[0m
[32mSaved Placements to: insertion-placements.qza[0m


In [7]:
qiime fragment-insertion filter-features \
    --i-table filtered-table.qza \
    --i-tree insertion-tree.qza \
    --o-filtered-table insertion-filtered-table.qza \
    --o-removed-table insertion-removed-table.qza

[32mSaved FeatureTable[Frequency] to: insertion-filtered-table.qza[0m
[32mSaved FeatureTable[Frequency] to: insertion-removed-table.qza[0m


### Generate a de novo tree with MAFFT

In [8]:
qiime phylogeny align-to-tree-mafft-fasttree \
  --i-sequences filtered-rep-seqs.qza \
  --o-alignment aligned-rep-seqs.qza \
  --o-masked-alignment masked-aligned-rep-seqs.qza \
  --o-tree unrooted-tree.qza \
  --o-rooted-tree rooted-tree.qza

[32mSaved FeatureData[AlignedSequence] to: aligned-rep-seqs.qza[0m
[32mSaved FeatureData[AlignedSequence] to: masked-aligned-rep-seqs.qza[0m
[32mSaved Phylogeny[Unrooted] to: unrooted-tree.qza[0m
[32mSaved Phylogeny[Rooted] to: rooted-tree.qza[0m


In [9]:
qiime phylogeny filter-table \
    --i-table filtered-table.qza \
    --i-tree rooted-tree.qza \
    --o-filtered-table rooted-tree-table.qza

[32mSaved FeatureTable[Frequency] to: rooted-tree-table.qza[0m


## Generate classifier and assign taxonomy
May want to follow tutorial here: https://forum.qiime2.org/t/using-q2-clawback-to-assemble-taxonomic-weights/5859<br>
Downloaded from: https://github.com/BenKaehler/readytowear/blob/master/inventory.tsv

In [None]:
#wget -O $data_dir'human-stool.qza' \
#https://github.com/BenKaehler/readytowear/raw/master/data/gg_13_8/515f-806r/human-stool.qza

In [None]:
#wget -O $data_dir'ref-seqs-v4.qza' \
#https://github.com/BenKaehler/readytowear/raw/master/data/gg_13_8/515f-806r/ref-seqs-v4.qza

In [None]:
#wget -O $data_dir'ref-tax.qza' \
#https://github.com/BenKaehler/readytowear/raw/master/data/gg_13_8/515f-806r/ref-tax.qza

In [2]:
qiime feature-classifier fit-classifier-naive-bayes \
    --i-reference-reads ref-seqs-v4.qza \
    --i-reference-taxonomy ref-tax.qza \
    --i-class-weight human-stool.qza \
    --o-classifier gg138_v4_human-stool_classifier.qza

In [3]:
qiime feature-classifier classify-sklearn \
    --i-reads filtered-rep-seqs.qza \
    --i-classifier gg138_v4_human-stool_classifier.qza \
    --o-classification bespoke-taxonomy.qza

[32mSaved FeatureData[Taxonomy] to: bespoke-taxonomy.qza[0m
