# QIIME 2 enables comprehensive end-to-end analysis of diverse microbiome data and comparative studies with publicly available data

this is a QIIME 2 Artifact CLI notebook which replicated analyses in the QIIME 2 protocol

**environment:** qiime2-2020.2

In [None]:
pip install --upgrade c-lasso

In [None]:
pip install c-lasso

pip install zarr

pip install plotly

In [6]:
cd ..

In [66]:
ls

ccovariates.qza			rep-seqs.qzv
classify-predictions.qza	sample-metadata-complete.tsv
classify-xtest.qza		table.qza
classify-xtraining.qza		table.qzv
classifytaxa.qza		taxonomy.qza
classifytaxa.qzv		taxonomy.qzv
features-test.qza		wcovariates.qza
filtered-table.qza		wtaxa.qza
genus_table.qza			xclr.qza
genus_table_clr.qza		xcovariates.qza
regress-predictions.qza		xtaxa.qza
regresstaxa.qza			xtest.qza
regresstaxa.qzv			xtraining.qza
rep-seqs.qza


In [67]:
cd GitHub/q2-classo/

bash: cd: GitHub/q2-classo/: No such file or directory


: 1

In [98]:
python setup.py install

pip install -e .

running install
running bdist_egg
running egg_info
writing q2_classo.egg-info/PKG-INFO
writing dependency_links to q2_classo.egg-info/dependency_links.txt
writing entry points to q2_classo.egg-info/entry_points.txt
writing top-level names to q2_classo.egg-info/top_level.txt
reading manifest file 'q2_classo.egg-info/SOURCES.txt'
writing manifest file 'q2_classo.egg-info/SOURCES.txt'
installing library code to build/bdist.macosx-10.9-x86_64/egg
running install_lib
running build_py
copying q2_classo/_func.py -> build/lib/q2_classo
copying q2_classo/plugin_setup.py -> build/lib/q2_classo
copying q2_classo/_tree.py -> build/lib/q2_classo
copying q2_classo/_summarize/_visualizer.py -> build/lib/q2_classo/_summarize
copying q2_classo/_summarize/assets/cv.html -> build/lib/q2_classo/_summarize/assets
copying q2_classo/_summarize/assets/lam-fixed.html -> build/lib/q2_classo/_summarize/assets
copying q2_classo/_summarize/assets/stabsel.html -> build/lib/q2_classo/_summarize/assets
creating build

In [122]:
qiime dev refresh-cache

[33mQIIME is caching your current deployment for improved performance. This may take a few moments and should only happen once per deployment.[0m


In [None]:
cd example/Data

## Filter features

In [69]:
qiime feature-table filter-features \
  --i-table table.qza \
  --p-min-samples 20 \
  --o-filtered-table filtered-table.qza

[32mSaved FeatureTable[Frequency] to: filtered-table.qza[0m


## log-contrast and taxa processing

Either collapse at genus level, which is the 'easy way', but not really what we want

In [None]:
qiime taxa collapse --i-table table.qza \
  --i-taxonomy taxonomy.qza \
  --p-level 6 \
  --o-collapsed-table genus_table.qza

qiime classo transform-features \
     --p-transformation clr \
     --p-coef 0.5 \
     --i-features genus_table.qza \
     --o-x genus_table_clr

In [70]:
 qiime classo transform-features \
     --p-transformation clr \
     --p-coef 0.5 \
     --i-features filtered-table.qza \
     --o-x xclr
     
    
qiime classo add-taxa \
	--i-features xclr.qza  \
	--i-taxa taxonomy.qza \
	--o-x xtaxa --o-aweights wtaxa

[32mSaved FeatureTable[Design] to: xclr.qza[0m
[32mSaved FeatureTable[Design] to: xtaxa.qza[0m
[32mSaved Weights to: wtaxa.qza[0m


## Add covariates

In [71]:
qiime classo add-covariates \
    --i-features xtaxa.qza \
    --i-weights wtaxa.qza \
    --m-covariates-file sample-metadata-complete.tsv \
    --p-to-add host_sexual_orientation host_age host_body_mass_index ethnicity \
    --p-w-to-add 0.1 1. 1. 0.1 \
    --o-new-features xcovariates \
    --o-new-c ccovariates \
    --o-new-w wcovariates

[32mSaved FeatureTable[Design] to: xcovariates.qza[0m
[32mSaved ConstraintMatrix to: ccovariates.qza[0m
[32mSaved Weights to: wcovariates.qza[0m


## Split table

Split data into training and testing sets : 

In [72]:
qiime sample-classifier split-table \
	--i-table xcovariates.qza \
	--m-metadata-file sample-metadata-complete.tsv \
	--m-metadata-column sCD14  \
	--p-test-size 0.2 \
	--p-random-state 42 \
	--p-stratify False \
	--o-training-table regress-xtraining \
	--o-test-table regress-xtest

[32mSaved FeatureTable[Design] to: regress-xtraining.qza[0m
[32mSaved FeatureTable[Design] to: regress-xtest.qza[0m


In [73]:
qiime sample-classifier split-table \
	--i-table xcovariates.qza \
	--m-metadata-file sample-metadata-complete.tsv \
	--m-metadata-column HIV_serostatus  \
	--p-test-size 0.2 \
	--p-random-state 42 \
	--p-stratify False \
	--o-training-table classify-xtraining \
	--o-test-table classify-xtest

[32mSaved FeatureTable[Design] to: classify-xtraining.qza[0m
[32mSaved FeatureTable[Design] to: classify-xtest.qza[0m


## Regression task 

Apply classo to the training set to solve the linear regression problem : 

In [88]:
qiime classo regress  \
    --i-features regress-xtraining.qza \
    --i-c ccovariates.qza \
    --i-weights wcovariates.qza \
    --m-y-file sample-metadata-complete.tsv \
    --m-y-column sCD14  \
    --p-concomitant \
    --p-stabsel \
    --p-cv \
    --p-path \
    --p-lamfixed \
    --p-stabsel-threshold 0.5 \
    --p-cv-seed 1 \
    --p-no-cv-one-se \
    --o-result regresstaxa

[32mSaved CLASSOProblem to: regresstaxa.qza[0m


## Classification task

In [126]:
qiime classo classify  \
    --i-features classify-xtraining.qza \
    --i-c ccovariates.qza \
    --i-weights wcovariates.qza \
    --m-y-file sample-metadata-complete.tsv \
    --m-y-column HIV_serostatus  \
    --p-huber \
    --p-stabsel \
    --p-cv \
    --p-path \
    --p-lamfixed \
    --p-stabsel-threshold 0.5 \
    --p-cv-seed 42 \
    --p-no-cv-one-se \
    --o-result classifytaxa

[32mSaved CLASSOProblem to: classifytaxa.qza[0m


## Prediction 

In [106]:
qiime classo predict \
    --i-features regress-xtest.qza \
    --i-problem regresstaxa.qza \
    --o-predictions regress-predictions.qza

[32mSaved CLASSOProblem to: regress-predictions.qza[0m


In [127]:
qiime classo predict \
    --i-features classify-xtest.qza \
    --i-problem classifytaxa.qza \
    --o-predictions classify-predictions.qza

[32mSaved CLASSOProblem to: classify-predictions.qza[0m


## Visualization

In [92]:
qiime classo summarize \
  --i-problem regresstaxa.qza \
  --i-taxa taxonomy.qza \
  --i-predictions regress-predictions.qza \
  --o-visualization regresstaxa.qzv

[32mSaved Visualization to: regresstaxa.qzv[0m


In [128]:
qiime classo summarize \
  --i-problem classifytaxa.qza \
  --i-taxa taxonomy.qza \
  --i-predictions classify-predictions.qza \
  --o-visualization classifytaxa.qzv \
  --verbose

[32mSaved Visualization to: classifytaxa.qzv[0m


In [94]:
qiime tools view regresstaxa.qzv

Press the 'q' key, Control-C, or Control-D to quit. This view may no longer be accessible or work correctly after quitting.

In [129]:
qiime tools view classifytaxa.qzv

Press the 'q' key, Control-C, or Control-D to quit. This view may no longer be accessible or work correctly after quitting.

Alternatively, one can drag&drop the file problemtaxa.qzv on : https://view.qiime2.org
Thanks to this alternative, one can also track the workflow that the qiime2 artifact did. 