# QIIME2: Terminal Run (Database Process)

<!--
**Can be run in both notebook**, in order to achieve shorter process time,
- If the Step 10 of [database process](Terminal_Run_DB_Process.ipynb) is not **complete**, **this step should be in [the main process](Terminal_Run_Main_Process.ipynb#8)**.
- If another step is run in [the main process](Terminal_Run_Main_Process.ipynb) (e.g. visualization) and the database process has already run successfully, **this step should be run in here**.
-->
---
Terminal (Shell command line code) for QIIME2

This pipeline consists of 2 files (2 Processes)
1. [Main Process](Terminal_Run_Main_Process.ipynb)
2. [Database Process](Terminal_Run_DB_Process.ipynb)
3. [Common (Final) Process](Terminal_Run_Common_Process.ipynb)

**To reduce running time the [all processes](#) should be run parallelly**

Documentation of QIIME2 - https://docs.qiime2.org/2019.7/tutorials/overview
Official website - https://qiime2.org

## Step of 16S
<img src="step.png"/>

# 0. Set Variable
- The variable should be set in [all processes](#)
- To reduce running time the [all processes](#) should be run parallelly

In [None]:
%env IN_SEQ_DB="/mnt/d/qiime2/Silva_132_release/SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna"
%env IN_TAXONOMY="/mnt/d/qiime2/Silva_132_release/SILVA_132_QIIME_release/taxonomy/16S_only/99/consensus_taxonomy_7_levels.txt"

%env PRIMER_FRW="CCTACGGGNGGCWGCAG"
%env PRIMER_REV="GACTACHVGGGTATCTAATCC"

%env CPU_CORE=32
%env CPU_THREAD=0
%env CPU_JOB=-1

# 1 - 5. The Main Process
Run them in [Main Process](Terminal_Run_Main_Process.ipynb)

---

# 6. Import Database
- We use SILVA Database.
- Other database are available at https://docs.qiime2.org/2019.4/data-resources
- Read more https://docs.qiime2.org/2019.4/tutorials/feature-classifier

In [None]:
%%bash
qiime tools import \
--input-path $IN_SEQ_DB \
--output-path 06.1.raw_97.qza \
--type 'FeatureData[Sequence]'

qiime tools import \
--type 'FeatureData[Taxonomy]' \
--input-format HeaderlessTSVTaxonomyFormat \
--input-path $IN_TAXONOMY \
--output-path 06.2.ref-taxonomy.qza

# 7. Extract feature

In [None]:
%%bash
qiime feature-classifier extract-reads \
--i-sequences 06.1.raw_97.qza \
--p-f-primer $PRIMER_FRW \
--p-r-primer $PRIMER_REV \
--o-reads 06.3.ref-seqs.qza

# 8 - 9 and 11. The Common Process  <a name="8"/>

Run them in [Common Final Process](Terminal_Run_Common_Process.ipynb)

# 10. Train Naive Bayes
**Memory consuming process** (About 16 GB of RAM for SILVA DB)

In [None]:
%%bash
qiime feature-classifier fit-classifier-naive-bayes \
--i-reference-reads 06.3.ref-seqs.qza \
--i-reference-taxonomy 06.2.ref-taxonomy.qza \
--verbose \
--o-classifier 09.classifier.qza