# QIIME 2

The pipeline for an Illumina data processing accordint to the "Moving Pictures" tutorial on [QIIME2](https://docs.qiime2.org/2023.5/tutorials/moving-pictures/).

We use same raw data, and re-format metadata file. Metadata must be:
* tab-separated (not comma-separated)
* contain `SampleId` column with names of files (they will become a sample names once)

Install Qiime2 as an environment of `conda`, activate the environment. The installation itself is described [here](https://docs.qiime2.org/2023.5/install/). If you are going to use Google Colab, [this](https://gist.github.com/cdiener/be2e4caded84dcdcba8b2b0c0e0771d3) might be useful for you.

## Import data

Import raw reads

In [9]:
!qiime tools import \
  --type 'SampleData[PairedEndSequencesWithQuality]' \
  --input-path raw \
  --input-format CasavaOneEightSingleLanePerSampleDirFmt \
  --output-path demux-paired-end.qza

[32mImported raw as CasavaOneEightSingleLanePerSampleDirFmt to demux-paired-end.qza[0m


Use your favorite way to process the data. I choose the Deblur implementation

## Deblur

In [14]:
!qiime quality-filter q-score \
 --i-demux demux-paired-end.qza \
 --o-filtered-sequences demux-filtered.qza \
 --o-filter-stats demux-filter-stats.qza

[32mSaved SampleData[SequencesWithQuality] to: demux-filtered.qza[0m
[32mSaved QualityFilterStats to: demux-filter-stats.qza[0m


In [None]:
!qiime deblur denoise-16S \
  --i-demultiplexed-seqs demux-filtered.qza \
  --p-trim-length 120 \
  --o-representative-sequences rep-seqs-deblur.qza \
  --o-table table-deblur.qza \
  --p-sample-stats \
  --o-stats deblur-stats.qza

In [15]:
!qiime metadata tabulate \
  --m-input-file demux-filter-stats.qza \
  --o-visualization demux-filter-stats.qzv
!qiime deblur visualize-stats \
  --i-deblur-stats deblur-stats.qza \
  --o-visualization deblur-stats.qzv

[32mSaved Visualization to: demux-filter-stats.qzv[0m
[32mSaved Visualization to: deblur-stats.qzv[0m


Now you can visualize the data from `.qzv` files in [QIIME2 Viewer](https://view.qiime2.org/)

## Add metadata

Be sure, that your metadata is in proper format

In [24]:
!qiime feature-table summarize \
  --i-table table-deblur.qza \
  --o-visualization table.qzv \
  --m-sample-metadata-file metadata.csv
!qiime feature-table tabulate-seqs \
  --i-data rep-seqs-deblur.qza \
  --o-visualization rep-seqs.qzv

[32mSaved Visualization to: table.qzv[0m
[32mSaved Visualization to: rep-seqs.qzv[0m


## Add tree

Construct the phylogenetic tree for reference reads

In [25]:
!qiime phylogeny align-to-tree-mafft-fasttree \
  --i-sequences rep-seqs-deblur.qza \
  --o-alignment aligned-rep-seqs.qza \
  --o-masked-alignment masked-aligned-rep-seqs.qza \
  --o-tree unrooted-tree.qza \
  --o-rooted-tree rooted-tree.qza

[32mSaved FeatureData[AlignedSequence] to: aligned-rep-seqs.qza[0m
[32mSaved FeatureData[AlignedSequence] to: masked-aligned-rep-seqs.qza[0m
[32mSaved Phylogeny[Unrooted] to: unrooted-tree.qza[0m
[32mSaved Phylogeny[Rooted] to: rooted-tree.qza[0m


# Diversity Analysis

Calculate alpha- and beta-diversity indices and their significance in groups. Be sure, that you use a rarefaction for the alpha-diversity

In [26]:
!qiime diversity core-metrics-phylogenetic \
  --i-phylogeny rooted-tree.qza \
  --i-table table-deblur.qza \
  --p-sampling-depth 8000 \
  --m-metadata-file metadata.csv \
  --output-dir core-metrics-results

[32mSaved FeatureTable[Frequency] to: core-metrics-results/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] % Properties('phylogenetic') to: core-metrics-results/faith_pd_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results/observed_otus_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results/evenness_vector.qza[0m
[32mSaved DistanceMatrix % Properties('phylogenetic') to: core-metrics-results/unweighted_unifrac_distance_matrix.qza[0m
[32mSaved DistanceMatrix % Properties('phylogenetic') to: core-metrics-results/weighted_unifrac_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: core-metrics-results/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: core-metrics-results/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: core-metrics-results/unweighted_unifrac_pcoa_results.qza[0m
[32mSaved PCoAResults to: core-me

Visualise alpha-diversity

In [31]:
!qiime diversity alpha-rarefaction \
  --i-table table-deblur.qza \
  --i-phylogeny rooted-tree.qza \
  --p-max-depth 20000 \
  --m-metadata-file metadata.csv \
  --o-visualization alpha-rarefaction.qzv

[32mSaved Visualization to: alpha-rarefaction.qzv[0m


In [28]:
!qiime diversity alpha-group-significance \
  --i-alpha-diversity core-metrics-results/observed_otus_vector.qza \
  --m-metadata-file metadata.csv \
  --o-visualization core-metrics-results/observed_otus-significance.qzv

!qiime diversity alpha-group-significance \
  --i-alpha-diversity core-metrics-results/evenness_vector.qza \
  --m-metadata-file metadata.csv \
  --o-visualization core-metrics-results/evenness-group-significance.qzv

[32mSaved Visualization to: core-metrics-results/observed_otus-significance.qzv[0m
[32mSaved Visualization to: core-metrics-results/evenness-group-significance.qzv[0m


Beta-diversity significance

In [29]:
!qiime diversity beta-group-significance \
  --i-distance-matrix core-metrics-results/bray_curtis_distance_matrix.qza \
  --m-metadata-file metadata.csv \
  --m-metadata-column MainPlant \
  --o-visualization core-metrics-results/bray_curtis-body-site-significance.qzv \
  --p-pairwise

[32mSaved Visualization to: core-metrics-results/bray_curtis-body-site-significance.qzv[0m


## Report

Unfortunally there is no way to plot all the data in nice and fancy pictures in something like `rmarkdown`. So, we have to use another format of the report: `.pdf`, `.docx` or `.pptx` - whatever you want