# QIIME
> By Gati Aher  
> Dec 5, 2021  

**Dataset:** FCF Carbon Perturbation (Cellulose-Glucose-Malate)

**Goal:**
- Create isometric balances
- Perform differential abundance analysis

## Cellulose and Malate

## Import Data Files into QIIME Formats

## Data Files

* Sample Name x Meta Information
    * `sample_metadata.tsv`
* Feature Table (Sample x OTU HTS Counts)
    * `OTU_counts_feature_table.qza`
* Taxonomy
    * `taxonomy.qza`

In [57]:
# Covert OTU_counts.tsv (class format table) to a BIOM table
!biom convert -i ../data/processed/malate_OTU_counts.tsv \
--table-type="OTU table" \
--to-json \
-o ../data/qiime/malate_OTU_counts_json_BIOM.biom

# Import OTU_counts BIOM table into QIIME2 FeatureTable[Frequency]
!qiime tools import --type 'FeatureTable[Frequency]' \
--input-path ../data/qiime/malate_OTU_counts_json_BIOM.biom \
--input-format BIOMV100Format \
--output-path ../data/qiime/malate_OTU_counts_feature_table.qza

# Import Taxonomy into QIIME2 FeatureData[Taxonomy]
!qiime tools import --type 'FeatureData[Taxonomy]' \
--input-path ../data/processed/malate_taxonomy.tsv \
--input-format HeaderlessTSVTaxonomyFormat \
--output-path ../data/qiime/malate_taxonomy.qza

[32mImported ../data/qiime/malate_OTU_counts_json_BIOM.biom as BIOMV100Format to ../data/qiime/malate_OTU_counts_feature_table.qza[0m
[0m[32mImported ../data/processed/malate_taxonomy.tsv as HeaderlessTSVTaxonomyFormat to ../data/qiime/malate_taxonomy.qza[0m
[0m

## Filter out Lower Abundance OTUs
Lots of garbage OTUs due to contamination, sequencing error, or clustering errors.

In [58]:
# filter out OTUs with a total abundance (summed across all samples) of less than threshold
threshold = 100
!qiime feature-table filter-features \
    --i-table ../data/qiime/malate_OTU_counts_feature_table.qza \
    --o-filtered-table ../data/qiime/malate_OTU_counts_feature_table_filt100.qza \
    --p-min-frequency 100

[32mSaved FeatureTable[Frequency] to: ../data/qiime/malate_OTU_counts_feature_table_filt100.qza[0m
[0m

## Construct Balances
* Option 1: Correlation-Clustering
* Option 2: Gradient-Clustering
* Option 3: Phylogenetic Analysis

### Option 1: Correlation-Clustering

In [61]:
!qiime gneiss correlation-clustering \
  --i-table ../data/qiime/malate_OTU_counts_feature_table_filt100.qza \
  --o-clustering ../data/qiime/malate_balance_hierarchy_filt100.qza

[32mSaved Hierarchy to: ../data/qiime/malate_balance_hierarchy_filt100.qza[0m
[0m

In [62]:
# Visualize Heatmap
!qiime gneiss dendrogram-heatmap \
  --i-table ../data/qiime/malate_OTU_counts_feature_table_filt100.qza \
  --i-tree ../data/qiime/malate_balance_hierarchy_filt100.qza \
  --m-metadata-file ../data/processed/malate_sample_metadata.tsv \
  --m-metadata-column series \
  --p-color-map seismic \
  --o-visualization ../terminal/malate_balance_heatmap_filt100.qzv


dendrogram-heatmap is deprecated and will be removed in a future version of this plugin.[0m
[32mSaved Visualization to: ../terminal/malate_balance_heatmap_filt100.qzv[0m
[0m

### Option 2: Gradient-Clustering

*Warning:* When using gradient-clustering, you are creating a tree to best highlight compositional differences along the metadata category of your choice, and it is possible to get false positives. Use gradient-clustering with caution.

In [64]:
!qiime gneiss gradient-clustering \
  --i-table ../data/qiime/malate_OTU_counts_feature_table_filt100.qza \
  --m-gradient-file ../data/processed/malate_sample_metadata.tsv \
  --m-gradient-column gradient \
  --o-clustering ../data/qiime/malate_gradient_balance_hierarchy_filt100.qza

[32mSaved Hierarchy to: ../data/qiime/malate_gradient_balance_hierarchy_filt100.qza[0m
[0m

Before running the regression, we have to account for zero abundances. Due the nature of zeros, we cannot be certain if the zeros arose from undersampling, or the complete absence of an OTU. To this extent, we'll add a pseudocount of 1 to approximate the uncertainity probability. We'll also want this for visualizing the heatmaps, since we'll be doing some log scaling.

In [65]:
!qiime composition add-pseudocount \
    --i-table ../data/qiime/malate_OTU_counts_feature_table_filt100.qza \
    --p-pseudocount 1 \
    --o-composition-table ../data/qiime/malate_feature_table_compositions_filt100.biom.qza

[32mSaved FeatureTable[Composition] to: ../data/qiime/malate_feature_table_compositions_filt100.biom.qza[0m
[0m

In [66]:
# # Visualize Heatmap: 
# !qiime gneiss dendrogram-heatmap \
#   --i-table ../data/processed/malate_feature_table_compositions_filt100.biom.qza \
#   --i-tree ../data/processed/malate_gradient_balance_hierarchy_filt100.qza \
#   --m-metadata-file ../data/processed/malate_sample_metadata.tsv \
#   --m-metadata-column series \
#   --p-color-map seismic \
#   --o-visualization ../terminal/malate_gradient_balance_heatmap_filt100.qzv

# ^ Error: (1/1) Invalid value for '--i-table': 
# Expected an artifact of at least typeFeatureTable[Frequency]. 
# An artifact of type FeatureTable[Composition] was provided.

# Visualize Heatmap
!qiime gneiss dendrogram-heatmap \
  --i-table ../data/qiime/malate_OTU_counts_feature_table_filt100.qza \
  --i-tree ../data/qiime/malate_gradient_balance_hierarchy_filt100.qza \
  --m-metadata-file ../data/processed/malate_sample_metadata.tsv \
  --m-metadata-column series \
  --p-color-map seismic \
  --o-visualization ../terminal/malate_gradient_balance_heatmap_filt100.qzv


dendrogram-heatmap is deprecated and will be removed in a future version of this plugin.[0m
[32mSaved Visualization to: ../terminal/malate_gradient_balance_heatmap_filt100.qzv[0m
[0m

In [67]:
!qiime gneiss ilr-hierarchical \
    --i-table ../data/qiime/malate_feature_table_compositions_filt100.biom.qza \
    --i-tree ../data/qiime/malate_gradient_balance_hierarchy_filt100.qza \
    --o-balances ../data/qiime/malate_gradient_ilr_filt100.qza

[32mSaved FeatureTable[Balance] to: ../data/qiime/malate_gradient_ilr_filt100.qza[0m
[0m