# Tutorial: Functional annotation analysis with MOSHPIT

This notebook contains materials accompanying the Rigi Workshop 2025: **Microbiome Meets Metabolism**. The notebook and corresponding setup script were adapted from the [**Advanced Block Course: Computational Biology**](https://github.com/bokulich-lab/advanced-comp-bio-tutorial.git); all source code is licensed under the Apache License 2.0.

Save your own local copy of this notebook by using `File > Save a copy in Drive`. At some point you may be prompted to trust the notebook. We promise that it is safe 🤞

**Notes (optional):**

The Google Colab notebook environment will interpret any command as Python code by default. If we want to run bash commands we will have to prefix them by `!`. So any command you see with a leading `!` is a bash command and if you wanted to run it in your terminal you would omit the leading `!`. For example, if in the Colab notebook you ran `!wget` you would just run `wget` in your terminal.

In this notebook we use the `!` prefix because we run all MOSHPIT commands using the [`q2cli`](https://github.com/qiime2/q2cli/) (QIIME 2 command-line interface). However, MOSHPIT/QIIME 2 also has a Python API. You can learn more about these and other QIIME 2 interfaces at https://qiime2.org/.

### Environment setup

MOSHPIT is usually installed by following the [official installation instructions](https://docs.qiime2.org/2024.10/install/). However, because we are using Google Colab and there are some caveats to using conda here, we will have to hack around the installation a little. But no worries, we provide a setup script below which does all this work for us. 😌 Let's start by pulling a local copy of the project repository down from GitHub.

From here, you run the entire notebook by selecting `Runtime > Run all` from the menu in Google Colab. Some steps are time-comsuming and the entire notebook may take up to 30-60 minutes, so run the entire notebook now and we will inspect the commands and results as we work through as a class.

🛑 **ACTION** 🛑
<br>
*Run every cell in the notebook using the instructions above.*

In [None]:
! git clone https://github.com/fsb-edu/rigi-workshop.git materials

We will move into the `materials/` directory.

In [None]:
%cd materials

Now we are ready to set up our environment. This will take about 10 minutes.
<br>
**Note:** This setup is only relevant for Google Colaboratory and will not work on your local machine. To learn more about MOSHPIT installation please consult our [official tutorial](https://moshpit.readthedocs.io/en/latest/chapters/00_setup.html).

In [None]:
%run setup_moshpit

We need to alias the "mosh" command to point to the moshpit-dev environment - this is a workaround for the Google Colab environment.

In [41]:
alias mosh mamba run -n moshpit-dev -r /usr/local mosh

### Functional annotation with MOSHPIT

In [2]:
mosh annotate extract-annotations --i-ortholog-annotations ./data/eggnog_annotations.qza --p-annotation kegg_reaction --p-max-evalue 0.0001 --o-annotation-frequency ./data/eggnog_kegg_ko_freq.qza

Usage: [94mmosh annotate extract-annotations[0m [OPTIONS]

  This method extract a specific annotation from the table generated by EggNOG
  and calculates its frequencies across all MAGs.

[1mInputs[0m:
  [94m[4m--i-ortholog-annotations[0m ARTIFACT
    [32mGenomeData[NOG][0m      Ortholog annotations.                      [35m[required][0m
[1mParameters[0m:
  [94m[4m--p-annotation[0m TEXT [32mChoices('cog', 'caz', 'kegg_ko', 'kegg_pathway',[0m
    [32m'kegg_reaction', 'kegg_module', 'brite')[0m
                         Annotation to extract.                     [35m[required][0m
  [94m--p-max-evalue[0m NUMBER  
    [32mRange(0, None)[0m                                              [35m[default: 1.0][0m
  [94m--p-min-score[0m NUMBER   
    [32mRange(0, None)[0m                                              [35m[default: 0.0][0m
[1mOutputs[0m:
  [94m[4m--o-annotation-frequency[0m ARTIFACT [32mFeatureTable[Frequency][0m
           

In [None]:
mosh annotate multiply-tables --i-table1 ./data/mags_derep_ft.qza --i-table2 ./data/eggnog_kegg_ko_freq.qza --o-result-table ./data/kegg_ko_ft.qza

In [None]:
mosh composition ancombc --i-table ./data/kegg_ko_ft.qza --m-metadata-file ./data/cocoa-metadata.tsv --p-formula stage --o-differentials ./data/kegg_ko_differentials.qza

In [None]:
mosh composition da-barplot --i-data ./data/kegg_ko_differentials.qza --p-significance-threshold 0.05 --p-effect-size-threshold 1.1 --o-visualization ./data/kegg_ko_differentials.qzv