# ANCOM-BC & LEfSe

*Run this notebook in `qiime2-2023.2`. Dokdo has been installed within this environment.*

In [1]:
from os import getcwd, listdir, chdir, mkdir
import pandas as pd
import qiime2 as q2

In [2]:
getcwd()

'/home/mrobeson/projects/pd_mouse_tutorial'

In [3]:
chdir('./processed')
getcwd()

'/home/mrobeson/projects/pd_mouse_tutorial/processed'

## ANCOM-BC

Continuing from notebook `04`, we'll use the latest version of ANCOM, [ANCOM-BC](https://doi.org/10.1038/s41467-020-17041-7). See the [R tutorial](http://www.bioconductor.org/packages/release/bioc/vignettes/ANCOMBC/inst/doc/ANCOMBC.html) for more details.

In [4]:
# filter features
! qiime feature-table filter-features \
    --i-table ./table-no-ecmu-hits.qza \
    --p-min-frequency 50 \
    --p-min-samples 4 \
    --o-filtered-table ./table-no-ecmu-hits-abund.qza

[32mSaved FeatureTable[Frequency] to: ./table-no-ecmu-hits-abund.qza[0m
[0m

In [5]:
# Run ANCOM, note we do not need to add a pseudo-count as we did for ANCOM.
! qiime composition ancombc \
    --i-table ./table-no-ecmu-hits-abund.qza \
    --m-metadata-file ./metadata.tsv \
    --p-formula "donor" \
    --p-reference-levels donor::hc_1 \
    --o-differentials ancombc-differentials.qza

[32mSaved FeatureData[DifferentialAbundance] to: ancombc-differentials.qza[0m
[0m

In [6]:
! qiime composition tabulate \
    --i-data ancombc-differentials.qza \
    --o-visualization ancombc-differentials.qzv

[32mSaved Visualization to: ancombc-differentials.qzv[0m
[0m

In [6]:
! qiime composition da-barplot \
    --i-data ancombc-differentials.qza \
    --p-significance-threshold 0.05 \
    --o-visualization ancombc-differentials.qzv

[32mSaved Visualization to: ancombc-differentials.qzv[0m
[0m

## LEfSe Preparation with Dokdo

We'll prepare our files with [dokdo](https://dokdo.readthedocs.io/en/latest/index.html) for use in [LEfSe-Galaxy](https://huttenhower.sph.harvard.edu/galaxy/). You can read the [LEfSe paper](https://doi.org/10.1186/gb-2011-12-6-r60) for more details. 

In [7]:
! dokdo prepare-lefse \
    -t ./table-no-ecmu-hits-abund.qza \
    -x ./taxonomy.qza \
    -m ./metadata.tsv \
    -c donor_status \
    -u genotype \
    -o lefse_table.tsv

[0m

In [8]:
! column -t lefse_table.tsv | head

donor_status                                                                                                                    Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                Healthy                PD                     Healthy                PD                     PD                   Healthy                PD                     PD                     Healthy                PD                     PD                     PD                     PD                     PD                     PD                     PD                     PD                     PD                     PD                     

## Run LefSe on Galaxy

Upload the `lefse_table.tsv` to [LEfSe-Galaxy](https://huttenhower.sph.harvard.edu/galaxy/), or try the newer version of their [Galaxy page](http://galaxy.biobakery.org/). You may have to force Galaxy to import the table as `tabular` within  `Datatype`.