# Visualizing songbird feature ranks with Qurro
In this example, we use data from the Red Sea metagenome dataset. This particular data was obtained from [songbird's GitHub repository in its `data/redsea` folder](https://github.com/biocore/songbird/tree/master/data/redsea), and is associated with the following paper:

Thompson, L. R., Williams, G. J., Haroon, M. F., Shibl, A., Larsen, P., Shorenstein, J., ... & Stingl, U. (2017). Metagenomic covariation along densely sampled environmental gradients in the Red Sea. _The ISME journal, 11_(1), 138.

The commands for running songbird and importing the Red Sea data are based on the example usage of this dataset in the songbird README file on its GitHub page.

## Requirements
This notebook relies on QIIME 2, songbird, and Qurro all being installed.

## 0. Setting up
In this section, we replace the output directory with an empty directory. This just lets us run this notebook multiple times, without any tools complaining about overwriting files.

In [7]:
# Clear the output directory so we can write these files there
!rm -rf output/*
# Since git doesn't keep track of empty directories, create the output/ directory if it doesn't already exist
# (if it does already exist, -p ensures that an error won't be thrown)
!mkdir -p output

## 1. Using songbird and Qurro through QIIME 2
You can use songbird and Qurro inside or outside of QIIME 2. In this section, we'll use songbird and Qurro from within QIIME 2; in the next section, we'll use these tools outside of QIIME 2.

If you just installed songbird or Qurro, it's advised that you run `qiime dev refresh-cache` on your system afterwards in order to get QIIME 2 to "find" these tools' QIIME 2 plugins.

### 1. A. Using songbird through QIIME 2
In order to use this dataset's BIOM table in QIIME 2, we need to import it as a `FeatureTable[Frequency]` QIIME 2 artifact.

In [8]:
!qiime tools import \
    --input-path input/redsea.biom \
    --output-path output/redsea.biom.qza \
    --type FeatureTable[Frequency]

[32mImported input/redsea.biom as BIOMV210DirFmt to output/redsea.biom.qza[0m


Now, we can run songbird through QIIME 2 on our imported BIOM table. This produces three output files, but the main one we care about for Qurro is the `FeatureData[Differential]` artifact (which will be stored in `output/differentials.qza`). This artifact contains **feature rankings**: as songbird's documentation puts it, these correspond to "...the ordering of the coefficients within a covariate."

Please see [songbird's documentation](https://github.com/biocore/songbird/) for more information about how it works and how its output files are formatted.

In [9]:
!qiime songbird multinomial \
    --i-table output/redsea.biom.qza \
    --m-metadata-file input/redsea_metadata.txt \
    --p-formula "Depth+Temperature+Salinity+Oxygen+Fluorescence+Nitrate" \
    --o-differentials output/differentials.qza \
    --o-regression-stats output/regression-stats.qza \
    --o-regression-biplot output/regression-biplot.qza

[32mSaved FeatureData[Differential] to: output/differentials.qza[0m
[32mSaved SampleData[SongbirdStats] to: output/regression-stats.qza[0m
[32mSaved PCoAResults % Properties(['biplot']) to: output/regression-biplot.qza[0m


### 1. B. Using Qurro through QIIME 2
Since we're working with songbird output, we use the `qiime qurro supervised-rank-plot` command.

**Note** that songbird must be installed on your system for the `qiime qurro supervised-rank-plot` command to be available; if songbird isn't installed, then running `qiime qurro --help` will only show the `unsupervised-rank-plot` command. (If you were able to run the `qiime songbird` command in the previous section, you should be fine.)

In [10]:
!qiime qurro supervised-rank-plot --help

Usage: qiime qurro supervised-rank-plot [OPTIONS]

  Generates an interactive visualization of songbird feature rankings in
  tandem with a visualization of the log ratios of selected features' sample
  abundances.

Options:
  --i-ranks ARTIFACT PATH FeatureData[Differential]
                                  A differentials file describing feature
                                  rankings produced by songbird.  [required]
  --i-table ARTIFACT PATH FeatureTable[Frequency]
                                  A BIOM table describing the abundances of
                                  the ranked features in samples.  [required]
  --m-sample-metadata-file MULTIPLE FILE
                                  Metadata file or artifact viewable as
                                  metadata. This option may be supplied
                                  multiple times to merge metadata.
                                  [required]
  --m-feature-metadata-file MULTIPLE FILE
         

In [11]:
!qiime qurro supervised-rank-plot \
    --i-ranks output/differentials.qza \
    --i-table output/redsea.biom.qza \
    --m-sample-metadata-file input/redsea_metadata.txt \
    --m-feature-metadata-file input/feature_metadata.txt \
    --o-visualization output/qurro_plot_q2.qzv

[32mSaved Visualization to: output/qurro_plot_q2.qzv[0m


That's it! Now, we've created a QZV file (describing a Qurro visualization) at `output/qurro_plot_q2.qzv`. You can view this visualization in one of the following ways:
  1. Upload the QZV file to [view.qiime2.org](https://view.qiime2.org).
  2. View the QZV file using `qiime tools view`.

## 2. Using songbird and Qurro as standalone tools
We don't need to use songbird and Qurro through QIIME 2; if you want, you can run these tools outside of QIIME 2. Although this means you don't have access to some of QIIME 2's functionality (e.g. provenance tracking, or artifact semantic types), the `differentials` you get should be roughly the same. (We say "roughly" because some of the machine learning methods used by songbird involve randomness.)

### 2. A. Using songbird as a standalone tool

In [12]:
!songbird multinomial \
    --input-biom input/redsea.biom \
    --metadata-file input/redsea_metadata.txt \
    --formula "Depth+Temperature+Salinity+Oxygen+Fluorescence+Nitrate" \
    --summary-dir output/

2019-05-26 19:44:51.967655: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-05-26 19:44:51.988083: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1992000000 Hz
2019-05-26 19:44:51.988729: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5584c8791da0 executing computations on platform Host. Devices:
2019-05-26 19:44:51.988746: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
Instructions for updating:
Use tf.random.categorical instead.
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
The TensorFlow Distributions library has moved to TensorFlow Probability (https://github.com/tensorflow/probability). You should update all references to use `tfp.distributions` instead of `tf.distributions`.
Instructions for updating:
The TensorFlow Distribution

### 2. B. Using Qurro as a standalone tool
When we used Qurro through QIIME 2, we had to specify the `supervised-rank-plot` command in order to let the Qurro QIIME 2 plugin know we were working with songbird outputs.

Now that we're running Qurro outside of QIIME 2, we don't need to specify this; Qurro can accept either songbird or DEICODE "ranks" as input. (An added benefit of this is that you don't need to have songbird installed in order to run Qurro on existing songbird output outside of QIIME 2.)

In [13]:
!qurro --help

Usage: qurro [OPTIONS]

  Generates a visualization of feature rankings and log ratios.

  The resulting visualization contains two plots. The first plot shows how
  features are ranked, and the second plot shows the log ratio of "selected"
  features' abundances within samples.

  The visualization is interactive, so which features are "selected" to
  construct log ratios -- as well as various other properties of the
  visualization -- can be changed by the user.

Options:
  -r, --ranks TEXT                Differentials output from songbird or
                                  Ordination output from DEICODE.  [required]
  -t, --table TEXT                A BIOM table describing the abundances of
                                  the ranked features in samples.  [required]
  -fm, --feature-metadata TEXT    Feature metadata file.
  -sm, --sample-metadata TEXT     Sample metadata file.  [required]
  -o, --output-dir TEXT           Location of output files.  [required]


In [15]:
!qurro \
    --ranks output/differentials.tsv \
    --table input/redsea.biom \
    --sample-metadata input/redsea_metadata.txt \
    --feature-metadata input/feature_metadata.txt \
    --output-dir output/qurro_plot_standalone/

Successfully generated a visualization in the folder output/qurro_plot_standalone/.


We just generated a Qurro visualization in the folder `output/qurro_plot_standalone/`. This visualization is analogous to the QZV file we generated above using QIIME 2. You can view this visualization by just opening up `output/qurro_plot_standalone/index.html` in a modern web browser.

That's it! If you have any more questions about using Qurro, feel free to contact us (see the Qurro README for contact information).