# Calculating Feature Log-Ratios Directly

Occasionally we might be only interested in the log-ratios between two features and not the ranks. In this case, it is useful to have a way to skip the step of running DEICODE/Songbird. This also has the advantage of allowing programmatic generation (through CLI or Python) of log-ratios for further visualization/analysis. We can perform this action using **Qarcoal**.

We will use the same dataset featured in the Qurro DEICODE tutorial (`deicode_example.ipynb`)

## Requirements

This notebook relies on QIIME 2, DEICODE, and Qurro all being installed. You should be in a QIIME 2 conda environment.

## 0. Setting Up

In this section, we replace the output directory with an empty directory. This just lets us run this notebook multiple times, without any tools complaining about overwriting files.

In [1]:
# Clear the output directory so we can write these files there
!rm -rf output/*
# Since git doesn't keep track of empty directories, create the output/ directory if it doesn't already exist
# (if it does already exist, -p ensures that an error won't be thrown)
!mkdir -p output

## 1. Using Qarcoal Through QIIME2

Currently, Qarcoal can only be called through QIIME2. However, we are working on a standalone version so stay tuned.

In [2]:
!qiime tools import \
    --input-path ../DEICODE_sleep_apnea/input/qiita_10422_table.biom \
    --output-path output/qiita_10422_table.biom.qza \
    --type FeatureTable[Frequency]

[32mImported ../DEICODE_sleep_apnea/input/qiita_10422_table.biom as BIOMV210DirFmt to output/qiita_10422_table.biom.qza[0m


Now, we can run Qarcoal through Qiime2 on our imported BIOM table. This produces one output: a table of samples with their associated log-ratios of selected features. We will use `g__Allobaculum` as our numerator string and `g__Coprococcus` as our denominator string for demonstration.

In [3]:
!qiime qurro qarcoal \
    --i-table output/qiita_10422_table.biom.qza \
    --m-taxonomy-file ../DEICODE_sleep_apnea/input/taxonomy.tsv \
    --p-num-string g__Allobaculum \
    --p-denom-string g__Coprococcus \
    --o-qarcoal-log-ratios output/allobaculum_coprococcus_log_ratios.qza

[32mSaved SampleData[QarcoalLogRatios] to: output/allobaculum_coprococcus_log_ratios.qza[0m


## 2. Verifying Qarcoal Output

First, we want to ensure that our results are the same as Qurro. First, we load our newly-calculated log-ratio table into Python.

In [4]:
import pandas as pd
from qiime2 import Artifact, Metadata

In [5]:
qarcoal_log_ratios = Artifact.load("output/allobaculum_coprococcus_log_ratios.qza")
qarcoal_log_ratios_df = qarcoal_log_ratios.view(pd.DataFrame)
qarcoal_log_ratios_df.head()

Unnamed: 0,Num_Sum,Denom_Sum,log_ratio
10422.18.F.8,19.0,162.0,-2.143157
10422.26.F.11,14927.0,192.0,4.353432
10422.25.F.10,10871.0,248.0,3.780425
10422.18.F.9,16.0,68.0,-1.446919
10422.19.F.12,26.0,1860.0,-4.270235


### 2.A. Running DEICODE

Next, we'll run DEICODE before we can run Qurro. Here we are going to use the Artifact API but you can just as easily run the following through the command line. (Please see the DEICODE example notebook for details on using DEICODE.)

**NOTE**: By default, DEICODE performs filtration on your input feature table. We will override this by setting both `min-feature-count` and `min-sample-count` to 0. If you want to match the DEICODE filtration, filter your feature table to match DEICODE and pass a QIIME2 Metadata file containing the sample IDs to Qarcoal with the `--m-samples-to-use-file` flag.

In [6]:
from qiime2.plugins import deicode

table = Artifact.load("output/qiita_10422_table.biom.qza")

ordination, dist_matrix = deicode.actions.rpca(
    table = table,
    min_sample_count = 0,
    min_feature_count = 0)

Use a Series with sparse values instead.

    >>> series = pd.Series(pd.SparseArray(...))

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  for r in self.matrix_data.tocsr()]
Use a regular DataFrame whose columns are SparseArrays instead.

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  return constructor(mat, index=index, columns=columns)
Use a regular DataFrame whose columns are SparseArrays instead.

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  default_kind=self._default_kind,


### 2.B. Running Qurro

We can then input the ordination into Qurro, save the visualization, and compare our results.

In [7]:
from qiime2.plugins import qurro

metadata = Metadata.load("../DEICODE_sleep_apnea/input/qiita_10422_metadata.tsv")
taxonomy = Metadata.load("../DEICODE_sleep_apnea/input/taxonomy.tsv")

qurro_viz = qurro.actions.loading_plot(
    ranks = ordination,
    table = table,
    sample_metadata = metadata,
    feature_metadata = taxonomy)

qurro_viz.visualization.save("output/qurro_viz.qzv")

Use a regular DataFrame whose columns are SparseArrays instead.

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  table_sdf = pd.SparseDataFrame(table.matrix_data, default_fill_value=0.0)
Use a Series with sparse values instead.

    >>> series = pd.Series(pd.SparseArray(...))

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  sparse_index=BlockIndex(N, blocs, blens),
Use a Series with sparse values instead.

    >>> series = pd.Series(pd.SparseArray(...))

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  return klass(values, index=self.index, name=items, fastpath=True)
Use a regular DataFrame whose columns are SparseArrays instead.

See http://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating for more.

  return self._constructor(new_arrays, index=index, columns=columns).__finalize__(
Use a Series with sparse values instead.

    >>>

'output/qurro_viz.qzv'

Open the visualization [here](https://view.qiime2.org/) and type in `g__Allobaculum` in the numerator search bar and `g__Coprococcus` in the denominator search bar. Make sure you select the option to filter features from "Taxon" rather than "Feature ID."

![img](imgs/qurro_feature_search.png)

Click the "Export sample data" button and save the resulting `sample_plot_data.tsv` file to the `output/` directory.

### 2.C. Comparing the Output of Qurro and Qarcoal

We can now load the Qurro results and compare them with the Qarcoal results to make sure they match.

In [8]:
qurro_log_ratios_df = pd.read_csv("output/sample_plot_data.tsv", sep="\t", index_col=0)
qurro_log_ratios_df.head()

Unnamed: 0_level_0,Current_Log_Ratio,age,age.1
Sample ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
10422.21.F.3,-6.08298,11.0,11.0
10422.28.F.9,4.656135,14.0,14.0
10422.24.F.10,-4.964397,14.5,14.5
10422.21.F.10,,14.5,14.5
10422.27.F.8,2.937259,13.5,13.5


We see that the Qurro results have at least one NaN. This just means that for this sample, the log-ratio could not be calculated due to 0s. We can drop these from our DataFrame.


In [9]:
qurro_log_ratios_df = qurro_log_ratios_df.dropna()

First, we can get a preliminary sense of how well the two methods coincide by looking at the number of samples present.

In [10]:
qurro_log_ratios_df.shape[0] == qarcoal_log_ratios_df.shape[0]

True

That's a good sign, but let's be more rigorous and make sure the samples are the same.

In [11]:
set(qurro_log_ratios_df.index) == set(qarcoal_log_ratios_df.index)

True

Finally, let's make sure the log-ratios themselves are the same. Note that Qurro calculates log-ratios using Javascript, while Qarcoal uses Python. As a result, the individual values may differ very slightly due to implementation of the logarithm function. We will use `np.allclose` to check that the two are equal within a tolerance.

In [12]:
from numpy import allclose

qurro_values = qurro_log_ratios_df.sort_index()['Current_Log_Ratio'].to_numpy()
qarcoal_values = qarcoal_log_ratios_df.sort_index()['log_ratio'].to_numpy()

allclose(qurro_values, qarcoal_values)

True

Success! Our Qarcoal-generated log-ratios are equal to our Qurro-generated ones.

We hope you find Qarcoal useful, and please contact us if you having questions or suggestsion about using Qarcoal.