Test different functions to get the core microbiota:

In [4]:
import os
import pandas as pd
from qiime2 import Visualization
import matplotlib.pyplot as plt
import numpy as np

import qiime2 as q2

%matplotlib inline
data_dir = 'CE'

##### Download metadata

In [14]:
! wget -nv -O $data_dir/food-metadata.tsv 'https://polybox.ethz.ch/index.php/s/nEd4l5CWGWGEtae/download'

2022-12-12 08:54:46 URL:https://polybox.ethz.ch/index.php/s/nEd4l5CWGWGEtae/download [42810/42810] -> "CE/food-metadata.tsv" [1]


Identify "core" features, which are features observed in a user-defined
  fraction of the samples. Since the core features are a function of the
  fraction of samples that the feature must be observed in to be considered
  core, this is computed over a range of fractions defined by the
  `min_fraction`, `max_fraction`, and `steps` parameters.

#### Workflow
1) Try different parameters to find core features
2) Find core features of all cheeses in our feature table
3) Find core features of Swiss cheeses (in categories rindtype = natural, washed or style = alpine
4) Find core features of similar neighboring country cheeses.
5) Compare results of Swiss to neighboring country cheeses.

### 1)

I tried different values for the parameters:

#### 1. Try

Used the function with the default values:

In [9]:
! qiime feature-table core-features \
--i-table $data_dir/dada2_table_align_filtered.qza \
--p-min-fraction 0.5 \
--o-visualization $data_dir/core_microbiota.qzv

Usage: [94mqiime feature-table core-features[0m [OPTIONS]

  Identify "core" features, which are features observed in a user-defined
  fraction of the samples. Since the core features are a function of the
  fraction of samples that the feature must be observed in to be considered
  core, this is computed over a range of fractions defined by the
  `min_fraction`, `max_fraction`, and `steps` parameters.

[1mInputs[0m:
  [94m[4m--i-table[0m ARTIFACT [32mFeatureTable[Frequency][0m
                       The feature table to use in core features
                       calculations.                                [35m[required][0m
[1mParameters[0m:
  [94m--p-min-fraction[0m PROPORTION [32mRange(0.0, 1.0, inclusive_start=False)[0m
                       The minimum fraction of samples that a feature must be
                       observed in for that feature to be considered a core
                       feature.                                 [35m[default: 0.5][0m
  [94

In [4]:
Visualization.load(f'{data_dir}/core_microbiota.qzv')

#### 2. Try

Used the function with higher min-fraction:

In [14]:
! qiime feature-table core-features \
--i-table $data_dir/dada2_table_align_filtered.qza \
--p-min-fraction 0.8 \
--o-visualization $data_dir/core_microbiota_2.qzv

[32mSaved Visualization to: CE/core_microbiota_2.qzv[0m
[0m

In [15]:
Visualization.load(f'{data_dir}/core_microbiota_2.qzv')

#### 3. Try

Using different step value:

In [17]:
! qiime feature-table core-features \
--i-table $data_dir/dada2_table_align_filtered.qza \
--p-min-fraction 0.8 \
--p-steps 5 \
--o-visualization $data_dir/core_microbiota_3.qzv

[32mSaved Visualization to: CE/core_microbiota_3.qzv[0m
[0m

In [18]:
Visualization.load(f'{data_dir}/core_microbiota_3.qzv')

#### 4. Try

Use different min-fraction:

In [22]:
! qiime feature-table core-features \
--i-table $data_dir/dada2_table_align_filtered.qza \
--p-min-fraction 0.7 \
--p-steps 10 \
--o-visualization $data_dir/core_microbiota_4.qzv

[32mSaved Visualization to: CE/core_microbiota_4.qzv[0m
[0m

In [2]:
Visualization.load(f'{data_dir}/core_microbiota_4.qzv')

### 2)

#### Download tsv file

TSV file with feature list could be downloaded from the output above. I downloaded the file and put it on polybox. I set the treshold of fraction of samples (The fraction of the total number of samples that a feature must be observed in for that feature to be considered "core".) to 0.7.
Here we import this data from the polybox:

In [11]:
! wget -nv -O $data_dir/core_microbiota_list_0.7.tsv 'https://polybox.ethz.ch/index.php/s/WRm86jdxvkxPOVa/download'

2022-12-12 08:53:38 URL:https://polybox.ethz.ch/index.php/s/WRm86jdxvkxPOVa/download [490/490] -> "CE/core_microbiota_list_0.7.tsv" [1]


This are the core features of all cheeses:

In [12]:
df_core_all = pd.read_csv(f'{data_dir}/core_microbiota_list_0.7.tsv', sep ='\t')
df_core_all

Unnamed: 0,Feature ID,2%,9%,25%,50%,75%,91%,98%
0,f50c8ae2717bb99c926c4ab1f2a6135c,4.0,12.0,184.5,1897.5,8307.25,51957.23,88576.02
1,805c1b3ec3035abbb7b9f1f7f6157e12,0.0,13.0,98.5,741.0,6608.0,21851.37,102426.32
2,5899b66b70d688d5cd95df5fc7a26e3a,0.0,0.0,8.0,87.0,1019.25,6905.73,28623.58
3,369232e1ac9f9983056d09b9fe866df5,0.0,0.0,8.0,44.0,400.75,2877.72,12945.78
4,398e906d9ad1914eb268fda5c7453e09,0.0,3.0,6.0,32.0,1070.25,11938.72,47885.18


### 3) 

Do cheeses from
Switzerland share this core microbiome with similar cheeses (e.g., same style/rind type) from neighboring
countries?

Find core features of CH cheeses with natural rindtype:

Result: 33 core features

In [22]:
! qiime feature-table filter-samples \
--i-table $data_dir/dada2_table_align_filtered.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--p-where "[country]='Switzerland' AND [rindtype]='natural'"\
--o-filtered-table $data_dir/feature_table_CH_natural.qza

[32mSaved FeatureTable[Frequency] to: CE/feature_table_CH_natural.qza[0m
[0m

In [29]:
! qiime feature-table core-features \
--i-table $data_dir/feature_table_CH_natural.qza \
--p-min-fraction 0.7 \
--p-steps 10 \
--o-visualization $data_dir/core_microbiota_CH_natural.qzv

[32mSaved Visualization to: CE/core_microbiota_CH_natural.qzv[0m
[0m

In [30]:
Visualization.load(f'{data_dir}/core_microbiota_CH_natural.qzv')

Find core features of CH cheeses with washed rindtype:

In [31]:
! qiime feature-table filter-samples \
--i-table $data_dir/dada2_table_align_filtered.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--p-where "[country]='Switzerland' AND [rindtype]='washed'"\
--o-filtered-table $data_dir/feature_table_CH_washed.qza

[32mSaved FeatureTable[Frequency] to: CE/feature_table_CH_washed.qza[0m
[0m

In [32]:
! qiime feature-table core-features \
--i-table $data_dir/feature_table_CH_washed.qza \
--p-min-fraction 0.7 \
--p-steps 10 \
--o-visualization $data_dir/core_microbiota_CH_washed.qzv

[32mSaved Visualization to: CE/core_microbiota_CH_washed.qzv[0m
[0m

In [33]:
Visualization.load(f'{data_dir}/core_microbiota_CH_washed.qzv')

Find core features of CH cheeses with alpine style:

In [35]:
! qiime feature-table filter-samples \
--i-table $data_dir/dada2_table_align_filtered.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--p-where "[country]='Switzerland' AND [style]='alpine'"\
--o-filtered-table $data_dir/feature_table_CH_alpine.qza

[32mSaved FeatureTable[Frequency] to: CE/feature_table_CH_alpine.qza[0m
[0m

In [36]:
! qiime feature-table core-features \
--i-table $data_dir/feature_table_CH_alpine.qza \
--p-min-fraction 0.7 \
--p-steps 10 \
--o-visualization $data_dir/core_microbiota_CH_alpine.qzv

[32mSaved Visualization to: CE/core_microbiota_CH_alpine.qzv[0m
[0m

In [37]:
Visualization.load(f'{data_dir}/core_microbiota_CH_alpine.qzv')

In [23]:
! qiime feature-table summarize \
--i-table $data_dir/feature_table_CH_natural.qza \
--o-visualization $data_dir/feature_table_CH_natural.qzv

[32mSaved Visualization to: CE/feature_table_CH_natural.qzv[0m
[0m

In [24]:
Visualization.load(f'{data_dir}/feature_table_CH_natural.qzv')

In [17]:
! qiime feature-table summarize --help

Usage: [94mqiime feature-table summarize[0m [OPTIONS]

  Generate visual and tabular summaries of a feature table.

[1mInputs[0m:
  [94m[4m--i-table[0m ARTIFACT [32mFeatureTable[Frequency | RelativeFrequency |[0m
    [32mPresenceAbsence][0m   The feature table to be summarized.          [35m[required][0m
[1mParameters[0m:
  [94m--m-sample-metadata-file[0m METADATA...
    (multiple          The sample metadata.
     arguments will    
     be merged)                                                     [35m[optional][0m
[1mOutputs[0m:
  [94m[4m--o-visualization[0m VISUALIZATION
                                                                    [35m[required][0m
[1mMiscellaneous[0m:
  [94m--output-dir[0m PATH    Output unspecified results to a directory
  [94m--verbose[0m / [94m--quiet[0m  Display verbose output to stdout and/or stderr during
                       execution of this action. Or silence output if
                       execution is success

In [2]:
! qiime feature-table filter-samples --help

Usage: [94mqiime feature-table filter-samples[0m [OPTIONS]

  Filter samples from table based on frequency and/or metadata. Any features
  with a frequency of zero after sample filtering will also be removed. See
  the filtering tutorial on https://docs.qiime2.org for additional details.

[1mInputs[0m:
  [94m[4m--i-table[0m ARTIFACT [32mFeatureTable[Frequency¹ | RelativeFrequency² |[0m
    [32mPresenceAbsence³ | Composition⁴][0m
                       The feature table from which samples should be
                       filtered.                                    [35m[required][0m
[1mParameters[0m:
  [94m--p-min-frequency[0m INTEGER
                       The minimum total frequency that a sample must have to
                       be retained.                               [35m[default: 0][0m
  [94m--p-max-frequency[0m INTEGER
                       The maximum total frequency that a sample can have to
                       be retained. If no value is provided t