Test different functions to get the core microbiota:

In [1]:
import os
import pandas as pd
from qiime2 import Visualization
import matplotlib.pyplot as plt
import numpy as np

import qiime2 as q2

%matplotlib inline
data_dir = 'CE'

##### Download metadata

In [14]:
! wget -nv -O $data_dir/food-metadata.tsv 'https://polybox.ethz.ch/index.php/s/nEd4l5CWGWGEtae/download'

2022-12-12 08:54:46 URL:https://polybox.ethz.ch/index.php/s/nEd4l5CWGWGEtae/download [42810/42810] -> "CE/food-metadata.tsv" [1]


Identify "core" features, which are features observed in a user-defined
  fraction of the samples. Since the core features are a function of the
  fraction of samples that the feature must be observed in to be considered
  core, this is computed over a range of fractions defined by the
  `min_fraction`, `max_fraction`, and `steps` parameters.

#### Workflow
1) Try different parameters to find core features
2) Find core features of all cheeses in our feature table
3) Find core features of Swiss cheeses (in categories rindtype = natural, washed or style = alpine
4) Find core features of similar neighboring country cheeses.
5) Compare results of Swiss to neighboring country cheeses.

### 1)

I tried different values for the parameters:

#### 1. Try

Used the function with the default values:

In [9]:
! qiime feature-table core-features \
--i-table $data_dir/dada2_table_align_filtered.qza \
--p-min-fraction 0.5 \
--o-visualization $data_dir/core_microbiota.qzv

Usage: [94mqiime feature-table core-features[0m [OPTIONS]

  Identify "core" features, which are features observed in a user-defined
  fraction of the samples. Since the core features are a function of the
  fraction of samples that the feature must be observed in to be considered
  core, this is computed over a range of fractions defined by the
  `min_fraction`, `max_fraction`, and `steps` parameters.

[1mInputs[0m:
  [94m[4m--i-table[0m ARTIFACT [32mFeatureTable[Frequency][0m
                       The feature table to use in core features
                       calculations.                                [35m[required][0m
[1mParameters[0m:
  [94m--p-min-fraction[0m PROPORTION [32mRange(0.0, 1.0, inclusive_start=False)[0m
                       The minimum fraction of samples that a feature must be
                       observed in for that feature to be considered a core
                       feature.                                 [35m[default: 0.5][0m
  [94

In [2]:
Visualization.load(f'{data_dir}/core_microbiota.qzv')

#### 2. Try

Used the function with higher min-fraction:

In [14]:
! qiime feature-table core-features \
--i-table $data_dir/dada2_table_align_filtered.qza \
--p-min-fraction 0.8 \
--o-visualization $data_dir/core_microbiota_2.qzv

[32mSaved Visualization to: CE/core_microbiota_2.qzv[0m
[0m

In [3]:
Visualization.load(f'{data_dir}/core_microbiota_2.qzv')

#### 3. Try

Using different step value:

In [17]:
! qiime feature-table core-features \
--i-table $data_dir/dada2_table_align_filtered.qza \
--p-min-fraction 0.8 \
--p-steps 5 \
--o-visualization $data_dir/core_microbiota_3.qzv

[32mSaved Visualization to: CE/core_microbiota_3.qzv[0m
[0m

In [4]:
Visualization.load(f'{data_dir}/core_microbiota_3.qzv')

#### 4. Try

Use different min-fraction:

In [22]:
! qiime feature-table core-features \
--i-table $data_dir/dada2_table_align_filtered.qza \
--p-min-fraction 0.7 \
--p-steps 10 \
--o-visualization $data_dir/core_microbiota_4.qzv

[32mSaved Visualization to: CE/core_microbiota_4.qzv[0m
[0m

In [5]:
Visualization.load(f'{data_dir}/core_microbiota_4.qzv')

### 2)

#### Download tsv file of core features of all cheeses

TSV file with feature list could be downloaded from the output above. I downloaded the file and put it on polybox. I set the treshold of fraction of samples (The fraction of the total number of samples that a feature must be observed in for that feature to be considered "core".) to 0.7.
Here we import this data from the polybox:

In [11]:
! wget -nv -O $data_dir/core_microbiota_list_0.7.tsv 'https://polybox.ethz.ch/index.php/s/WRm86jdxvkxPOVa/download'

2022-12-12 08:53:38 URL:https://polybox.ethz.ch/index.php/s/WRm86jdxvkxPOVa/download [490/490] -> "CE/core_microbiota_list_0.7.tsv" [1]


These are the core features of all cheeses:

In [3]:
df_core_all = pd.read_csv(f'{data_dir}/core_microbiota_list_0.7.tsv', sep ='\t')
df_core_all.set_index('Feature ID', inplace = True)
df_core_all

Unnamed: 0_level_0,2%,9%,25%,50%,75%,91%,98%
Feature ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
f50c8ae2717bb99c926c4ab1f2a6135c,4.0,12.0,184.5,1897.5,8307.25,51957.23,88576.02
805c1b3ec3035abbb7b9f1f7f6157e12,0.0,13.0,98.5,741.0,6608.0,21851.37,102426.32
5899b66b70d688d5cd95df5fc7a26e3a,0.0,0.0,8.0,87.0,1019.25,6905.73,28623.58
369232e1ac9f9983056d09b9fe866df5,0.0,0.0,8.0,44.0,400.75,2877.72,12945.78
398e906d9ad1914eb268fda5c7453e09,0.0,3.0,6.0,32.0,1070.25,11938.72,47885.18


Load/show qiime artifact as pandas dataframe and afterwards add the Taxon column to the core feature table.

In [4]:
taxa = q2.Artifact.load(f'{data_dir}/taxonomy_v4.qza')
taxa = taxa.view(pd.DataFrame)

In [5]:
core_all_taxa = df_core_all.join(taxa['Taxon'])
pd.set_option('max_colwidth', 150)
core_all_taxa

Unnamed: 0_level_0,2%,9%,25%,50%,75%,91%,98%,Taxon
Feature ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
f50c8ae2717bb99c926c4ab1f2a6135c,4.0,12.0,184.5,1897.5,8307.25,51957.23,88576.02,k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Brevibacteriaceae; g__Brevibacterium
805c1b3ec3035abbb7b9f1f7f6157e12,0.0,13.0,98.5,741.0,6608.0,21851.37,102426.32,k__Bacteria; p__Firmicutes; c__Bacilli; o__Bacillales; f__Staphylococcaceae; g__Staphylococcus
5899b66b70d688d5cd95df5fc7a26e3a,0.0,0.0,8.0,87.0,1019.25,6905.73,28623.58,k__Bacteria; p__Proteobacteria; c__Gammaproteobacteria; o__Pseudomonadales; f__Moraxellaceae; g__Psychrobacter
369232e1ac9f9983056d09b9fe866df5,0.0,0.0,8.0,44.0,400.75,2877.72,12945.78,k__Bacteria; p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Streptococcaceae; g__Lactococcus; s__
398e906d9ad1914eb268fda5c7453e09,0.0,3.0,6.0,32.0,1070.25,11938.72,47885.18,k__Bacteria


### 3) 

Do cheeses from
Switzerland share this core microbiome with similar cheeses (e.g., same style/rind type) from neighboring
countries?

##### Find core features of CH cheeses with natural rindtype:

Result: 33 core features

In [22]:
! qiime feature-table filter-samples \
--i-table $data_dir/dada2_table_align_filtered.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--p-where "[country]='Switzerland' AND [rindtype]='natural'"\
--o-filtered-table $data_dir/feature_table_CH_natural.qza

[32mSaved FeatureTable[Frequency] to: CE/feature_table_CH_natural.qza[0m
[0m

In [29]:
! qiime feature-table core-features \
--i-table $data_dir/feature_table_CH_natural.qza \
--p-min-fraction 0.7 \
--p-steps 10 \
--o-visualization $data_dir/core_microbiota_CH_natural.qzv

[32mSaved Visualization to: CE/core_microbiota_CH_natural.qzv[0m
[0m

In [19]:
Visualization.load(f'{data_dir}/core_microbiota_CH_natural.qzv')

##### Find core features of CH cheeses with washed rindtype:

In [31]:
! qiime feature-table filter-samples \
--i-table $data_dir/dada2_table_align_filtered.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--p-where "[country]='Switzerland' AND [rindtype]='washed'"\
--o-filtered-table $data_dir/feature_table_CH_washed.qza

[32mSaved FeatureTable[Frequency] to: CE/feature_table_CH_washed.qza[0m
[0m

In [32]:
! qiime feature-table core-features \
--i-table $data_dir/feature_table_CH_washed.qza \
--p-min-fraction 0.7 \
--p-steps 10 \
--o-visualization $data_dir/core_microbiota_CH_washed.qzv

[32mSaved Visualization to: CE/core_microbiota_CH_washed.qzv[0m
[0m

In [5]:
Visualization.load(f'{data_dir}/core_microbiota_CH_washed.qzv')

##### Find core features of CH cheeses with alpine style:

In [35]:
! qiime feature-table filter-samples \
--i-table $data_dir/dada2_table_align_filtered.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--p-where "[country]='Switzerland' AND [style]='alpine'"\
--o-filtered-table $data_dir/feature_table_CH_alpine.qza

[32mSaved FeatureTable[Frequency] to: CE/feature_table_CH_alpine.qza[0m
[0m

In [36]:
! qiime feature-table core-features \
--i-table $data_dir/feature_table_CH_alpine.qza \
--p-min-fraction 0.7 \
--p-steps 10 \
--o-visualization $data_dir/core_microbiota_CH_alpine.qzv

[32mSaved Visualization to: CE/core_microbiota_CH_alpine.qzv[0m
[0m

In [6]:
Visualization.load(f'{data_dir}/core_microbiota_CH_alpine.qzv')

### 4)

Filter table to have only cheeses from neighboring countries (no cheeses from Germany or Austria in our dataset):

In [8]:
! qiime feature-table filter-samples \
--i-table $data_dir/dada2_table_align_filtered.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--p-where "[country]='France' OR [country]='Italy'"\
--o-filtered-table $data_dir/feature_table_neighbor.qza

[32mSaved FeatureTable[Frequency] to: CE/feature_table_neighbor.qza[0m
[0m

##### Find core features of neighboring cheeses with natural rindtype:

In [9]:
! qiime feature-table filter-samples \
--i-table $data_dir/feature_table_neighbor.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--p-where "[rindtype]='natural'"\
--o-filtered-table $data_dir/feature_table_neighbor_natural.qza

[32mSaved FeatureTable[Frequency] to: CE/feature_table_neighbor_natural.qza[0m
[0m

In [10]:
! qiime feature-table core-features \
--i-table $data_dir/feature_table_neighbor_natural.qza \
--p-min-fraction 0.7 \
--p-steps 10 \
--o-visualization $data_dir/core_microbiota_neighbor_natural.qzv

[32mSaved Visualization to: CE/core_microbiota_neighbor_natural.qzv[0m
[0m

In [11]:
Visualization.load(f'{data_dir}/core_microbiota_neighbor_natural.qzv')

##### Find core features of neighboring cheeses with washed rindtype:

In [12]:
! qiime feature-table filter-samples \
--i-table $data_dir/feature_table_neighbor.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--p-where "[rindtype]='washed'"\
--o-filtered-table $data_dir/feature_table_neighbor_washed.qza

[32mSaved FeatureTable[Frequency] to: CE/feature_table_neighbor_washed.qza[0m
[0m

In [13]:
! qiime feature-table core-features \
--i-table $data_dir/feature_table_neighbor_washed.qza \
--p-min-fraction 0.7 \
--p-steps 10 \
--o-visualization $data_dir/core_microbiota_neighbor_washed.qzv

[32mSaved Visualization to: CE/core_microbiota_neighbor_washed.qzv[0m
[0m

In [14]:
Visualization.load(f'{data_dir}/core_microbiota_neighbor_washed.qzv')

##### Find core features of neighboring cheeses with alpine style:

In [15]:
! qiime feature-table filter-samples \
--i-table $data_dir/feature_table_neighbor.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--p-where "[style]='alpine'"\
--o-filtered-table $data_dir/feature_table_neighbor_alpine.qza

[32mSaved FeatureTable[Frequency] to: CE/feature_table_neighbor_alpine.qza[0m
[0m

In [17]:
! qiime feature-table core-features \
--i-table $data_dir/feature_table_neighbor_alpine.qza \
--p-min-fraction 0.7 \
--p-steps 10 \
--o-visualization $data_dir/core_microbiota_neighbor_alpine.qzv

[32mSaved Visualization to: CE/core_microbiota_neighbor_alpine.qzv[0m
[0m

In [18]:
Visualization.load(f'{data_dir}/core_microbiota_neighbor_alpine.qzv')

### 5) Compare results of core features of CH cheeses with similar cheeses from neighboring countries

--> add column to table with taxonomy
--> get list with only feature IDs
--> use python set intersection function

##### Cheeses with natural rindtype

Download of tsv files with core features (fraction of samples = 0.7)

In [20]:
! wget -nv -O $data_dir/core_microbiota_list_ch_natural.tsv 'https://polybox.ethz.ch/index.php/s/5ZVUmvDoy1VBTAx/download'

2022-12-12 10:56:55 URL:https://polybox.ethz.ch/index.php/s/5ZVUmvDoy1VBTAx/download [2688/2688] -> "CE/core_microbiota_list_ch_natural.tsv" [1]


In [21]:
! wget -nv -O $data_dir/core_microbiota_list_neighbor_natural.tsv 'https://polybox.ethz.ch/index.php/s/cAEL47rLr8ELoV5/download'

2022-12-12 10:57:38 URL:https://polybox.ethz.ch/index.php/s/cAEL47rLr8ELoV5/download [1254/1254] -> "CE/core_microbiota_list_neighbor_natural.tsv" [1]


Read tsv files into pandas dataframe and add column with taxon:

In [12]:
#core features from CH cheeses with natural rindtype
df_core_ch_nat = pd.read_csv(f'{data_dir}/core_microbiota_list_ch_natural.tsv', sep ='\t')
df_core_ch_nat.set_index('Feature ID', inplace = True)
core_ch_nat_taxa = df_core_ch_nat.join(taxa['Taxon'])
pd.set_option('max_colwidth', 150)
#core_ch_nat_taxa

In [11]:
#core features from neighboring countries with natural rindtype
df_core_nei_nat = pd.read_csv(f'{data_dir}/core_microbiota_list_neighbor_natural.tsv', sep ='\t')
df_core_nei_nat.set_index('Feature ID', inplace = True)
core_nei_nat_taxa = df_core_nei_nat.join(taxa['Taxon'])
pd.set_option('max_colwidth', 150)
#core_nei_nat_taxa

Compare values between the two dataframes created above:

In [8]:
#get list of Feature IDs from core features of CH and neighboring cheeses with natural rindtype and convert list into set
index_list_ch_nat = list(df_core_ch_nat.index.values)
set_ch_nat = set(index_list_ch_nat)
index_list_nei_nat = list(df_core_nei_nat.index.values)
set_nei_nat = set(index_list_nei_nat)

In [9]:
#get set of Feature IDs which are the same in both sets
set_core_nat = set_ch_nat.intersection(set_nei_nat)

print(set_core_nat)

{'d8805a58ee0553d4947a5697b758f581', 'f50c8ae2717bb99c926c4ab1f2a6135c', '5899b66b70d688d5cd95df5fc7a26e3a', '0e0c3a6a9489f3439329d12d76275100', '805c1b3ec3035abbb7b9f1f7f6157e12', '0f47f1d604a3c0c66dd7a771668df459', '2984a873cf9373de5425dd5b5b96c232', '56e99d7158115760f6283fb65ab29bd0', '398e906d9ad1914eb268fda5c7453e09', '369232e1ac9f9983056d09b9fe866df5'}


In [10]:
core_nat = pd.DataFrame(set_core_nat)
core_nat.set_index(0, inplace = True)
#core_nat = core_nat.rename(index={'Feature ID'})
core_nat_taxa = core_nat.join(taxa['Taxon'])
core_nat_taxa

Unnamed: 0_level_0,Taxon
0,Unnamed: 1_level_1
d8805a58ee0553d4947a5697b758f581,k__Bacteria; p__Proteobacteria; c__Gammaproteobacteria; o__Oceanospirillales; f__Halomonadaceae; g__Halomonas
f50c8ae2717bb99c926c4ab1f2a6135c,k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Brevibacteriaceae; g__Brevibacterium
5899b66b70d688d5cd95df5fc7a26e3a,k__Bacteria; p__Proteobacteria; c__Gammaproteobacteria; o__Pseudomonadales; f__Moraxellaceae; g__Psychrobacter
0e0c3a6a9489f3439329d12d76275100,k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Micrococcaceae; g__Arthrobacter
805c1b3ec3035abbb7b9f1f7f6157e12,k__Bacteria; p__Firmicutes; c__Bacilli; o__Bacillales; f__Staphylococcaceae; g__Staphylococcus
0f47f1d604a3c0c66dd7a771668df459,k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Dermabacteraceae; g__Brachybacterium
2984a873cf9373de5425dd5b5b96c232,k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Dermabacteraceae; g__Brachybacterium; s__
56e99d7158115760f6283fb65ab29bd0,k__Bacteria
398e906d9ad1914eb268fda5c7453e09,k__Bacteria
369232e1ac9f9983056d09b9fe866df5,k__Bacteria; p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Streptococcaceae; g__Lactococcus; s__


##### Cheeses with washed rindtype

In [13]:
! wget -nv -O $data_dir/core_microbiota_list_ch_washed.tsv 'https://polybox.ethz.ch/index.php/s/M5WGsq8gReQGrQq/download'
! wget -nv -O $data_dir/core_microbiota_list_neighbor_washed.tsv 'https://polybox.ethz.ch/index.php/s/uO4l1YWYO91DkxH/download'

2022-12-13 16:22:44 URL:https://polybox.ethz.ch/index.php/s/M5WGsq8gReQGrQq/download [1816/1816] -> "CE/core_microbiota_list_ch_washed.tsv" [1]
2022-12-13 16:22:44 URL:https://polybox.ethz.ch/index.php/s/uO4l1YWYO91DkxH/download [1166/1166] -> "CE/core_microbiota_list_neighbor_washed.tsv" [1]


In [25]:
#core features from CH cheeses with washed rindtype
df_core_ch_was = pd.read_csv(f'{data_dir}/core_microbiota_list_ch_washed.tsv', sep ='\t')
df_core_ch_was.set_index('Feature ID', inplace = True)
core_ch_was_taxa = df_core_ch_was.join(taxa['Taxon'])
pd.set_option('max_colwidth', 150)
#core_ch_was_taxa

In [21]:
#core features from neighboring countries with washed rindtype
df_core_nei_was = pd.read_csv(f'{data_dir}/core_microbiota_list_neighbor_washed.tsv', sep ='\t')
df_core_nei_was.set_index('Feature ID', inplace = True)
core_nei_was_taxa = df_core_nei_was.join(taxa['Taxon'])
pd.set_option('max_colwidth', 150)
#core_nei_was_taxa

In [17]:
#get list of Feature IDs from core features of CH and neighboring cheeses with natural rindtype and convert list into set
index_list_ch_was = list(df_core_ch_was.index.values)
set_ch_was = set(index_list_ch_was)
index_list_nei_was = list(df_core_nei_was.index.values)
set_nei_was = set(index_list_nei_was)

In [18]:
#get set of Feature IDs which are the same in both sets
set_core_was = set_ch_was.intersection(set_nei_was)

print(set_core_was)

{'f50c8ae2717bb99c926c4ab1f2a6135c', '5899b66b70d688d5cd95df5fc7a26e3a', 'da95b61897d9c6cd8b79f052d26a7985', '805c1b3ec3035abbb7b9f1f7f6157e12', '4db7c06da0197e12d5dd8b3dc1418e50', '398e906d9ad1914eb268fda5c7453e09'}


In [19]:
core_was = pd.DataFrame(set_core_was)
core_was.set_index(0, inplace = True)
#core_was = core_was.rename(index={'Feature ID'})
core_was_taxa = core_was.join(taxa['Taxon'])
core_was_taxa

Unnamed: 0_level_0,Taxon
0,Unnamed: 1_level_1
f50c8ae2717bb99c926c4ab1f2a6135c,k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Brevibacteriaceae; g__Brevibacterium
5899b66b70d688d5cd95df5fc7a26e3a,k__Bacteria; p__Proteobacteria; c__Gammaproteobacteria; o__Pseudomonadales; f__Moraxellaceae; g__Psychrobacter
da95b61897d9c6cd8b79f052d26a7985,k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Corynebacteriaceae; g__Corynebacterium; s__variabile
805c1b3ec3035abbb7b9f1f7f6157e12,k__Bacteria; p__Firmicutes; c__Bacilli; o__Bacillales; f__Staphylococcaceae; g__Staphylococcus
4db7c06da0197e12d5dd8b3dc1418e50,k__Bacteria; p__Proteobacteria; c__Gammaproteobacteria; o__Oceanospirillales; f__Halomonadaceae; g__Halomonas; s__
398e906d9ad1914eb268fda5c7453e09,k__Bacteria


##### Cheeses with alpine style

In [22]:
! wget -nv -O $data_dir/core_microbiota_list_ch_alpine.tsv 'https://polybox.ethz.ch/index.php/s/f8vVurBBWM740hB/download'
! wget -nv -O $data_dir/core_microbiota_list_neighbor_alpine.tsv 'https://polybox.ethz.ch/index.php/s/k4Yy6aCgH2G2gkT/download'

2022-12-13 16:29:41 URL:https://polybox.ethz.ch/index.php/s/f8vVurBBWM740hB/download [1917/1917] -> "CE/core_microbiota_list_ch_alpine.tsv" [1]
2022-12-13 16:29:42 URL:https://polybox.ethz.ch/index.php/s/k4Yy6aCgH2G2gkT/download [1849/1849] -> "CE/core_microbiota_list_neighbor_alpine.tsv" [1]


In [27]:
#core features from CH cheeses in alpine style
df_core_ch_alp = pd.read_csv(f'{data_dir}/core_microbiota_list_ch_alpine.tsv', sep ='\t')
df_core_ch_alp.set_index('Feature ID', inplace = True)
core_ch_alp_taxa = df_core_ch_alp.join(taxa['Taxon'])
pd.set_option('max_colwidth', 150)
#core_ch_alp_taxa

In [29]:
#core features from neighboring countries in alpine style
df_core_nei_alp = pd.read_csv(f'{data_dir}/core_microbiota_list_neighbor_alpine.tsv', sep ='\t')
df_core_nei_alp.set_index('Feature ID', inplace = True)
core_nei_alp_taxa = df_core_nei_alp.join(taxa['Taxon'])
pd.set_option('max_colwidth', 150)
#core_nei_alp_taxa

In [30]:
#get list of Feature IDs from core features of CH and neighboring cheeses with natural rindtype and convert list into set
index_list_ch_alp = list(df_core_ch_alp.index.values)
set_ch_alp = set(index_list_ch_alp)
index_list_nei_alp = list(df_core_nei_alp.index.values)
set_nei_alp = set(index_list_nei_alp)

In [31]:
#get set of Feature IDs which are the same in both sets
set_core_alp = set_ch_alp.intersection(set_nei_alp)

print(set_core_alp)

{'f50c8ae2717bb99c926c4ab1f2a6135c', '13abd204fa63efb19248b7c271448d5a', '5899b66b70d688d5cd95df5fc7a26e3a', '805c1b3ec3035abbb7b9f1f7f6157e12', 'da95b61897d9c6cd8b79f052d26a7985', 'fc51328a0e0452be580de099a5b5791a', 'd847672aeae8e53a505ead86563586e4', '398e906d9ad1914eb268fda5c7453e09', '9e9ac50434879829e4bce8eeb1bc4f9c', 'c3e308088f68e1cabfd16c37f5a2307b', '4db7c06da0197e12d5dd8b3dc1418e50', '016557b68d4a86357fc47eab6f903d3f'}


In [32]:
core_alp = pd.DataFrame(set_core_alp)
core_alp.set_index(0, inplace = True)
#core_alp = core_alp.rename(index={'Feature ID'})
core_alp_taxa = core_alp.join(taxa['Taxon'])
core_alp_taxa

Unnamed: 0_level_0,Taxon
0,Unnamed: 1_level_1
f50c8ae2717bb99c926c4ab1f2a6135c,k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Brevibacteriaceae; g__Brevibacterium
13abd204fa63efb19248b7c271448d5a,k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Dermabacteraceae; g__Brachybacterium; s__
5899b66b70d688d5cd95df5fc7a26e3a,k__Bacteria; p__Proteobacteria; c__Gammaproteobacteria; o__Pseudomonadales; f__Moraxellaceae; g__Psychrobacter
805c1b3ec3035abbb7b9f1f7f6157e12,k__Bacteria; p__Firmicutes; c__Bacilli; o__Bacillales; f__Staphylococcaceae; g__Staphylococcus
da95b61897d9c6cd8b79f052d26a7985,k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Corynebacteriaceae; g__Corynebacterium; s__variabile
fc51328a0e0452be580de099a5b5791a,k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Corynebacteriaceae; g__Corynebacterium; s__stationis
d847672aeae8e53a505ead86563586e4,k__Bacteria; p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Aerococcaceae; g__Facklamia; s__
398e906d9ad1914eb268fda5c7453e09,k__Bacteria
9e9ac50434879829e4bce8eeb1bc4f9c,k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Brevibacteriaceae; g__Brevibacterium; s__
c3e308088f68e1cabfd16c37f5a2307b,k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Yaniellaceae; g__Yaniella; s__
