# CARA-Metabolomics LC-MS/MS Annotation analysis - HILIC pos

**Author**: Louis-Felix Nothias, Feb 2021

**Objective**: 
- Explore the annotations with feature-based molecular networking (FBMN)
- Check if SIRIUS annotations are consistent with spectral library matches at the molecular formula (MF) and class level. We can also check with for analogue library match.

**Additional ideas**

In [1]:
import pandas as pd     
import numpy as np
import altair as alt
from get_stats_annotation import *
from check_annotation import *
from visualize_annotation import *
pd.set_option('mode.chained_assignment', None)

### Prepare input annotation files

In [2]:
# Importing the feature metadata for CMN and FBMN
FBMN = pd.read_csv('input/FBMN/HILICpos_feature_metadata.tsv',  sep='\t', index_col=0, header=0, low_memory=False)

### Lets look at the annotations metadata per tools

The annotation metadata originating from a specific share the same prefix:

**GNPS tools**:

- **Molecular networking** (column prefix: `'GNPS_'`).
- **Spectral library search** (column prefix: `'GNPS_LIB_'`).
- **Spectral library search in analogue mode** (column prefix: `'GNPS_LIBA_'`).
- **Passatutto FDR controlled spectral lib match** (column prefix: `'PASSA_FDR_'`).

**SIRIUS tools**:

- **ZODIAC**: Molecular formula annotation (column prefix: `'SIR_MF_Zod'`).
- **CSI:FingerID**: Putative structure annotation (column prefix: `'CSI_'`).
- **CANOPUS**: Putative chemical class annotation (column prefix: `'CAN_'`).


#### We can limit to a subset of features

In [3]:
features_of_interest_p_value = pd.read_csv('Input/FBMN/statistical_analysis_results/HILIC_pos.p_value.tsv',  sep='\t', index_col=0, header=0)

print('Features to filter: '+ str(features_of_interest_p_value.shape))
print('Features in the table before: '+ str(FBMN.shape[0]))
FBMN  = FBMN[FBMN.index.isin(features_of_interest_p_value.index)]
print('Features in the table after p value filter: '+ str(FBMN.shape[0]))

Features to filter: (20156, 14)
Features in the table before: 28762
Features in the table after p value filter: 20156


In [4]:
# This prints the metadata columns per annotations using the respective prefix.
def show_metadata_tools(table, metadata_prefix):
    metadata = []
    for x in table.columns:
        if x.startswith(metadata_prefix):
            metadata.append(str(x))
    print(metadata)

In [5]:
show_metadata_tools(FBMN,'GNPS_LIB_')

['GNPS_LIB_SpectrumID', 'GNPS_LIB_Compound_Name', 'GNPS_LIB_Ion_Source', 'GNPS_LIB_Instrument', 'GNPS_LIB_Compound_Source', 'GNPS_LIB_PI', 'GNPS_LIB_Data_Collector', 'GNPS_LIB_Adduct', 'GNPS_LIB_Precursor_MZ', 'GNPS_LIB_ExactMass', 'GNPS_LIB_Charge', 'GNPS_LIB_CAS_Number', 'GNPS_LIB_Pubmed_ID', 'GNPS_LIB_Smiles', 'GNPS_LIB_INCHI', 'GNPS_LIB_INCHI_AUX', 'GNPS_LIB_Library_Class', 'GNPS_LIB_IonMode', 'GNPS_LIB_UpdateWorkflowName', 'GNPS_LIB_LibraryQualityString', 'GNPS_LIB_SpectrumFile', 'GNPS_LIB_LibraryName', 'GNPS_LIB_MQScore', 'GNPS_LIB_Organism', 'GNPS_LIB_TIC_Query', 'GNPS_LIB_RT_Query', 'GNPS_LIB_MZErrorPPM', 'GNPS_LIB_SharedPeaks', 'GNPS_LIB_MassDiff', 'GNPS_LIB_LibMZ', 'GNPS_LIB_SpecMZ', 'GNPS_LIB_SpecCharge', 'GNPS_LIB_FileScanUniqueID', 'GNPS_LIB_NumberHits', 'GNPS_LIB_tags', 'GNPS_LIB_MoleculeExplorerDatasets', 'GNPS_LIB_MoleculeExplorerFiles', 'GNPS_LIB_InChIKey', 'GNPS_LIB_InChIKey-Planar', 'GNPS_LIB_superclass', 'GNPS_LIB_class', 'GNPS_LIB_subclass']


# Feature-Based Molecular Networking
## General annotation statistics

In [6]:
zodiac_score_thresh=0.7
ionisation_mode='pos'
ppm_error=25

In [7]:
get_stats_annotation(FBMN, zodiac_score_thresh, ionisation_mode, ppm_error)

Features = 20156
 
==== GNPS =====
In networks = 3231
Number of networks = 665
Valid library annotations = 699
Library annotations in analogue mode= 1674
PASSATUTTO FDR-controlled library annotations = 413
PASSATUTTO FDR-controlled library annotations at 20% FDR = 356
PASSATUTTO FDR-controlled library annotations at 10% FDR = 249
 
==== SIRIUS =====
Features with SIRIUS annotation = 4637
SIRIUS ZODIAC MF with ZodiacScore > 0.7 = 3319
CSIFingerID annotations = 2153
CANOPUS annotations = 3140
CANOPUS annotations at the subclass level= 3060
CANOPUS annotations at the subclass level= 2648
CANOPUS annotations at the level 5 = 2020
 
==== General annotation statistics =====
Number of features = 20156
Annotated features = 4857
Annotated features or in network = 5641
Single nodes = 16925
Single nodes and unnnannotated = 14632
 


### View all annotations

In [8]:
get_stats_annotation.final_table_rel
get_stats_annotation.final_table

Unnamed: 0,Annotation tool,Count
0,Features,20156
1,GNPS - in networks,3231
2,GNPS - lib. match,699
3,GNPS - lib. match analogue,1674
4,PASSATUTTO FDR 20%,356
5,PASSATUTTO FDR 10%,249
6,SIRIUS - Annotated features,4637
7,SIRIUS - MF with ZodScore >0.7,3319
8,SIRIUS - structure,2153
9,SIRIUS - chemical class,3140


In [9]:
make_barchart(get_stats_annotation.final_table) # Absolute count
make_barchart_rel(get_stats_annotation.final_table_rel)

### View GNPS annotations

In [10]:
make_barchart(get_stats_annotation.table_gnps) # Absolute count
make_barchart_rel(get_stats_annotation.table_gnps_rel)

### View SIRIUS annotations

In [11]:
make_barchart(get_stats_annotation.table_sirius) # Absolute count
make_barchart_rel(get_stats_annotation.table_sirius_rel)

# Look at GNPS/SIRIUS annotation consistency

In [12]:
check_matching_annotations(FBMN, zodiac_score_thresh, ionisation_mode='pos', library_mode = 'reg', canopus_level= 'spec', 
                           cosine=0.7, shared_peaks=6, ppm_error=25)

=== Looking at match between GNPS library in REGULAR mode and SIRIUS annotation ===
Usable GNPS/SIRIUS annotations = 134
Usable GNPS/SIRIUS annot. w. ZodiacScore > 0.7 = 134
Check with CANOPUS SPECIFIC classification levels
 
MF match = 113
MF match score = 111
 
Classified pairs considered = 120
Superclass annotation pairs = 110
Superclass match all = 110, 0.92%
Class annotation pairs = 119
Class match = 108, 0.91%
Subclass annotation pairs = 113
Subclass match all = 103, 0.91%


### Molecular formula annotation consistency between GNPS/SIRIUS

In [13]:
check_matching_annotations.table_matching

Unnamed: 0,Matching level,Count,Relative
0,Usable MF pairs,134,1.07
1,Usable MF pairs w. ZodiacScore>0.7,125,1.0
2,Matching molecular formula,113,0.9
3,Matching molecular w. ZodiacScore>0.7,111,0.89


In [14]:
make_barchart_match(check_matching_annotations.table_matching) ### This one for absolute values
make_barchart_match_rel(check_matching_annotations.table_matching)

### Chemical class annotation consistency between GNPS-SIRIUS/CANOPUS

In [15]:
check_matching_annotations.table_class_matching

Unnamed: 0,Matching level,Count,Relative
0,Available pairs,128,1.07
1,Classified pairs w. ZodiacScore>0.7,120,1.0
2,Matching superclass,110,0.92
3,Matching class,108,0.9
4,Matching subclass,103,0.86


In [16]:
check_matching_annotations.table_class_matching
make_barchart_match(check_matching_annotations.table_class_matching) ### This one for absolute values
make_barchart_match_rel(check_matching_annotations.table_class_matching)

# Additional, detailed views (Optional) 
## Below are to view distribution of correct/incorrect annotations

### View Molecular Formula (only for REGULAR library search)

In [17]:
dist_plot(check_matching_annotations.MF_pairs,'MF_match', zodiac_score_thresh)

### View classification results

In [18]:
# Superclass level
#dist_plot(check_matching_annotations.superclass_match_all_total,'Match_GNPSsuperclass-SIRIUS',zodiac_score_thresh)
dist_plot(check_matching_annotations.class_match_all_total,'Match_GNPSclass-SIRIUS',zodiac_score_thresh)
#dist_plot(check_matching_annotations.subclass_match_all_total,'Match_GNPSsubclass-SIRIUS',zodiac_score_thresh)

### View details of incorrect MF annotations (only for REGULAR library search)

In [19]:
# This is used to display the entire table in the notebook
from IPython.display import display, HTML
show_non_matching_MF = check_matching_annotations.MF_no_match[['SIR_MF_Zod_ZodiacScore','GNPS_LIB_INCHI_MF','SIR_MF_Zod_molecularFormula',
                                                               'GNPS_LIB_Adduct','SIR_MF_Zod_adduct',
                                                               'GNPS_LIB_superclass','CAN_superclass',
                                                               'GNPS_LIB_class','CAN_class',
                                                               'GNPS_LIB_MQScore', 'GNPS_LIB_MZErrorPPM', 'GNPS_LIB_SharedPeaks', 'GNPS_LIB_SpecCharge','GNPS_LIB_SpecMZ']]
show_non_matching_MF.sort_values(['GNPS_LIB_SpecMZ','SIR_MF_Zod_ZodiacScore'], inplace=True, ascending = (False, False))

display(HTML(show_non_matching_MF.to_html()))

Unnamed: 0_level_0,SIR_MF_Zod_ZodiacScore,GNPS_LIB_INCHI_MF,SIR_MF_Zod_molecularFormula,GNPS_LIB_Adduct,SIR_MF_Zod_adduct,GNPS_LIB_superclass,CAN_superclass,GNPS_LIB_class,CAN_class,GNPS_LIB_MQScore,GNPS_LIB_MZErrorPPM,GNPS_LIB_SharedPeaks,GNPS_LIB_SpecCharge,GNPS_LIB_SpecMZ
#featureID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
13455,0.998,C39H68O5,C37H68O4,M+H-H2O,[M + Na]+,Lipids and lipid-like molecules,Lipids and lipid-like molecules,Fatty Acyls,Fatty Acyls,0.751713,4.07238,18.0,1.0,599.501
5332,1.0,C30H50O,C25H50N2O3,M+H,[M + H]+,Lipids and lipid-like molecules,Lipids and lipid-like molecules,Prenol lipids,Prenol lipids,0.729634,19.1363,11.0,1.0,427.385
6868,0.508,C20H21N3O4,C15H23N5O7,M+H,[M - H2O + H]+,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,0.947242,1.57495,7.0,1.0,368.16
1322,1.0,C6H13N3O3,C12H23N5O6,2M+H,[M + H3N + H]+,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,0.967737,2.51997,7.0,1.0,351.198
1831,1.0,C6H14N4O2,C12H25N7O4,2M+H,[M + H3N + H]+,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,0.986186,2.79633,7.0,1.0,349.23
12724,0.987,C17H19N5O3,C15H21N5O3,M+H,[M + Na]+,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,0.931251,7.84889,7.0,1.0,342.153
4509,0.752,C16H20N4O4,C13H18FN3O5,M+H,[M + H3N + H]+,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,0.859401,0.641211,8.0,1.0,333.156
12581,0.433,C16H20N4O4,C13H18FN3O5,M+H,[M + H3N + H]+,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,0.859401,7.23652,8.0,1.0,333.153
11691,0.558,C20H21FN2O,C17H17BN2O3,M+H,[M + H3N + H]+,Benzenoids,Benzenoids,Benzene and substituted derivatives,,0.849318,6.47571,7.0,1.0,325.169
2224,0.553,C19H38O4,C17H36N3O3,M+H-H2O,[M - H2O + H]+,Lipids and lipid-like molecules,Lipids and lipid-like molecules,Glycerolipids,Fatty Acyls,0.772679,4.48109,9.0,1.0,313.272


### View details of correct MF annotations (only for REGULAR library search)

In [20]:
# This is used to display the entire table in the notebook
from IPython.display import display, HTML
show_matching_MF = check_matching_annotations.MF_match[['SIR_MF_Zod_ZodiacScore','GNPS_LIB_INCHI_MF','SIR_MF_Zod_molecularFormula',
                                                               'GNPS_LIB_Adduct','SIR_MF_Zod_adduct',
                                                               'GNPS_LIB_superclass','CAN_superclass',
                                                               'GNPS_LIB_class','CAN_class',
                                                               'GNPS_LIB_MQScore', 'GNPS_LIB_MZErrorPPM', 'GNPS_LIB_SharedPeaks', 'GNPS_LIB_SpecCharge','GNPS_LIB_SpecMZ']]
show_matching_MF.sort_values(['GNPS_LIB_SpecMZ','SIR_MF_Zod_ZodiacScore'], inplace=True, ascending = (False, False))

display(HTML(show_matching_MF.to_html()))

Unnamed: 0_level_0,SIR_MF_Zod_ZodiacScore,GNPS_LIB_INCHI_MF,SIR_MF_Zod_molecularFormula,GNPS_LIB_Adduct,SIR_MF_Zod_adduct,GNPS_LIB_superclass,CAN_superclass,GNPS_LIB_class,CAN_class,GNPS_LIB_MQScore,GNPS_LIB_MZErrorPPM,GNPS_LIB_SharedPeaks,GNPS_LIB_SpecCharge,GNPS_LIB_SpecMZ
#featureID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
14884,0.955,C39H68O5,C39H68O5,M+H-H2O,[M - H2O + H]+,Lipids and lipid-like molecules,Lipids and lipid-like molecules,Fatty Acyls,Glycerolipids,0.703515,3.86876,14.0,1.0,599.501
8040,1.0,C22H30Cl2N10,C22H30Cl2N10,M+H,[M + H]+,Benzenoids,Benzenoids,Benzene and substituted derivatives,Benzene and substituted derivatives,0.909024,3.80556,13.0,1.0,505.209
18727,0.915,C21H20O10,C21H20O10,M+H,[M + H]+,Phenylpropanoids and polyketides,Phenylpropanoids and polyketides,Flavonoids,Isoflavonoids,0.919401,2.32521,9.0,1.0,433.112
27617,0.999,C16H30N6O6,C16H30N6O6,M+H,[M + H]+,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,0.728373,12.1849,13.0,1.0,403.225
10967,0.72,C21H22O8,C21H22O8,M+H,[M + H]+,Phenylpropanoids and polyketides,Phenylpropanoids and polyketides,Flavonoids,Flavonoids,0.920958,4.0121,8.0,1.0,403.137
1487,0.999,C21H39NO4,C21H39NO4,M+H,[M + H]+,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,0.901871,4.0383,11.0,1.0,370.294
16892,0.94,C21H30O8S,C21H30O5,M+H-O3S,[M + H]+,Lipids and lipid-like molecules,Lipids and lipid-like molecules,Steroids and steroid derivatives,Steroids and steroid derivatives,0.84673,2.26855,15.0,1.0,363.216
9052,1.0,C17H24N6O3,C17H24N6O3,M+H,[M + H]+,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,0.852654,5.57633,11.0,1.0,361.196
1769,1.0,C21H38O4,C21H38O4,M+H,[M + H]+,Lipids and lipid-like molecules,Lipids and lipid-like molecules,Fatty Acyls,Fatty Acyls,0.822617,3.34995,11.0,1.0,355.283
9163,0.981,C20H21N3O3,C20H21N3O3,M+H,[M + H]+,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,0.881147,2.51305,8.0,1.0,352.165


### View details of incorrect/correct class annotations (valid only for REGULAR library search)

In [21]:
from IPython.display import display, HTML
show_matching_class = check_matching_annotations.class_match_all_total[[
                                                               'GNPS_LIB_superclass','CAN_superclass',
                                                               'GNPS_LIB_class','CAN_class',
                                                               'GNPS_LIB_subclass','CAN_subclass',
                                                               'SIR_MF_Zod_ZodiacScore','SIR_MF_Zod_molecularFormula',
                                                               'GNPS_LIB_Adduct','SIR_MF_Zod_adduct',
                                                               'GNPS_LIB_MQScore', 'GNPS_LIB_MZErrorPPM', 'GNPS_LIB_SharedPeaks',
                                                               'GNPS_LIB_SpecMZ','Match_GNPSsuperclass-SIRIUS',
                                                               'Match_GNPSclass-SIRIUS','Match_GNPSsubclass-SIRIUS']]
show_matching_class.sort_values(['Match_GNPSsuperclass-SIRIUS','Match_GNPSclass-SIRIUS','Match_GNPSsubclass-SIRIUS',
                                 'GNPS_LIB_SpecMZ','SIR_MF_Zod_ZodiacScore'], inplace=True, ascending = (False,False,False,False, False))

display(HTML(show_matching_class.to_html()))

Unnamed: 0_level_0,GNPS_LIB_superclass,CAN_superclass,GNPS_LIB_class,CAN_class,GNPS_LIB_subclass,CAN_subclass,SIR_MF_Zod_ZodiacScore,SIR_MF_Zod_molecularFormula,GNPS_LIB_Adduct,SIR_MF_Zod_adduct,GNPS_LIB_MQScore,GNPS_LIB_MZErrorPPM,GNPS_LIB_SharedPeaks,GNPS_LIB_SpecMZ,Match_GNPSsuperclass-SIRIUS,Match_GNPSclass-SIRIUS,Match_GNPSsubclass-SIRIUS
#featureID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
13455,Lipids and lipid-like molecules,Lipids and lipid-like molecules,Fatty Acyls,Fatty Acyls,Lineolic acids and derivatives,Lineolic acids and derivatives,0.998,C37H68O4,M+H-H2O,[M + Na]+,0.751713,4.07238,18.0,599.501,yes,yes,yes
8040,Benzenoids,Benzenoids,Benzene and substituted derivatives,Benzene and substituted derivatives,Halobenzenes,Halobenzenes,1.0,C22H30Cl2N10,M+H,[M + H]+,0.909024,3.80556,13.0,505.209,yes,yes,yes
27617,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,"Amino acids, peptides, and analogues","Amino acids, peptides, and analogues",0.999,C16H30N6O6,M+H,[M + H]+,0.728373,12.1849,13.0,403.225,yes,yes,yes
10967,Phenylpropanoids and polyketides,Phenylpropanoids and polyketides,Flavonoids,Flavonoids,O-methylated flavonoids,O-methylated flavonoids,0.72,C21H22O8,M+H,[M + H]+,0.920958,4.0121,8.0,403.137,yes,yes,yes
1487,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,"Amino acids, peptides, and analogues","Amino acids, peptides, and analogues",0.999,C21H39NO4,M+H,[M + H]+,0.901871,4.0383,11.0,370.294,yes,yes,yes
9052,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,"Amino acids, peptides, and analogues","Amino acids, peptides, and analogues",1.0,C17H24N6O3,M+H,[M + H]+,0.852654,5.57633,11.0,361.196,yes,yes,yes
1769,Lipids and lipid-like molecules,Lipids and lipid-like molecules,Fatty Acyls,Fatty Acyls,Lineolic acids and derivatives,Lineolic acids and derivatives,1.0,C21H38O4,M+H,[M + H]+,0.822617,3.34995,11.0,355.283,yes,yes,yes
9163,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,"Amino acids, peptides, and analogues","Amino acids, peptides, and analogues",0.981,C20H21N3O3,M+H,[M + H]+,0.881147,2.51305,8.0,352.165,yes,yes,yes
1322,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,"Amino acids, peptides, and analogues","Amino acids, peptides, and analogues",1.0,C12H23N5O6,2M+H,[M + H3N + H]+,0.967737,2.51997,7.0,351.198,yes,yes,yes
1831,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,"Amino acids, peptides, and analogues","Amino acids, peptides, and analogues",1.0,C12H25N7O4,2M+H,[M + H3N + H]+,0.986186,2.79633,7.0,349.23,yes,yes,yes


### View details of correct/incorrect class annotations (valid only for ANALOGUE library search)

In [22]:
### View details of correct class annotations (only for REGULAR library search)
from IPython.display import display, HTML
show_matching_class = check_matching_annotations.class_match_all_total[[
                                                               'GNPS_LIBA_superclass','CAN_superclass',
                                                               'GNPS_LIBA_class','CAN_class',
                                                               'GNPS_LIBA_subclass','CAN_subclass',
                                                               'SIR_MF_Zod_ZodiacScore','SIR_MF_Zod_molecularFormula',
                                                               'GNPS_LIB_Adduct','SIR_MF_Zod_adduct',
                                                               'GNPS_LIB_MQScore', 'GNPS_LIB_MZErrorPPM', 'GNPS_LIB_SharedPeaks',
                                                               'GNPS_LIB_SpecMZ','Match_GNPSsuperclass-SIRIUS',
                                                               'Match_GNPSclass-SIRIUS','Match_GNPSsubclass-SIRIUS']]
show_matching_class.sort_values(['Match_GNPSsuperclass-SIRIUS','Match_GNPSclass-SIRIUS','Match_GNPSsubclass-SIRIUS',
                                 'GNPS_LIB_SpecMZ','SIR_MF_Zod_ZodiacScore'], inplace=True, ascending = (False,False,False,False, False))

display(HTML(show_matching_class.to_html()))

Unnamed: 0_level_0,GNPS_LIBA_superclass,CAN_superclass,GNPS_LIBA_class,CAN_class,GNPS_LIBA_subclass,CAN_subclass,SIR_MF_Zod_ZodiacScore,SIR_MF_Zod_molecularFormula,GNPS_LIB_Adduct,SIR_MF_Zod_adduct,GNPS_LIB_MQScore,GNPS_LIB_MZErrorPPM,GNPS_LIB_SharedPeaks,GNPS_LIB_SpecMZ,Match_GNPSsuperclass-SIRIUS,Match_GNPSclass-SIRIUS,Match_GNPSsubclass-SIRIUS
#featureID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
13455,Lipids and lipid-like molecules,Lipids and lipid-like molecules,Fatty Acyls,Fatty Acyls,Lineolic acids and derivatives,Lineolic acids and derivatives,0.998,C37H68O4,M+H-H2O,[M + Na]+,0.751713,4.07238,18.0,599.501,yes,yes,yes
8040,Benzenoids,Benzenoids,Benzene and substituted derivatives,Benzene and substituted derivatives,Halobenzenes,Halobenzenes,1.0,C22H30Cl2N10,M+H,[M + H]+,0.909024,3.80556,13.0,505.209,yes,yes,yes
27617,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,"Amino acids, peptides, and analogues","Amino acids, peptides, and analogues",0.999,C16H30N6O6,M+H,[M + H]+,0.728373,12.1849,13.0,403.225,yes,yes,yes
10967,Phenylpropanoids and polyketides,Phenylpropanoids and polyketides,Flavonoids,Flavonoids,O-methylated flavonoids,O-methylated flavonoids,0.72,C21H22O8,M+H,[M + H]+,0.920958,4.0121,8.0,403.137,yes,yes,yes
1487,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,"Amino acids, peptides, and analogues","Amino acids, peptides, and analogues",0.999,C21H39NO4,M+H,[M + H]+,0.901871,4.0383,11.0,370.294,yes,yes,yes
9052,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,"Amino acids, peptides, and analogues","Amino acids, peptides, and analogues",1.0,C17H24N6O3,M+H,[M + H]+,0.852654,5.57633,11.0,361.196,yes,yes,yes
1769,Lipids and lipid-like molecules,Lipids and lipid-like molecules,Fatty Acyls,Fatty Acyls,Lineolic acids and derivatives,Lineolic acids and derivatives,1.0,C21H38O4,M+H,[M + H]+,0.822617,3.34995,11.0,355.283,yes,yes,yes
9163,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,"Amino acids, peptides, and analogues","Amino acids, peptides, and analogues",0.981,C20H21N3O3,M+H,[M + H]+,0.881147,2.51305,8.0,352.165,yes,yes,yes
1322,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,"Amino acids, peptides, and analogues","Amino acids, peptides, and analogues",1.0,C12H23N5O6,2M+H,[M + H3N + H]+,0.967737,2.51997,7.0,351.198,yes,yes,yes
1831,Organic acids and derivatives,Organic acids and derivatives,Carboxylic acids and derivatives,Carboxylic acids and derivatives,"Amino acids, peptides, and analogues","Amino acids, peptides, and analogues",1.0,C12H25N7O4,2M+H,[M + H3N + H]+,0.986186,2.79633,7.0,349.23,yes,yes,yes
