# Virtually metabolize GNPS annotations and prepare for Network Annotation Propagation or SIRIUS

Made by Louis-Felix Nothias (UC San Diego), louisfelix.nothias@gmail.com. Started in 2018 and improved in May 2021.

This notebook downloads results of spectral annotations from classical or feature-based molecular networking job from GNPS [[http://gnps.ucsd.edu](http://gnps.ucsd.edu)] and generate virtual metabolites either with SyGMa or BioTransformer. The resulting candidates can be used for [Network Annotation Propagation](https://ccms-ucsd.github.io/GNPSDocumentation/nap/) on GNPS or with [SIRIUS](https://boecker-lab.github.io/docs.sirius.github.io/install/).

> Start by running the cell below to initiate the libraries.

In [2]:
import sys
sys.path.append('gnps_postprocessing/lib')
sys.path.append('src')
from gnps_download_results import *
from consolidate_structures import *
from gnps_results_postprocess import *
from prepare_virtual_metabolization import *
from run_virtual_metabolization import *

## Mandatory - Download annotation from the GNPS job
 
> Replace the job ID from the GNPS molecular networking job in the URL in the cell below (line 3). We support both classical molecular networking and feature-based molecular networking (FBMN) jobs.

You can try the classical MN job from that paper https://pubs.acs.org/doi/10.1021/acs.analchem.8b05854 with the ID `'bbee697a63b1400ea585410fafc95723'`. 

An other test job for feature-based molecular networking (FBMN) is `'e78a8c8f429a46fcb24f3b34d69aff25'`.

In [3]:
job_id = 'bbee697a63b1400ea585410fafc95723'

gnps_download_results(job_id, output_folder ='all_annotations')

This is the GNPS job link: https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=bbee697a63b1400ea585410fafc95723
Downloading the following content: https://gnps.ucsd.edu/ProteoSAFe/DownloadResult?task=bbee697a63b1400ea585410fafc95723&view=view_all_annotations_DB


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 29.5M    0 29.5M    0     0  2860k      0 --:--:--  0:00:10 --:--:-- 3299k


GNPS job results were succesfully downloaded as: all_annotations.zip
GNPS job results were succesfully extracted into the folder: all_annotations
   CLASSICAL MOLECULAR NETWORKING job detected
      199 spectral library annotations in the job.
      9643 nodes in the network (including single nodes)


In [4]:
gnps_download_results.df_annotations.head(2)

Unnamed: 0,#Scan#,Adduct,CAS_Number,Charge,Compound_Name,Compound_Source,Data_Collector,ExactMass,FileScanUniqueID,INCHI,...,RT_Query,SharedPeaks,Smiles,SpecCharge,SpecMZ,SpectrumFile,SpectrumID,TIC_Query,UpdateWorkflowName,tags
0,100631,[M+H]+,,1,MoNA:594132 Octocrylene,isolated,MoNA,0.0,spectra/specs_ms.pklbin100631,InChI=1S/C24H27NO2/c1-3-5-12-19(4-2)18-27-24(2...,...,73.567,5,,0,362.208,spectra/specs_ms.pklbin,CCMSLIB00000566191,1910.37,UPDATE-SINGLE-ANNOTATED-BRONZE,
1,100637,[M+H]+,,1,MoNA:594132 Octocrylene,isolated,MoNA,0.0,spectra/specs_ms.pklbin100637,InChI=1S/C24H27NO2/c1-3-5-12-19(4-2)18-27-24(2...,...,433.325,7,,0,362.212,spectra/specs_ms.pklbin,CCMSLIB00000566191,31304.2,UPDATE-SINGLE-ANNOTATED-BRONZE,


## Mandatory - Consolidating structures identifier

> Run the cell below to have a complete set of Smiles and InChI for the annotations.

**IMPORTANT: Note that only spectral annotations that have a valid InChI or SMILES identifier will be considered downstream. If the annotations you are interested in don't have an identifier in the library, go back to the GNPS library entry, update the entry by adding an identifier, and rerun your GNPS job**

In [5]:
df_annotations_consolidated  = consolidate_and_convert_structures(gnps_download_results.df_annotations, prefix='', 
                                                                  smiles='Smiles', inchi='INCHI')

 ==== Consolidating structures from SMILES and/or InChI ====
Both SMILES and InChI were inputted
Converting SMILES to mol object
  Salt(s) deleted in       : CC(C)N=C(N)N=C(N)Nc1ccc(Cl)cc1.Cl
  Remaining residue        : CC(C)N=C(N)N=C(N)Nc1ccc(Cl)cc1
  Salt(s) deleted in       : C12(=CC=C(C=C1COC2(CCCN(C)C)C3=CC=C(C=C3)F)C#N).OC(=O)C(=O)O
  Remaining residue        : CN(C)CCCC1(c2ccc(F)cc2)OCc2cc(C#N)ccc21
Succesfully converted to mol object: 110
Exception to the parsing: 0
Not available: 90
Converting INCHI to mol object
Succesfully converted to mol object: 104
Exception to the parsing: 0
Not available: 96
Consolidating the lists
Total mol object from the list 1 = 110
Mol object consolidated from list 2 = 12
Consolidated structures = 122
Converting mol objects to SMILES iso
Converting mol objects to SMILES
Converting mol objects to InChI
Converting mol objects to InChIKey
End


In [6]:
# We keep only annotations with a structure identifier

df_annotations = get_info_gnps_annotations(df_annotations_consolidated, 
                          inchi_column='Consol_InChI', 
                          smiles_column = 'Consol_SMILES', 
                          smiles_planar_column='Consol_SMILES_iso')

200 annotations detected
that corresponds to 61 unique stereostructures
that corresponds to 58 unique planar structures
78 annotations dont have a structure identifier and will be discarded from downstream processing, unless you do the following:
You can either update the GNPS library and rerun the GNPS job. Or you can provide a structure identifier in the dedicated cell below
These are the compounds without structure identifiers:
'(+)-Catechin',
'(.+/-.)-8-Hydroxy-5Z,9E,11Z,14Z,17Z-eicosapentaenoic acid',
'2-(Cyclohexylamino)ethanesulfonic acid',
'3,4'-Dimethoxy-2-hydroxychalcone',
'4-Hydroxy-4'-methyldiphenylamine',
'B10A30 Faulkner legacy library looks like sterol or lipid needs to be verified',
'Benzalkonium chloride (C12)',
'Betulinic acid',
'Betulinic acid methyl ester',
'Conjugated linoleic Acid (10E,12Z)',
'Dioctyl phthalate',
'Lyso-PC(16:0)',
'Procyanidin B2',
'Spectral Match to (+)-Catechin from NIST14',
'Spectral Match to (-)-Epicatechin from NIST14',
'Spectral Match to 1-He

### [Advanced optional feature - Recommended to ignore] - Filter annotations based on compound name

If you want to apply this filter, convert the cell type from raw to code. Other skip the following cells.

##### Optional - Display compound name

#### Optional - Select compound name to keep

Replace the compound names in the list `compound_name_to_keep`


### [Advanced optional feature - Recommended to ignore]  - Filter annotations based on tags

If you want to apply this filter convert the cell type from raw to code.

#### Optional - Display tags-annotations

#### Optional - Select tags to keep

Specify the tags in the list `tags_to_keep`

## Mandatory - Apply filter (if any were set)
If you haven't select a filter, run this cell anyway.

In [7]:
# We check if those lists exists and process as needed:

try: compound_name_to_keep
except NameError: compound_name_to_keep = None

try: tags_to_keep
except NameError: tags_to_keep = None

if compound_name_to_keep == None and tags_to_keep == None:
    df_annotations_filtered = df_annotations
    print('No Compound_Name or Tags filter were used')
    
elif compound_name_to_keep and tags_to_keep==None:
    df_annotations_filtered = df_annotations_filtering(df_annotations, compound_name=compound_name_to_keep)
    print('Compound name filtering applied')
    
elif compound_name_to_keep == None and tags_to_keep:
    df_annotations_filtered = df_annotations_filtering(df_annotations, tags=tags_to_keep)
    print('Tag(s) filtering applied')
    
elif compound_name_to_keep and tags_to_keep:
    df_annotations_filtered = df_annotations_filtering(df_annotations, compound_name=compound_name_to_keep, tags=tags_to_keep)
    print('Compound name and tags filtering applied')
    
else:
    print('Something is wrong')
    
print('Number of annotations after filtering = '+str(df_annotations_filtered.shape[0]))

No Compound_Name or Tags filter were used
Number of annotations after filtering = 122


## Mandatory - Choose between planar or stereochemical SMILES

### [RECOMMENDED] Use the planar SMILES for virtual metabolization (no stereochemistry specified)

Run the cell below to use planar isomers and ignore the cell after. This is recommended as it reflects the confidence computational mass spectrometry annotation can achieve and limits the number of candidates to compute.

In [8]:
use_planar_structure=True # or False
prepare_for_virtual_metabolization(df_annotations_filtered,
                                    smiles_column = 'Consol_SMILES', 
                                    smiles_planar_column='Consol_SMILES_iso',
                                    drop_duplicated_structure = True, 
                                    use_planar_structure= True)

Number of spectral library annotations = 122
Number of spectral annotations with planar SMILES/InChI = 122
Number of unique planar SMILES considered = 57


Unnamed: 0,#Scan#,Adduct,CAS_Number,Charge,Compound_Name,Compound_Source,Data_Collector,ExactMass,FileScanUniqueID,INCHI,...,SpecMZ,SpectrumFile,SpectrumID,TIC_Query,UpdateWorkflowName,tags,Consol_SMILES_iso,Consol_SMILES,Consol_InChIKey,Consol_InChI
29,134880,M-H2O+H,,1,"NCGC00347704-02_C24H32O7_2H-Oxireno[1,10a]phen...",isolated,lfnothias,432.215,spectra/specs_ms.pklbin134880,InChI=1S/C24H32O7/c1-11-19-14(30-21(11)27)9-16...,...,415.212,spectra/specs_ms.pklbin,CCMSLIB00000853048,249889.0,UPDATE-SINGLE-ANNOTATED-GOLD,,CC(=O)OC1CC(OC(C)=O)C2(C)C(CCC34OC3C3=C(C)C(=O...,CC(=O)O[C@H]1C[C@H](OC(C)=O)[C@@]2(C)[C@@H]3C[...,YOELDOOOBJSHSZ-SRFZOMHBSA-N,InChI=1S/C24H32O7/c1-11-19-14(30-21(11)27)9-16...
176,67543,[M+H],219861-08-2,1,Escitalopram Oxalate,Commercial,Garg_Neha,324.164,spectra/specs_ms.pklbin67543,,...,325.171,spectra/specs_ms.pklbin,CCMSLIB00000078645,21118.8,UPDATE-SINGLE-ANNOTATED-GOLD,,CN(C)CCCC1(c2ccc(F)cc2)OCc2cc(C#N)ccc21,CN(C)CCCC1(c2ccc(F)cc2)OCc2cc(C#N)ccc21,WSEQXVZVJXJVFP-UHFFFAOYSA-N,InChI=1S/C20H21FN2O/c1-23(2)11-3-10-20(17-5-7-...
35,136809,M+H,,1,"NCGC00385223-01!(2R,3S)-7-[(2S,3R,4R)-3,4-dihy...",isolated,lfnothias,422.121,spectra/specs_ms.pklbin136809,InChI=1S/C20H22O10/c21-7-20(27)8-28-19(18(20)2...,...,423.128,spectra/specs_ms.pklbin,CCMSLIB00000848636,63965.8,UPDATE-SINGLE-ANNOTATED-GOLD,,OCC1(O)COC(Oc2cc(O)c3c(c2)OC(c2ccc(O)c(O)c2)C(...,OC[C@@]1(O)CO[C@@H](Oc2cc(O)c3c(c2)O[C@H](c2cc...,OZEIRJMOKXRLCR-AXDKOMKPSA-N,InChI=1S/C20H22O10/c21-7-20(27)8-28-19(18(20)2...
36,136810,M+H,,1,"NCGC00347390-02!(2R,3S)-7-[(2S,3R,4R,5S)-3,4-d...",isolated,lfnothias,422.121,spectra/specs_ms.pklbin136810,InChI=1S/C20H22O10/c21-7-16-17(26)18(27)20(30-...,...,423.128,spectra/specs_ms.pklbin,CCMSLIB00000847500,246707.0,UPDATE-SINGLE-ANNOTATED-GOLD,,OCC1OC(Oc2cc(O)c3c(c2)OC(c2ccc(O)c(O)c2)C(O)C3...,OC[C@@H]1O[C@@H](Oc2cc(O)c3c(c2)O[C@H](c2ccc(O...,JRAAEKBJXQXXBZ-DAJYORATSA-N,InChI=1S/C20H22O10/c21-7-16-17(26)18(27)20(30-...
161,45408,[M+H]+,738-70-5,1,"Massbank:EA019905 Trimethoprim|2,4-Diamino-5-(...",Isolated,Massbank,0.0,spectra/specs_ms.pklbin45408,InChI=1S/C14H18N4O3/c1-19-10-5-8(6-11(20-2)12(...,...,291.145,spectra/specs_ms.pklbin,CCMSLIB00000207444,37062.6,UPDATE-SINGLE-ANNOTATED-BRONZE,,COc1cc(Cc2cnc(N)nc2N)cc(OC)c1OC,COc1cc(Cc2cnc(N)nc2N)cc(OC)c1OC,IEDVJHCEMCRBQM-UHFFFAOYSA-N,InChI=1S/C14H18N4O3/c1-19-10-5-8(6-11(20-2)12(...
122,34949,[M+H]+,77-93-0,1,Massbank:UF416901 Triethyl citrate|triethyl 2-...,Isolated,Massbank,0.0,spectra/specs_ms.pklbin34949,"InChI=1S/C12H20O7/c1-4-17-9(13)7-12(16,11(15)1...",...,277.128,spectra/specs_ms.pklbin,CCMSLIB00000223815,11458.5,UPDATE-SINGLE-ANNOTATED-BRONZE,,CCOC(=O)CC(O)(CC(=O)OCC)C(=O)OCC,CCOC(=O)CC(O)(CC(=O)OCC)C(=O)OCC,DOOTYTYQINUNNV-UHFFFAOYSA-N,"InChI=1S/C12H20O7/c1-4-17-9(13)7-12(16,11(15)1..."
152,43911,[M+H]+,35323-91-2,1,Massbank:PR100263 (+)-Epicatechin|EpCt-pl|ent-...,Isolated,Massbank,0.0,spectra/specs_ms.pklbin43911,InChI=1S/C15H14O6/c16-8-4-11(18)9-6-13(20)15(2...,...,291.086,spectra/specs_ms.pklbin,CCMSLIB00000223090,1920.17,UPDATE-SINGLE-ANNOTATED-BRONZE,,Oc1cc(O)c2c(c1)OC(c1ccc(O)c(O)c1)C(O)C2,Oc1cc(O)c2c(c1)O[C@@H](c1ccc(O)c(O)c1)[C@@H](O)C2,PFTAWBLQPZVEMU-ZFWWWQNUSA-N,InChI=1S/C15H14O6/c16-8-4-11(18)9-6-13(20)15(2...
139,43219,M+H,495-91-0,1,"NCGC00094872-09_C17H19NO3_(2E,4E)-5-(1,3-Benzo...",isolated,lfnothias,285.137,spectra/specs_ms.pklbin43219,InChI=1S/C17H19NO3/c19-17(18-10-4-1-5-11-18)7-...,...,286.144,spectra/specs_ms.pklbin,CCMSLIB00000850601,22402.5,UPDATE-SINGLE-ANNOTATED-GOLD,,O=C(C=CC=Cc1ccc2c(c1)OCO2)N1CCCCC1,O=C(/C=C/C=C/c1ccc2c(c1)OCO2)N1CCCCC1,MXXWOMGUGJBKIW-YPCIICBESA-N,InChI=1S/C17H19NO3/c19-17(18-10-4-1-5-11-18)7-...
25,126417,M+H,,0,Sorbitane Monopalmitate - Polysorbate 40 in-so...,Lysate,AMelnik,0.0,spectra/specs_ms.pklbin126417,,...,402.358,spectra/specs_ms.pklbin,CCMSLIB00000577482,209069.0,UPDATE-SINGLE-ANNOTATED-BRONZE,Nonionic Surfactant[Source Environment],CCCCCCCCCCCCCCCC(=O)OCC(O)C1OCC(O)C1O,CCCCCCCCCCCCCCCC(=O)OCC(O)C1OCC(O)C1O,IYFATESGLOUGBX-UHFFFAOYSA-N,InChI=1S/C22H42O6/c1-2-3-4-5-6-7-8-9-10-11-12-...
120,33632,[M+H]+,,1,Massbank: Dextromethorphan,Isolated,Massbank,0.0,spectra/specs_ms.pklbin33632,InChI=1S/C18H25NO/c1-19-10-9-18-8-4-3-5-15(18)...,...,272.201,spectra/specs_ms.pklbin,CCMSLIB00000206093,1963.7,UPDATE-SINGLE-ANNOTATED-BRONZE,,COc1ccc2c(c1)C13CCCCC1C(C2)N(C)CC3,COc1ccc2c(c1)C13CCCCC1C(C2)N(C)CC3,MKXZASYAUGDDCJ-UHFFFAOYSA-N,InChI=1S/C18H25NO/c1-19-10-9-18-8-4-3-5-15(18)...


## Optional - Manually add candidate structures

Convert the cell from markdown to raw if you want to use it.

Appending structures to virtual metabolization batch.

You can proceed by manually appending the pairs of compound name and SMILES [the order should match in both list]



# Mandatory - Choose between SyGMa (A) or BioTransformer (B) for virtual metabolization

#### A - SyGMa generates specifically human biotransformation of phase 1 and/or 2. 
It takes generally couple minutes to compute. More informations from the paper (https://doi.org/10.1002/cmdc.200700312).

#### B - BioTransformer generates biotransformation in mammals, their gut microbiota, as well as the soil/aquatic microbiota. 
It takes more time to compute. More information from the paper ([https://doi.org/10.1186/s13321-018-0324-5](https://doi.org/10.1186/s13321-018-0324-5)).

# A - Virtual metabolization with SyGMa

SyGMa is a python library for the Systematic Generation of potential Metabolites. See [SyGMa: combining expert knowledge and empirical scoring in the prediction of metabolites](https://doi.org/10.1002/cmdc.200700312) and [https://github.com/3D-e-Chem/sygma](https://github.com/3D-e-Chem/sygma).

Please cite their work:
Ridder, L., & Wagener, M. (2008) [SyGMa: combining expert knowledge and empirical scoring in the prediction of metabolites](https://doi.org/10.1002/cmdc.200700312). ChemMedChem, 3(5), 821-832.


### IMPORTANT -> Change the parameters below as needed
> Define the ruleset and the number of phase 1/2 reaction cyles to apply in the SyGMA scenario. For example 2 cycles for phase 1 `phase_1_cycle = 2`. Using a value > 1 will be slow.

> Define the maximum number of SyGMa candidates outputted (consider the number of reaction cycles). Suggested value `top_sygma_candidates = 15`

> Run SyGMa.

In [9]:
# Define the number of metabolization cycles (1-3). If the number of cycle is more than 1, it can be slow.
phase_1_cycle = 1
phase_2_cycle = 1
          
#Top metabolites predicted by SyGMa to output (ranked by highest score)
top_sygma_candidates = 10

### Run the cell below for running SyGMa (Fast !)

No need to change the content of cell below

In [10]:
run_sygma_batch(prepare_for_virtual_metabolization.list_smiles, prepare_for_virtual_metabolization.list_compound_name, 
                phase_1_cycle, phase_2_cycle, top_sygma_candidates, 'results_vm-NAP_SyGMa.tsv')

=== Starting SyGMa computation ===
Number of compounds = 57
Batch_size = 13
If you are running many compounds or cycles, and maxing out RAM memory available, you can decrease the batch size. Otherwise the value can be increased for faster computation.
Please wait
Batch 1/5 completed with 130 metabolites
Batch 2/5 completed with 130 metabolites
Batch 3/5 completed with 130 metabolites
Batch 4/5 completed with 130 metabolites
Batch 5/5 completed with 50 metabolites
Number of SyGMA candidates = 570
Number of unique SyGMA candidates = 549
===== COMPLETED =====


When completed, download the full SyGMa results in the left side panel->
['results_vm-NAP_SyGMa.tsv'](./results_vm-NAP_SyGMa.tsv).

## Export the SyGMa results for NAP
See the documentation for custom database in [NAP](https://ccms-ucsd.github.io/GNPSDocumentation/nap/#structure-database) and how to run NAP on GNPS [https://ccms-ucsd.github.io/GNPSDocumentation/nap/#structure-database](https://ccms-ucsd.github.io/GNPSDocumentation/nap/#structure-database).

In [11]:
export_for_NAP('results_vm-NAP_SyGMa.tsv')

Number of metabolites = 570
Number of unique metabolites considered = 407


View/Download the results for NAP in the left side panel->
['results_vm-NAP_SyGMa_NAP.tsv'](./results_vm-NAP_SyGMa_NAP.tsv).

To download: Go into File/Download or right-clic on the file in the left panel

## Export the SyGMa results for SIRIUS

See the documentation to generate the SIRIUS [custom database here](https://boecker-lab.github.io/docs.sirius.github.io/cli-standalone/#custom-database-tool).

In [12]:
export_for_SIRIUS('results_vm-NAP_SyGMa.tsv')

Number of metabolites = 570
Number of unique metabolites considered = 402


Download the results for SIRIUS in the left side panel->
['results_vm-NAP_SyGMa_SIRIUS.tsv'](./results_vm-NAP_SyGMa_SIRIUS.tsv).

# B - Virtual metabolization with BioTransformer (It is slow !)

BioTransformer is a software tool that predicts small molecule metabolism in mammals, their gut microbiota, as well as the soil/aquatic microbiota. BioTransformer also assists scientists in metabolite identification, based on the metabolism prediction. More information from the paper [[https://doi.org/10.1186/s13321-018-0324-5](https://doi.org/10.1186/s13321-018-0324-5)] and [[https://bitbucket.org/djoumbou/biotransformerjar/src/master/](https://bitbucket.org/djoumbou/biotransformerjar/src/master/)].

### Citation

Djoumbou-Feunang, Y., Fiamoncini, J., Gil-de-la-Fuente, A. et al. [BioTransformer: a comprehensive computational tool for small molecule metabolism prediction and metabolite identification.](https://doi.org/10.1186/s13321-018-0324-5) J Cheminform 11, 2 (2019).

### Install BioTransformer [it can be ran once]
It requires `curl` and `java`.

In [18]:
!java -version
!rm -r biotransformer.zip biotransformer/
!curl https://bitbucket.org/djoumbou/biotransformerjar/get/f47aa4e3c0da.zip -o biotransformer.zip
!unzip -q -d biotransformer biotransformer.zip
!cp -r biotransformer/djoumbou-biotransformerjar-f47aa4e3c0da/. .
!rm -r biotransformer.zip biotransformer/

openjdk version "1.8.0_112"
OpenJDK Runtime Environment (Zulu 8.19.0.1-linux64) (build 1.8.0_112-b16)
OpenJDK 64-Bit Server VM (Zulu 8.19.0.1-linux64) (build 25.112-b16, mixed mode)
rm: cannot remove 'biotransformer.zip': No such file or directory
rm: cannot remove 'biotransformer/': No such file or directory
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 68.9M  100 68.9M    0     0  19.1M      0  0:00:03  0:00:03 --:--:-- 19.1M


#### Specify the parameters of BioTransformer

`type_of_biotransformation` : -b,--bt Type <BioTransformer Option> The type of description: Type of biotransformer - EC-based (`ecbased`), CYP450 (`cyp450`), Phase II (`phaseII`), Human gut microbial (`hgut`), human super transformer* (`superbio`, or `allHuman`), Environmental microbial** (`envimicro`).

(* ) While the `superbio` option runs a set number of transformation steps in a pre-defined order (e.g. deconjugation first, then Oxidation/reduction, etc.), the `allHuman` option predicts all possible metabolites from any applicable reaction (Oxidation, reduction, (de-)conjugation) at each step.

(** ) For the environmental microbial biodegradation, all reactions (aerobic and anaerobic) are reported, and not only the aerobic biotransformations (as per default in the EAWAG BBD/PPS system).
    
`number_of_steps`  -s,--nsteps <Number of steps> The number of steps for the prediction. This option can be set by the user for the EC-based, CYP450, Phase II, and Environmental microbial biotransformers. The default value is `1`.

In [None]:
type_of_biotransformation = 'hgut'
number_of_steps = 1

run_biotransformer(prepare_for_virtual_metabolization.list_smiles,prepare_for_virtual_metabolization.list_compound_name,
                   type_of_biotransformation, number_of_steps, 'results_vm-NAP_BioTransformer.tsv')
print(' ====> Biotransformer computation is finally completed !!! ')

Download the full BioTransformer results in the left side panel->
['results_vm-NAP_BioTransformer.tsv'](./results_vm-NAP_BioTransformer.tsv).

## Export the BioTransformer results for NAP

See the documentation for custom database in [NAP](https://ccms-ucsd.github.io/GNPSDocumentation/nap/#structure-database) and how to run NAP on GNPS [https://ccms-ucsd.github.io/GNPSDocumentation/nap/#structure-database](https://ccms-ucsd.github.io/GNPSDocumentation/nap/#structure-database).

In [None]:
export_for_NAP('results_vm-NAP_BioTransformer.tsv')

Download the BioTransformer results for NAP in the left side panel->
['results_vm-NAP_BioTransformer_NAP.tsv'](./results_vm-NAP_BioTransformer_NAP.tsv).

## Export the BioTransformer results for SIRIUS

See the documentation to generate the SIRIUS [custom database here](https://boecker-lab.github.io/docs.sirius.github.io/cli-standalone/#custom-database-tool).

In [None]:
export_for_SIRIUS('results_vm-NAP_BioTransformer.tsv')

Download the BioTransformer results for NAP in the left side panel->
['results_vm-NAP_BioTransformer_SIRIUS.tsv'](./results_vm-NAP_BioTransformer_SIRIUS.tsv).