# Using MetaNetMap

MetNetMap aims to map metabolites between metabolomic data and metabolic networks.

There are several challenges to this task:

- **ID homogenization in metabolic networks:**  
  Automatic reconstruction of metabolic networks using different tools often assigns different IDs to the same metabolites. This inconsistency makes it difficult to cross-compare or transfer data across networks.

- **Metabolomic data complexities:**  
  Due to the difficulty of annotating metabolomic profiles, identifications are often partial, incomplete, and inconsistently represented. For example, enantiomers are frequently not precisely specified because they are indistinguishable by LC/MS methods (to be confirmed with Sylvain).

Successfully bridging metabolomic data and metabolic networks is complex but highly valuable, both for species-specific studies and community-level analyses.

Metanetmap enables this bridging process. We have developed a tool that primarily allows the construction of a knowledge base based on:

The ``datatable_conversion`` file acts as a bridge between the metabolomics data and the metabolic networks.  
It combines all structured information extracted from the MetaCyc ``compounds.dat`` file, along with any additional identifiers or metadata provided by the user through the ``datatable_complementary`` file.  

This unified table serves as a comprehensive knowledge base that allows the tool to search across all known identifiers for a given metabolite and match them between the input data and the metabolic networks.  

By leveraging both the MetaCyc database and user-provided enhancements, the ``datatable_conversion`` enables robust and flexible mapping across diverse data sources.

> <picture>
>   <source media="(prefers-color-scheme: light)" srcset="https://raw.githubusercontent.com/Mqxx/GitHub-Markdown/main/blockquotes/badge/light-theme/info.svg">
>   <img alt="Info" src="https://raw.githubusercontent.com/Mqxx/GitHub-Markdown/main/blockquotes/badge/dark-theme/info.svg">
> </picture><br>
> Metacyc database information related to the ontology of metabolites and pathways is not included in test option.


This notebook is used to regenerate mapping with toys using Metacyc datatable. For more details about MetaNetX conversion datatable, please refers to [documentation](https://metanetmap.readthedocs.io/en/latest/index.html).

To sum up, it covers the use of MetNetMap, as well as the results of our mapping. 


#### Table of contents:
- [0. Requirements : Jupyter](#requirements)
- [I. Data requirements](#data)
- [II. General use of the MetaNetMap](#general)



***


## 0. Requirements : Jupyter <a id="requirements"></a>

/!\ In order to use this notebook, it mandatory to activate a specific environnement and set the jupyter kernel.
If this step is not done, the tutorial is below.


### Conda environment 


- **Step 1** : Create a conda environemnt <a name="Conda"></a>
Create your environnement : 

```
conda env create -n <name> python==3.11 
conda activate <name>
```

- **Step 2** : Install python-dfba-sampling in the environment 

Install with pip:

```sh
pip install metanetmap
```

Or from source:

```sh
git clone git@gitlab.inria.fr:mistic/metanetmap.git
cd metanetmap
pip install -r requirements.txt
pip install -r requirements_dev.txt
pip install .
```


- **Step 3** : Install in the environment (still activated) jupyter : 

```sh
conda install jupyter
conda install -n base nb_conda_kernels
```

Your Jupyter notebook can now run on the kernel you will set, in our case this environnemnt. It will not working if the kernel is not selected.


_____________________________________________________


# I. Data requirements <a id="data"></a>

## Input data: <a id="inputs"></a>
### Summary of Input Files for Database Building Mode

| File/Directory          | Description                                                 |
|-------------------------|-------------------------------------------------------------|
| `metacyc_compounds`       | Tabular file provided by the MetaCyc database                |
| `chem_xref`       | Tabular file provided by the MetanetX database                |
| `datatable_complementary` | Tabular file provided by the user (see details below)       |
| `output -o`               | Output directory for mapping results and logs               |

> <picture>
>   <source media="(prefers-color-scheme: light)" srcset="https://raw.githubusercontent.com/Mqxx/GitHub-Markdown/main/blockquotes/badge/light-theme/info.svg">
>   <img alt="Info" src="https://raw.githubusercontent.com/Mqxx/GitHub-Markdown/main/blockquotes/badge/dark-theme/info.svg">
> </picture><br>   
> The ouput stucture will be show as 'input' for the Summary of Input Files for Mapping Modes just below.


### Overview structure
<h3 style="color: green;">metacyc_compounds:</h3>

><picture>
>   <source media="(prefers-color-scheme: light)" srcset="https://raw.githubusercontent.com/Mqxx/GitHub-Markdown/main/blockquotes/badge/light-theme/info.svg">
>   <img alt="Info" src="https://raw.githubusercontent.com/Mqxx/GitHub-Markdown/main/blockquotes/badge/dark-theme/info.svg">
> </picture><br>   
> The `datatable_complementary` is a tabular file provided by the user. It allows users to add their own custom identifiers in order to improve matching with their metabolomics data.icence for MetaCyc

The following is a raw entry for a compound from a MetaCyc flat file `.dat` extension. 
The file is structured as key-value pairs, where each line represents a specific property or annotation of the compound.

Some keys, such as `CHEMICAL-FORMULA`, `SYNONYMS`, or `DBLINKS`, may occur multiple times. Values can contain nested content, quotes, or formatting(e.g. HTML tags in names).

Some Key Characteristics (non-exhaustive)

| **Field**                | **Description**                                                                                        |
|--------------------------|--------------------------------------------------------------------------------------------------------|
| `UNIQUE-ID`              | Primary identifier of the compound in the MetaCyc database.                                            |
| `TYPES`                  | Declares the type of entity — typically `Compound`, but can also be other biological entities.         |
| `COMMON-NAME`            | Human-readable compound name. May contain HTML formatting.                                             |
| `CHEMICAL-FORMULA`       | Chemical composition split across multiple lines, each specifying an element and its count.            |
| `DBLINKS`                | Cross-references to external databases such as BiGG, ChEBI, HMDB, KEGG, PubChem, etc. Multiple lines.  |
| `INCHI`                  | Standard InChI string describing the molecular structure.                                              |
| `INCHI-KEY`              | Hashed InChI identifier (short, fixed-length string) used for quick comparison of chemical structures. |
| `INSTANCE-NAME-TEMPLATE` | A template indicating how this compound ID is generated or structured (e.g., starts with `CPD-`).      |
| `LOGP`                   | Octanol–water partition coefficient (logP), representing hydrophobicity.                               |
| `MOLECULAR-WEIGHT`       | Average molecular weight based on atomic composition.                                                  |
| `MONOISOTOPIC-MW`        | Exact mass using the most abundant isotope for each element.                                           |
| `NON-STANDARD-INCHI`     | Alternative or non-standard InChI representation.                                                      |
| `POLAR-SURFACE-AREA`     | Topological polar surface area (TPSA) of the molecule.                                                 |
| `SMILES`                 | Simplified Molecular Input Line Entry System (SMILES) string representing the structure.               |
| `SYNONYMS`               | Alternate or common names for the compound. Can appear on multiple lines.                              |

____________________________________

<h3 style="color: green;">datatable_complementary:</h3> 
Tabular file provided by the user

| UNIQUE-ID        | ADD-COMPLEMENT                      | BIGG | SEED |
|------------------|-------------------------------------|------|------|
| CPD-7100         | (2S)-2-isopropyl-3-oxosuccinic acid |      |      |
| DI-H-OROTATE     | (S)-dihydroorotic acid              |      |      |
| SHIKIMATE-5P     | 3-phosphoshikimic acid              |      |      |
| DIAMINONONANOATE | 7,8-diaminononanoate                | dann |      |

> <picture>
>   <source media="(prefers-color-scheme: light)" srcset="https://raw.githubusercontent.com/Mqxx/GitHub-Markdown/main/blockquotes/badge/light-theme/info.svg">
>   <img alt="Info" src="https://raw.githubusercontent.com/Mqxx/GitHub-Markdown/main/blockquotes/badge/dark-theme/info.svg">
> </picture><br>   
> The `datatable_complementary` is a tabular file provided by the user. It allows users to add their own custom identifiers in order to improve matching with their metabolomics data.


**Requirements and structure:**

- The **first column must be** a ``UNIQUE-ID`` that links to the MetaCyc database.
- All **following columns are free** and may contain any identifiers or names. Their column names will be automatically included in the main conversion datatable.
- The file must be in tabular format (e.g., TSV), with headers.


<h4 style="color: red;">Important notes:</h4> 

- If you have a metabolite **without a matching ``UNIQUE-ID`` in MetaCyc**, you may assign it a **custom or fictional ID** in the first column.
- This fictional ``UNIQUE-ID`` will still be included in the conversion table, and **will be used if a match is found based on the name or identifier you provided.**
- Be sure to keep track of any custom or fictional IDs you create, so you can filter or manage them later if needed.


___________________

### Summary of Input Files for Mapping Modes  <a id="input-mode"></a>

| File/Directory        | Description                                                 |
|-----------------------|-------------------------------------------------------------|
| `MetaNetMap output`   | Output directory for mapping results and logs               |
| `metabolic_networks`  | Path to the directory with `.sbml` and/or `.xml` files      |
| `metabolomics_data`   | Tabulated file (see note below for details)                 |
| `datatable_conversion`| Tabulated file, first column is the `UNIQUE-ID` in MetaCyc  |


> <picture>
>   <source media="(prefers-color-scheme: light)" srcset="https://raw.githubusercontent.com/Mqxx/GitHub-Markdown/main/blockquotes/badge/light-theme/info.svg">
>   <img alt="Info" src="https://raw.githubusercontent.com/Mqxx/GitHub-Markdown/main/blockquotes/badge/dark-theme/info.svg">
> </picture><br> 
> For `metabolomics_data`:  
> Column names must follow a specific naming convention to be properly processed by the tool during the mapping step.  
> 
> Recognized column names include:
>
> `UNIQUE-ID`, `CHEBI`, `COMMON-NAME`, `ABBREV-NAME`, `SYNONYMS`, `ADD-COMPLEMENT`, `MOLECULAR-WEIGHT`, `MONOISOTOPIC-MW`, `SEED`, `BIGG`, `HMDB`, `METANETX`, `LIGAND-CPD`, `REFMET`, `PUBCHEM`, `CAS`, `INCHI-KEY`, `SMILES`


### Overview structure

<h3 style="color: green;">metabolomics_data:</h3>

  > **Note:**  
  > For **metabolomics_data**:  
  > Column names must follow a specific naming convention and each line is a metabolite.  
  > Metabolomic data files must include column names that follow a specific naming convention in order to be properly processed by the tool during the mapping step.  
  >
  > The following column names are recognized:  
  >
  > `UNIQUE-ID`, `CHEBI`, `COMMON-NAME`, `ABBREV-NAME`, `SYNONYMS`,  
  > `ADD-COMPLEMENT`, `MOLECULAR-WEIGHT`, `MONOISOTOPIC-MW`, `SEED`,  
  > `BIGG`, `HMDB`, `METANETX`, `LIGAND-CPD`, `REFMET`, `PUBCHEM`,  
  > `CAS`, `INCHI-KEY`, `SMILES`

| UNIQUE-ID  | CHEBI       | COMMON-NAME           | M/Z       | INCHI-KEY                               | 
|------------|-------------|----------------------|-----------|----------------------------------------|
|            | CHEBI:4167  |                      | 179       |                                        |
|            |             | L-methionine         | 150       |                                        |        
| CPD-17381  |             | roquefortine C       | 389.185   |                                        |        
|            |             |                      |           | InChIKey=CGBYBGVMDAPUIH-ARJAWSKDSA-L  |
| CPD-25370  | 84783       |                      | 701.58056 |                                        |
|            | CHEBI:16708 | Adenine              |           |                                        |

---

<h3 style="color: green;">Metabolic networks: </h3>

```xml
<?xml version="1.0" encoding="UTF-8"?>
<sbml xmlns="http://www.sbml.org/sbml/level3/version1/core"
      level="3" version="1">
  <model id="example_model" name="Example Metabolic Model">
    <!-- Compartments -->
    <listOfCompartments>
      <compartment id="cytosol" name="Cytosol" constant="true"/>
    </listOfCompartments>

    <listOfSpecies>
      <species id="glucose_c" name="Glucose" compartment="cytosol" initialAmount="1.0" hasOnlySubstanceUnits="false" boundaryCondition="false" constant="false">
        <annotation>
          <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
            <rdf:Description rdf:about="#glucose_c">
              <bqbiol:is>
                <rdf:Bag>
                  <rdf:li rdf:resource="http://identifiers.org/chebi/CHEBI:17234"/>
                  <rdf:li rdf:resource="http://identifiers.org/inchikey/WQZGKKKJIJFFOK-GASJEMHNSA-N"/>
                </rdf:Bag>
              </bqbiol:is>
            </rdf:Description>
          </rdf:RDF>
        </annotation>
      </species>
    </listOfSpecies>

  </model>
</sbml>
```

For **metabolic network data**, we typically extract the ID and name, as well as all possible metadata present in the networks, for example: (chebi, InChIKey...) via annotation.

| Element              | Description                                                                   |
|----------------------|-------------------------------------------------------------------------------|
| `species`            | Defines a metabolite within a compartment                                     |
| `annotation`         | Contains **metadata** in RDF format, including standardized cross-references  |
| `chebi` / `inchikey` | Links to standardized identifiers for interoperability                        |

---

<h3 style="color: green;">Datatable_conversion: </h3>

The ``datatable_conversion`` file acts as a bridge between the metabolomic data and the metabolic networks.
It combines all structured information extracted from the MetaCyc ``compounds.dat`` file, along with any additional identifiers or metadata provided by the user through the ``datatable_complementary`` file.
This unified table serves as a comprehensive knowledge base that allows the tool to search across all known identifiers for a given metabolite, and match them between the input metabolomic data and the metabolic networks.
By leveraging both the MetaCyc database and user-provided knowledge, the ``datatable_conversion`` enables robust and flexible mapping across diverse data sources.


| UNIQUE-ID    | CHEBI  | COMMON-NAME           | ABBREV-NAME | SYNONYMS                                            | ADD-COMPLEMENT | MOLECULAR-WEIGHT | MONOISOTOPIC-MW  | SEED | BIGG   |
|--------------|--------|-----------------------|-------------|-----------------------------------------------------|----------------|------------------|------------------|------|--------|
| CPD-17257    | 30828  | trans-vaccenate       |             | ["trans-vaccenic acid","(E)-11-octadecenoic acid"]  |                | 281.457          | 282.2558803356   |      |        |
| CPD-24978    | 50258  | alpha-L-allofuranose  |             |                                                     |                | 180.157          | 180.0633881178   |      |        |
| CPD-25014    | 147718 | alpha-D-talofuranoses |             |                                                     |                | 180.157          | 180.0633881178   |      |        |
| CPD-25010    | 153460 | alpha-D-mannofuranose |             |                                                     |                | 180.157          | 180.0633881178   |      |        |
| Glucopyranose| 4167   | D-glucopyranose       |             | ["6-(hydroxymethyl)tetrahydropyran-2,3,4,5-tetraol"]|                | 180.157          | 180.0633881178   |      | glc__D |


Without other columns created with the datatable_complementary, the general keys are listed below:

| Column Name        | Description                                                                                                                                        |
|--------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|
| `UNIQUE-ID`        | The unique identifier for the compound, typically from the MetaCyc database (e.g., `CPD-17257`).                                                   |
| `CHEBI`            | The corresponding ChEBI identifier (if available), used for chemical standardization and interoperability.                                         |
| `COMMON-NAME`      | The common name of the metabolite as found in MetaCyc or other databases.                                                                          |
| `ABBREV-NAME`      | Abbreviated name for the metabolite, if defined. Often used in metabolic modeling tools (e.g., COBRA models).                                      |
| `SYNONYMS`         | A list of alternative names for the metabolite. These may include IUPAC names, trivial names, and other variants used in the literature/databases. |
| `ADD-COMPLEMENT`   | Reserved for additional manually added metadata or complement terms, if applicable.                                                                |
| `MOLECULAR-WEIGHT` | The molecular weight (nominal or average) of the metabolite.                                                                                       |
| `MONOISOTOPIC-MW`  | The monoisotopic molecular weight — i.e., the exact mass based on the most abundant isotope of each element.                                       |
| `SEED`             | Identifier from the SEED database, if available.                                                                                                   |
| `BIGG`             | Identifier from the BiGG Models database, if available. Typically used in genome-scale metabolic models.                                           |



---

# Output Data

### Summary of Output File for Database Building Mode

| File/Directory         | Description                                                 |
|------------------------|-------------------------------------------------------------|
| `datatable_conversion` | Tabulated file, first column is the `UNIQUE-ID` in MetaCyc  |
| `logs`                 | Directory providing more detailed information               |

___

> <picture>
>   <source media="(prefers-color-scheme: light)" srcset="https://raw.githubusercontent.com/Mqxx/GitHub-Markdown/main/blockquotes/badge/light-theme/info.svg">
>   <img alt="Info" src="https://raw.githubusercontent.com/Mqxx/GitHub-Markdown/main/blockquotes/badge/dark-theme/info.svg">
> </picture><br>
> The `datatable_conversion` file acts as a bridge between the metabolomics data and the metabolic networks.  
> It combines structured data from the MetaCyc `compounds.dat` file with user-provided identifiers or metadata from `datatable_complementary`.  
> 
> This unified table enables comprehensive and flexible mapping across diverse data sources.  
> 
> The `logs` directory contains detailed processing information, helpful for debugging, auditing, and understanding how the tool performed the mapping.

---

### Summary of Output File for Mapping Modes

| File/Directory         | Description                                                 |
|------------------------|-------------------------------------------------------------|
| `mapping_results`      | Tabulated file with match/unmatch results                   |
| `logs`                 | Directory providing more detailed information               |


<h3 style="color: green;">mapping_results: </h3>
#### Output File Format

- In **community mode**, the file is named: `community_mapping_results_YYYY-MM-DD_HH:MM:SS.tsv`
- In **classic mode**, the file is named: `mapping_results_YYYY-MM-DD_HH:MM:SS.tsv`
- If **partial match** is activated, the filename will include `partial_match` to indicate this.

#### Output file Content and Column Structure

1. **Metabolite Matches**  
   Lists metabolite names that matched. If multiple matches are found for a single input, they are joined with `_AND_`.

2. **MetaCyc UNIQUE-ID Match (from `datatable_conversion`)**  
   Shows matches found using MetaCyc `UNIQUE-ID`.  
   Multiple matches are separated by `_AND_` and flagged as uncertain. These are also shown in the **Partial Match** column.

3. **Input File Match (metabolomics data)**  
   - In **classic mode**: Shows identifier from the input file that matched with the SBML model.
   - In **community mode**: Shows a list like `[data1, data4]` indicating specific files with matches.  
     More details are available in the logs.

4. **Partial Match**  
   Contains uncertain or ambiguous matches:
   - Duplicates (same metabolite matched multiple entries)
   - Post-processing matches, including:
     - CHEBI ontology expansion
     - INCHIKEY simplification
     - Enantiomer removal  
   These require manual review and are logged in detail.

5. **Other Columns**  
   Correspond to identifiers or metadata from the metabolomics data.  
   Each cell shows `YES` if a match was found for that ID in the data.


<h3 style="color: green;">log: </h3>

Provides more information about each step and the corresponding results.

```

    -----------------------------------------
                MAPPING METABOLITES 
    ----------------------------------------- 

    ------ Main package version ------
    numpy version: 2.3.2
    pandas version: 2.3.2
    cobra version: 0.29.1
    libchebipy version: 1.0.10

    Command run:
    Actual command run (from sys.argv): python /home/cmuller/miniconda3/envs/test2/bin/metanetmap -t -c -p

    #---------------------------#
          Test COMMUNITY   
    #---------------------------#

     Test with .. code-block:: noneatching test between metabolites derived from metabolomic data on  all metadata in the metabolic network 
    <2> Matching test between metabolites derived from metabolomic data on all metadata in the database conversion

    ++ Match step for "CPD-17381":
    -- "CPD-17381" is present in database with the UNIQUE-ID "CPD-17381" and matches via "UNIQUE-ID"

    ++ Match step for "CPD-25370":
    -- "CPD-25370" is present directly in "toys1" metabolic network with the ID "CPD-25370" via "UNIQUE-ID"
    -- "CPD-25370" is present in database with the UNIQUE-ID "CPD-25370" and matches via "UNIQUE-ID"

    ++ Match step for "C9H16NO5":
    -- "C9H16NO5" is present directly in "toys3" metabolic network with the ID "pnto__R" via "UNIQUE-ID"
    -- ""C9H16NO5"" has a partial match. We have a formula as identifier for this metabolite: "C9H16NO5"

    ++ Match step for "4167":
    -- "4167" is present directly in "toys3" metabolic network with the ID "glc__D" via "CHEBI"
    .....

    --"NO" is present directly in metabolic network with the corresponding ID "NITRIC-OXIDE" via the match ID "nitric-oxide"


    ......

    ----------------------------------------------
    ---------------MATCH STEP 2-------------------
    ----------------------------------------------
    
    <3> Matching test on metabolites that matched only on the database conversion data against all metadata from the metabolic network
    
    --"Glycocholic acid" is present directly in metabolic network with the corresponding ID "GLYCOCHOLIC_ACID" via the match ID "glycocholic_acid"
    --"gamma-Tocopherol" is present directly in metabolic network with the corresponding ID "GAMA-TOCOPHEROL" via the match ID "gama-tocopherol"
    
    .......


    -------------------- SUMMARY REPORT --------------------


    Recap of Matches:
      + Matched metabolites: 103
      + Unmatched metabolites: 43740
      + Partial matches: 15
    
     Match Details:
      -- Full match (database + SBML): 103
      -- Partial match + metabolic info: 10
      -- Match only in SBML: 0
    
     Unmatch Details:
      -- Full unmatch (no match in DB or SBML): 43514
      -- Match in DB but not in SBML: 226
      -- Partial match in DB only: 5
    
    --------------------------------------------------------
    
    
    --- Total runtime 1478.55 seconds ---
     --- MAPPING COMPLETED'

__________________

# II. General use of the MetaNetMap 

#### a. Build database_conversion

*/!\ Warning:* This step requires the MetaCyc database or MetaNetX database. Metacyc information related to the ontology of metabolites and pathways is not included in this test.
You must provide your own ``compounds.dat`` file from MetaCyc or use MetaNetX files ``chem_xref.tsv`` and ``chem_prop.tsv`` wich are open source. 

The ``datatable_conversion`` file acts as a bridge between the metabolomics data and the metabolic networks.  
It combines all structured information extracted from the MetaCyc ``compounds.dat`` or from MetaNetX files ``chem_xref.tsv`` and ``chem_prop.tsv``files, along with any additional identifiers or metadata provided by the user through the ``datatable_complementary`` file.

This unified table serves as a comprehensive knowledge base that allows the tool to search across all known identifiers for a given metabolite and match them between the input data and the metabolic networks.

By leveraging both the MetaCyc database and user-provided enhancements, the ``datatable_conversion`` enables robust and flexible mapping across diverse data sources.
<a id="general"></a>

In [None]:
''' Run database building mode**:  

'''
#for Metacyc
# Commented out because we don’t have the MetaCyc compounds.dat file,
# so it can’t be run directly with MetaCyc data.
#!metanetmap build_db --db metacyc -f  <your_path_to_metacyc_compounds.dat>  --compfiles ../src/metanetmap/build_datatable_conversion/datatable_complementary_file  --out_db <output>


##for MetaNetX -> can be time comnsuming because it download inputs files
#!metanetmap build_db --db metanetx -f  <your_path_to_metanetx_chem_prop>  <your_path_to_metanetx_chem_xref> --compfiles ../src/metanetmap/build_datatable_conversion/datatable_complementary --out_db <output>
!metanetmap build_db --db metanetx --out_db build/ 


-----------------------------------------
            BUILD CONVERTION TABLE 
----------------------------------------- 


Command run:
Actual command run (from sys.argv): python /home/cmuller/miniconda3/envs/metanetmap/bin/metanetmap build_db --db metanetx --out_db build/
Downloading chem_prop.tsv from https://www.metanetx.org/cgi-bin/mnxget/mnxref/chem_prop.tsv...
chem_prop.tsv successfully downloaded to build/metanetx_db/chem_prop.tsv
File exists: build/metanetx_db/chem_prop.tsv
Downloading chem_xref.tsv from https://www.metanetx.org/cgi-bin/mnxget/mnxref/chem_xref.tsv...
chem_xref.tsv successfully downloaded to build/metanetx_db/chem_xref.tsv
File exists: build/metanetx_db/chem_xref.tsv

---> Run construction of the datatable for MetaNetX
/!\  No complementary file added
Final merged table generated: build/metanetx_conversion_datatable.tsv

--- Total runtime 93.09 seconds ---
 ---> Construction of the database completed


In [4]:
''' Show Build mode (display first 10 rows) '''

import os
import pandas as pd

# 1. Set the file path directly
file_path = 'build/metanetx_conversion_datatable.tsv'  

# 2. Check if the file exists
if os.path.isfile(file_path):
    print(f"Found file: {file_path}")

    # 3. Read the file (auto-detect TSV)
    if file_path.endswith('.tsv'):
        df = pd.read_csv(file_path, sep='\t')
    else:
        print("Unsupported file format.")
        df = None

    # 4. Display only the first 10 rows if loaded
    if df is not None:
        display(df.head(10))
else:
    print(f"File not found: {file_path}")

!rm -r build/


Found file: build/metanetx_conversion_datatable.tsv


Unnamed: 0,UNIQUE-ID,CHEBI,COMMON_NAME,ABBREV_NAME,SYNONYMS,ADD-COMPLEMENT,MOLECULAR_WEIGHT,SEED,BIGG,HMDB,METACYC,REFMET,PUBCHEM,VMH,CAS,INCHI,NON-STANDARD-INCHI,INCHI_KEY,SMILES
0,BIOMASS,,BIOMASS,,,,,cpd11416,,,,,,,,,,,
1,MNXM01,,PMF,,,,1.00794,,,,,,,,,InChI=1S/p+1,,GPRLSGONYQIRFK-UHFFFAOYSA-N,[H+]
2,MNXM02,chebi:13365,OH(-),,,,17.007,cpd15275,oh1,HMDB0001039,OH,,,oh1,,InChI=1S/H2O/h1H2/p-1,,XLYOFNOQVPJJNP-UHFFFAOYSA-M,[H][O-]
3,MNXM02,chebi:13419,OH(-),,,,17.007,cpd15275,oh1,HMDB0001039,OH,,,oh1,,InChI=1S/H2O/h1H2/p-1,,XLYOFNOQVPJJNP-UHFFFAOYSA-M,[H][O-]
4,MNXM02,chebi:16234,OH(-),,,,17.007,cpd15275,oh1,HMDB0001039,OH,,,oh1,,InChI=1S/H2O/h1H2/p-1,,XLYOFNOQVPJJNP-UHFFFAOYSA-M,[H][O-]
5,MNXM02,chebi:29356,OH(-),,,,17.007,cpd15275,oh1,HMDB0001039,OH,,,oh1,,InChI=1S/H2O/h1H2/p-1,,XLYOFNOQVPJJNP-UHFFFAOYSA-M,[H][O-]
6,MNXM02,chebi:44641,OH(-),,,,17.007,cpd15275,oh1,HMDB0001039,OH,,,oh1,,InChI=1S/H2O/h1H2/p-1,,XLYOFNOQVPJJNP-UHFFFAOYSA-M,[H][O-]
7,MNXM02,chebi:5594,OH(-),,,,17.007,cpd15275,oh1,HMDB0001039,OH,,,oh1,,InChI=1S/H2O/h1H2/p-1,,XLYOFNOQVPJJNP-UHFFFAOYSA-M,[H][O-]
8,MNXM03,chebi:29412,H3O(+),,,,19.023,,,,OXONIUM,,,,,InChI=1S/H2O/h1H2/p+1,,XLYOFNOQVPJJNP-UHFFFAOYSA-O,[H][O+]([H])[H]
9,MNXM1,chebi:10744,H(+),,,,1.00794,cpd00067,h,HMDB0059597,PROTON,,,HC02115,,InChI=1S/p+1,,GPRLSGONYQIRFK-UHFFFAOYSA-N,[H+]


#### b. Run mapping mode

You can run MetaNetMap in two different modes (classic, Community) with a partial match option : 

Rappel partial match:
- **Partial match**:
The **partial match** is optional, as it can be time-consuming. It is a post-processing step applied to metabolites or IDs that were not successfully mapped during the initial run. These unmatched entries are re-evaluated using specific strategies, which increase the chances of finding a match (e.g., via CHEBI, INCHIKEY, or enantiomer simplification).

After this processing step, the entire mapping pipeline is re-executed, taking the modifications into account.

**The following treatments are applied:**

- **CHEBI** *(only if a CHEBI column exists in the metabolomics data)*:  
  For each row containing a CHEBI ID, the Python library `libCHEBI` is used to retrieve the full CHEBI ontology of the metabolite. These related terms are then remapped against the target databases.

- **INCHIKEY**:  
  An INCHIKEY is structured as `XXXXXXXXXXXXXX-YYYYYYYAB-Z`. The first block (`X`) represents the core molecular structure. We extract only this primary structure to increase the chances of a match during the second mapping phase.

- **Enantiomers**:  
  Stereochemistry indicators (L, D, R, S) are removed from both the metabolomics data and the databases. This improves matching rates, since stereochemical information is often missing in metabolomics datasets.


In [48]:
''' Run Classic mode
    The classic mode allows you to input a single metabolomics data file and a directory containing multiple metabolic networks.
'''

!metanetmap classic -s ../src/metanetmap/toys_tests_data/toys/sbml/ -a ../src/metanetmap/toys_tests_data/toys/maf/toys2.tsv -d ../src/metanetmap/toys_tests_data/conversion_datatable_toys.tsv -o mapping/classic/



-----------------------------------------
            MAPPING METABOLITES 
----------------------------------------- 

------> Main package version <------
numpy version: 2.3.3
pandas version: 2.3.2
cobra version: 0.29.1
libchebipy version: 1.0.10

Command run:
Actual command run (from sys.argv): python /home/cmuller/miniconda3/envs/metanetmap/bin/metanetmap classic -s ../src/metanetmap/toys_tests_data/toys/sbml/ -a ../src/metanetmap/toys_tests_data/toys/maf/toys2.tsv -d ../src/metanetmap/toys_tests_data/conversion_datatable_toys.tsv -o mapping/classic/



---->    MODE CLASSIC    <----       

Load metabolomics data user path: ../src/metanetmap/toys_tests_data/toys/maf/toys2.tsv and metabolic networks user path: ../src/metanetmap/toys_tests_data/toys/sbml/


---------------PATHS-------------------
List metabolomics data paths: ['../src/metanetmap/toys_tests_data/toys/maf/toys2.tsv']
List metabolic networks paths: ['../src/metanetmap/toys_tests_data/toys/sbml/toys1.sbml', '../src/metan

In [49]:
''' Show Classic mode 

'''
import os
import pandas as pd

# 1. Set the directory path
directory = 'mapping/classic/'

# 2. List all files in the directory
files = [f for f in os.listdir(directory) if os.path.isfile(os.path.join(directory, f))]

# 3. Check if exactly one file exists
if len(files) == 1:
    file_path = os.path.join(directory, files[0])
    print(f"Found file: {file_path}")
    
    # 4. Read the file (auto-detect TSV)
    if file_path.endswith('.tsv'):
        df = pd.read_csv(file_path, sep='\t')
    else:
        print("Unsupported file format.")
        df = None
    
    # 5. Display the table if loaded
    if df is not None:
        display(df)
else:
    print(f"Expected 1 file, found {len(files)}: {files}")


Found file: mapping/classic/mapping_results_2025-09-13_12:45:23.tsv


Unnamed: 0,Metabolites,Match in database,Match in metabolic networks,Partial match,Match via UNIQUE-ID,Match via CHEBI,Match via COMMON-NAME,Match via ABBREV-NAME,Match via SYNONYMS,Match via ADD-COMPLEMENT,...,Match via BIGG,Match via HMDB,Match via METANETX,Match via LIGAND-CPD,Match via REFMET,Match via PUBCHEM,Match via CAS,Match via INCHI-KEY,Match via SMILES,Match via FORMULA
0,methionine,MET,['met__L'],,,,YES,,,,...,,,,,,,,,,
1,8-O-methylfusarubin alcohol,CPD-18186,['CPD-18186'],,,,YES,,,,...,,,,,,,,,,
2,beta-D-galactosyl-(1->3)-N-acetyl-beta-D-gluco...,Gal-13-GlcN-R _AND_ beta-Gal-13-beta-GlcNac-R,['Gal-13-GlcN-R'],Gal-13-GlcN-R _AND_ beta-Gal-13-beta-GlcNac-R,,,YES,,,,...,,,,,,,,,,
3,orotic acid,OROTATE,['orot'],,,,,,YES,,...,,,,,,,,,,
4,Carbamyl-phosphate,CARBAMOYL-P,['Carbamyl-phosphate'],,,,,,YES,,...,,,,,,,,,,
5,pantothenic acid,PANTOTHENATE,['pnto__R'],,,,,,,YES,...,,,,,,,,,,
6,CCCC(=O)OCC(/C=C=C1([C@](C)(O)C[C@@H](OC(=O)C)...,CPD-26454,['CPD-26454'],,,,,,,,...,,,,,,,,,YES,
7,C(S)[C@@H](C(NCCC(=O)[O-])=O)NC(=O)CC[C@@H](C(...,,,,,,,,,,...,,,,,,,,,,
8,InChIKey=FSCYHDCTAZZSKN-DJNGBR,,,,,,,,,,...,,,,,,,,,,
9,thionine,,,,,,,,,,...,,,,,,,,,,


## Output File Content and Column Structure

| **Column Name**                  | **Description**                                                                                                                                                       |
|----------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `Metabolite`                     | Name of the input metabolite (from the experimental data). May be a name, SMILES, InChIKey, or identifier. If multiple matches are found, they are joined with "\_AND\_".                                                       |
| `Match in database`             | Main match found in the reference database (e.g., MetaCyc). May be a MetaCyc ID like `CPD-XXXX` or a named entity. Multiple matches are joined with "\_AND\_" and flagged in **Partial Match**.                                                   |
| `Match in metabolic networks`    | List of metabolite matches in the metabolic network (SBML model). Typically uses short IDs like `met__L`. Returned as a list: `['met__L']`. In community mode, the list indicates each SBML model where the metabolite is present. The name is in the log for more details  |
| `Partial match`                  | Shows ambiguous or post-processed matches, e.g.: <br> - Duplicates <br> - CHEBI ontology expansion <br> - INCHIKEY simplification <br> - Enantiomer removal         |
| `Match via UNIQUE-ID`           | Indicates whether a match was found using the MetaCyc `UNIQUE-ID` from the `datatable_conversion`. Displays `YES` if matched.                                        |
| `Match via CHEBI`               | Match based on **ChEBI** identifier. Displays `YES` if a ChEBI ID in the data matched the network.                                                                   |
| `Match via COMMON-NAME`         | Match based on common (non-abbreviated) name of the metabolite. E.g., `"methionine"`.                                                                                |
| `Match via ABBREV-NAME`         | Match based on abbreviated names, often from SBML or COBRA models. E.g., `"met__L"`, `"pnto__R"`.                                                                    |
| `Match via SYNONYMS`            | Match using any of the listed synonyms for the metabolite. Useful when matching trivial or alternate names.                                                          |
| `Match via ADD-COMPLEMENT`      | Match using manually added complementary fields (from `ADD-COMPLEMENT` column in your input data).                                                                   |
| `Match via BIGG`                | Match using **BiGG Models** identifiers. Typically abbreviated and used in genome-scale models.                                                                      |
| `Match via HMDB`                | Match via **Human Metabolome Database (HMDB)** identifiers.                                                                                                           |
| `Match via METANETX`            | Match via **MetaNetX** IDs, used for cross-database integration.                                                                                                     |
| `Match via LIGAND-CPD`          | Match via identifiers from **KEGG Ligand** or other ligand-based databases.                                                                                          |
| `Match via REFMET`              | Match via **RefMet**, a reference nomenclature system for metabolomics.                                                                                              |
| `Match via PUBCHEM`             | Match via **PubChem Compound IDs (CIDs)**.                                                                                                                            |
| `Match via CAS`                 | Match using **CAS numbers** (Chemical Abstracts Service).                                                                                                             |
| `Match via INCHI-KEY`           | Match based on the **InChIKey**, a hashed version of the InChI chemical identifier.                                                                                   |
| `Match via SMILES`              | Match via the **SMILES** string (Simplified Molecular Input Line Entry System) representing the molecular structure.                                                  |
| `Match via FORMULA`             | Match based on **molecular formula**, e.g., `C6H12O6`.                                                                                                                |
| `Input File Match`              | - **Classic mode**: Matched identifier from the original input file. <br> - **Community mode**: List of input files where the match was found, e.g., `[data1, data3]`. |



In [50]:
''' Run Classic mode with partial match
    The classic mode allows you to input a single metabolomics data file and a directory containing multiple metabolic networks.

'''

!metanetmap classic -p -s ../src/metanetmap/toys_tests_data/toys/sbml/ -a ../src/metanetmap/toys_tests_data/toys/maf/toys2.tsv -d ../src/metanetmap/toys_tests_data/conversion_datatable_toys.tsv -o mapping/classic_partial/



-----------------------------------------
            MAPPING METABOLITES 
----------------------------------------- 

------> Main package version <------
numpy version: 2.3.3
pandas version: 2.3.2
cobra version: 0.29.1
libchebipy version: 1.0.10

Command run:
Actual command run (from sys.argv): python /home/cmuller/miniconda3/envs/metanetmap/bin/metanetmap classic -p -s ../src/metanetmap/toys_tests_data/toys/sbml/ -a ../src/metanetmap/toys_tests_data/toys/maf/toys2.tsv -d ../src/metanetmap/toys_tests_data/conversion_datatable_toys.tsv -o mapping/classic_partial/



---->    MODE CLASSIC    <----       

Load metabolomics data user path: ../src/metanetmap/toys_tests_data/toys/maf/toys2.tsv and metabolic networks user path: ../src/metanetmap/toys_tests_data/toys/sbml/


---------------PATHS-------------------
List metabolomics data paths: ['../src/metanetmap/toys_tests_data/toys/maf/toys2.tsv']
List metabolic networks paths: ['../src/metanetmap/toys_tests_data/toys/sbml/toys1.sbml', '.

In [51]:
''' Show Classic mode partial match

'''
import os
import pandas as pd

# 1. Set the directory path
directory = 'mapping/classic_partial/'

# 2. List all files in the directory
files = [f for f in os.listdir(directory) if os.path.isfile(os.path.join(directory, f))]

# 3. Check if exactly one file exists
if len(files) == 1:
    file_path = os.path.join(directory, files[0])
    print(f"Found file: {file_path}")
    
    # 4. Read the file (auto-detect TSV)
    if file_path.endswith('.tsv'):
        df = pd.read_csv(file_path, sep='\t')
    else:
        print("Unsupported file format.")
        df = None
    
    # 5. Display the table if loaded
    if df is not None:
        display(df)
else:
    print(f"Expected 1 file, found {len(files)}: {files}")


Found file: mapping/classic_partial/mapping_results_partial_match_2025-09-13_12:45:26.tsv


Unnamed: 0,Metabolites,Match in database,Match in metabolic networks,Partial match,Match via UNIQUE-ID,Match via CHEBI,Match via COMMON-NAME,Match via ABBREV-NAME,Match via SYNONYMS,Match via ADD-COMPLEMENT,...,Match via BIGG,Match via HMDB,Match via METANETX,Match via LIGAND-CPD,Match via REFMET,Match via PUBCHEM,Match via CAS,Match via INCHI-KEY,Match via SMILES,Match via FORMULA
0,methionine,MET,['met__L'],,,,YES,,,,...,,,,,,,,,,
1,8-O-methylfusarubin alcohol,CPD-18186,['CPD-18186'],,,,YES,,,,...,,,,,,,,,,
2,beta-D-galactosyl-(1->3)-N-acetyl-beta-D-gluco...,Gal-13-GlcN-R _AND_ beta-Gal-13-beta-GlcNac-R,['Gal-13-GlcN-R'],Gal-13-GlcN-R _AND_ beta-Gal-13-beta-GlcNac-R,,,YES,,,,...,,,,,,,,,,
3,orotic acid,OROTATE,['orot'],,,,,,YES,,...,,,,,,,,,,
4,Carbamyl-phosphate,CARBAMOYL-P,['Carbamyl-phosphate'],,,,,,YES,,...,,,,,,,,,,
5,pantothenic acid,PANTOTHENATE,['pnto__R'],,,,,,,YES,...,,,,,,,,,,
6,CCCC(=O)OCC(/C=C=C1([C@](C)(O)C[C@@H](OC(=O)C)...,CPD-26454,['CPD-26454'],,,,,,,,...,,,,,,,,,YES,
7,C(S)[C@@H](C(NCCC(=O)[O-])=O)NC(=O)CC[C@@H](C(...,,,,,,,,,,...,,,,,,,,,,
8,InChIKey=FSCYHDCTAZZSKN-DJNGBR,,,,,,,,,,...,,,,,,,,,,
9,thionine,,,,,,,,,,...,,,,,,,,,,


In [52]:
''' Run Community mode
   The "community" mode allows you to input a directory containing multiple metabolomics data files, as well as a directory containing multiple metabolic networks.


'''

!metanetmap community -s ../src/metanetmap/toys_tests_data/toys/sbml/ -a ../src/metanetmap/toys_tests_data/toys/maf/ -d ../src/metanetmap/toys_tests_data/conversion_datatable_toys.tsv -o mapping/community/



-----------------------------------------
            MAPPING METABOLITES 
----------------------------------------- 

------> Main package version <------
numpy version: 2.3.3
pandas version: 2.3.2
cobra version: 0.29.1
libchebipy version: 1.0.10

Command run:
Actual command run (from sys.argv): python /home/cmuller/miniconda3/envs/metanetmap/bin/metanetmap community -s ../src/metanetmap/toys_tests_data/toys/sbml/ -a ../src/metanetmap/toys_tests_data/toys/maf/ -d ../src/metanetmap/toys_tests_data/conversion_datatable_toys.tsv -o mapping/community/


---->    MODE COMMUNITY    <----
Load metabolomics data user path: ../src/metanetmap/toys_tests_data/toys/maf/ and metabolic networks user path: ../src/metanetmap/toys_tests_data/toys/sbml/

---------------PATHS-------------------
List metabolomics data paths: ['../src/metanetmap/toys_tests_data/toys/maf/toys1.tsv', '../src/metanetmap/toys_tests_data/toys/maf/toys3.tsv', '../src/metanetmap/toys_tests_data/toys/maf/toys2.tsv']
List metaboli

In [53]:
''' Show Community mode 

'''
import os
import pandas as pd

# 1. Set the directory path
directory = 'mapping/community/'

# 2. List all files in the directory
files = [f for f in os.listdir(directory) if os.path.isfile(os.path.join(directory, f))]

# 3. Check if exactly one file exists
if len(files) == 1:
    file_path = os.path.join(directory, files[0])
    print(f"Found file: {file_path}")
    
    # 4. Read the file (auto-detect TSV)
    if file_path.endswith('.tsv'):
        df = pd.read_csv(file_path, sep='\t')
    else:
        print("Unsupported file format.")
        df = None
    
    # 5. Display the table if loaded
    if df is not None:
        display(df)
else:
    print(f"Expected 1 file, found {len(files)}: {files}")


Found file: mapping/community/community_mapping_results_2025-09-13_12:45:29.tsv


Unnamed: 0,Metabolites,Match in database,Match in metabolic networks,Partial match,Match via UNIQUE-ID,Match via CHEBI,Match via COMMON-NAME,Match via ABBREV-NAME,Match via SYNONYMS,Match via ADD-COMPLEMENT,...,Match via BIGG,Match via HMDB,Match via METANETX,Match via LIGAND-CPD,Match via REFMET,Match via PUBCHEM,Match via CAS,Match via INCHI-KEY,Match via SMILES,Match via FORMULA
0,CPD-17381 _AND_ roquefortine C,CPD-17381,,,YES,,YES,,,,...,,,,,,,,,,
1,84783 _AND_ CPD-25370,CPD-25370,['toys1'],,YES,YES,,,,,...,,,,,,,,,,
2,C9H16NO5,,['toys3'],C9H16NO5,YES,,,,,,...,,,,,,,,,,
3,4167,Glucopyranose,"['toys1', 'toys3']",,,YES,,,,,...,,,,,,,,,,
4,L-methionine _AND_ methionine,MET,"['toys1', 'toys2', 'toys3']",,,,YES,,,,...,,,,,,,,,,
5,16708 _AND_ Adenine,ADENINE,"['toys1', 'toys3']",,,,YES,,,,...,,,,,,,,,,
6,Beta-D-galactosyl-(1->3)-N-acetyl-beta-D-gluco...,Gal-13-GlcN-R _AND_ beta-Gal-13-beta-GlcNac-R,"['toys1', 'toys2']",Gal-13-GlcN-R _AND_ beta-Gal-13-beta-GlcNac-R,,,YES,,,,...,,,,,,,,,,
7,8-O-methylfusarubin alcohol,CPD-18186,['toys2'],,,,YES,,,,...,,,,,,,,,,
8,orotic acid,OROTATE,"['toys2', 'toys3']",,,,,,YES,,...,,,,,,,,,,
9,Carbamyl-phosphate,CARBAMOYL-P,['toys2'],,,,,,YES,,...,,,,,,,,,,


In [54]:
''' Run Community mode with partial match
    The "community" mode allows you to input a directory containing multiple metabolomics data files, as well as a directory containing multiple metabolic networks.

'''

!metanetmap community -p -s ../src/metanetmap/toys_tests_data/toys/sbml/ -a ../src/metanetmap/toys_tests_data/toys/maf/ -d ../src/metanetmap/toys_tests_data/conversion_datatable_toys.tsv -o mapping/community_partial/



-----------------------------------------
            MAPPING METABOLITES 
----------------------------------------- 

------> Main package version <------
numpy version: 2.3.3
pandas version: 2.3.2
cobra version: 0.29.1
libchebipy version: 1.0.10

Command run:
Actual command run (from sys.argv): python /home/cmuller/miniconda3/envs/metanetmap/bin/metanetmap community -p -s ../src/metanetmap/toys_tests_data/toys/sbml/ -a ../src/metanetmap/toys_tests_data/toys/maf/ -d ../src/metanetmap/toys_tests_data/conversion_datatable_toys.tsv -o mapping/community_partial/


---->    MODE COMMUNITY    <----
Load metabolomics data user path: ../src/metanetmap/toys_tests_data/toys/maf/ and metabolic networks user path: ../src/metanetmap/toys_tests_data/toys/sbml/

---------------PATHS-------------------
List metabolomics data paths: ['../src/metanetmap/toys_tests_data/toys/maf/toys1.tsv', '../src/metanetmap/toys_tests_data/toys/maf/toys3.tsv', '../src/metanetmap/toys_tests_data/toys/maf/toys2.tsv']
Li

In [55]:
''' Show Community mode with partial match 

'''
import os
import pandas as pd

# 1. Set the directory path
directory = 'mapping/community_partial/'

# 2. List all files in the directory
files = [f for f in os.listdir(directory) if os.path.isfile(os.path.join(directory, f))]

# 3. Check if exactly one file exists
if len(files) == 1:
    file_path = os.path.join(directory, files[0])
    print(f"Found file: {file_path}")
    
    # 4. Read the file (auto-detect TSV)
    if file_path.endswith('.tsv'):
        df = pd.read_csv(file_path, sep='\t')
    else:
        print("Unsupported file format.")
        df = None
    
    # 5. Display the table if loaded
    if df is not None:
        display(df)
else:
    print(f"Expected 1 file, found {len(files)}: {files}")


Found file: mapping/community_partial/community_mapping_results_partial_match_2025-09-13_12:45:32.tsv


Unnamed: 0,Metabolites,Match in database,Match in metabolic networks,Partial match,Match via UNIQUE-ID,Match via CHEBI,Match via COMMON-NAME,Match via ABBREV-NAME,Match via SYNONYMS,Match via ADD-COMPLEMENT,...,Match via BIGG,Match via HMDB,Match via METANETX,Match via LIGAND-CPD,Match via REFMET,Match via PUBCHEM,Match via CAS,Match via INCHI-KEY,Match via SMILES,Match via FORMULA
0,CPD-17381 _AND_ roquefortine C,CPD-17381,,,YES,,YES,,,,...,,,,,,,,,,
1,84783 _AND_ CPD-25370,CPD-25370,['toys1'],,YES,YES,,,,,...,,,,,,,,,,
2,C9H16NO5,,['toys3'],C9H16NO5,YES,,,,,,...,,,,,,,,,,
3,4167,Glucopyranose,"['toys1', 'toys3']",,,YES,,,,,...,,,,,,,,,,
4,L-methionine _AND_ methionine,MET,"['toys1', 'toys2', 'toys3']",,,,YES,,,,...,,,,,,,,,,
5,16708 _AND_ Adenine,ADENINE,"['toys1', 'toys3']",,,,YES,,,,...,,,,,,,,,,
6,Beta-D-galactosyl-(1->3)-N-acetyl-beta-D-gluco...,Gal-13-GlcN-R _AND_ beta-Gal-13-beta-GlcNac-R,"['toys1', 'toys2']",Gal-13-GlcN-R _AND_ beta-Gal-13-beta-GlcNac-R,,,YES,,,,...,,,,,,,,,,
7,8-O-methylfusarubin alcohol,CPD-18186,['toys2'],,,,YES,,,,...,,,,,,,,,,
8,orotic acid,OROTATE,"['toys2', 'toys3']",,,,,,YES,,...,,,,,,,,,,
9,Carbamyl-phosphate,CARBAMOYL-P,['toys2'],,,,,,YES,,...,,,,,,,,,,


## LOG exemple:

In [56]:
import os

# 1. Set the directory path
directory = 'mapping/classic_partial/logs'

# 2. List all files in the directory
files = [f for f in os.listdir(directory) if os.path.isfile(os.path.join(directory, f))]

# 3. Check if exactly one file exists
if len(files) == 1:
    file_path = os.path.join(directory, files[0])
    print(f"Found log file: {file_path}\n")

    # 4. Read and print the content of the log file
    with open(file_path, 'r', encoding='utf-8') as f:
        content = f.read()
        print(content)
else:
    print(f"Expected 1 file, found {len(files)}: {files}")

Found log file: mapping/classic_partial/logs/mapping_2025-09-13_12:45:26.log

-----------------------------------------
            MAPPING METABOLITES 
----------------------------------------- 

------> Main package version <------
numpy version: 2.3.3
pandas version: 2.3.2
cobra version: 0.29.1
libchebipy version: 1.0.10

Command run:
Actual command run (from sys.argv): python /home/cmuller/miniconda3/envs/metanetmap/bin/metanetmap classic -p -s ../src/metanetmap/toys_tests_data/toys/sbml/ -a ../src/metanetmap/toys_tests_data/toys/maf/toys2.tsv -d ../src/metanetmap/toys_tests_data/conversion_datatable_toys.tsv -o mapping/classic_partial/



---->    MODE CLASSIC    <----       

Load metabolomics data user path: ../src/metanetmap/toys_tests_data/toys/maf/toys2.tsv and metabolic networks user path: ../src/metanetmap/toys_tests_data/toys/sbml/


---------------PATHS-------------------
List metabolomics data paths: ['../src/metanetmap/toys_tests_data/toys/maf/toys2.tsv']
List metabolic

## Remove generated data
To clean directly the repository of the notebook

In [57]:
''' Run Community mode with partial match
    The "community" mode allows you to input a directory containing multiple metabolomics data files, as well as a directory containing multiple metabolic networks.

'''

!rm -r mapping/