<a href="https://colab.research.google.com/github/guillaume-souede/immunogit/blob/main/GetLogicalModelAndMetadata.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [4]:
import os
import zipfile
import requests
import json
from bioservices import BioModels

Creating directory /root/.config/bioservices 


In [5]:
s = BioModels()

[32mINFO    [bioservices.BioModels:363]: [0m [32mInitialising BioModels service (REST)[0m
INFO:bioservices.BioModels:Initialising BioModels service (REST)


Creating directory /root/.cache/bioservices 
Welcome to Bioservices
It looks like you do not have a configuration file.
We are creating one with default values in /root/.config/bioservices/bioservices.cfg .
Done


In [6]:
def get_all_models(query, page_size=10):
    """
    Fonction pour récupérer tous les modèles correspondant à la requête avec pagination.
    """
    models = []
    offset = 0

    while True:
        try:
            search_results = s.search(query, offset=offset)

            if 'models' not in search_results or not search_results['models']:
                break

            models.extend(search_results['models'])
            offset += page_size
            print(f"Page {offset // page_size} téléchargée, {len(search_results['models'])} modèles récupérés.")

        except Exception as e:
            print(f"Erreur lors de la récupération des modèles : {e}")
            break

    return models

In [21]:
def download_model_file(model_id, directory):
    """
    Télécharge le fichier sbml du modele
    """
    try:
        # Créer le chemin du ZIP
        zip_filename = f"{model_id}.zip"
        zip_path = os.path.join(directory, zip_filename)

        # Télécharger le ZIP
        s.get_model_download(model_id, output_filename=zip_path)

        # Extraire uniquement le premier fichier .xml trouvé
        with zipfile.ZipFile(zip_path, 'r') as zip_ref:
            xml_files = [
                f for f in zip_ref.namelist()
                if ((f.lower().endswith('.xml') or f.lower().endswith('.XML'))
                    and not f.lower().endswith('_urn.xml')
                    and not f.lower().endswith('manifest.xml'))
                ]

            if not xml_files:
                xml_files = [f for f in zip_ref.namelist()
                             if f.lower().endswith('.sbml')
                             and not f.lower().endswith('_urn.xml')
                             and not f.lower().endswith('manifest.xml')]

            if not xml_files:
                print(f"Aucun fichier SBML/XML trouvÃ© dans le ZIP pour {model_id}")
                return None

            xml_filename = xml_files[0]
            zip_ref.extract(xml_filename, directory)

            extracted_path = os.path.join(directory, os.path.basename(xml_filename))
            print(f"Fichier SBML extrait : {extracted_path}")

        # Supprimer le fichier ZIP
        os.remove(zip_path)

        return extracted_path

    except Exception as e:
        print(f"Erreur lors du tÃ©lÃ©chargement du modÃ©le {model_id}: {e}")
        return None


In [22]:
def download_model_with_metadata(model_data, base_directory) :
    """
    Download a model and its metadata, then save them into a ZIP file.

    Args :
        model_data (dict) : A dictionary containing model information, including at least 'id' and 'url'.
        base_directory (str) : Path to the root directory where the model should be saved.

    Returns :
        None
    """
    try :
        model_id = model_data['id']
        #sbml_url = model_data.get('url', None)
        title = model_data.get('name', "").lower()
        keywords = model_data.get('submitter_keywords', "").lower()
        immun = model_data.get('immun', "").lower()

        def contains_keyword(data, keyword) :
            """
            Recursively search for a keyword ('kw') in all string values of a nested element.

            Args :
                data (any) : The data to search (can be dict, list, str).
                keyword (str) : The keyword to search for.

            Returns :
                bool : True if keyword is found. False otherwise.
            """
            if isinstance(data, dict) :
                return any(contains_keyword(v, keyword) for v in data.values())
            elif isinstance(data, list) :
                return any(contains_keyword(item, keyword) for item in data)
            elif isinstance(data, str) :
                return keyword.lower() in data.lower()
            return False


        # Retrieve full metadata
        try :
            full_metadata = s.get_model(model_id)
        except Exception as e :
            print(f"Error retrieving full metadata for {model_id} : {e}")
            return

        # Determine destination directory based on content
        if contains_keyword(full_metadata, "immun") :
            directory = os.path.join(base_directory, "immun")
            directory = os.path.join(directory, "Curated_models" if "BIOM" in model_id else "No_Curated_models")
        elif contains_keyword(full_metadata, "T cell") :
            directory = os.path.join(base_directory, "T-cell")
            directory = os.path.join(directory, "Curated_models" if "BIOM" in model_id else "No_Curated_models")
        else :
            return

        os.makedirs(directory, exist_ok=True)

        # Download SBML file
        model_path = download_model_file(model_id, directory)
        if model_path is None :
            return

        # Save metadata as JSON
        metadata_filename = f"{model_id}_metadata.json"
        metadata_path = os.path.join(directory, metadata_filename)
        with open(metadata_path, 'w', encoding='utf-8') as f :
            json.dump(full_metadata, f, ensure_ascii=False, indent=4)

        # Create ZIP file containing model and metadata
        zip_filename = os.path.join(directory, f"{model_id}.zip")
        with zipfile.ZipFile(zip_filename, 'w') as zipf :
            zipf.write(model_path, os.path.basename(model_path))
            zipf.write(metadata_path, os.path.basename(metadata_path))

        # Remove temporary files
        os.remove(model_path)
        os.remove(metadata_path)

        print(f"Model {model_id} and its metadata saved to {zip_filename}")

    except Exception as e :
        print(f"An error occurred while processing model {model_data['id']} : {e}")

In [23]:
def main():
    # Requête mise à jour
    query = (
        'logical AND modelformat:"SBML"'
    )

    # Répertoire principal
    base_directory = "downloaded_models_logical"
    os.makedirs(base_directory, exist_ok=True)

    # Obtenir tous les modèles
    models = get_all_models(query)

    # Télécharger chaque modèle et les classer
    for model_data in models:
        download_model_with_metadata(model_data, base_directory)

In [24]:
if __name__ == "__main__":
    main()

Page 1 téléchargée, 10 modèles récupérés.
Page 2 téléchargée, 10 modèles récupérés.
Page 3 téléchargée, 10 modèles récupérés.
Page 4 téléchargée, 10 modèles récupérés.
Page 5 téléchargée, 7 modèles récupérés.


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file MODEL2002170001.zip[0m
INFO:bioservices.BioModels:Saving file MODEL2002170001.zip


Fichier SBML extrait : downloaded_models_logical/T-cell/No_Curated_models/Cacace_TdevModel_2nov2020.sbml
Model MODEL2002170001 and its metadata saved to downloaded_models_logical/T-cell/No_Curated_models/MODEL2002170001.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file MODEL1506260000.zip[0m
INFO:bioservices.BioModels:Saving file MODEL1506260000.zip


Fichier SBML extrait : downloaded_models_logical/T-cell/No_Curated_models/MODEL1506260000_url.xml
Model MODEL1506260000 and its metadata saved to downloaded_models_logical/T-cell/No_Curated_models/MODEL1506260000.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file MODEL2304070002.zip[0m
INFO:bioservices.BioModels:Saving file MODEL2304070002.zip


Fichier SBML extrait : downloaded_models_logical/immun/No_Curated_models/intracellular_model.sbml
Model MODEL2304070002 and its metadata saved to downloaded_models_logical/immun/No_Curated_models/MODEL2304070002.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file MODEL2004040001.zip[0m
INFO:bioservices.BioModels:Saving file MODEL2004040001.zip


Fichier SBML extrait : downloaded_models_logical/T-cell/No_Curated_models/Selvaggio_etal_2020.sbml
Model MODEL2004040001 and its metadata saved to downloaded_models_logical/T-cell/No_Curated_models/MODEL2004040001.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file MODEL1903260003.zip[0m
INFO:bioservices.BioModels:Saving file MODEL1903260003.zip


Fichier SBML extrait : downloaded_models_logical/T-cell/No_Curated_models/RodriguezJorge_Merged_TCR_TLR5_Signalling_BooleanModel_15Jul2018.sbml
Model MODEL1903260003 and its metadata saved to downloaded_models_logical/T-cell/No_Curated_models/MODEL1903260003.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file MODEL1903260001.zip[0m
INFO:bioservices.BioModels:Saving file MODEL1903260001.zip


Fichier SBML extrait : downloaded_models_logical/T-cell/No_Curated_models/RodriguezJorge_TCR_Signalling_BooleanModel_17Jul2018.sbml
Model MODEL1903260001 and its metadata saved to downloaded_models_logical/T-cell/No_Curated_models/MODEL1903260001.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file MODEL1903260002.zip[0m
INFO:bioservices.BioModels:Saving file MODEL1903260002.zip


Fichier SBML extrait : downloaded_models_logical/T-cell/No_Curated_models/RodriguezJorge_TLR5_Signalling_BooleanModel_17Jul2018.xml
Model MODEL1903260002 and its metadata saved to downloaded_models_logical/T-cell/No_Curated_models/MODEL1903260002.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file MODEL2101150001.zip[0m
INFO:bioservices.BioModels:Saving file MODEL2101150001.zip


Fichier SBML extrait : downloaded_models_logical/immun/No_Curated_models/Corral_ThIL17diff_15jan2021.sbml
Model MODEL2101150001 and its metadata saved to downloaded_models_logical/immun/No_Curated_models/MODEL2101150001.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file MODEL2308300001.zip[0m
INFO:bioservices.BioModels:Saving file MODEL2308300001.zip


Fichier SBML extrait : downloaded_models_logical/immun/No_Curated_models/PsoriaSys.sbml
Model MODEL2308300001 and its metadata saved to downloaded_models_logical/immun/No_Curated_models/MODEL2308300001.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file MODEL1910020002.zip[0m
INFO:bioservices.BioModels:Saving file MODEL1910020002.zip


Fichier SBML extrait : downloaded_models_logical/T-cell/No_Curated_models/Afenya2018 - peripheral blodd dynamics in the disease state.xml
Model MODEL1910020002 and its metadata saved to downloaded_models_logical/T-cell/No_Curated_models/MODEL1910020002.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file MODEL1904150001.zip[0m
INFO:bioservices.BioModels:Saving file MODEL1904150001.zip


Fichier SBML extrait : downloaded_models_logical/immun/No_Curated_models/Hannig(geb Scheidel)2016 - In Silico Knockout Studies of Xenophagic Capturing of Salmonella, Petri Nets.xml
Model MODEL1904150001 and its metadata saved to downloaded_models_logical/immun/No_Curated_models/MODEL1904150001.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file BIOMD0000000220.zip[0m
INFO:bioservices.BioModels:Saving file BIOMD0000000220.zip


Fichier SBML extrait : downloaded_models_logical/immun/Curated_models/BIOMD0000000220_url.xml
Model BIOMD0000000220 and its metadata saved to downloaded_models_logical/immun/Curated_models/BIOMD0000000220.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file MODEL1606020000.zip[0m
INFO:bioservices.BioModels:Saving file MODEL1606020000.zip


Fichier SBML extrait : downloaded_models_logical/immun/No_Curated_models/MODEL1606020000_url.xml
Model MODEL1606020000 and its metadata saved to downloaded_models_logical/immun/No_Curated_models/MODEL1606020000.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file MODEL2007020001.zip[0m
INFO:bioservices.BioModels:Saving file MODEL2007020001.zip


Fichier SBML extrait : downloaded_models_logical/T-cell/No_Curated_models/Muenzner2019_Yeast_Cell_Cycle_Control_Network.sbml
Model MODEL2007020001 and its metadata saved to downloaded_models_logical/T-cell/No_Curated_models/MODEL2007020001.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file BIOMD0000000592.zip[0m
INFO:bioservices.BioModels:Saving file BIOMD0000000592.zip


Fichier SBML extrait : downloaded_models_logical/immun/Curated_models/BIOMD0000000592_url.xml
Model BIOMD0000000592 and its metadata saved to downloaded_models_logical/immun/Curated_models/BIOMD0000000592.zip


[32mINFO    [bioservices.BioModels:171]: [0m [32mSaving file BIOMD0000000593.zip[0m
INFO:bioservices.BioModels:Saving file BIOMD0000000593.zip


Fichier SBML extrait : downloaded_models_logical/immun/Curated_models/BIOMD0000000593_url.xml
Model BIOMD0000000593 and its metadata saved to downloaded_models_logical/immun/Curated_models/BIOMD0000000593.zip
