# OntoGraph Usage - Jupyter Notebook

[//]: # (------------------------------------------    DO NOT MODIFY THIS    ------------------------------------------)
<style type="text/css">
.tg  {border-collapse:collapse;
      border-spacing:0;
     }
.tg td{border-color:black;
       border-style:solid;
       border-width:1px;
       font-family:Arial, sans-serif;
       font-size:14px;
       overflow:hidden;
       padding:10px 5px;
       word-break:normal;
      }
.tg th{border-color:black;
       border-style:solid;
       border-width:1px;
       font-family:Arial, sans-serif;
       font-size:14px;
       font-weight:normal;
       overflow:hidden;
       padding:10px 5px;
       word-break:normal;
      }
.tg .tg-fymr{border-color:inherit;
             font-weight:bold;
             text-align:left;
             vertical-align:top
            }
.tg .tg-0pky{border-color:inherit;
             text-align:left;
             vertical-align:top
            }
[//]: # (--------------------------------------------------------------------------------------------------------------)
[//]: # (-------------------------------------    FILL THIS OUT WITH YOUR DATA    -------------------------------------)
[//]: # (--------------------------------------------------------------------------------------------------------------)
</style>
<table class="tg">
    <tbody>
      <tr>
        <td class="tg-fymr" style="font-weight: bold">Title:</td>
        <td class="tg-0pky">OntoGraph Usage - Jupyter Notebook </td>
      </tr>
      <tr>
        <td class="tg-fymr" style="font-weight: bold">Authors:</td>
        <td class="tg-0pky">
            <a href="https://github.com/ecarrenolozano" target="_blank" rel="noopener noreferrer">Edwin Carreño</a>,
            <a href="" target="_blank" rel="noopener noreferrer">Denes Türei</a>
        </td>
      </tr>
      <tr>
        <td class="tg-fymr" style="font-weight: bold">Affiliations:</td>
        <td class="tg-0pky">
            <a href="https://www.ssc.uni-heidelberg.de/en" target="_blank" rel="noopener noreferrer">Scientific Software Center</a>,
            <a href="https://saezlab.org/" target="_blank" rel="noopener noreferrer">Saez-Rodriguez Group</a>
        </td>
      </tr>
      <tr>
        <td class="tg-fymr" style="font-weight: bold">Date Created:</td>
        <td class="tg-0pky">30.10.2024</td>
      </tr>
      <tr>
        <td class="tg-fymr" style="font-weight: bold">Description:</td>
        <td class="tg-0pky">This notebook is a sandbox to explore, test, and demonstrate the core functionalities of the `OntoGraph` package, helping refine the user experience and gather early feedback before writing the official documentation.</td>
      </tr>
    </tbody>
</table>

[//]: # (--------------------------------------------------------------------------------------------------------------)

## Overview

This notebook serves as a **sandbox environment** to explore and test the functionalities of the `OntoGraph` Python package. It is designed to:  

- Evaluate the **user experience** during development.  
- Demonstrate common use cases and workflows to potential users.  
- Collect feedback from collaborators before finalizing the official documentation.
- Help to create use cases based on what the user wants.

Feel free to execute, modify, and extend the examples in this notebook to better understand how the package can be used in real scenarios.

## How to Use This Notebook

1. Run cells sequentially from top to bottom.
2. If using an environment manager (e.g., `conda`, `venv`), activate it before running.
3. To reset, click `Kernel -> Restart & Run All`.

## Importing Libraries

In [None]:
# 1. Standard library imports
import sys

sys.path.append("/home/ecarreno/SSC-Projects/b_REPOSITORIES/ontograph")

from pathlib import Path

# 2. Related third party imports
# 3. Local application/library specific imports
from ontograph.downloader import PoochDownloaderAdapter
from ontograph.ontology_query import OntologyQueries
from ontograph.ontology_loader import ProntoLoaderAdapter
from ontograph.ontology_registry import OBORegistryAdapter

## Defining a `cache` folder

In this cache folder we are going to store all the ontologies and other files like the ontology registry metadata from OBO foundry.

In [None]:
cache_dir = Path("../data/out")

## Step 1. Create a registry with all the ontologies

We are going to create a registry based on all the metadata that OBO Foundy has.

In [None]:
onto_registry = OBORegistryAdapter(cache_dir=cache_dir)

In [None]:
onto_registry.cache_dir

### Load the registry (in case of not having the registry it will be downloaded automatically)

In [None]:
onto_registry.load_registry()

### Print registry' schema

In [None]:
onto_registry.print_registry_schema_tree()

### List of available ontologies

In [None]:
onto_registry.list_available_ontologies()

### Print the link associated to a valid ontology (e.g., 'chebi')

In [None]:
# This function looks if the given ontology has a link with a given extension

print("Link: {}".format(onto_registry.get_download_url("chebi", "obo")))

### Print available formats for a valid ontology

In [None]:
print(onto_registry.get_available_formats(ontology_id="chebi"))

## Step 2. Download specific ontologies

### Create a downloader object

In [None]:
downloader = PoochDownloaderAdapter(
    cache_dir=cache_dir,
    registry=onto_registry,
)

### Download multiple ontologies at once

In [None]:
# Put the ontologies and their formats as a dictionary
resources = [
    {"name_id": "chebi", "format": "owl"},
    {"name_id": "go", "format": "obo"},
    {"name_id": "ado", "format": "owl"},
]

# Use the downloader, it has a cache in case you have already downloaded the ontologies.
batch_results = downloader.fetch_batch(resources)

# Print all the paths where your ontologies are located
print("Ontologies in cache:")
for ontology_name, ontology_path in batch_results.items():
    print(f"\t{ontology_name}: {ontology_path}")

## Step 3. Load ontologies

### Create a loader for the ontology. Behind scenes it uses `Pronto`. However, it can be another one. You as a final user will not notice this.

In [None]:
ontology_loader = ProntoLoaderAdapter(cache_dir=cache_dir)

### Load the `go` ontology in `obo` format

In [None]:
name_id_go = "go"
format_go = "obo"

gene_ontology = ontology_loader.load_from_registry(name_id=name_id_go, format=format_go)

print(f"Loaded ontology: {name_id_go}.{format_go}")
print(f"Number of terms: {len(gene_ontology.terms())}")

### Step 4. Query the ontology

In [None]:
# term_id = "GO:0008150" # biological_process
# term_id = "GO:0160266"  # anestrus phase
term_id = "GO:0070360"  # symbiont-mediated actin polymerization-dependent cell-to-cell migration in host

In [None]:
queries = OntologyQueries(gene_ontology)

# Print term relations
print(f"Term: {term_id}")
print(f"  Parents     : {queries.ancestors(term_id)}")
print(f"  Children    : {queries.descendants(term_id)}")


# print(f"  Ancestors   : {queries.ancestors(term_id)}")
# print(f"  Descendants : {queries.descendants(term_id)}")

## Step 5. Client

### Import the library

In [None]:
import ontograph as op

### Interact with the registry

In [None]:
# Load the ontogragh catalog of ontologies
onto_registry = op.registry("../data/out")

In [None]:
# In case you want to force downloading the registry you can use
onto_registry.load_registry(force_download=True)

In [None]:
# You can save ALL the catalog as a Python dictionary. The original OBO Foundry registry comes from a YAML file.
onto_registry_dict = onto_registry.registry_as_dict()

In [None]:
# You can verify the id and description of all the ontologies in the catalog
onto_registry.list_available_ontologies()

In [None]:
# You can print the current schema or possible fields the catalog store for each ontology.
onto_registry.print_registry_schema_tree()

In [None]:
# You can get the metadata of each ontology and store it in a variable. At the same time, you can print it using `show_metadata=True`
go_metadata = onto_registry.get_ontology_metadata("go", show_metadata=True)

In [None]:
# The catalog contains different formats for the same ontology. You can verify all the formats available for a specific ontology.
onto_registry.get_available_formats("chebi")

### Download ontologies

In [None]:
# Instantiate a downloader object
downloader = PoochDownloaderAdapter(cache_dir=cache_dir, registry=onto_registry)

In [None]:
# List of resources to download
ontologies_to_download = [
    {"name_id": "go", "format": "obo"},
    {"name_id": "chebi", "format": "obo"},
]

# Download ontologies
paths_downloads = downloader.fetch_batch(resources=ontologies_to_download)

for name_id, path in paths_downloads.items():
    print(f"{name_id}:\t{path}")

### Load Ontologies

In [None]:
# Instantiate a loader object
loader = ProntoLoaderAdapter(cache_dir=cache_dir)

In [None]:
# Load "go" ontology previosly downloaded
go = loader.load_from_registry(name_id="go", format="obo")

In [None]:
# Load "chebi" ontology previosly downloaded
#chebi = loader.load_from_registry(name_id="chebi", format="owl")

go_version1 = op.load(path="/home/egcarren/go_version1.obo")

go_version2 = op.load(name_id="go", format="owl")

go_version3 = op.load(name_id="go")

In [None]:
path="../data/out/go_version1.obo"

Path(path).exists()


In [None]:
go_version1 = op.load(path="../data/out/go_version1.obo")

In [None]:
go_version2 = op.load(name_id="go", format="owl", cache_dir=cache_dir)

In [None]:
queries = OntologyQueries(go_version2)

In [None]:
queries.children(term_id="GO:0008150")

## References and Further Reading

- [OBO Foundry](https://obofoundry.org/)
- [`pronto` documentation](https://pronto.readthedocs.io/en/stable/)
- [`pooch` documentation](https://www.fatiando.org/pooch/latest/)