# OntoGraph Usage - Jupyter Notebook

[//]: # (------------------------------------------    DO NOT MODIFY THIS    ------------------------------------------)
<style type="text/css">
.tg  {border-collapse:collapse;
      border-spacing:0;
     }
.tg td{border-color:black;
       border-style:solid;
       border-width:1px;
       font-family:Arial, sans-serif;
       font-size:14px;
       overflow:hidden;
       padding:10px 5px;
       word-break:normal;
      }
.tg th{border-color:black;
       border-style:solid;
       border-width:1px;
       font-family:Arial, sans-serif;
       font-size:14px;
       font-weight:normal;
       overflow:hidden;
       padding:10px 5px;
       word-break:normal;
      }
.tg .tg-fymr{border-color:inherit;
             font-weight:bold;
             text-align:left;
             vertical-align:top
            }
.tg .tg-0pky{border-color:inherit;
             text-align:left;
             vertical-align:top
            }
[//]: # (--------------------------------------------------------------------------------------------------------------)
[//]: # (-------------------------------------    FILL THIS OUT WITH YOUR DATA    -------------------------------------)
[//]: # (--------------------------------------------------------------------------------------------------------------)
</style>
<table class="tg">
    <tbody>
      <tr>
        <td class="tg-fymr" style="font-weight: bold">Title:</td>
        <td class="tg-0pky">OntoGraph Usage - Jupyter Notebook </td>
      </tr>
      <tr>
        <td class="tg-fymr" style="font-weight: bold">Authors:</td>
        <td class="tg-0pky">
            <a href="https://github.com/ecarrenolozano" target="_blank" rel="noopener noreferrer">Edwin Carreño</a>,
            <a href="" target="_blank" rel="noopener noreferrer">Denes Türei</a>
        </td>
      </tr>
      <tr>
        <td class="tg-fymr" style="font-weight: bold">Affiliations:</td>
        <td class="tg-0pky">
            <a href="https://www.ssc.uni-heidelberg.de/en" target="_blank" rel="noopener noreferrer">Scientific Software Center</a>,
            <a href="https://saezlab.org/" target="_blank" rel="noopener noreferrer">Saez-Rodriguez Group</a>
        </td>
      </tr>
      <tr>
        <td class="tg-fymr" style="font-weight: bold">Date Created:</td>
        <td class="tg-0pky">30.10.2024</td>
      </tr>
      <tr>
        <td class="tg-fymr" style="font-weight: bold">Description:</td>
        <td class="tg-0pky">This notebook is a sandbox to explore, test, and demonstrate the core functionalities of the `OntoGraph` package, helping refine the user experience and gather early feedback before writing the official documentation.</td>
      </tr>
    </tbody>
</table>

[//]: # (--------------------------------------------------------------------------------------------------------------)

## Overview

This notebook serves as a **sandbox environment** to explore and test the functionalities of the `OntoGraph` Python package. Its goals are to:  

- Evaluate the **user experience** during development.  
- Demonstrate common use cases and workflows for potential users.  
- Gather feedback from collaborators before finalizing the official documentation.
- Support the creation of use cases based on user requirements.

Feel free to execute, modify, and extend the examples in this notebook to gain a deeper understanding of how the package can be applied in real-world scenarios.

## How to Use This Notebook

1. Run cells sequentially from top to bottom.
2. If using an environment manager (e.g., `conda`, `venv`), activate it before running.
3. To reset, click `Kernel -> Restart & Run All`.

## Setup

In [None]:
!uv pip install -e ../

## Importing Libraries

In [None]:
# 1. Standard library imports
import sys

from pathlib import Path

# 2. Related third party imports
# 3. Local application/library specific imports
from ontograph.client import (
    ClientCatalog,
    ClientOntology,
)

## Client Catalog (optional, but useful)

Users can retrieve the catalog of ontologies provided by the OBO Foundry. This catalog contains names, descriptions, and other useful metadata for all supported ontologies in one place.

To facilitate access, we provide a dedicated client for interacting with the catalog. It automatically downloads a `.yaml` file containing the catalog information. If the file already exists in your local cache, it will be loaded from there automatically.

In [None]:
# Define the cache path, if you do not provide this automatically the folder .cache/ontograph will be created by the client
cache = Path('../data/out')

In [None]:
# Create a client for the catalog
client_catalog = ClientCatalog(cache_dir=cache)

### Load the OBO Foundries catalog

In [None]:
client_catalog.load_catalog(
    force_download=True
)  # True: in case you are interested in download the catalog by forcing the download.

### Retrieve the catalog as a Python Dictionary

In [None]:
client_catalog.catalog_as_dict()

### List available ontologies

**OntoGraph** offers two options:
1. Retrieve a list of dictionaries containing the `ontology_id` and `description` for each ontology.
   
2. Print directly in a formatted table the current ontologies in OBO foundries

In [None]:
# Option 1. Return the list of ontologies (ontology_id and description)
obo_foundry_ontologies = client_catalog.list_available_ontologies()

In [None]:
# Option 2. Print the list of available ontologies in console.
client_catalog.print_available_ontologies()

### Get metadata of a specific ontology

For example, suppose we are interested in retrieving metadata about the chebi ontology (Chemical Entities of Biological Interest).
In this case, we only need the `ontology_id`, which is `chebi`.

In [None]:
# Store the metadata and print in console if you need it!
chebi_metadata = client_catalog.get_ontology_metadata(
    ontology_id='chebi',
    show_metadata=True,
)

### Print catalog schema

You have access to the metadata, but you may not fully understand all the returned fields.
To clarify their structure and meaning, simply print the schema of the catalog.

The schema is dynamically generated, so if the OBO Foundry updates the format, you will automatically receive the latest version.

In [None]:
client_catalog.print_catalog_schema_tree()

### Get available formats for a given ontology

Are you interested in knowing which formats (e.g., `.obo`, `.owl`, etc.) are available for a given ontology?
You can retrieve this information with a single command!

In [None]:
client_catalog.get_available_formats(ontology_id='go')

### Get the link to download an ontology based on their `ontology_id` and `format`

In [None]:
client_catalog.get_download_url(ontology_id='go', format='json')

## Client Ontology

We provide a dummy ontology that allows users to navigate and understand the purpose of each method offered by the client. This ontology is named `dummy_ontology.obo`.

```mermaid
graph TB
    
    Z((Z)) --> A((A))
    Z((Z)) --> B((B))
    Z((Z)) --> C((C))

    A((A)) --> D((D))
    B((B)) --> H((H))
    B((B)) --> I((I))
    C((C)) --> J((J))

    D((D)) --> E((E))
    D((D)) --> F((F))
    D((D)) --> G((G))
    H((H)) --> K((K))
    I((I)) --> L((L))
    J((J)) --> M((M))

    E((E)) --> N((N))
    F((F)) --> O((O))
    F((F)) --> Y((Y))
    G((G)) --> K1((K1))
    G((G)) --> K2((K2))
    K((K)) --> Q((Q))
    K((K)) --> G1((G))
    M((M)) --> S((S))

    G1((G)) --> K11((K1))
    G1((G)) --> K21((K2))
    S((S)) --> T((T))

    T((T)) --> U((U))
    
    U((T)) --> V((V))
    U((U)) --> W((W))
    
    W((W)) --> Y1((Y))
```

In [None]:
# Create a client for the ontology
client_dummy_ontology = ClientOntology(cache_dir=cache)

### Load a given ontology

For this example, we are going to load the ontology from an file. However, you can load the ontology from the OBO foundry catalog or by giving the link from other website.

In [None]:
# Load ontology from file
_ = client_dummy_ontology.load(
    file_path_ontology='../tests/resources/dummy_ontology.obo'
)

### Navigation queries

#### Get ancestors of a given node

**Hint**: Compare the result of each function with the graph displayed in this notebook.

In [None]:
# It means, give me all the ancestors of the node "K"
client_dummy_ontology.get_ancestors(term_id='K')

#### Get ancestors and the distance of each ancestor to the given node

In [None]:
# retrieve the ancestors with distance
ancestors = client_dummy_ontology.get_ancestors_with_distance(
    term_id='K',
    include_self=True,
)

# Print for better visualization, compare with the graph.
for node, distance in ancestors:
    print(f'Node: {node}\tDistance: {distance}')

#### Get descendants

In [None]:
client_dummy_ontology.get_descendants(term_id='K')

#### Get descendants with distance

In this case the distances will be **positive numbers**, it means we are advancing starting from the term.

In [None]:
descendants = client_dummy_ontology.get_descendants_with_distance(
    term_id='U', include_self=True
)

for node, distance in descendants:
    print(f'Node: {node}\tDistance: {distance}')

#### Get children

It means, retrieve the immediate descendants or with distance 1.

In [None]:
client_dummy_ontology.get_children(term_id='U')

#### Get parents

In [None]:
client_dummy_ontology.get_parents(term_id='K')

#### Get siblings

In [None]:
client_dummy_ontology.get_siblings(term_id='E')

#### Get term

When we use `get_term()` we are retriving an object from the class `Term` defined in the library behind scenes in OntoGraph. The object generated by `Term` contains a lot of information about an specific term such as the id, metadata, annotations, etc. For this simple example, it contains nothing more than the `id`. 

If you load more specialized ontologies such as `go` (Gene Ontology), you can see how powerful is the object term!

In [None]:
my_term = client_dummy_ontology.get_term(term_id='A')

In [None]:
my_term.id

In [None]:
my_term.obsolete

In [None]:
my_term.is_leaf()

#### Get the ontology's root

In this example, we have only one root, it is the term `"Z"`. The result is a list of roots, some ontologies have more than one root (i.e., gene ontology has three roots).

In [None]:
client_dummy_ontology.get_root()

### Relationship queries

#### Get common ancestors (all)

In [None]:
client_dummy_ontology.get_common_ancestors(node_ids=['K', 'L'])

#### Get the most proximate ancestor

In [None]:
client_dummy_ontology.get_lowest_common_ancestors(node_ids=['K', 'L'])

#### Evaluate if one node is ancestor or another node

In [None]:
client_dummy_ontology.is_ancestor(ancestor_node='A', descendant_node='N')

#### Evaluate if a node is descendant of another node

In [None]:
client_dummy_ontology.is_descendant(descendant_node='A', ancestor_node='N')

#### Evaluate if a node is sibling of another node

In [None]:
client_dummy_ontology.is_sibling(node_a='F', node_b='G')

### Introspection

#### Get the distance between a node and the root (or roots)

In [None]:
client_dummy_ontology.get_distance_from_root(term_id='V')

#### Get the path between two nodes (if those are related as ancestor-descendat or descendant-ancestor)

In [None]:
client_dummy_ontology.get_path_between(node_a='Q', node_b='B')

#### Get all the direct trajectories to an specific node.

In [None]:
trajectories = client_dummy_ontology.get_trajectories_from_root(term_id='Y')

#### Print all the trajectories in terminal

In [None]:
client_dummy_ontology.print_term_trajectories_tree(trajectories)

## References and Further Reading

- [OntoGraph Repo](https://github.com/saezlab/ontograph/)
- [OBO Foundry](https://obofoundry.org/)
- [`pronto` documentation](https://pronto.readthedocs.io/en/stable/)
- [`pooch` documentation](https://www.fatiando.org/pooch/latest/)