# Hpo-toolkit tutorial

This notebook shows how to install and use `hpo-toolkit` for work with Human Phenotype Ontology (HPO).

## Installation

The toolkit is available at [PyPi](https://pypi.org/project/hpo-toolkit), so installation with `pip` is really easy:

## Load HPO

`hpo-toolkit` supports reading ontologies in [Obographs](https://github.com/geneontology/obographs) JSON format.

We can download and open the latest HPO from *https://raw.githubusercontent.com/obophenotype/human-phenotype-ontology/master/hp.json*

In [1]:
import hpotk
from hpotk.ontology import Ontology
from hpotk.ontology.load.obographs import load_ontology

# to peek under the hood
import logging
hpotk.util.setup_logging(logging.DEBUG)

hpo: Ontology = load_ontology('https://raw.githubusercontent.com/obophenotype/human-phenotype-ontology/master/hp.json')

2022-12-28 13:07:54,423 hpotk.util           DEBUG : Opening https://raw.githubusercontent.com/obophenotype/human-phenotype-ontology/master/hp.json
2022-12-28 13:07:54,423 hpotk.util           DEBUG : Using default encoding 'utf-8'
2022-12-28 13:07:54,426 hpotk.util           DEBUG : Looks like a URL: https://raw.githubusercontent.com/obophenotype/human-phenotype-ontology/master/hp.json
2022-12-28 13:07:54,427 hpotk.util           DEBUG : Downloading with timeout=30s
2022-12-28 13:07:54,501 hpotk.util           DEBUG : Looks like un-compressed data
2022-12-28 13:07:55,178 hpotk.ontology.load.obographs._load DEBUG : Extracting ontology terms
2022-12-28 13:07:55,179 hpotk.ontology.io.obographs DEBUG : Missing node type in {'id': 'http://purl.obolibrary.org/obo/GO_0000016', 'lbl': 'lactase activity'}
2022-12-28 13:07:55,180 hpotk.ontology.io.obographs DEBUG : Missing node type in {'id': 'http://purl.obolibrary.org/obo/GO_0003857', 'lbl': '3-hydroxyacyl-CoA dehydrogenase activity'}
2022-12

The code downloads and reads the JSON file into `MinimalOntology` object.

## `Ontology`

`hpo-toolkit` provides a container ontology data. You can iterate over HPO terms by calling `hpo.terms`:

In [2]:
for term in hpo.terms:
    print(term)
    break

Term(identifier=TermId(prefix="HP", id="0000001"), name="All", definition=None, comment=Root of all terms in the Human Phenotype Ontology., is_obsolete=False, alt_term_ids="[]")


You can get a term by calling `get_term` method:

In [3]:
hypertension = hpo.get_term("HP:0001166")
hypertension

Term(identifier=TermId(prefix="HP", id="0001166"), name="Arachnodactyly", definition=Abnormally long and slender fingers ("spider fingers")., comment=None, is_obsolete=False, alt_term_ids="[TermId(prefix="HP", id="0001505")]")

Each term has:
- `identifier` - a `hpotk.model.TermId` corresponding to term's CURIE 
- `name` - term's name (e.g. *"Hypertension"*)
- `alt_term_ids` - alternative term IDs - term ids of obsoleted terms that have been replaced by this term
- `is_obsolete` - obsoletion status

In [4]:
hypertension.identifier.value

'HP:0001166'

In [5]:
print(f'ID: {term.identifier.value}')
print(f'Name: {term.name}')
print(f'Alt ids: {term.alt_term_ids}')
print(f'Is obsolete: {term.is_obsolete}')

ID: HP:0000001
Name: All
Alt ids: []
Is obsolete: False
