# twa OGM tutorial

In this tutorial, we demonstrate basic usage of object graph mapper of the `twa` package.

We will go through a few minimal examples with Blazegraph as our triple store and demonstrate how to interact with it through a Python-based SPARQL client.

You may skip any of the preparation steps if it's already done.

Let's get started!

## Preparation 1: Install necessary packages
Run this cell to install the necessary Python packages if they are not already installed.

In [None]:
!pip install twa docker

## Preparation 2: Setup Docker and SPARQL Client

### Start Blazegraph Docker Container
Before anything, firstly spin up a Blazegraph container to serve as our triple store.


In [2]:
import docker
# Connect to Docker using the default socket or the configuration in your environment:
client = docker.from_env()

# Run Blazegraph container
# It returns a Container object that we will need later for stopping it
blazegraph = client.containers.run(
    'ghcr.io/cambridge-cares/blazegraph:1.1.0',
    ports={'8080/tcp': 27149}, # this binds the internal port 8080/tcp to the external port 27149
    detach=True # this runs the container in the background
)

### Define and initialize SPARQL Client

Connect to the Blazegraph instance using the SPARQL client.

In [3]:
from twa.kg_operations import PySparqlClient

sparql_endpoint = 'http://localhost:27149/blazegraph/namespace/kb/sparql'
sparql_client = PySparqlClient(sparql_endpoint, sparql_endpoint)

Info: Initializing JPSGateway with resName=JpsBaseLib, jarPath=None


In [4]:
sparql_client.perform_update('delete where { ?s ?p ?o }')

## Preparation 3: Importing Libraries

Import necessary modules. Note that for a Python script, it is important to include `from __future__ import annotations` at the beginning of the file.


In [5]:
from __future__ import annotations
from twa.data_model.base_ontology import BaseOntology, BaseClass, ObjectProperty, DatatypeProperty, TransitiveProperty

from typing import Optional

## Example 1: Recursive pull and push

### Create a ExampleOntology class
This class represents the ontology for example purpose.


In [6]:
from twa.data_model.base_ontology import KnowledgeGraph
KnowledgeGraph.clear_object_lookup()

class ExampleOntology(BaseOntology):
    base_url = 'https://dummy.example/kg/'
    namespace = 'example'
    owl_versionInfo = '0.0.1'
    rdfs_comment = 'An example ontology'

ExampleOntology.set_dev_mode()

In [7]:
Has = ObjectProperty.create_from_base('Has', ExampleOntology)

class I(BaseClass):
    rdfs_isDefinedBy = ExampleOntology
    has: Optional[Has[I]] = set()

class I1(I):
    pass

class I2(I):
    pass

class I3(I):
    pass

class I4(I):
    pass

class I5(I):
    pass

class I6(I):
    pass

In [8]:
# Export the ontology to the triple store
ExampleOntology.export_to_triple_store(sparql_client)

In [9]:
# Upload example triples
import rdflib
sparql_client.upload_graph(
    rdflib.Graph().parse(data="""
        @prefix : <https://dummy.example/kg/example/> .
        :i1 a :I1.
        :i2 a :I2.
        :i3 a :I3.
        :i1 :has :i2, :i3.
        """
    )
)
# Initially (t = 0), the knowledge graph contains the following triples:
#  :i1 a :I1.
#  :i2 a :I2.
#  :i3 a :I3.
#  :i1 :has :i2, :i3.

In [10]:
# At t = 1, pull the instance :i1 from the graph, retrieving all connected instances recursively:
i1 = I1.pull_from_kg(
    iris='https://dummy.example/kg/example/i1',
    sparql_client=sparql_client,
    recursive_depth=-1, # -1 enables full recursion, retrieving all transitive connections
)[0] # Results are stored in a list as multiple instances of the same class can be pulled at once if a list of IRIs is passed in for the field `iris`

In [11]:
print(i1.triples())

@prefix ns1: <https://dummy.example/kg/example/> .

ns1:i1 a ns1:I1 ;
    ns1:has ns1:i2,
        ns1:i3 .




In [12]:
# At t = 2, an external update is made directly to the knowledge graph via SPARQL:
# """INSERT DATA { :i4 a :I4 . :i2 :has :i4 . :i3 :has :i4 . }"""
# At this point, the graph contains new triples that are not yet reflected in the Python environment.
sparql_client.perform_update(
    """
    PREFIX : <https://dummy.example/kg/example/>
    INSERT DATA {
        :i4 a :I4 .
        :i2 :has :i4 .
        :i3 :has :i4 .
    }
    """
)

In [13]:
# At t = 2, modify the local Python object by removing the connection between `i1` and `i2`:
i1.has.remove('https://dummy.example/kg/example/i2')  # This change is local and not yet propagated to the knowledge graph

In [14]:
# At t = 3, push local changes to the graph while first pulling the latest remote state:
g_r, g_a = i1.push_to_kg(
    sparql_client,
    recursive_depth=-1,  # Ensures synchronisation across all transitive links
    pull_first=True  # Ensures that any external modifications are pulled before applying local changes
)

In [15]:
# The `g_r` and `g_a` variables contain the graph deletes and the additions, respectively.
print('Removed triples:')
print(g_r.serialize(format='turtle'))
print('---------------------------------')
print('Added triples:')
print(g_a.serialize(format='turtle'))

Removed triples:
@prefix ns1: <https://dummy.example/kg/example/> .

ns1:i1 ns1:has ns1:i2 .


---------------------------------
Added triples:




In [16]:
# At t = 4, another external update is made to the graph:
# """INSERT DATA { :i5 a :I5 . :i1 :has :i5 . }"""
# The graph now contains additional data that is not yet present in Python.
sparql_client.perform_update(
    """
    PREFIX : <https://dummy.example/kg/example/>
    INSERT DATA {
        :i5 a :I5 .
        :i1 :has :i5 .
    }
    """
)

In [17]:
# At t = 5, create a new instance `i6` locally and establish a relationship with `i1`:
i6 = I6()  # A unique IRI is automatically assigned to `i6`
i1.has.add(i6)  # This change remains local until explicitly pushed to the graph

In [18]:
# At t = 6, push the local updates while first pulling any remote changes:
g_r, g_a = i1.push_to_kg(
    sparql_client,
    recursive_depth=-1,  # Ensures full synchronisation
    pull_first=True  # External modifications (e.g., `i5`) will be pulled before pushing local changes (e.g., `i6`)
)

In [19]:
# After this operation, the Python environment will contain `i5`, which was previously only in the graph, and the relationship involving `i6` will be pushed to the graph, ensuring full consistency.
assert 'https://dummy.example/kg/example/i5' in i1.has
i1.has

{https://dummy.example/kg/example/I6_e9338ed0-f705-4109-a126-688b16c942df,
 https://dummy.example/kg/example/i3,
 https://dummy.example/kg/example/i5}

In [20]:
# We can check `g_r` and `g_a` for the exact triples modified when push to remote
print('Removed triples:')
print(g_r.serialize(format='turtle'))
print('---------------------------------')
print('Added triples:')
print(g_a.serialize(format='turtle'))

Removed triples:


---------------------------------
Added triples:
@prefix ns1: <https://dummy.example/kg/example/> .

ns1:i1 ns1:has ns1:I6_e9338ed0-f705-4109-a126-688b16c942df .

ns1:I6_e9338ed0-f705-4109-a126-688b16c942df a ns1:I,
        ns1:I6 .




## Example 2: Multi-Inheritance Instances

In [21]:
from __future__ import annotations
from twa.data_model.base_ontology import BaseOntology, BaseClass, ObjectProperty, DatatypeProperty, TransitiveProperty
from twa.data_model.iris import TWA_BASE_URL
from twa.data_model.base_ontology import KnowledgeGraph


KnowledgeGraph.clear_object_lookup()

class YourOntology(BaseOntology):
    # Below fields can be set up to provide metadata for your ontology
    base_url = TWA_BASE_URL
    namespace = 'yourontology'
    owl_versionInfo = '0.0.1'
    rdfs_comment = 'Your ontology'

DataB = DatatypeProperty.create_from_base('DataB', YourOntology)
DataC = DatatypeProperty.create_from_base('DataC', YourOntology)

# Class hierarchy defining multiple inheritance:
class T(BaseClass):
    rdfs_isDefinedBy = YourOntology

class A(T):
    rdfs_isDefinedBy = YourOntology

class B(A): # Leaf subclass of `A`
    data_b: DataB[str]

class C(T): # Independent branch from `T`
    data_c: DataC[int]

class D(C): # Leaf subclass of `C`
    pass

In [22]:
from twa.kg_operations import PySparqlClient
sparql_endpoint = 'http://localhost:27149/blazegraph/namespace/kb/sparql'
sparql_client = PySparqlClient(sparql_endpoint, sparql_endpoint)

In [23]:
sparql_client.perform_update('delete where { ?s ?p ?o }') # Clear the knowledge graph

In [24]:
# Upload the ontology to the triple store
YourOntology.export_to_triple_store(sparql_client)

In [25]:
# Knowledge graph initially contains the following triples for node <i>:
#     <i> a <T>, <A>, <B>, <C> .
#     <i> :data_c 5 .
# i.e. in triple format:
# <https://iri_i> a <https://www.theworldavatar.com/kg/yourontology/T>, <https://www.theworldavatar.com/kg/yourontology/A>, <https://www.theworldavatar.com/kg/yourontology/B>, <https://www.theworldavatar.com/kg/yourontology/C> .
#    <https://iri_i> <https://www.theworldavatar.com/kg/yourontology/dataC> 5 .
# The instance is labelled with multiple types, requiring resolution when pulled into Python.
import rdflib
sparql_client.upload_graph(
    rdflib.Graph().parse(data="""
        @prefix : <https://www.theworldavatar.com/kg/yourontology/> .
        <https://iri_i> a :T, :A, :B, :C .
        <https://iri_i> :dataC 5 .
        """
    )
)

In [26]:
# Pull node <i> with class `A`:
i = A.pull_from_kg(
    'https://iri_i',
    sparql_client,
    recursive_depth=-1,
)[0]

In [27]:
# Since `B` is the most specific subclass within the pulled hierarchy, OGM resolves the instance as `B`:
assert type(i) is B
assert not i.data_b # No triple exists for `data_b`, so the attribute is empty

In [28]:
i.data_b

set()

In [29]:
# Add a property relevant to `B` and push changes back to the graph:
i.data_b.add("my_str")
g_r, g_a = i.push_to_kg(sparql_client, -1)
# The following triple is added to the knowledge graph:
#     <i> :data_b "my_str" .

In [30]:
# To view the changes made to the graph, we can check `g_r` and `g_a`:
print('Removed triples:')
print(g_r.serialize(format='turtle'))
print('---------------------------------')
print('Added triples:')
print(g_a.serialize(format='turtle'))

Removed triples:


---------------------------------
Added triples:
@prefix ns1: <https://www.theworldavatar.com/kg/yourontology/> .

<https://iri_i> ns1:dataB "my_str" .




In [31]:
# Now pull node <i> using class `C` instead:
i = C.pull_from_kg(
    'https://iri_i',
    sparql_client,
    recursive_depth=-1,
)[0]

In [32]:
# The instance is now instantiated as `C`, and properties specific to `C` are retrieved:
assert type(i) is C
assert i.data_c == {5} # Retrieved from the knowledge graph

# If we create a new instance and push it to the graph:
new_i = B(data_b="new_str")
g_r, g_a = new_i.push_to_kg(sparql_client, -1)
# The following triples are added:
#     <new_i> a <B>, <A>, <T> .
#     <new_i> :data_b "new_str" .

In [33]:
# To view the changes made to the graph, we can check `g_r` and `g_a`:
print('Removed triples:')
print(g_r.serialize(format='turtle'))
print('---------------------------------')
print('Added triples:')
print(g_a.serialize(format='turtle'))

Removed triples:


---------------------------------
Added triples:
@prefix ns1: <https://www.theworldavatar.com/kg/yourontology/> .

ns1:B_85eec40d-d478-4739-8ae1-4ca9b648e9ab a ns1:A,
        ns1:B,
        ns1:T ;
    ns1:dataB "new_str" .




## Example 3: Retrieval of Transitive Property

In [34]:
from __future__ import annotations
from twa.data_model.base_ontology import BaseOntology, BaseClass, ObjectProperty, DatatypeProperty, TransitiveProperty
from typing import Optional
from twa.data_model.base_ontology import KnowledgeGraph


KnowledgeGraph.clear_object_lookup()

class MyOntology(BaseOntology):
    base_url = 'https://example.org/ontology/'
    namespace = 'myontology'
    owl_versionInfo = '0.0.1'
    rdfs_comment = 'My ontology'

# We can set the ontology to development mode for testing purposes
MyOntology.set_dev_mode()

# Define transitive relationships in the knowledge graph using an OGM class structure:
Part_of = TransitiveProperty.create_from_base('Part_of', MyOntology)

# Define class hierarchies:
class Experiment(BaseClass):
    rdfs_isDefinedBy = MyOntology

class ReactionSetup(BaseClass):
    rdfs_isDefinedBy = MyOntology
    part_of: Optional[Part_of[Experiment]] = set() # Defines a transitive relationship

class Equipment(BaseClass):
    rdfs_isDefinedBy = MyOntology
    part_of: Optional[Part_of[ReactionSetup]] = set() # Equipment can be part of a reaction setup

class EquipmentPart(BaseClass):
    rdfs_isDefinedBy = MyOntology
    part_of: Optional[Part_of[Equipment]] = set() # Equipment part can be part of an equipment

In [35]:
from twa.kg_operations import PySparqlClient
sparql_endpoint = 'http://localhost:27149/blazegraph/namespace/kb/sparql'
sparql_client = PySparqlClient(sparql_endpoint, sparql_endpoint)
sparql_client.perform_update('delete where { ?s ?p ?o }') # Clear the knowledge graph

In [36]:
# Below triples exist in the knowledge graph:
#     :beaker_A a :Equipment .
#     :clamp_B a :EquipmentPart .
#     :stand_C a :Equipment .
#     :reaction_setup_X a :ReactionSetup .
#     :experiment_Y a :Experiment .
#     :beaker_A :part_of :reaction_setup_X .
#     :reaction_setup_X :part_of :experiment_Y .
#     :clamp_B :part_of :stand_C .
#     :stand_C :part_of :reaction_setup_X .
import rdflib
sparql_client.upload_graph(
    rdflib.Graph().parse(data="""
        @prefix : <https://example.org/ontology/myontology/> .
        :beaker_A a :Equipment .
        :clamp_B a :EquipmentPart .
        :stand_C a :Equipment .
        :reaction_setup_X a :ReactionSetup .
        :experiment_Y a :Experiment .
        :beaker_A :part_of :reaction_setup_X .
        :reaction_setup_X :part_of :experiment_Y .
        :clamp_B :part_of :stand_C .
        :stand_C :part_of :reaction_setup_X .
        """
    )
)

In [37]:
# Pull node <beaker_A> using class `Equipment`
beaker_A = Equipment.pull_from_kg(
    'https://example.org/ontology/myontology/beaker_A',
    sparql_client,
    recursive_depth=-1,  # Fully traverse transitive properties
)[0]

In [38]:
# If we access `beaker_A.part_of` in normal way, we will only get the direct parent
assert beaker_A.part_of == {'https://example.org/ontology/myontology/reaction_setup_X'}

# If we access `beaker_A.part_of` using the transitive property, we will get all ancestors
assert Part_of.obtain_transitive_objects(beaker_A) == {
    'https://example.org/ontology/myontology/reaction_setup_X',
    'https://example.org/ontology/myontology/experiment_Y'
}

In [39]:
# Similarly, pull node <clamp_B> and verify the inferred hierarchy
clamp_B = EquipmentPart.pull_from_kg(
    'https://example.org/ontology/myontology/clamp_B',
    sparql_client,
    recursive_depth=-1,
)[0]

In [40]:
# If we access the transitive property in the normal way, `clamp_B` as part of `stand_C`
assert clamp_B.part_of == {'https://example.org/ontology/myontology/stand_C'}
# If we access `clamp_B` using the transitive property, we will get all ancestors
assert Part_of.obtain_transitive_objects(clamp_B) == {
    'https://example.org/ontology/myontology/stand_C',
    'https://example.org/ontology/myontology/reaction_setup_X',
    'https://example.org/ontology/myontology/experiment_Y'
}

In [41]:
# If we add a new equipment relationship and push it to the knowledge graph:
new_equipment = Equipment()
new_equipment.part_of.add('https://example.org/ontology/myontology/reaction_setup_X')
g_r, g_a = new_equipment.push_to_kg(sparql_client, -1)

# Below triple will be added:
#     :new_equipment a :Equipment .
#     :new_equipment :part_of :reaction_setup_X .

In [42]:
# To view the changes made to the graph, we can check `g_r` and `g_a`:
print('Removed triples:')
print(g_r.serialize(format='turtle'))
print('---------------------------------')
print('Added triples:')
print(g_a.serialize(format='turtle'))

Removed triples:


---------------------------------
Added triples:
@prefix ns1: <https://example.org/ontology/myontology/> .

ns1:Equipment_0f12f5c5-b326-44ad-a594-b859a17d744f a ns1:Equipment ;
    ns1:part_of ns1:reaction_setup_X .




## Example 4: Load Nested JSON

In [43]:
# Here we reuse the ontology and classes defined above in Example 1
# Below concepts are re-defined so that you can run this example independently
from __future__ import annotations
from twa.data_model.base_ontology import BaseOntology, BaseClass, ObjectProperty, DatatypeProperty, TransitiveProperty
from typing import Optional
from twa.data_model.base_ontology import KnowledgeGraph


KnowledgeGraph.clear_object_lookup()

class ExampleOntology(BaseOntology):
    base_url = 'https://dummy.example/kg/'
    namespace = 'example'
    owl_versionInfo = '0.0.1'
    rdfs_comment = 'An example ontology'

ExampleOntology.set_dev_mode()

Has = ObjectProperty.create_from_base('Has', ExampleOntology)

class I(BaseClass):
    rdfs_isDefinedBy = ExampleOntology
    has: Optional[Has[I]] = set()

class I1(I):
    pass

class I2(I):
    pass

class I3(I):
    pass

In [44]:
# Here we also define a lean Pydantic model to represent the same data structure
from pydantic import BaseModel, Field
class IModel(BaseModel):
    instance_iri: str
    has: Optional[list[IModel]] = Field(default_factory=list)

class I1Model(IModel):
    pass

class I2Model(IModel):
    pass

class I3Model(IModel):
    pass

In [45]:
# Here we define the nested JSON to be loaded
import json
json_data = {
  'instance_iri': 'https://dummy.example/kg/example/I1_6e3af415',
  'has': [
    {
      'instance_iri': 'https://dummy.example/kg/example/I2_c3a0238f',
      'has': [
        {
          'instance_iri': 'https://dummy.example/kg/example/I3_7b53f45c',
          'has': []
        }
      ]
    },
    {
      'instance_iri': 'https://dummy.example/kg/example/I3_7b53f45c',
      'has': []
    }
  ]
}

In [46]:
# Load the JSON data into the OGM and Pydantic models
ogm_loaded_i1 = I1.model_validate_json(json.dumps(json_data))
pydantic_loaded_i1 = I1Model.model_validate_json(json.dumps(json_data))

In [47]:
# Now we can check the loaded data
assert ogm_loaded_i1.has == {
    'https://dummy.example/kg/example/I2_c3a0238f',
    'https://dummy.example/kg/example/I3_7b53f45c'
}
ogm_loaded_i2 = KnowledgeGraph.get_object_from_lookup('https://dummy.example/kg/example/I2_c3a0238f')
assert ogm_loaded_i2.has == {
    'https://dummy.example/kg/example/I3_7b53f45c'
}
ogm_loaded_i3 = KnowledgeGraph.get_object_from_lookup('https://dummy.example/kg/example/I3_7b53f45c')
assert ogm_loaded_i3.has == set()
# NOTE importantly, we need to check the id of the i3 pointed by i2 has the same id as the one in the loaded i3
assert id(ogm_loaded_i3) == id(list(ogm_loaded_i2.has)[0])

In [48]:
# We can also check the triples
for o in KnowledgeGraph.construct_object_lookup().values():
    print(o.triples())


<https://dummy.example/kg/example/I3_7b53f45c> a <https://dummy.example/kg/example/I> .


@prefix ns1: <https://dummy.example/kg/example/> .

ns1:I2_c3a0238f a ns1:I ;
    ns1:has ns1:I3_7b53f45c .


@prefix ns1: <https://dummy.example/kg/example/> .

ns1:I1_6e3af415 a ns1:I1 ;
    ns1:has ns1:I2_c3a0238f,
        ns1:I3_7b53f45c .




In [49]:
# For data loaded in the Pydantic model, we will see that even though two objects share the same IRI, they are different objects in memory
pydantic_loaded_i2 = pydantic_loaded_i1.has[0]
pydantic_loaded_i3 = pydantic_loaded_i1.has[1]
print(pydantic_loaded_i2.has[0].instance_iri)
print(pydantic_loaded_i3.instance_iri)
assert pydantic_loaded_i2.has[0].instance_iri == pydantic_loaded_i3.instance_iri

https://dummy.example/kg/example/I3_7b53f45c
https://dummy.example/kg/example/I3_7b53f45c


In [50]:
# But they are not the same object in memory
print(id(pydantic_loaded_i2.has[0]), id(pydantic_loaded_i3))
assert id(pydantic_loaded_i2.has[0]) != id(pydantic_loaded_i3)

1831727524160 1831728404176


## Example 5: Find Ontological Concepts/Relationships in Python

In [51]:
# Here we reuse the ontology and classes defined above in Example 1
# Below concepts are re-defined so that you can run this example independently
from __future__ import annotations
from twa.data_model.base_ontology import BaseOntology, BaseClass, ObjectProperty, DatatypeProperty, TransitiveProperty
from typing import Optional
from twa.data_model.base_ontology import KnowledgeGraph


KnowledgeGraph.clear_object_lookup()

class ExampleOntology(BaseOntology):
    base_url = 'https://dummy.example/kg/'
    namespace = 'example'
    owl_versionInfo = '0.0.1'
    rdfs_comment = 'An example ontology'

ExampleOntology.set_dev_mode()

Has = ObjectProperty.create_from_base('Has', ExampleOntology)

class I(BaseClass):
    rdfs_isDefinedBy = ExampleOntology
    has: Optional[Has[I]] = set()

class I1(I):
    pass

In [52]:
# Assume that we have already defined the ontology and classes as in Example 1
# To know the IRI of a class
I1.rdf_type

'https://dummy.example/kg/example/I1'

In [53]:
# To get the python class from the IRI, we can use the `class_lookup` dictionary
KnowledgeGraph.class_lookup['https://dummy.example/kg/example/I1']

__main__.I1

In [54]:
KnowledgeGraph.class_lookup['https://dummy.example/kg/example/This_Does_Not_Exist'] # This will raise a KeyError

KeyError: 'https://dummy.example/kg/example/This_Does_Not_Exist'

## Final clean up
### Stop Blazegraph Docker Container

In [55]:
blazegraph.stop()

If one wish to remove the blazegraph container:

In [56]:
blazegraph.remove()

## Author
Jiaru Bai (jb2197@cantab.ac.uk)

## Citation

If you found this tool useful, please consider citing the following preprint:

```bibtex
@article{bai2025twa,
  title={{twa: The World Avatar Python package for dynamic knowledge graphs and its application in reticular chemistry}},
  author={Bai, Jiaru and Rihm, Simon D and Kondinski, Aleksandar and Saluz, Fabio and Deng, Xinhong and Brownbridge, George and Mosbach, Sebastian and Akroyd, Jethro and Kraft, Markus},
  year={2025},
  note={Preprint at \url{https://como.ceb.cam.ac.uk/preprints/335/}}
}
```