In [1]:
import requests
from rdflib import Graph
from rdflib.namespace import Namespace, OWL, RDF, RDFS

from ir_onto_demos_utilities import (
    show_instances, show_owl_classes, show_owl_obj_props, 
    sparql_update, sparql_select, insert_graph, delete_graph
)

## TODOs

- Use only RDFlib Namespace, no tuple
- Take out the functions

## Explore the Information Retrieval Ontology

Before anything, let us shortly explore what is in the IR ontology.
The steps are:
- Load the IR ontology
- Query for the OWL classes
- Query for the OWl object properties

In [2]:
# Load the IR ontology into an RDFlib Graph
ir_onto_graph = Graph()
ir_onto_graph.parse("../information-retrieval-ontology.ttl")
ir_onto_ns = Namespace("http://www.msesboue.org/o/ir-ontology#")
ir_onto_graph.bind(prefix="ir-onto", namespace=ir_onto_ns) # setup a namespace for nicer human readable display

In [3]:
query_namespaces = { # namespaces for nicer human readable display
    "owl": OWL,
    "rdf": RDF,
    "RDFS": RDFS,
    "ir-onto": ir_onto_ns
}
query_classes = """
    SELECT ?p
    WHERE {
        ?p rdf:type owl:Class .
    }
"""
# Apply the query to the IR ontology graph and iterate through results
for r in ir_onto_graph.query(query_classes, initBindings=query_namespaces):
    print(r["p"].n3(ir_onto_graph.namespace_manager))

ir-onto:IncompatibleSearch
ir-onto:Search
ir-onto:CandidateDocument
ir-onto:Category
ir-onto:Classification
ir-onto:Document
ir-onto:EnabledCategory
ir-onto:IncompatibleDocument
ir-onto:SearchContext
ir-onto:SelectedCategory


In [4]:
query_obj_props = """
    SELECT ?p
    WHERE {
        ?p rdf:type owl:ObjectProperty .
    }
"""

# Apply the query to the graph and iterate through results
for r in ir_onto_graph.query(query_obj_props, initBindings=query_namespaces):
    print(r["p"].n3(ir_onto_graph.namespace_manager))

ir-onto:hasSearchCategory
ir-onto:categorizedBy
ir-onto:categorizes
ir-onto:enablesCategory
ir-onto:hasSubcategory
ir-onto:hasSupercategory
ir-onto:isMemberOf


## Information Retrieval Ontology usage examples

Let us now see some examples of what can we do with the IR ontology.

For these demos we will use a triple store database in a server. 
Another tutorial will explore the same examples with in memory only requirements. 
These demos require OWL inference at query time. We chose the OntoText GraphDB triple store (v10.4.0): <https://graphdb.ontotext.com/documentation/10.0/index.html>. Hence, some pieces of code might be specific to the GraphDB APIs. We will try to flag those as much as possible. Once you the docker container is running you can see their Web API documentation there: <http://localhost:7200/webapi>.
However, GraphDB implements the RDF4J REST API specification (<https://rdf4j.org/documentation/reference/rest-api/>). So we will try to make use of it as much as possible.

We use the Docker GraphDB instance without any license (the GraphDB Free version). Hence, you will need to have Docker installed and running on your computer (See Docker installation procedure: <https://docs.docker.com/get-docker/>). Let's start from there.

1. Download the GraphDB image: `docker pull ontotext/graphdb:10.4.0`
   - OntoText related documentation pointers:
     - <https://github.com/Ontotext-AD/graphdb-docker>
     - <https://hub.docker.com/r/ontotext/graphdb/>
2. Run the image: `docker run -p 127.0.0.1:7200:7200 --name graphdb-ir-onto -t ontotext/graphdb:10.4.0`
3. From now on the rest is in the code.

In [5]:
# Setting up the repository we will work with

# Load the repository configuration (WARNING: specific to GraphDB)
repo_config_graph = Graph()
repo_config_graph.parse("./data/ir-onto-demo-graphdb-config.ttl")
repo_config_ttl_string = repo_config_graph.serialize(format="turtle")

headers = {
    "Accept": "application/json",
}
data = {
    "config": ("config.ttl", repo_config_ttl_string)
}

# Uses the GraphDB REST API (WARNING: specific to GraphDB)
r = requests.post("http://localhost:7200/rest/repositories", headers=headers, files=data)
print(r.status_code)
print(r.text)

201



In [6]:
# Check that the repository is created
r = requests.get('http://localhost:7200/repositories', headers={"Content-type": "application/rdf+xml"})
print(r.status_code)
print(r.text)

200
uri,id,title,readable,writable
http://localhost:7200/repositories/ir-onto-demo,ir-onto-demo,,true,true



### Some utilities

In [7]:
# Define the server info for future interactions
DB_IP = "localhost"
DB_PORT = "7200"
DB_URL = f"http://{DB_IP}:{DB_PORT}"
REPOSITORY_ID = "ir-onto-demo"

# Build an RDFlib NamespaceManager so ease the display of sparql results
g = Graph()

ir_onto_ns = Namespace("http://www.msesboue.org/o/ir-ontology#")
g.bind(prefix="ir-onto", namespace=ir_onto_ns)

pizza_onto_ns = Namespace("http://www.co-ode.org/ontologies/pizza/pizza.owl#")
g.bind(prefix="pizza-onto", namespace=pizza_onto_ns)

pizza_onto_demo_ns = Namespace("http://www.msesboue.org/o/ir-onto-pizza-demo#")
g.bind(prefix="pizza-taxos", namespace=pizza_onto_demo_ns)

demo_ns_manager = g.namespace_manager

In [8]:
# Let us now load the IR ontology

# RDF4J does not let you simply add an RDF file to a repository without 
# erasing what was already in it.
# Hence we need to use a SPARQL INSERT query
insert_graph(graph=ir_onto_graph, db_url=DB_URL, repo_id=REPOSITORY_ID)

OK


In [9]:
show_owl_classes(ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID)

ir-onto:IncompatibleDocument
ir-onto:Document
ir-onto:IncompatibleSearch
ir-onto:Search
ir-onto:SearchContext
ir-onto:SelectedCategory
ir-onto:Category
ir-onto:Classification
ir-onto:EnabledCategory
ir-onto:CandidateDocument


In [10]:
show_owl_obj_props(ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID)

ir-onto:hasSearchCategory
ir-onto:categorizedBy
ir-onto:categorizes
ir-onto:hasSubcategory
ir-onto:hasSupercategory
ir-onto:enablesCategory
ir-onto:isMemberOf


## Insert some example data

For our examples we will use a version of the well-known Pizza ontology. We will use the one from this repository: <https://github.com/owlcs/pizza-ontology/>

In [11]:
pizza_onto_graph = Graph()
pizza_onto_graph.parse("./data/pizza.ttl")
pizza_onto_ns = Namespace("http://www.co-ode.org/ontologies/pizza/pizza.owl#")
pizza_onto_graph.bind(prefix="pizza-onto", namespace=pizza_onto_ns)

In [12]:
pizza_taxos_graph = Graph()
pizza_taxos_graph.parse("./data/pizza-taxonomies.ttl")
pizza_onto_demo_ns = Namespace("http://www.msesboue.org/o/ir-onto-pizza-demo#")
pizza_taxos_graph.bind(prefix="pizza-taxos", namespace=pizza_onto_demo_ns)

In [13]:
print("Classes")
show_owl_classes(ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID)

print()
print("Object Properties")
show_owl_obj_props(ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID)

Classes
ir-onto:IncompatibleDocument
ir-onto:Document
ir-onto:IncompatibleSearch
ir-onto:Search
ir-onto:SearchContext
ir-onto:SelectedCategory
ir-onto:Category
ir-onto:Classification
ir-onto:EnabledCategory
ir-onto:CandidateDocument

Object Properties
ir-onto:hasSearchCategory
ir-onto:categorizedBy
ir-onto:categorizes
ir-onto:hasSubcategory
ir-onto:hasSupercategory
ir-onto:enablesCategory
ir-onto:isMemberOf


In [14]:
# Lets add the pizza taxonomies to our DB
insert_graph(graph=pizza_taxos_graph, db_url=DB_URL, repo_id=REPOSITORY_ID)

OK


In [15]:
print("Classes")
show_owl_classes(ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID)

print()
print("Object Properties")
show_owl_obj_props(ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID)

Classes
ir-onto:IncompatibleDocument
ir-onto:Document
ir-onto:IncompatibleSearch
ir-onto:Search
ir-onto:SearchContext
ir-onto:SelectedCategory
ir-onto:Category
ir-onto:Classification
ir-onto:EnabledCategory
ir-onto:CandidateDocument
pizza-taxos:Pizza
pizza-taxos:PizzaBase
pizza-taxos:Country
pizza-taxos:Spiciness
pizza-taxos:PizzaTopping
pizza-taxos:PizzaKind

Object Properties
ir-onto:hasSearchCategory
ir-onto:categorizedBy
ir-onto:categorizes
ir-onto:hasSubcategory
ir-onto:hasSupercategory
ir-onto:enablesCategory
ir-onto:isMemberOf
pizza-taxos:hasCountryOfOrigin
pizza-taxos:hasPizzaKind
pizza-taxos:hasTopping
pizza-taxos:makesIt
pizza-taxos:hasBase
pizza-taxos:hasIngredient
pizza-taxos:hasSpiciness


At the moment our triple store contains both the IR ontology graph and the pizza taxonomies graph. However, they are not yet linked.

To link both graph (our domain and data graph) in a meaningful manner, we need to add some triples to define:

- What are the categories in our pizza taxonomies graph?
- What are the documents in our pizza taxonomies graph?
- What are the relations (i.e., object properties) that should enable other categories when the their subject category is selected?
- What are the relations used to categorise our documents?

These triples will form our mapping between the data and domain graph and will enable reasoning over our data. 

In [16]:
mapping_graph = Graph()

# In our example we categorise pizzas by country, base, kind, topping and spiciness
mapping_graph.add((pizza_onto_demo_ns.Country, RDFS.subClassOf, ir_onto_ns.Category))
mapping_graph.add((pizza_onto_demo_ns.PizzaBase, RDFS.subClassOf, ir_onto_ns.Category))
mapping_graph.add((pizza_onto_demo_ns.PizzaKind, RDFS.subClassOf, ir_onto_ns.Category))
mapping_graph.add((pizza_onto_demo_ns.PizzaTopping, RDFS.subClassOf, ir_onto_ns.Category))
mapping_graph.add((pizza_onto_demo_ns.Spiciness, RDFS.subClassOf, ir_onto_ns.Category))

# In our example the search engine goal is to find pizzas
mapping_graph.add((pizza_onto_demo_ns.Pizza, RDFS.subClassOf, ir_onto_ns.Document))

# In our example when a category is selected we want ot have the subcategories enabled
mapping_graph.add((ir_onto_ns.hasSubcategory, RDFS.subPropertyOf, ir_onto_ns.enablesCategory))
mapping_graph.add((ir_onto_ns.hasSubcategory, RDF.type, OWL.TransitiveProperty)) # otherwise, only the first subcategory level is enabled

# Ex: if a pizza is categorised by a meat topping the category meaty pizza should be enabled
mapping_graph.add((pizza_onto_demo_ns.makesIt, RDFS.subPropertyOf, ir_onto_ns.enablesCategory))

# In our example the relations in our data graph categorizing the pizzas are has topping, has ingredient, ...
mapping_graph.add((pizza_onto_demo_ns.hasTopping, RDFS.subPropertyOf, ir_onto_ns.categorizedBy))
mapping_graph.add((pizza_onto_demo_ns.hasIngredient, RDFS.subPropertyOf, ir_onto_ns.categorizedBy))
mapping_graph.add((pizza_onto_demo_ns.hasCountryOfOrigin, RDFS.subPropertyOf, ir_onto_ns.categorizedBy))
mapping_graph.add((pizza_onto_demo_ns.hasPizzaKind, RDFS.subPropertyOf, ir_onto_ns.categorizedBy))
mapping_graph.add((pizza_onto_demo_ns.hasSpiciness, RDFS.subPropertyOf, ir_onto_ns.categorizedBy))

<Graph identifier=Nd73085de629445c3bf3cc6c04d702ad4 (<class 'rdflib.graph.Graph'>)>

In [17]:
# Lets add the mapping graph
insert_graph(mapping_graph, db_url=DB_URL, repo_id=REPOSITORY_ID)

OK


In [18]:
show_instances(class_uri=str(ir_onto_ns.Category), ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID, limit=10)

Instances of http://www.msesboue.org/o/ir-ontology#Category
pizza-taxos:_america
pizza-taxos:_cheesyPizza
pizza-taxos:_meatyPizza
pizza-taxos:_mozzarellaTopping
pizza-taxos:_peperoniSausageTopping
pizza-taxos:_tomatoTopping
pizza-taxos:_spicyPizza
pizza-taxos:_hotGreenPepperTopping
pizza-taxos:_jalapenoPepperTopping
pizza-taxos:_nonVegetarianPizza



In [19]:
show_instances(class_uri=str(ir_onto_ns.Document), limit=10, ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID)

Instances of http://www.msesboue.org/o/ir-ontology#Document
pizza-taxos:_american
pizza-taxos:_americanHot
pizza-taxos:_cajun
pizza-taxos:_capricciosa
pizza-taxos:_caprina
pizza-taxos:_fiorentina
pizza-taxos:_fourSeasons
pizza-taxos:_frutiDiMare
pizza-taxos:_giardiniera
pizza-taxos:_laReine



## Our first user search

In [20]:
def make_search_the_context(user_search_uri: str) -> None:
    query = f"""
        PREFIX ir-onto: <http://www.msesboue.org/o/ir-ontology#>
        INSERT DATA {{
            <{user_search_uri}> a ir-onto:SearchContext .
        }}
    """

    sparql_update(sparql_query=query, db_url=DB_URL, repo_id=REPOSITORY_ID)

def remove_search_as_context(user_search_uri: str) -> None:
    query = f"""
        PREFIX ir-onto: <http://www.msesboue.org/o/ir-ontology#>
        DELETE DATA {{
            <{user_search_uri}> a ir-onto:SearchContext .
        }}
    """

    sparql_update(sparql_query=query, db_url=DB_URL, repo_id=REPOSITORY_ID)

In [21]:
meatyToppingSearch_graph = Graph()
meatyToppingSearch_graph.add((pizza_onto_demo_ns._meatyToppingSearch, RDF.type, ir_onto_ns.Search))
meatyToppingSearch_graph.add((pizza_onto_demo_ns._meatyToppingSearch, ir_onto_ns.hasSearchCategory, pizza_onto_demo_ns._meatTopping))

onionMushroomToppingSearch_graph = Graph()
onionMushroomToppingSearch_graph.add((pizza_onto_demo_ns._onionMushroomToppingSearch, RDF.type, ir_onto_ns.Search))
onionMushroomToppingSearch_graph.add((pizza_onto_demo_ns._onionMushroomToppingSearch, ir_onto_ns.hasSearchCategory, pizza_onto_demo_ns._onionTopping))
onionMushroomToppingSearch_graph.add((pizza_onto_demo_ns._onionMushroomToppingSearch, ir_onto_ns.hasSearchCategory, pizza_onto_demo_ns._mushroomTopping))

hamToppingSearch_graph = Graph()
hamToppingSearch_graph.add((pizza_onto_demo_ns._hamToppingSearch, RDF.type, ir_onto_ns.Search))
hamToppingSearch_graph.add((pizza_onto_demo_ns._hamToppingSearch, ir_onto_ns.hasSearchCategory, pizza_onto_demo_ns._hamTopping))

<Graph identifier=Ne54bec132dc54ee2abefbbfb3bf9c6c1 (<class 'rdflib.graph.Graph'>)>

In [22]:
insert_graph(graph=meatyToppingSearch_graph, db_url=DB_URL, repo_id=REPOSITORY_ID)
insert_graph(graph=onionMushroomToppingSearch_graph, db_url=DB_URL, repo_id=REPOSITORY_ID)
insert_graph(graph=hamToppingSearch_graph, db_url=DB_URL, repo_id=REPOSITORY_ID)

OK
OK
OK


In [24]:
show_instances(class_uri=str(ir_onto_ns.SelectedCategory), ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID, limit=10)
show_instances(class_uri=str(ir_onto_ns.EnabledCategory), ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID, limit=10)
show_instances(class_uri=str(ir_onto_ns.CandidateDocument), ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID, limit=10)

Instances of http://www.msesboue.org/o/ir-ontology#SelectedCategory

Instances of http://www.msesboue.org/o/ir-ontology#EnabledCategory

Instances of http://www.msesboue.org/o/ir-ontology#CandidateDocument



In [25]:
# with pizza:_meatyToppingSearch as context:
# we should have among the enabled categories: _chickenTopping, _parmaHamTopping?, _meatyPizza, _nonVegetarianPizza
# we should not have among the enabled categories: _nutTopping, _nutTopping, _vegetarianPizza
make_search_the_context(user_search_uri=str(pizza_onto_demo_ns._meatyToppingSearch))

show_instances(class_uri=str(ir_onto_ns.SelectedCategory), ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID, limit=10)
show_instances(class_uri=str(ir_onto_ns.EnabledCategory), ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID, limit=10)
show_instances(class_uri=str(ir_onto_ns.CandidateDocument), ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID, limit=10)

remove_search_as_context(user_search_uri=str(pizza_onto_demo_ns._meatyToppingSearch))

OK
Instances of http://www.msesboue.org/o/ir-ontology#SelectedCategory
pizza-taxos:_meatTopping

Instances of http://www.msesboue.org/o/ir-ontology#EnabledCategory
pizza-taxos:_meatyPizza
pizza-taxos:_peperoniSausageTopping
pizza-taxos:_nonVegetarianPizza
pizza-taxos:_hamTopping
pizza-taxos:_parmaHamTopping
pizza-taxos:_chickenTopping
pizza-taxos:_hotSpicedBeefTopping

Instances of http://www.msesboue.org/o/ir-ontology#CandidateDocument
pizza-taxos:_american
pizza-taxos:_americanHot
pizza-taxos:_cajun
pizza-taxos:_capricciosa
pizza-taxos:_fourSeasons
pizza-taxos:_frutiDiMare
pizza-taxos:_laReine
pizza-taxos:_napoletana
pizza-taxos:_parmense
pizza-taxos:_polloAdAstra

OK


In [26]:
make_search_the_context(user_search_uri=str(pizza_onto_demo_ns._onionMushroomToppingSearch))

show_instances(class_uri=str(ir_onto_ns.SelectedCategory), ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID, limit=10)
show_instances(class_uri=str(ir_onto_ns.EnabledCategory), ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID, limit=10)
show_instances(class_uri=str(ir_onto_ns.CandidateDocument), ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID, limit=10)

remove_search_as_context(user_search_uri=str(pizza_onto_demo_ns._onionMushroomToppingSearch))

OK
Instances of http://www.msesboue.org/o/ir-ontology#SelectedCategory
pizza-taxos:_onionTopping
pizza-taxos:_mushroomTopping

Instances of http://www.msesboue.org/o/ir-ontology#EnabledCategory
pizza-taxos:_redOnionTopping

Instances of http://www.msesboue.org/o/ir-ontology#CandidateDocument
pizza-taxos:_cajun
pizza-taxos:_fourSeasons
pizza-taxos:_giardiniera
pizza-taxos:_laReine
pizza-taxos:_mushroom
pizza-taxos:_polloAdAstra
pizza-taxos:_sloppyGiuseppe
pizza-taxos:_veneziana

OK


In [27]:
make_search_the_context(user_search_uri=str(pizza_onto_demo_ns._hamToppingSearch))

show_instances(class_uri=str(ir_onto_ns.SelectedCategory), ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID, limit=10)
show_instances(class_uri=str(ir_onto_ns.EnabledCategory), ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID, limit=10)
show_instances(class_uri=str(ir_onto_ns.CandidateDocument), ns_manager=demo_ns_manager, db_url=DB_URL, repo_id=REPOSITORY_ID, limit=10)

remove_search_as_context(user_search_uri=str(pizza_onto_demo_ns._hamToppingSearch))

OK
Instances of http://www.msesboue.org/o/ir-ontology#SelectedCategory
pizza-taxos:_hamTopping

Instances of http://www.msesboue.org/o/ir-ontology#EnabledCategory
pizza-taxos:_parmaHamTopping

Instances of http://www.msesboue.org/o/ir-ontology#CandidateDocument
pizza-taxos:_capricciosa
pizza-taxos:_laReine
pizza-taxos:_parmense
pizza-taxos:_siciliana

OK
