# 2.2 Creating an ontology
In this short tutorial, we will show you how to use Python code with RDFLib to create an OWL ontology. This OWL ontology represents terminological knowledge, known as the TBox.  

## 1. Setup
We will use the python package  
> rdflib

Other packages exist as well. It is also possible to work with RDF and OWL in other software languages:
- C#: [DotNetRdf](https://dotnetrdf.org)
- Java: [Jena](https://jena.apache.org)
- JavaScript: [rdf.js](https://rdf.js.org)
- Redland: [librdf](https://librdf.org)

## 2. Creating a graph
We import rdflib and create an initial graph. By doing so, an empty graph `g` is created in memory. This is not stored to disk or database or file. After creation of this empty graph, we will fill it with triples and eventually save it to a file.

In [1]:
from rdflib import Graph , Literal , BNode , Namespace , RDF , RDFS , OWL , URIRef
import os
g = Graph()

We will creation the below ontology. This is a UML Class Diagram. 

<img src="figures/simpleOntology_UML.png" width="600" />

Make sure to understand this class diagram before proceeding. It shows 10 classes in total. Two important classes are the `Element` and `Zone` classes, which both have subclasses. The `Zone` class has one child class, namely `Space`. The Element has three child classes, namely `Aggregate`, `BuildingElement`, and `BuildingEquipment`. The last two have subclasses of their own, namely `Wall`, `Sensor`, and `AirHandlingUnit`. The `Aggregate` class has an aggregation relation (1 to many) with the `Part` class. 

There is also a normal one-to-many relation between `Zone` and `Element`, with roles `hasElement` and `hasLocation`. Finally, one datatype property is available (attribute) with name `name` and datatype `string`.

## 3. Namespace and prefix
First let´s give our ontology a name, e.g. "https://example.org/myFirstOntology". To do this, we need to add a triple statement (`s p o`) to the triplestore, which makes this statement (`https://example.org/myFirstOntology rdf:type owl:ontology`). Note that `rdflib` stores the triples in a triple store in your local memory. If a redundant triple is added, then the triple is stored only once.

In [2]:
s = URIRef("https://example.org/myFirstOntology")
p = RDF.type
o = OWL.Ontology
g.add((s, p, o))

<Graph identifier=N589ace6204fb40a99c548f0cf76e00c9 (<class 'rdflib.graph.Graph'>)>

Let's see what this looks like when writing out this statement, including all namespaces:

In [3]:
for s, p, o in g:
    print((s, p, o))

(rdflib.term.URIRef('https://example.org/myFirstOntology'), rdflib.term.URIRef('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'), rdflib.term.URIRef('http://www.w3.org/2002/07/owl#Ontology'))


At the moment, we have a graph that contains an ontology with a namespace `https://example.org/myFirstOntology`. Other than that, the ontology is empty. To ease the use of this ontology, we will add a prefix and bind it with the namespace that we just added. We add the `owl` prefix as well as the `mfi` prefix. The `mfi` prefix is a self-chosen name, and it can be anything you prefer. Most commonly, this prefix has three characters, eg. `bot`, `ifc`, `bpo`, `fog`, `beo`, `mep`, etc.

In [4]:
g.bind("owl", OWL)

NS = Namespace("https://example.org/myFirstOntology#")
g.bind("mfi", NS)

## 4. Writing to file
At all times, it is possible to write the graph that is created in memory, to a file. You can write this file in multiple file formats.

In [5]:
g.serialize(destination="output/myFirstOntology.ttl", format="turtle")
print("Created output/myFirstOntology.ttl in folder:")
print(str(os.getcwd()))

Created output/myFirstOntology.ttl in folder:
/Users/stefan/Repositories/FireBIM/SSolDAC2024/handson-querying-and-interaction


Have a look at the outcome in the serialised file, and see what you created.

## 5. Add your first OWL class
Let's add the concept (owl:Class) *Aggregate* as depicted above in our target example. This is our first class. The `owl:Thing` Class is available by default. How many classes do we eventually need to create, to complete our example?

In [6]:
s = NS["Element"]
p = RDF.type
o = OWL.Class
g.add((s, p, o))

<Graph identifier=N589ace6204fb40a99c548f0cf76e00c9 (<class 'rdflib.graph.Graph'>)>

Examine the result:

In [11]:
g.serialize(destination="output/myFirstOntology.ttl", format="turtle")
print("Created output/myFirstOntology.ttl in folder:")
print(str(os.getcwd()))

Created output/myFirstOntology.ttl in folder:
/Users/stefan/Repositories/FireBIM/SSolDAC2024/handson-querying-and-interaction


## 6. Add other classes and properties
After this first class, we need to create the remaining 6 classes, as well as two object properties and three subClassOf relationships (inheritance). Can you recognize how many object properties and inheritance relationships need to be created? 

In [8]:
lConcepts = [   
    "Aggregate",
    "Part",
    "BuildingElement",
    "Wall",
    "BuildingEquipment",
    "Sensor",
    "AirHandlingUnit",
    "Space",
    "Zone" 
]

for concept in lConcepts:
    s = NS[concept]
    p = RDF.type
    o = OWL.Class
    g.add((s, p, o))

Now we can add the inheritance relationships for all parent-child couples (`rdfs:subClassOf`):

In [10]:
s = NS["Aggregate"]
p = RDFS.subClassOf
o = NS["Element"]
g.add((s, p, o))

s = NS["BuildingElement"]
p = RDFS.subClassOf
o = NS["Element"]
g.add((s, p, o))

s = NS["BuildingEquipment"]
p = RDFS.subClassOf
o = NS["Element"]
g.add((s, p, o))

g.add((NS["Wall"], RDFS.subClassOf, NS["BuildingElement"]))
g.add((NS["Sensor"], RDFS.subClassOf, NS["BuildingEquipment"]))
g.add((NS["AirHandlingUnit"], RDFS.subClassOf, NS["BuildingEquipment"]))
g.add((NS["Space"], RDFS.subClassOf, NS["Zone"]))

<Graph identifier=N589ace6204fb40a99c548f0cf76e00c9 (<class 'rdflib.graph.Graph'>)>

Now we can add the object properties (owl:ObjectProperty). These need to be added for all relations (except inheritance relations) in the original UML Class Diagram. This means:
- `hasPart`
- `hasElement`
- `hasLocation`

In [12]:
s = NS["hasPart"]
p = RDF.type
o = OWL.ObjectProperty
g.add((s, p, o))

s1 = NS["hasElement"]
p = RDF.type
o = OWL.ObjectProperty
g.add((s1, p, o))

s2 = NS["hasLocation"]
p = RDF.type
o = OWL.ObjectProperty
g.add((s2, p, o))

<Graph identifier=N589ace6204fb40a99c548f0cf76e00c9 (<class 'rdflib.graph.Graph'>)>

And finally the produced file is:

In [13]:
g.serialize(destination="output/myFirstOntology.ttl", format="turtle")
print("Created output/myFirstOntology.ttl in folder:")
print(str(os.getcwd()))

Created output/myFirstOntology.ttl in folder:
/Users/stefan/Repositories/FireBIM/SSolDAC2024/handson-querying-and-interaction


## 7. Data properties and domain-range restrictions
The above created ontology is a basic ontology. This can be further extended with data properties, domain and range statements, and more. With the following statements, you can add domain and range expressions. These are not restrictions. These expressions allow inferences as specified in https://www.w3.org/TR/owl2-direct-semantics/#Object_Property_Expression_Axioms. 

In [14]:
g.add((NS["hasPart"], RDFS.domain, NS["Aggregate"]))
g.add((NS["hasPart"], RDFS.range, NS["Part"]))

g.add((NS["hasElement"], RDFS.domain, NS["Zone"]))
g.add((NS["hasElement"], RDFS.range, NS["Element"]))

g.add((NS["hasLocation"], RDFS.domain, NS["Element"]))
g.add((NS["hasLocation"], RDFS.range, NS["Zone"]))

<Graph identifier=N589ace6204fb40a99c548f0cf76e00c9 (<class 'rdflib.graph.Graph'>)>

The properties `hasElement` and `hasLocation` are inverses of each other, and this can be declared in `rdflib` as follows.

In [15]:
g.add((NS["hasElement"], OWL.inverseOf, NS["hasLocation"]))

<Graph identifier=N589ace6204fb40a99c548f0cf76e00c9 (<class 'rdflib.graph.Graph'>)>

This creates the following ontology.

In [16]:
g.serialize(destination="output/myFirstOntology.ttl", format="turtle")
print("Created output/myFirstOntology.ttl in folder:")
print(str(os.getcwd()))

Created output/myFirstOntology.ttl in folder:
/Users/stefan/Repositories/FireBIM/SSolDAC2024/handson-querying-and-interaction


Next to domain and range expressions, the ontology can be further extended with data properties. Data properties, as opposed to object properties, refer to literal values of `XSD` datatypes (`integer`, `string`, `boolean`, etc). In the below code snippet, first a generic property is created, after which `xsd:string` is added as a range expression.

In [17]:
from rdflib import XSD

g.add((NS["name"], RDF.type, RDF.Property))
g.add((NS["name"], RDFS.range, XSD.string))

<Graph identifier=N589ace6204fb40a99c548f0cf76e00c9 (<class 'rdflib.graph.Graph'>)>

We don't add any domain and range expressions to this data property. This means that it can be used anywhere without that this leads to any inferences in one or the other way.

After this last step, we have the following ontology.

In [18]:
g.serialize(destination="output/myFirstOntology.ttl", format="turtle")
print("Created output/myFirstOntology.ttl in folder:")
print(str(os.getcwd()))

Created output/myFirstOntology.ttl in folder:
/Users/stefan/Repositories/FireBIM/SSolDAC2024/handson-querying-and-interaction
