In [None]:
!pip install rdflib
!pip install owlrl
#!pip install Owlready2

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
from rdflib import Graph, Literal, URIRef, BNode
from rdflib.term import Identifier
from rdflib.collection import Collection
from rdflib.namespace import RDF, RDFS, SKOS, XSD, OWL
import rdflib.plugins.sparql.update
import owlrl.RDFSClosure


# Data Resources


In [None]:
Egg_Ontology_BFO_CCO_URL="https://raw.githubusercontent.com/materialdigital/ontology-playground/main/example1-data/Egg_playground-BFO-CCO.ttl" 
Egg_Ontology_EMMO_URL="https://raw.githubusercontent.com/materialdigital/ontology-playground/main/example1-data/Egg_playground-EMMO.ttl"
Egg_Ontology_PMDCO_URL="https://raw.githubusercontent.com/materialdigital/ontology-playground/main/example1-data/Egg_playground-PMDCO.ttl"
Egg_Ontology_PROVO_URL="https://raw.githubusercontent.com/materialdigital/ontology-playground/main/example1-data/Egg_playground-PROVO.ttl"


In [None]:
g = Graph()
g.parse(Egg_Ontology_BFO_CCO_URL, format='ttl')
g.parse(Egg_Ontology_EMMO_URL, format='ttl')
g.parse(Egg_Ontology_PMDCO_URL, format='ttl')
g.parse(Egg_Ontology_PROVO_URL, format='ttl')
print(len(g))

491


In [None]:
for prefix, namespace in g.namespaces():
  print(prefix, namespace)

rdf http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs http://www.w3.org/2000/01/rdf-schema#
xsd http://www.w3.org/2001/XMLSchema#
xml http://www.w3.org/XML/1998/namespace
dc http://purl.org/dc/elements/1.1/
cco http://www.ontologyrepository.com/CommonCoreOntologies/
obo http://purl.obolibrary.org/obo/
ccobfo_egg http://example.org/ccobfo#
emmo_egg http://example.org/emmo#
owl http://www.w3.org/2002/07/owl#
prov http://www.w3.org/ns/prov#
pmdco http://material-digital.de/pmdco#
wikibase http://wikiba.se/ontology#
pmdao_egg http://example.org/pmd#
provo_egg http://example.org/provo#


Example query for instances of Egg in the prov-O example


In [None]:
query = """
SELECT ?s 
WHERE { 
    ?s a provo_egg:Egg
}
"""
qres = g.query(query)
for row in qres:
    print(f"{row.s} is and instance of Egg class defined in PROV ontology")

http://example.org/provo#egg1 is and instance of Egg class defined in PROV ontology


We now activate the reasoner, so that also subclasses appear in the result.

In [None]:
print (len(g))
rdfs = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)
rdfs.closure()
rdfs.flush_stored_triples()
print (len(g))

491
2307


The same query now returns also the **boiled egg** and **easter egg** as instances of Egg in the Prov-O example.

In [None]:
qres = g.query(query)
for row in qres:
    print(f"{row.s} is and instance of Egg class defined in PROV ontology")

http://example.org/provo#egg1 is and instance of Egg class defined in PROV ontology
http://example.org/provo#boiledEgg1 is and instance of Egg class defined in PROV ontology
http://example.org/provo#easterEgg1 is and instance of Egg class defined in PROV ontology


Lets now query for **prov:Entity**, which comprises the eggs as well as other domain items. 

In [None]:
query = """

SELECT ?s 
WHERE { 
    ?s a prov:Entity
}
"""
qres = g.query(query)
for row in qres:
    print(f"{row.s}")

http://example.org/pmd#2e7012fc14764843a96b6e231fb67c53
http://example.org/pmd#515c392524f74d14bfd4e24c64141095
http://example.org/provo#egg1
http://example.org/provo#cookingParams1
http://example.org/provo#easterEgg1
http://example.org/pmd#egg1
http://example.org/pmd#boiledEgg1
http://example.org/provo#boiledEgg1
http://example.org/pmd#easterEgg1


Lets now query for BFO objects (prefix obo), which only contains TapWater.  (see the Concept Board)


In [None]:
query = """

SELECT ?s 
WHERE {   
    ?s a obo:BFO_0000030
}
"""
qres = g.query(query)
for row in qres:
    print(f"{row.s}")

http://example.org/ccobfo#TapWater


Lets query for cco:Artifact.  (TapWater is not included because it is a obo:BFO_0000030 only)

In [None]:
query = """

SELECT ?s 
WHERE { 
    ?s a cco:Artifact
}
"""
qres = g.query(query)
for row in qres:
    print(f"{row.s}")

There were no artifacts found, because the data does not include the complete hierarchy of Egg -> 'Portion Of Food' -> Artifact -> bfo:object. The hierarchy is incomplete in the data. 

We have to include the hierarchy by adding the cco artifact ontology and hte BFO ontology. And do the reasoning again.


In [None]:
BFO_URL="http://purl.obolibrary.org/obo/bfo/2020/bfo.owl"
g.parse(BFO_URL, format='xml')
CCO_URL="https://raw.githubusercontent.com/CommonCoreOntology/CommonCoreOntologies/master/ArtifactOntology.ttl"
g.parse(CCO_URL, format='ttl')


<Graph identifier=N61cee2854cb040f69b648a8b74ead1d3 (<class 'rdflib.graph.Graph'>)>

In [None]:
print (len(g))
rdfs = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)
rdfs.closure()
rdfs.flush_stored_triples()
print (len(g))

7077
31528


In [None]:
qres = g.query(query)
for row in qres:
    print(f"{row.s}")

http://example.org/ccobfo#informationBearingEntity1
http://example.org/ccobfo#informationBearingEntity3
http://example.org/ccobfo#portionOfWater1
http://example.org/ccobfo#portionOfWater2
http://example.org/ccobfo#Egg1


So, wrapping things up until now ... 
There are two examples for modeling the cooking process.
We can query for Eggs in each of the examples. With reasoning active, we can also collect instances of sublcasses of Egg-classes. 

We now want to query for eggs in both examples, but with only one query. 
This will not work, because there is no mapping between the classes.

To include a mapping, we can load the PMDCO2 ontology. 

In [None]:
Mapping_URL="https://raw.githubusercontent.com/materialdigital/ontology-playground/main/example1-data/pmdco_bfo2-mapping.ttl"
g.parse(Mapping_URL, format='ttl')

<Graph identifier=N61cee2854cb040f69b648a8b74ead1d3 (<class 'rdflib.graph.Graph'>)>

Now lets do the reasoning (takes one or two minutes)

In [None]:
print (len(g))
rdfs = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)
rdfs.closure()
rdfs.flush_stored_triples()
print (len(g))

31689
42833


We now query again for **provo entities**, and the result is also showing entities from the cco example (these were not explicitly modelled as provo entities).

In [None]:
query = """
SELECT ?s 
WHERE { 
    ?s a prov:Entity
}
"""
qres = g.query(query)
for row in qres:
    print(f"{row.s}")

http://example.org/pmd#2e7012fc14764843a96b6e231fb67c53
http://example.org/pmd#515c392524f74d14bfd4e24c64141095
http://example.org/provo#egg1
http://example.org/provo#cookingParams1
http://example.org/provo#easterEgg1
http://example.org/pmd#egg1
http://example.org/pmd#boiledEgg1
http://example.org/provo#boiledEgg1
http://example.org/pmd#easterEgg1
http://example.org/ccobfo#informationBearingEntity1
http://example.org/ccobfo#informationBearingEntity3
http://example.org/ccobfo#Egg1
http://example.org/ccobfo#portionOfWater1
http://example.org/ccobfo#portionOfWater2
http://example.org/ccobfo#TapWater


If we query for cco:Artifact we get the same result. Because ther is a mapping between cco:Artifact and prov:Entity. The PMDCO entities were found because pmdco:Object is subclass of prov:Entity.

Because of the mapping, we can substitute prov:Entity with cco:Artifact and receive the same result.


In [None]:
query = """
SELECT ?s 
WHERE { 
    ?s a cco:Artifact 
}
"""
qres = g.query(query)
for row in qres:
    print(f"{row.s}")

http://example.org/ccobfo#informationBearingEntity1
http://example.org/ccobfo#informationBearingEntity3
http://example.org/ccobfo#portionOfWater1
http://example.org/ccobfo#portionOfWater2
http://example.org/ccobfo#Egg1
http://example.org/pmd#easterEgg1
http://example.org/pmd#egg1
http://example.org/provo#egg1
http://example.org/provo#cookingParams1
http://example.org/provo#easterEgg1
http://example.org/provo#boiledEgg1
http://example.org/pmd#boiledEgg1
http://example.org/pmd#2e7012fc14764843a96b6e231fb67c53
http://example.org/pmd#515c392524f74d14bfd4e24c64141095
http://example.org/ccobfo#TapWater


OK. so far. it seems to work great. 
Lets now work on the competency questions.

----------------------

# Competency questions:
https://docs.google.com/document/d/186rP8P1GpUQ59VUKMyq9c7GaVdp6BYcWQSVbRQRy0SE/edit#heading=h.2xbuv3z0nlkf

## (1) Welche Materialien sind in den Daten vorhanden?

In [None]:
# todo

## (2) Welche Eigenschaften haben die Materialien (oder ein bestimmtes)?
    - z.B. welche Masse haben die vorhandenen Samples?
    - z.B. welche Größe haben die beteiligten gekochten Eier?



In [None]:
# todo

##  (3) Welche Prozesse sind in den Daten vorhanden?

Expectation: queries for prov:Activity, bfo:process (obo:BFO_0000015), cco:Act, emmo:Process (http://emmo.info/emmo#EMMO_43e9a05d_98af_41b4_92f6_00f79a09bfce> ) lead to the same results.

In [None]:
query = """
SELECT ?s 
WHERE { 
    ?s a prov:Activity
}
"""
qres = g.query(query)
for row in qres:
    print(f"{row.s}")

http://example.org/provo#meCookingAnEgg1
http://example.org/provo#mePaintingAnEgg1
http://example.org/pmd#meCookingAnEgg1
http://example.org/pmd#mePaintingAnEgg1
http://example.org/emmo#EggBoilingProcess1
http://example.org/emmo#Egg1


In [None]:
query = """
SELECT ?s 
WHERE { 
    ?s a obo:BFO_0000015
}
"""
qres = g.query(query)
for row in qres:
    print(f"{row.s}")

http://example.org/emmo#Egg1
http://example.org/provo#mePaintingAnEgg1
http://example.org/emmo#EggBoilingProcess1
http://example.org/pmd#meCookingAnEgg1
http://example.org/pmd#mePaintingAnEgg1
http://example.org/provo#meCookingAnEgg1


In [None]:
query = """
SELECT ?s 
WHERE { 
    ?s a cco:Act
}
"""
qres = g.query(query)
for row in qres:
    print(f"{row.s}")

In [None]:
query = """
SELECT ?s 
WHERE { 
    ?s a <http://emmo.info/emmo#EMMO_43e9a05d_98af_41b4_92f6_00f79a09bfce> 
}
"""
qres = g.query(query)
for row in qres:
    print(f"{row.s}")

http://example.org/emmo#EggBoilingProcess1
http://example.org/emmo#Egg1
http://example.org/provo#meCookingAnEgg1
http://example.org/provo#mePaintingAnEgg1
http://example.org/pmd#mePaintingAnEgg1
http://example.org/pmd#meCookingAnEgg1


We can see: the mapping works well for all examples except "cco:Act" .

To make "cco:Act" work, we need the cco Event Ontology. 
(don't forget the reasoning )

In [None]:
Mapping_URL="https://raw.githubusercontent.com/CommonCoreOntology/CommonCoreOntologies/master/EventOntology.ttl"
g.parse(Mapping_URL, format='ttl')

<Graph identifier=N61cee2854cb040f69b648a8b74ead1d3 (<class 'rdflib.graph.Graph'>)>

In [None]:
print (len(g))
rdfs = owlrl.CombinedClosure.RDFS_OWLRL_Semantics(g, False, False, False)
rdfs.closure()
rdfs.flush_stored_triples()
print (len(g))

44990
59285


In [None]:
query = """
SELECT ?s 
WHERE { 
    ?s a cco:Act
}
"""
qres = g.query(query)
for row in qres:
    print(f"{row.s}")

http://example.org/ccobfo#HeatUpWater
http://example.org/ccobfo#RetrieveEgg
http://example.org/ccobfo#PutPotOnHotPlate
http://example.org/ccobfo#CookEgg
http://example.org/ccobfo#PutEggInWater
http://example.org/ccobfo#PotFill
http://example.org/ccobfo#ActOfBoilingAnEgg
http://example.org/ccobfo#CookWater
http://example.org/ccobfo#QuenchEgg
http://example.org/ccobfo#actOfPaintingAnEgg1


We can see, that cco:Act is resulting only in cco related instances. That's because there is no direct mapping between cco:Act and the other ontologies. 
cco:Act is subclass of bfo:process and only this is mapped to the other ontologies.

That means, cco:Act seems not to be a right candidate for an "overarching" query.

We should discuss this in detail.

## (4) Welche Prozessschritte (Teilprozesse) gibt es?

In [None]:
# todo

usw. 