# Knowledge Representation on the Web -- RDFS tutorial

Here we'll learn the basics of RDFS (RDF Schema) and how to perform basic RDFS reasoning with rdflib (documentation [here](https://rdflib.readthedocs.io/en/stable/)) and owlrl (documentation [here](https://owl-rl.readthedocs.io/en/latest/)).

## Imports

owlrl is a library implementing basic RDFS and OWL reasoning on top of rdflib. We'll install and import its relevant symbols.

In [6]:
import sys
!{sys.executable} -m pip install rdflib owlrl

from rdflib import Graph, RDFS, RDF, URIRef, Namespace, Literal
from owlrl import DeductiveClosure, RDFS_Semantics



## Loading RDFS graphs

Your file `yourRDF.ttl` already contains a basic Knowledge Graph in RDF with some RDFS semantics

First, we are going to add some RDFS semantics, and inspect the graph as-is; this is also called the "asserted graph"

**Exercise 1** 
1. add additional triples using the RDFS semantics: have a look [here](https://www.w3.org/TR/rdf-schema/), and use domain and range, subPropertyOf, propertyOf, and Class, to say more about the instances in your graph
2. load yourRDF graph
3. print the classes in your graph
4. print the properties of a specific class in yourRDF graph
5. print all instances in yourRDF graph (all objects that have a type) 
6. explain what constitutes a vocabulary in RDF

In [38]:
g = Graph()

#example namespace
EX = Namespace("https://example.org/")

# Add triples using store's add method.
g.add( (EX.whale, RDF.type, EX.Mammalia) )
g.add( (EX.whale, RDFS.label, Literal('whale'))) #in this example, the identifiers have human readable names, but these can also be arbitrary strings. rdfs:label makes these human-interpretable.  
g.add( (EX.crocodile, RDF.type, EX.Amphibia) )
g.add( (EX.Amphibia, RDFS.subClassOf, EX.Animalia) )
g.add( (EX.whale, EX.eats, EX.crocodile) )

g.serialize(destination="../data/yourRDF.ttl")


print("All classes:")
for objects in g.objects(None, RDF.type):
    print(objects)
    
mammaliaProperty = list()
for s,p,o in g.triples((None, RDF.type, EX.Mammalia)):

    # now we check which predicates a mammalia class has 
    for s1,p1,o1 in g.triples((s,None,None)):
        mammaliaProperty.append(p1)

print("All properties of the Mammalia class:", mammaliaProperty)

print("All instances:")
for subjects in g.subjects(RDF.type, None):
    print(subjects)

All classes:
https://example.org/Mammalia
https://example.org/Amphibia
All properties of the Mammalia class: [rdflib.term.URIRef('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'), rdflib.term.URIRef('http://www.w3.org/2000/01/rdf-schema#label'), rdflib.term.URIRef('https://example.org/eats')]
All instances:
https://example.org/whale
https://example.org/crocodile


6. explain what constitutes a vocabulary in RDF: 
["On the Semantic Web, vocabularies define the concepts and relationships (also referred to as “terms”) used to describe and represent an area of concern."](https://www.w3.org/standards/semanticweb/ontology)
In our case, the example vocabulary describes classes and subclasses in the natural domain, and how they relate to one another (ex:eats))

## RDFS inferencing

The inference engine in owlrl is triggered by `DeductiveClosure`, which computes the closure of the graph. This requires us to specify under which semantic regime we want to perform the inference (e.g. what kind of rules under the RDFS, OWL, etc. semantics we want the reasoner to produce derivations on). For RDFS semantics we use `RDFS_Semantics` as parameter. See extra options [here](https://owl-rl.readthedocs.io/en/latest/stubs/owlrl.html#module-owlrl)


**Exercise 2**
1. expand the graph through RDFS semantics inference
2. print how many triples the new graph has
3. print out the triples in your new graph and inspect them. 

In [39]:
e = Graph()
e.parse("../data/yourRDF.ttl")

DeductiveClosure(RDFS_Semantics).expand(e)
print("RDFS closure of the graph has {} triples, the initial graph had {}".format(len(e), len(g)))

for s,p,o in sorted(e.triples((None,None,None))):
    if o != RDFS.Resource:
        print(s,p,o)

RDFS closure of the graph has 22 triples, the initial graph had 5
http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2000/01/rdf-schema#subPropertyOf http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/2000/01/rdf-schema#label http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
http://www.w3.org/2000/01/rdf-schema#label http://www.w3.org/2000/01/rdf-schema#subPropertyOf http://www.w3.org/2000/01/rdf-schema#label
http://www.w3.org/2000/01/rdf-schema#subClassOf http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
http://www.w3.org/2000/01/rdf-schema#subClassOf http://www.w3.org/2000/01/rdf-schema#subPropertyOf http://www.w3.org/2000/01/rdf-schema#subClassOf
http://www.w3.org/2000/01/rdf-schema#subPropertyOf http://www.w3

## The explicit (asserted) graph vs the implicit (derived) graph, and RDF entailment

Asserted triples are those that are explicitly stated, while derived or inferred triples are those that are implicitly stated through the semantics of RDFS. 

**Exercise 3**

1. Write here code to generate a graph that contains **RDFS derived triples only** from yourRDF Knowledge Graph, not the asserted ones. See a clue on rdflib graph algebra [here](https://rdflib.readthedocs.io/en/stable/merging.html)
2. have a look at the inferred graph. Based on the RDFS semantics, explain for each triple the rule that was used to generate it.
3. Explain the concept RDF entailment, and the types of entailment RDFS can produce


In [44]:
d = Graph()

d = e - g

print("RDFS inference generated additional {} triples".format(len(d)))

for s,p,o in sorted(d.triples((None,None,None))):
    print(s,p,o)

RDFS inference generated additional 17 triples
http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2000/01/rdf-schema#subPropertyOf http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/2000/01/rdf-schema#label http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
http://www.w3.org/2000/01/rdf-schema#label http://www.w3.org/2000/01/rdf-schema#subPropertyOf http://www.w3.org/2000/01/rdf-schema#label
http://www.w3.org/2000/01/rdf-schema#subClassOf http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
http://www.w3.org/2000/01/rdf-schema#subClassOf http://www.w3.org/2000/01/rdf-schema#subPropertyOf http://www.w3.org/2000/01/rdf-schema#subClassOf
http://www.w3.org/2000/01/rdf-schema#subPropertyOf http://www.w3.org/1999/02/22-rdf

## Assignment part 2: your own webapplication. 


**Exercise 4**
1. load ingredients.rdf and recipes.rdf in one graph. The graph contains types of individuals and types of relationships between them. Print all the classes and properties in the combined graph with the namespace `ind` and the `wtm` namespace/vocabulary (`http://purl.org/heals/food/`). 

2. extend the `ind` vocabulary (`http://purl.org/heals/ingredient/`) by creating a hierarchy of ingredients (**hint: http://purl.org/heals/ingredient/CoconutMilk rdf:subClassOf http://purl.org/heals/ingredient/PlantMilk), and make these superclasses human readable by giving them labels**) 
3. do the same for the `wtm` vocabulary: add a hierarchy of recipes as well as a hierarchy of properties (**hint: http://purl.org/heals/food/hasCookingTemperature rdf:subPropertyOf ...) 
4. print the entailed triples as we did in the previous exercise
5. give three examples of how RDF semantics could aid the chefs in your restaurant 
    
6. which properties and classes could you add to the `wtm` and `ind` vocabularies to further describe your recipe and ingredient knowledge graph, aiding the chefs in your restaurant?  
