#Preambule
Before starting with tutorial, you want to execute the KG generation steps of morph kgc. This is necessary as this part of the tutorial builds up the data we have generated.



In [None]:
!pip install morph-kgc

In [None]:
import morph_kgc
config = """
             [GTFS-Madrid-Bench]
             mappings: https://raw.githubusercontent.com/oeg-upm/morph-kgc/main/examples/tutorial/mapping.gtfs.ttl
         """
morph_kgc.materialize(config).serialize("knowledge-graph.ttl", format="turtle")

# Inferring implicit knowledge from the KG
So far, we have focussed on integrating non-RDF into an RDF knowledge graph using RML. In this part of the tutorial, we will demonstrate the use of ontologies for reasoning tasks. We will cover two reasoning tasks in this part of the tutorial: inferring implicit information from the graph and assessing whether there is a contradiction in the knowledge graph.

## Inferring implicit information
We start with inferring implicit information from explicit information. The knowledge graph we have constructed in this tutorial uses the [gtfs](http://vocab.gtfs.org/terms#) vocabulary. This vocabulary contains very few axioms, enough to demonstrate our first reasoning task. In this ontology, `gtfs:Agency` is declared as a subclass of `foaf:Agent`, meaning that every instance of `gtfs:Agency` is also an instance of `foaf:Agent`.

We will first load the graph we have generated in a previous step in a new object stored in the variable `data`.

In [None]:
import rdflib

In [None]:
# Creating a new graph for the data we have previously generated
data = rdflib.Graph()
data.parse('knowledge-graph.ttl')

print(f"There are {len(data)} triples in the graph.")

We will first count the number of instances for each type using a simple query. You will notice that, in our graph, there is only one instance of `gtfs:Agency` but no instances of `foaf:Agent`. This SPARQL query will only retrieve information based on the graph that is available to it. (*)

In [None]:
qtypes = """
         SELECT ?type (COUNT(*) AS ?count1) WHERE {
            [] a ?type .
         } GROUP BY ?type ORDER BY ?count1
      """

from io import BytesIO
import pandas as pd
df1 = pd.read_csv(BytesIO(data.query(qtypes).serialize(format='csv')), index_col=0)
df1

In [None]:
turtle_output = data.query("""
    CONSTRUCT { ?s ?p ?o } WHERE {
        ?s a <http://vocab.gtfs.org/terms#Agency> .
        ?s ?p ?o .
    }
""").serialize(format='turtle').decode("utf-8")

print(turtle_output)

If we were to consult the `gtfs` ontology, we would use the aforementioned axiom stating that `gtfs:Agency` is a subclass of `foaf:Agent`, among other axioms.

```turtle
gtfs:Agency a rdfs:Class ;
            status:term_status "stable"@en ;
            status:term_status "estable"@es ;
            rdfs:subClassOf foaf:Agent ;
            rdfs:label "Agency"@en ;
				rdfs:label "Empresa"@es ;
            rdfs:seeAlso <https://developers.google.com/transit/gtfs/reference#agency_fields> ;
            rdfs:comment "An agency operates a certain schedule based transport mode"@en ;
				rdfs:comment "Una empresa opera un cierto modo de transporte de manera programada"@es .
```
                
We will now create a new graph and add the triples of the `gtfs` ontology and our generated graph. We need to do this so that a reasoner can access both the data and the axioms.

In [None]:
data_bis = rdflib.Graph()

data_bis.parse('http://vocab.gtfs.org/terms')
print(f"There are {len(data_bis)} triples in the gtfs ontology.")

data_bis.parse('knowledge-graph.ttl', format='turtle')
print(f"After adding the data, there are now {len(data_bis)} triples in the graph.")

We then use RDFLib's [OWL-RL](https://owl-rl.readthedocs.io/en/latest/) implementation to reason over these axioms. The reasoner will add the inferred triples to the graph with side effects. 

While the various OWL2 profiles (different ontology language "flavors" with varying levels of expressivity and computational complexity) are beyond the scope of this tutorial, it suffices to know that OWL2 RL (which stands for Rule Language) is a decidable subset of First-Order Logic that can be implemented with rule-languages.

In [None]:
!pip3 install owlrl

In [None]:
import owlrl
owlrl.DeductiveClosure(owlrl.OWLRL_Semantics).expand(data_bis)
print(f"After reasoning, there are now {len(data_bis)} triples in the graph.")

An OWL-RL reasoner already allows one to infer additional triples based on the data, but these are limited to the "obvious" facts, such as there being two boolean values. We are primarily interested in the facts we can infer via the ontology. Now let's rerun the query and see whether additional triples have been inferred. 

In [None]:
turtle_output = data_bis.query("""
    CONSTRUCT { ?s ?p ?o } WHERE {
        ?s a <http://vocab.gtfs.org/terms#Agency> .
        ?s ?p ?o .
    }
""").serialize(format='turtle').decode("utf-8")

print(turtle_output)

Indeed, we now see that the resource `<http://transport.linkeddata.es/madrid/agency/00000000000000000001>` is both a `gtfs:Agency` and `foaf:Person`.

You can launch a second query on this "augmented" graph, create a new data frame, and join this data frame with the first to observe that much more information has been inferred. While our examples are limited to subclass inferences, OWL-RL supports many more axioms. You can execute the following statements at your leisure.

In [None]:
# qtypes = """
#          SELECT ?type (COUNT(*) AS ?count2) WHERE {
#             [] a ?type .
#          } GROUP BY ?type ORDER BY ?count2
#       """

# df2 = pd.read_csv(BytesIO(data_bis.query(qtypes).serialize(format='csv')), index_col=0)
# pd.merge(df1, df2, on='type', how='outer')

## Looking for contradictions
The OWL2 ontology languages have been designed to support several reasoning tasks. One of these reasoning tasks is to assess whether an ontology is satisfiable (e.g., it does not contain contradictions or, in other words, "it can happen"). Other reasoning tasks include subsumption checking (Is a class A a subclass of class B?) or class satisfiability checking (Can a class have instances?).

One may reasonably assume that these checks have been conducted on an ontology without instances to avoid any "obvious" contradictions. These reasoning tasks, which may be computationally expensive, may provide you with quality assurance as they can allow you to check whether the data you have provided leads to contradictions. This is something we will exemplify now.

In [None]:
import owlrl
from owlrl.Namespaces import ERRNS, T

for error in data_bis.objects(predicate=ERRNS.error):
    print(error)

There are no errors, as our data contained none. The `gtfs` ontology is not highly axiomatized either. For this tutorial, we will now demonstrate satisfiability checking in two steps. We will first introduce an axiom stating that no resource can be both a `foaf:Agent` and a `gtfs:feed`, which makes sense. Then we will add a resource that does not comply with that axiom.

In [None]:
# Adding the axiom
data_bis.parse(data="""
    @prefix foaf: <http://xmlns.com/foaf/0.1/> .
    @prefix gtfs: <http://vocab.gtfs.org/terms#> .
    @prefix owl: <http://www.w3.org/2002/07/owl#> .
    
    foaf:Agent owl:disjointWith gtfs:Feed .
    
    <http://www.example.com/foo> a foaf:Agent .
    <http://www.example.com/foo> a gtfs:Feed .
""")

# re-launching the reasonser
owlrl.DeductiveClosure(owlrl.OWLRL_Semantics).expand(data_bis)

# check for problems
for error in data_bis.objects(predicate=ERRNS.error):
    print(error)
    
# DISCLAIMER: if you see any complaints w.r.t. to the data, this is because 
# there are inconsistencies introduced with the synthetic data provided for this tutorial

(*) While there are extensions of SPARQL supporting so-called [Entailment Regimes](https://www.w3.org/TR/sparql11-entailment/), these are out of this tutorial's scope.