# Knowledge and Data: Practical Assignment 3 
## RDF Data, RDFS knowledge and inferencing 

YOUR NAME: Filippos Kontolemis

YOUR VUNetID: fko230

*(If you do not provide your name and VUNetID we will not accept your submission).* 

### Learning objectives

At the end of this exercise you should be able to:

1. Access local an external data via SPARQL both from within a python programming environment and stand-alone with a GUI, such as [YASGUI](https://yasgui.triply.cc/), and this way integrate data from different sources  
2. Model your own first knowledge base, in this case an RDF Schema knowledge graph
3. Implement inference rules 

Follow this Notebook step-by-step. 

Of course, you can do the exercises in any Programming Editor of your liking. 
But you do not have to. Feel free to simply write code in the Notebook. When 
everythink is filled in and works, safe the Notebook and submit it 
as a Jupyter Notebook, i.e. with an ipynb extension. Please use as name of the 
Notebook your studentID+Assignment3.ipynb.  


We will not evaluate the programming style of your solutions. Yet we do look whether your solutions suggests an understanding, and whether they yield the correct output.

Note that all notebooks will automatically be checked for plagiarism: while similar answers can be expected, it is not allowed to directly copy the solutions from fellow students or TAs, or from the examples discussed during the lectures. Similarly, sharing your solutions with your peers is not allowed.

**IMPORTANT: Submit this notebook after finishing the assignment. It is not necessary to submit the created turtle files**

Before you start, you need to:

- **Install the *rdflib* Python package:** *pip install rdflib* (should already be installed from the previous assignment)
- **Install the *SPARQLWrapper* Python package:** *pip install SPARQLWrapper*
- **Install the free edition of the GraphDB Triplestore:** please follow this short [GraphDB tutorial](https://github.com/ucds-vu/knowledge-data-vu/blob/master/Tutorials/Preliminaries/tutorial-GraphDB.md). 

Then, add the file example-from-slides.ttl to a newly created database, say called assignment-3. 

**Note that you should have an active internet connection to run the code in this notebook. If, for some external reason (ie internet and/or system issues), you cannot access the SPARQL endpoint, then report this to a TA as soon as possible!**

In [51]:
# install library
%pip install SPARQLWrapper

Note: you may need to restart the kernel to use updated packages.


## Task 1: (35 points) Integrate Local and External Data

You can integrate SPARQL queries into your Python code by using the *RDFLib* and *SPARQLWrapper* libraries. 

The following code accesses the DBPedia knowledge graph using its SPARQL endpoint, and returns the result of the SPARQL query requesting all the labels asserted to Amsterdam (test it!)  

In [52]:
# This code only works if you are online.
# If, for some reason, you cannot get this to work, then please contact a TA

from rdflib import Graph, RDF, RDFS, Namespace, Literal, URIRef
from SPARQLWrapper import SPARQLWrapper, JSON

sparql = SPARQLWrapper("http://dbpedia.org/sparql")
sparql.setQuery("""
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    SELECT ?cityName
    WHERE { 
        <http://dbpedia.org/resource/Amsterdam> rdfs:label ?cityName 
    }
""")
sparql.setReturnFormat(JSON)
results = sparql.query().convert()
for result in results["results"]["bindings"]:
    print(result["cityName"]["value"])  

Amsterdam
أمستردام
Amsterdam
Amsterdam
Amsterdam
Άμστερνταμ
Amsterdamo
Ámsterdam
Amsterdam
Amstardam
Amsterdam
Amsterdam
Amsterdam
암스테르담
アムステルダム
Amsterdam
Amsterdam
Амстердам
Amesterdão
Amsterdam
Амстердам
阿姆斯特丹


For your convenience, we already wrote the following functions that might be useful to complete this task. 
In addition, we have loaded and printed the 'example-from-slides.ttl' dataset.

In [53]:
from rdflib import Graph, RDF, Namespace, Literal, URIRef
from SPARQLWrapper import SPARQLWrapper, JSON


# Loads the data from a certain file given as input in Turtle syntax into the Graph g  
# -------------------------
def load_graph(graph, filename):
    with open(filename, 'r') as f:
        graph.parse(f, format='turtle')
        

# Prints a certain graph given as input in Turtle syntax
# if your output shows byte string (ie, b'...') you must add '.decode()' to the print statements:
#    print(myGraph.serialize(format='turtle').decode())
# -------------------------
def serialize_graph(myGraph):
     print(myGraph.serialize(format='turtle'))
        

# Saves the Graph g in Turtle syntax to a certain file given as input
# -------------------------
def save_graph(myGraph, filename):
    with open(filename, 'w') as f:
        myGraph.serialize(filename, format='turtle')
        
    
# Changes the namespace of a certain URI given as input to a DBpedia URI 
# Example: transformToDBR("http://example.com/kad2020/Amsterdam") returns "http://dbpedia.org/resource/Amsterdam"
# -------------------------
def transformToDBR(uri):
    if isinstance(uri, Literal):
        # changes the literal to uppercase so that the object with the same name refers to an object and not the string
        return uri.upper()
    components = g.namespace_manager.compute_qname(uri)
    return "http://dbpedia.org/resource/%s"%(components[2])

# -------------------------

g = Graph()
load_graph(g, 'example-from-slides.ttl')
serialize_graph(g)


# Don't forget to run this cell before continuing the task.


@prefix ex: <http://example.com/kad/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

ex:Netherlands a ex:Country ;
    ex:contains ex:Ijsselmeer ;
    ex:containsCity ex:Rotterdam ;
    ex:hasCapital ex:Amsterdam ;
    ex:hasName "The Netherlands" ;
    ex:neighbours ex:Belgium .

ex:hasCapital rdfs:range ex:Capital ;
    rdfs:subPropertyOf ex:containsCity .

ex:neighbours rdfs:subPropertyOf ex:closeBy .

ex:Amsterdam a ex:Capital ;
    ex:closeBy ex:Germany .

ex:Belgium a ex:Country .

ex:EuropeanCountry rdfs:subClassOf ex:Country .

ex:Germany a ex:EuropeanCountry ;
    ex:hasCapital ex:Berlin .

ex:closeBy rdfs:domain ex:Location ;
    rdfs:range ex:Location .

ex:containsCity rdfs:domain ex:Country ;
    rdfs:range ex:City ;
    rdfs:subPropertyOf ex:contains .

ex:Capital rdfs:subClassOf ex:City .

ex:City rdfs:subClassOf ex:Location .

ex:Country rdfs:subClassOf ex:Location .




### A: Write a SPARQL query that finds all the cities in the dataset

As you cannot directly use class City, you will have to find those cities in the dataset (example-from-slides.ttl) using implicit information that can be deduced from the domain and ranges of the relations (e.g. things in a hasCapital relation are capitals and a capital is a city, etc.).

Save all the cities returned from the SPARQL query into the empty set "cities". 

In [54]:
cities = set()


from rdflib.plugins.sparql import prepareQuery

q = prepareQuery("""
    PREFIX ex: <http://example.com/kad/>
    SELECT DISTINCT ?city WHERE {
        { ?country ex:containsCity ?city . }
        UNION
        { ?country ex:hasCapital ?city . }
    }
""")

for row in g.query(q):
    cities.add(row.city)


for city in cities:
    print(city)

http://example.com/kad/Rotterdam
http://example.com/kad/Berlin
http://example.com/kad/Amsterdam


### B: For each city, find from DBpedia its longitude & latitude, and its number of inhabitants (if available)

Don't forget to adapt the namespace of the cities in your dataset when querying DBpedia, using the above function *transformToDBR(uri)*. Also note that namespaces should never use the *https* protocol.

The empty graph h should only contain the triples extracted from DBpedia, but added to the URIs with the 'ex' namespace. 
An example of a triple in h is the following triple: 
       
       ex:Amsterdam dbo:populationTotal "872680"^^xsd:nonNegativeInteger .

In [55]:
from rdflib import Graph, Namespace, URIRef, Literal
from rdflib.namespace import XSD, RDF
from SPARQLWrapper import SPARQLWrapper, JSON

ex = Namespace("http://example.com/kad/")
dbo = Namespace("http://dbpedia.org/ontology/")

h = Graph()
h.bind('ex', ex)
h.bind('dbo', dbo)
h.bind('xsd', XSD)

for city in cities:
    dbpedia_uri = transformToDBR(city)
    sparql.setQuery(f"""
        PREFIX dbo: <http://dbpedia.org/ontology/>
        PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
        SELECT  ?lat_geo ?long_geo ?pop WHERE {{
             {{ <{dbpedia_uri}> geo:lat ?lat_geo . }}
             {{ <{dbpedia_uri}> geo:long ?long_geo . }}
             {{ <{dbpedia_uri}> dbo:populationTotal ?pop . }}
        }}
    """)
    
    sparql.setReturnFormat(JSON)
    results = sparql.query().convert()
    for result in results["results"]["bindings"]:
        lat = result.get("lat_geo")
        lon = result.get("long_geo")
        pop = result.get("pop")
        
        if lat and lon:
            h.add((URIRef(ex+str(city).split('/')[-1].split('#')[-1]), dbo.latitude, Literal(lat["value"], datatype=XSD.float)))
            h.add((URIRef(ex+str(city).split('/')[-1].split('#')[-1]), dbo.longitude, Literal(lon["value"], datatype=XSD.float)))
        if pop:
            h.add((URIRef(ex+str(city).split('/')[-1].split('#')[-1]), dbo.populationTotal, Literal(pop["value"], datatype=XSD.nonNegativeInteger)))

serialize_graph(h)

@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix ex: <http://example.com/kad/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:Amsterdam dbo:latitude "52.36666488647461"^^xsd:float ;
    dbo:longitude "4.900000095367432"^^xsd:float ;
    dbo:populationTotal "907976"^^xsd:nonNegativeInteger .

ex:Berlin dbo:latitude "52.52000045776367"^^xsd:float ;
    dbo:longitude "13.40499973297119"^^xsd:float ;
    dbo:populationTotal "3677472"^^xsd:nonNegativeInteger .

ex:Rotterdam dbo:latitude "51.91666793823242"^^xsd:float ;
    dbo:longitude "4.5"^^xsd:float ;
    dbo:populationTotal "651157"^^xsd:nonNegativeInteger .




### C: Save your results

- Merge the triples from example-from-slides.ttl with the information extracted from DBpedia. See the [documentation](https://rdflib.readthedocs.io/en/stable/merging.html) on how to accomplish this.
- Save all these triples into a new file 'extended-example.ttl'. **It is not necessary to submit this file**
- Print all triples in Turtle Syntax.


In [56]:

merged_graph = g + h
merged_graph.serialize('extended-example.ttl', format='turtle')


<Graph identifier=Na482cb17566f46eba205c85ae48695fb (<class 'rdflib.graph.Graph'>)>

## Task 2: (25 points)  Implement Basic Inferencing Rules 

In the lecture we showed that the RDFS inference rules can be used to infer new knowledge. For example, infer class membership based on _rdfs:domain_ or infer relationships between subjects and objects based on _rdfs:subPropertyOf_. 

Create rules to inference class membership based on the RDF Schema language features 
*	For example: infer that an instance belongs to a class because of domain and range restrictions
*	For example: infer that an instance belongs to a (super)class because it also belongs to a subclass

We implemented the __rdfs2__ rule. You should implement the 5 following remaining rules:  

*     (rdfs2) If G contains the triples (aaa rdfs:domain xxx.) and (uuu aaa yyy.)  then infer the triple (uuu rdf:type xxx.)
*     (rdfs3) If G contains the triples (aaa rdfs:range xxx.) and (uuu aaa vvv.) then infer the triple (vvv rdf:type xxx .)
*     (rdfs5) If G contains the triples (uuu rdfs:subPropertyOf vvv.) and (vvv rdfs:subPropertyOf xxx.) then infer the triple
(uuu rdfs:subPropertyOf xxx.) 
*     (rdfs7) If G contains the triples (aaa rdfs:subPropertyOf bbb.) and (uuu aaa yyy.) then infer the triple (uuu bbb yyy) 
*     (rdfs9) If G contains the triples (uuu rdfs:subClassOf xxx.) and (vvv rdf:type uuu.) then infer the triple
 (vvv rdf:type xxx.)   -> this one was not mentioned in the lecture, but is a very important one. 
*     (rdfs11) If G contains the triples (uuu rdfs:subClassOf vvv.) and (vvv rdfs:subClassOf xxx.) then infer the triple
(uuu rdfs:subClassOf xxx.)


Run your rule reasoner on your knowledge graph. If you have implemented everything correctly, you should find exactly 17 inferences.

In [57]:
def myRDFSreasoner(myGraph):
    inferredTriples = 0
    for sbj, prd, obj in myGraph:

        # --- rdfs2 ---
        if (prd.eq(URIRef("http://www.w3.org/2000/01/rdf-schema#domain"))):
            generator = myGraph.subject_objects(URIRef(sbj))
            for s, o in generator:
                inferredTriples += 1
                print("(rdfs 2) ", s, "rdf:type", obj)
        
        
        # --- rdfs3 ---
        if (prd.eq(URIRef("http://www.w3.org/2000/01/rdf-schema#range"))):
            generator = myGraph.subject_objects(URIRef(sbj))
            for s, o in generator:
                inferredTriples += 1
                print("(rdfs 3) ", o, "rdf:type", obj)

        
        
        # --- rdfs5 ---
        if (prd.eq(URIRef("http://www.w3.org/2000/01/rdf-schema#subPropertyOf"))):
            generator = myGraph.predicate_objects(URIRef(obj))
            for p, o in generator:
                if p.eq(URIRef("http://www.w3.org/2000/01/rdf-schema#subPropertyOf")):
                    inferredTriples += 1
                    print("(rdfs 5) ", sbj, "rdfs:subPropertyOf", o)

     # --- rdfs7 ---
        if (prd.eq(URIRef("http://www.w3.org/2000/01/rdf-schema#subPropertyOf"))):
            generator = myGraph.subject_objects(URIRef(sbj))
            for s, o in generator:
                inferredTriples += 1
                print("(rdfs 7) ", s, obj, o)

         
        
        # --- rdfs9 ---
        if (prd.eq(URIRef("http://www.w3.org/2000/01/rdf-schema#subClassOf"))):
            generator = myGraph.subject_objects(URIRef("http://www.w3.org/1999/02/22-rdf-syntax-ns#type"))
            for s, o in generator:
                if o.eq(URIRef(sbj)):
                    inferredTriples += 1
                    print("(rdfs 9) ", s, "rdf:type", obj)
        

        
        # --- rdfs11 ---
        if (prd.eq(URIRef("http://www.w3.org/2000/01/rdf-schema#subClassOf"))):
            generator = myGraph.predicate_objects(URIRef(obj))
            for p, o in generator:
                if p.eq(URIRef("http://www.w3.org/2000/01/rdf-schema#subClassOf")):
                    inferredTriples += 1
                    print("(rdfs 11) ", sbj, "rdfs:subClassOf", o)
        

        
        
    print("---------------------------------")
    print("Number of inferred triples:", inferredTriples)
    print("---------------------------------")
    
myRDFSreasoner(g)  # test your reasoner



(rdfs 2)  http://example.com/kad/Amsterdam rdf:type http://example.com/kad/Location
(rdfs 3)  http://example.com/kad/Amsterdam rdf:type http://example.com/kad/Capital
(rdfs 3)  http://example.com/kad/Berlin rdf:type http://example.com/kad/Capital
(rdfs 9)  http://example.com/kad/Netherlands rdf:type http://example.com/kad/Location
(rdfs 9)  http://example.com/kad/Belgium rdf:type http://example.com/kad/Location
(rdfs 9)  http://example.com/kad/Amsterdam rdf:type http://example.com/kad/City
(rdfs 11)  http://example.com/kad/Capital rdfs:subClassOf http://example.com/kad/Location
(rdfs 7)  http://example.com/kad/Netherlands http://example.com/kad/contains http://example.com/kad/Rotterdam
(rdfs 5)  http://example.com/kad/hasCapital rdfs:subPropertyOf http://example.com/kad/contains
(rdfs 7)  http://example.com/kad/Netherlands http://example.com/kad/containsCity http://example.com/kad/Amsterdam
(rdfs 7)  http://example.com/kad/Germany http://example.com/kad/containsCity http://example.com/

## Task 3: (20 points) Build your very own RDFS knowledge graph. 


Define a small RDF Schema vocabulary in Turtle. You can choose your own domain (e.g. movies, geography, sports), as long as it hasn't been used as an example during the lectures. The following rules must be respected:
*	The schema should define at least 4 classes, 4 properties, and 4 instances.
*   The properties should be used to relate the instances (i.e., object-type relations)
*	The instances should be members of at least one of the 4 defined classes
*	All resources should have an rdfs:label attribute in a suitable language.

You should use (at least) the following language features of RDF and RDFS:
* 	rdf:type (or 'a')
* 	rdfs:subClassOf
* 	rdfs:subPropertyOf
* 	rdfs:domain and rdfs:range
*	rdfs:label

Be sure to define the 'rdf:' and 'rdfs:' namespace prefixes for RDF and RDF Schema in your file (perhaps have a look at http://prefix.cc)

For creating your vocabulary you should add the axioms directly (programatically) to your Knowledge Graph as you did last week. 

Play around with the inference rules you have created in the previous task to make sure that you added some implicit knowledge, that becomes "visible" via inferencing (this will be useful for the next task). 

Finally:
- Add the knowledge you created into the RDFlib graph datastructure *myRDFSgraph*, 
- Print *myRDFSgraph* in Turtle so that we can check your "design"
- Save *myRDFSgraph* into a new file 'myRDFSgraph.ttl' (it is not necessary to submit this file)

In [58]:
myRDFSgraph = Graph()
EX = Namespace("http://example.org/football/")
myRDFSgraph.bind("ex", EX)
myRDFSgraph.bind("rdf", RDF)
myRDFSgraph.bind("rdfs", RDFS)

myRDFSgraph.add((EX.FootballCompetition, RDF.type, RDFS.Class))
myRDFSgraph.add((EX.PremierLeague, RDF.type, RDFS.Class))
myRDFSgraph.add((EX.PremierLeague, RDFS.subClassOf, EX.FootballCompetition))
myRDFSgraph.add((EX.FootballClub, RDF.type, RDFS.Class))
myRDFSgraph.add((EX.Player, RDF.type, RDFS.Class))
myRDFSgraph.add((EX.Stadium, RDF.type, RDFS.Class))
myRDFSgraph.add((EX.Manager, RDF.type, RDFS.Class))



myRDFSgraph.add((EX.hasPlayer, RDF.type, RDF.Property))
myRDFSgraph.add((EX.hasPlayer, RDFS.domain, EX.FootballClub))
myRDFSgraph.add((EX.hasPlayer, RDFS.range, EX.Player))

myRDFSgraph.add((EX.hasManager, RDF.type, RDF.Property))
myRDFSgraph.add((EX.hasManager, RDFS.domain, EX.FootballClub))
myRDFSgraph.add((EX.hasManager, RDFS.range, EX.Manager))

myRDFSgraph.add((EX.playsAt, RDF.type, RDF.Property))
myRDFSgraph.add((EX.playsAt, RDFS.domain, EX.FootballClub))
myRDFSgraph.add((EX.playsAt, RDFS.range, EX.Stadium))

myRDFSgraph.add((EX.coaches, RDF.type, RDF.Property))
myRDFSgraph.add((EX.coaches, RDFS.domain, EX.Manager))
myRDFSgraph.add((EX.coaches, RDFS.range, EX.Player))

myRDFSgraph.add((EX.participatesIn, RDF.type, RDF.Property))
myRDFSgraph.add((EX.participatesIn, RDFS.domain, EX.FootballClub))
myRDFSgraph.add((EX.participatesIn, RDFS.range, EX.PremierLeague))

myRDFSgraph.add((EX.Arsenal, EX.participatesIn, EX.PremierLeague))
myRDFSgraph.add((EX.LiverpoolFC, EX.participatesIn, EX.PremierLeague))


myRDFSgraph.add((EX.Arsenal, RDF.type, EX.FootballClub))
myRDFSgraph.add((EX.Arsenal, RDFS.label, Literal("Arsenal Football Club")))
myRDFSgraph.add((EX.LiverpoolFC, RDF.type, EX.FootballClub))
myRDFSgraph.add((EX.LiverpoolFC, RDFS.label, Literal("Liverpool Football Club")))

myRDFSgraph.add((EX.Saka, RDF.type, EX.Player))
myRDFSgraph.add((EX.Saka, RDFS.label, Literal("Bukayo Saka")))
myRDFSgraph.add((EX.MohamedSalah, RDF.type, EX.Player))
myRDFSgraph.add((EX.MohamedSalah, RDFS.label, Literal("Mohamed Salah")))

myRDFSgraph.add((EX.Arteta, RDF.type, EX.Manager))
myRDFSgraph.add((EX.Arteta, RDFS.label, Literal("Mikel Arteta")))
myRDFSgraph.add((EX.Slott, RDF.type, EX.Manager))
myRDFSgraph.add((EX.Slott, RDFS.label, Literal("Arne Slott")))

myRDFSgraph.add((EX.EmiratesStadium, RDF.type, EX.Stadium))
myRDFSgraph.add((EX.EmiratesStadium, RDFS.label, Literal("Emirates Stadium")))
myRDFSgraph.add((EX.Anfield, RDF.type, EX.Stadium))
myRDFSgraph.add((EX.Anfield, RDFS.label, Literal("Anfield")))



myRDFSgraph.add((EX.Arsenal, EX.hasManager, EX.Arteta))
myRDFSgraph.add((EX.Arsenal, EX.playsAt, EX.EmiratesStadium))
myRDFSgraph.add((EX.Arteta, EX.coaches, EX.Saka))
myRDFSgraph.add((EX.LiverpoolFC, EX.hasManager, EX.Slott))
myRDFSgraph.add((EX.LiverpoolFC, EX.playsAt, EX.Anfield))
myRDFSgraph.add((EX.Slott, EX.coaches, EX.MohamedSalah))

myRDFSgraph.add((EX.YouthAcademy, RDF.type, RDFS.Class))
myRDFSgraph.add((EX.YouthAcademy, RDFS.subClassOf, EX.FootballClub))
myRDFSgraph.add((EX.YouthAcademy, RDFS.label, Literal("Youth Academy")))
myRDFSgraph.add((EX.ArsenalYouthAcademy, RDF.type, EX.YouthAcademy))
myRDFSgraph.add((EX.ArsenalYouthAcademy, RDFS.label, Literal("Arsenal Youth Academy")))
myRDFSgraph.add((EX.LiverpoolYouthAcademy, RDF.type, EX.YouthAcademy))
myRDFSgraph.add((EX.LiverpoolYouthAcademy, RDFS.label, Literal("Liverpool Youth Academy")))


myRDFSgraph.add((EX.hasCaptain, RDF.type, RDF.Property))
myRDFSgraph.add((EX.hasCaptain, RDFS.subPropertyOf, EX.hasPlayer))
myRDFSgraph.add((EX.hasCaptain, RDFS.domain, EX.FootballClub))
myRDFSgraph.add((EX.hasCaptain, RDFS.range, EX.Player))
myRDFSgraph.add((EX.LiverpoolFC, EX.hasCaptain, EX.MohamedSalah))
myRDFSgraph.add((EX.Arsenal, EX.hasCaptain, EX.Saka))

print("Now let's check what we can infer from your knowledge graph...")
print("The more rules you cover, the better!")
myRDFSreasoner(myRDFSgraph)
save_graph(myRDFSgraph, "myRDFSgraph.ttl")



Now let's check what we can infer from your knowledge graph...
The more rules you cover, the better!
(rdfs 3)  http://example.org/football/Saka rdf:type http://example.org/football/Player
(rdfs 3)  http://example.org/football/MohamedSalah rdf:type http://example.org/football/Player
(rdfs 2)  http://example.org/football/Arsenal rdf:type http://example.org/football/FootballClub
(rdfs 2)  http://example.org/football/LiverpoolFC rdf:type http://example.org/football/FootballClub
(rdfs 2)  http://example.org/football/LiverpoolFC rdf:type http://example.org/football/FootballClub
(rdfs 2)  http://example.org/football/Arsenal rdf:type http://example.org/football/FootballClub
(rdfs 7)  http://example.org/football/LiverpoolFC http://example.org/football/hasPlayer http://example.org/football/MohamedSalah
(rdfs 7)  http://example.org/football/Arsenal http://example.org/football/hasPlayer http://example.org/football/Saka
(rdfs 2)  http://example.org/football/Arsenal rdf:type http://example.org/footb

## Task 4 (20 points) Compare local inferences with GraphDB results

Upload *myRDFSgraph.ttl* to GraphDB (check [the GraphDB tutorial](https://github.com/ucds-vu/knowledge-data-vu/blob/master/Tutorials/Preliminaries/tutorial-GraphDB.md) before starting to work with GraphDB).

Formulate two different SPARQL queries, and write a Python code that executes these queries over your GraphDB SPARQL endpoint (check example of Task 1).

**Each SPARQL query should return a _different type_ of inferred knowledge** (at least one triple that was not explicitly asserted in the graph).

Specify below next to your query (using a comment '# ...') which type of RDFS rule is the GraphDB reasoner using to infer this answer (rdfs2, rdfs3, rdfs5, rdfs7, rdfs9, rdfs11). 

In [59]:
# Get your GraphDB repository URL (setup -> repositories -> repository url) and assign it to the variable 'myEndpoint' below. 
# It should be similar to this: 

myEndpoint = "http://127.0.0.1:7200/repositories/KnowledgeAndData"  # KnowledgeAndData is the name of the repository
sparql = SPARQLWrapper(myEndpoint)

In [60]:
# Query 1 - Specify which RDFS rule are you testing: 

# Check example of Task 1 on how to query remote SPARQL endpoints

# rdfs9 --> ex:ArsenalYouthAcademy rdf:type ex:YouthAcademy  and ex:YouthAcademy rdfs:subClassOf ex:FootballClub . --> then ex:ArsenalYouthAcademy rdf:type ex:FootballClub .


sparql.setQuery("""
PREFIX ex: <http://example.org/football/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?academy
WHERE {
  ?academy rdf:type ex:FootballClub .
  ?academy rdf:type ex:YouthAcademy .
}
""")

sparql.setReturnFormat(JSON)
results = sparql.query().convert()
for result in results["results"]["bindings"]:
    print(result["academy"]["value"])
   



http://example.org/football/ArsenalYouthAcademy
http://example.org/football/LiverpoolYouthAcademy


In [61]:
# Query 2 - Specify which RDFS rule are you testing: 

# Check example of Task 1 on how to query remote SPARQL endpoints
#rdfs7 --> EX.hasCaptain rdfs:subPropertyOf EX.hasPlayer and EX.Arsenal ex:hasCaptain EX.Saka --> then EX.Arsenal ex:hasPlayer EX.Saka

sparql.setQuery("""
PREFIX ex: <http://example.org/football/>
SELECT ?club ?player WHERE {
    ?club ex:hasPlayer ?player .
}
""")
sparql.setReturnFormat(JSON)
results = sparql.query().convert()
for result in results["results"]["bindings"]:
    print(result["club"]["value"])
    print(result["player"]["value"])
   


http://example.org/football/Arsenal
http://example.org/football/Saka
http://example.org/football/LiverpoolFC
http://example.org/football/MohamedSalah


## Submitting the assignment

Please submit this notebook (.ipynb) once you're finished with the assignment. It is not necessary to submit the created turtle files.