Course Instructor: Bernd Neumayr, JKU

# UE04- SPARQL Updata and RDF Datasets

Complete the **8 tasks (1 point per task)** in the `3. SPARQL Update` sheet of `SemAI.jar` first and then transfer them to this notebook.

For each task include:
- A headline including the task number
- The task description 
- Your solution in executable form: your solutions for SemAI.jar will make use of the default grap. In this notebook you have to transform your solutions according to the workaround exemplified in V04_SPARQL_Update.ipynb
- After executing the update request, print a serizalization of the dataset in TriG format.  

**Task 9 (2 points)**  is to develop a nice visualization of RDF datasets using `visualize_graph_pyvis` from UE02 as a starting point. The requirements are as follows:
- Each named graph must be represented as an independent graph. This means, for example, that :Jane in :JanesGraph is a different node than :Jane in :BillsGraph. There are no edges between nodes in different graphs.
- It is not strictly necessary to draw a box around each named graph, as seen in the slides. The different named graphs should simply be visually distinguishable and not overlap.
- If not all nodes within a named graph are connected, make sure in the visualization that the named graph still forms a coherent visual unit in some way.

## Preparations

In [1]:
# Install required packages
!pip install -q rdflib     # comment to avoid re-install with every re-run

ERROR: Invalid requirement: '#'

[notice] A new release of pip available: 22.3 -> 23.1
[notice] To update, run: python.exe -m pip install --upgrade pip


### Imports and Functions 

We are re-using the sparql_select function. 

In [2]:
# Imports
import pandas as pd
import rdflib
from rdflib import Graph, Literal, RDF, URIRef, BNode, Namespace


# Convenient Functions
def sparql_select(graph,query,use_prefixes=True):
  results = graph.query(query)          # execute the query against the graph, resulting in a rdflib.plugins.sparql.processor.SPARQLResult
  rows = []                             # a list of dictionaries, as intermediate format to construct the pandas DataFrame
  for result in results:                # iterate over the result set of the query, a result is an instance of rdflib.query.ResultRow
    row = {}                            #     create a dictionary to hold a single row of the result
    for var in results.vars:            #     iterate over the variables of the SPARQLResult to add a dictionary entry for each variable
      if (isinstance(result[var],URIRef) and use_prefixes):
        row[var] = result[var].n3(graph.namespace_manager)   # use namespace prefixes to shorten URIs
      else:
        row[var] = result[var]                  
    rows.append(row)                    #     add the dictionary (row) to the list 
  return pd.DataFrame(rows,columns=results.vars)        
                                        # return a pandas DataFrame constructed from the list of dictionaries, with the variables from the result set as columns      


## Task 1

Sie beginnen mit einem leeren Dataset. Fügen Sie in den Default Graph Statements ein, die sagen, dass :Peter der Autor von :G1 ist, und :Mary Autor von :G2.

In [3]:
dataset1 = """
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .
"""

task1 = """
INSERT DATA {
:G1 :author :Mary .
:G2 :author :Peter .
}
"""

g = rdflib.Graph().parse(format="turtle",data=dataset1)
g.update(task1)

print(g.serialize(format="turtle"))

@prefix : <http://example.org/> .

:G1 :author :Mary .

:G2 :author :Peter .




## Task 2

Schreiben Sie { :Mary :knows :Peter, :John, :Mary. } in den Named Graph :G1 und { :Peter :knows :Mary. :John :knows :Mary. } in den Named Graph :G2.

In [4]:
dataset2 = """
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .
@prefix xrdf: <urn:x-rdflib>  .

xrdf:default {
:G1  :author  :Mary .
:G2  :author  :Peter .
}
"""

task2 = """
Insert Data {
GRAPH :G1 {
:Mary :knows :Peter, :John, :Mary .
}
GRAPH :G2 {
:Peter :knows :Mary .
:John :knows :Mary .
}
}
"""

ds = rdflib.Dataset()
ds.parse(format="trig", data=dataset2)
ds.update(task2)

print(ds.serialize(format="trig"))

@prefix : <http://example.org/> .
@prefix ns1: <urn:> .

:G1 {
    :Mary :knows :John,
            :Mary,
            :Peter .
}

ns1:x-rdflibdefault {
    :G1 :author :Mary .

    :G2 :author :Peter .
}

:G2 {
    :John :knows :Mary .

    :Peter :knows :Mary .
}




## Task 3

Fragen Sie mittels INSERT-WHERE die :knows-Beziehungen aus :G2 ab und fügen deren inverse :knownBy-Beziehungen in den Default-Graph ein. Ihr Update Request darf nicht enthalten: [Mary, Peter, John]

In [5]:
dataset3 = """
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .
@prefix xrdf: <urn:x-rdflib>  .

xrdf:default {
  :G1  :author  :Mary .
  :G2  :author  :Peter .
}
:G1 {
   :Mary :knows :Mary , :John , :Peter .
}
:G2 {
   :John  :knows  :Mary .
   :Peter :knows  :Mary .
}
"""

task3 = """
INSERT { GRAPH xrdf:default { ?p2 :knownBy ?p1 } }
WHERE {
GRAPH :G2
{
?p1 ?knows ?p2
}
} 
"""

ds = rdflib.Dataset()
ds.parse(format="trig", data=dataset3)
ds.update(task3)

print(ds.serialize(format="trig"))

@prefix : <http://example.org/> .
@prefix ns1: <urn:> .

:G1 {
    :Mary :knows :John,
            :Mary,
            :Peter .
}

ns1:x-rdflibdefault {
    :G1 :author :Mary .

    :G2 :author :Peter .

    :Mary :knownBy :John,
            :Peter .
}

:G2 {
    :John :knows :Mary .

    :Peter :knows :Mary .
}




## Task 4

Löschen Sie mittels DELETE-WHERE alle :knownBy-Beziehungen aus dem Default-Graph. Ihr Update Request darf nicht enthalten: [Mary, Peter, John]

In [6]:
dataset4 = """
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .
@prefix xrdf: <urn:x-rdflib>  .

xrdf:default {
:G1  :author  :Mary .
:G2  :author  :Peter .
:Mary  :knownBy  :Peter , :John .
}
:G1 {
    :Mary  :knows  :Mary , :John , :Peter .
}
:G2 {
    :John  :knows  :Mary .
    :Peter :knows  :Mary .
}
"""

task4 = """
DELETE { GRAPH xrdf:default { ?x :knownBy ?y } }
WHERE {
GRAPH xrdf:default {
?x :knownBy ?y
}
}
"""

ds = rdflib.Dataset()
ds.parse(format="trig", data=dataset4)
ds.update(task4)

print(ds.serialize(format="trig"))

@prefix : <http://example.org/> .
@prefix ns1: <urn:> .

:G1 {
    :Mary :knows :John,
            :Mary,
            :Peter .
}

ns1:x-rdflibdefault {
    :G1 :author :Mary .

    :G2 :author :Peter .
}

:G2 {
    :John :knows :Mary .

    :Peter :knows :Mary .
}




## Task 5

Ermitteln Sie mittels INSERT-WHERE zu jedem Named-Graph dessen Anzahl an Statements mit der Property :knows und schreiben Sie diese in den Default Graph. Ihr Update Request darf nicht enthalten: [G1, G2]

In [7]:
dataset5 = """
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .
@prefix xrdf: <urn:x-rdflib>  .

xrdf:default {
:G1  :author  :Mary .
:G2  :author  :Peter .
}
:G1 {
   :Mary :knows :Mary , :John , :Peter .
}
:G2 {
   :John :knows :Mary .
   :Peter  :knows  :Mary .
}
"""

task5 = """
INSERT { GRAPH xrdf:default { ?g :knowsCount ?xcount } }
WHERE {
Select ?g (count(?x) as ?xcount)
WHERE {
GRAPH ?g
{
?x :knows ?y
} }
GROUP BY ?g
}
"""

ds = rdflib.Dataset()
ds.parse(format="trig", data=dataset5)
ds.update(task5)

print(ds.serialize(format="trig"))

@prefix : <http://example.org/> .
@prefix ns1: <urn:> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

:G2 {
    :John :knows :Mary .

    :Peter :knows :Mary .
}

:G1 {
    :Mary :knows :John,
            :Mary,
            :Peter .
}

ns1:x-rdflibdefault {
    :G1 :author :Mary ;
        :knowsCount 3 .

    :G2 :author :Peter ;
        :knowsCount 2 .
}




## Task 6

Ermitteln Sie mittels INSERT-WHERE die Anzahl an Named Graphs und schreiben Sie sie in den Default Graph. Ihr Update Request darf nicht enthalten: [2]

In [8]:
#Does not currently work with the default graph --> but is executed correctly in SemAI.jar
#The following line is only necessary as somehow the "invisible" graph gets counted as well: FILTER( contains(str(?g), "G") )
dataset6 = """
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .
@prefix xrdf: <urn:x-rdflib:default>  .

xrdf:default {
:G1 :author :Mary ;
    :knowsCount  3 .
:G2 :author :Peter ;
    :knowsCount  2 .
}
:G1 {
   :Mary  :knows  :Mary , :John , :Peter . }
:G2 {
   :John  :knows  :Mary .
   :Peter  :knows  :Mary . }
"""

task6 = """
INSERT { GRAPH xrdf:default { :ds :graphCount ?graphcount } }
WHERE {
Select (count(?g) as ?graphcount)
WHERE {
GRAPH ?g {}
FILTER( contains(str(?g), "G") )
}
}
"""

query6 = """
Select (count(?g) as ?graphcount)
WHERE {
GRAPH ?g {}
FILTER( contains(str(?g), "G") )
}
"""

ds = rdflib.Dataset()
ds.parse(format="trig", data=dataset6)
ds.update(task6)

print(ds.serialize(format="trig"))

# df = sparql_select(ds, query6)
# df

@prefix : <http://example.org/> .
@prefix ns1: <urn:x-rdflib:> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

:G2 {
    :John :knows :Mary .

    :Peter :knows :Mary .
}

:G1 {
    :Mary :knows :John,
            :Mary,
            :Peter .
}

ns1:defaultdefault {
    :G1 :author :Mary ;
        :knowsCount 3 .

    :G2 :author :Peter ;
        :knowsCount 2 .

    :ds :graphCount 2 .
}




## Task 7

Verschieben Sie mittels DELETE-INSERT-WHERE alle Metadaten zu Named Graphs (also Statements die einen Named Graph als Subjekt haben) in den entsprechenden Named Graph. Ihr Update Request darf nicht enthalten: [G1, G2]

In [9]:
dataset7 = """
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .
@prefix xrdf: <urn:x-rdflib>  .

xrdf:default {
:G1 :author :Mary ; :knowsCount  3 .
        :ds :graphCount  2 .
:G2 :author :Peter; :knowsCount  2 .
}
:G1 { :Mary   :knows  :Mary , :John , :Peter .}
:G2 { :John   :knows  :Mary . :Peter  :knows  :Mary .}
"""

task7 = """
DELETE { GRAPH xrdf:default { ?g ?prop ?val } }
INSERT { GRAPH ?g {
?g ?prop ?val
} 
}
WHERE {
Select ?g ?prop ?val
WHERE {
GRAPH ?g {}
GRAPH xrdf:default { ?g ?prop ?val }
}
}
"""

ds = rdflib.Dataset()
ds.parse(format="trig", data=dataset7)
ds.update(task7)

print(ds.serialize(format="trig"))

@prefix : <http://example.org/> .
@prefix ns1: <urn:> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

:G2 {
    :G2 :author :Peter ;
        :knowsCount 2 .

    :John :knows :Mary .

    :Peter :knows :Mary .
}

:G1 {
    :G1 :author :Mary ;
        :knowsCount 3 .

    :Mary :knows :John,
            :Mary,
            :Peter .
}

ns1:x-rdflibdefault {
    :ds :graphCount 2 .
}




## Task 8

Schreiben Sie in jeden Named Graph ein Statement, dass der Autor des jeweiligen Named Graphs die :Susi kennt und aktualisieren Sie mit dem selben UpdateRequest den knowsCount. Ihr Update Request darf nicht enthalten: [Peter, Mary]

In [10]:
dataset8 = """
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .
@prefix xrdf: <urn:x-rdflib>  .

xrdf:default {
:ds  :graphCount  2 .
}
:G1 { :Mary  :knows  :Mary , :John , :Peter .
		:G1  :author  :Mary ; :knowsCount  3 .}
:G2 { :John   :knows  :Mary . :G2 :author :Peter ; :knowsCount  2 .
		:Peter  :knows  :Mary .}
"""

task8 = """
DELETE { GRAPH ?g {
?g :knowsCount ?c
} }
INSERT { 
GRAPH ?g {
?aut :knows :Susi .
} 
}
WHERE {
Select ?g ?aut ?c
WHERE {
GRAPH ?g { ?g :author ?aut . ?g :knowsCount ?c}
}
};
INSERT { GRAPH ?g2 { ?g2 :knowsCount ?xcount } }
WHERE {
Select ?g2 (count(?x) as ?xcount)
WHERE {
GRAPH ?g2
{
?x :knows ?y
} }
GROUP BY ?g2
}
"""

ds = rdflib.Dataset()
ds.parse(format="trig", data=dataset8)
ds.update(task8)

print(ds.serialize(format="trig"))

@prefix : <http://example.org/> .
@prefix ns1: <urn:> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

:G2 {
    :G2 :author :Peter ;
        :knowsCount 3 .

    :John :knows :Mary .

    :Peter :knows :Mary,
            :Susi .
}

:G1 {
    :G1 :author :Mary ;
        :knowsCount 4 .

    :Mary :knows :John,
            :Mary,
            :Peter,
            :Susi .
}

ns1:x-rdflibdefault {
    :ds :graphCount 2 .
}


