Course Instructor: Bernd Neumayr, JKU

# UE04- SPARQL Updata and RDF Datasets

Complete the **8 tasks (1 point per task)** in the `3. SPARQL Update` sheet of `SemAI.jar` first and then transfer them to this notebook.

For each task include:
- A headline including the task number
- The task description 
- Your solution in executable form: your solutions for SemAI.jar will make use of the default grap. In this notebook you have to transform your solutions according to the workaround exemplified in V04_SPARQL_Update.ipynb
- After executing the update request, print a serizalization of the dataset in TriG format.  

**Task 9 (2 points)**  is to develop a nice visualization of RDF datasets using `visualize_graph_pyvis` from UE02 as a starting point. The requirements are as follows:
- Each named graph must be represented as an independent graph. This means, for example, that :Jane in :JanesGraph is a different node than :Jane in :BillsGraph. There are no edges between nodes in different graphs.
- It is not strictly necessary to draw a box around each named graph, as seen in the slides. The different named graphs should simply be visually distinguishable and not overlap.
- If not all nodes within a named graph are connected, make sure in the visualization that the named graph still forms a coherent visual unit in some way.

In [None]:
# Install required packages
!pip install -q rdflib     # comment to avoid re-install with every re-run

# Imports
import pandas as pd
import rdflib
from rdflib import Graph, Literal, RDF, URIRef, BNode, Namespace


# Convenient Functions
def sparql_select(graph,query,use_prefixes=True):
  results = graph.query(query)          # execute the query against the graph, resulting in a rdflib.plugins.sparql.processor.SPARQLResult
  rows = []                             # a list of dictionaries, as intermediate format to construct the pandas DataFrame
  for result in results:                # iterate over the result set of the query, a result is an instance of rdflib.query.ResultRow
    row = {}                            #     create a dictionary to hold a single row of the result
    for var in results.vars:            #     iterate over the variables of the SPARQLResult to add a dictionary entry for each variable
      if (isinstance(result[var],URIRef) and use_prefixes):
        row[var] = result[var].n3(graph.namespace_manager)   # use namespace prefixes to shorten URIs
      else:
        row[var] = result[var]                  
    rows.append(row)                    #     add the dictionary (row) to the list 
  return pd.DataFrame(rows,columns=results.vars)        
                                        # return a pandas DataFrame constructed from the list of dictionaries, with the variables from the result set as columns      


### Task 1
Sie beginnen mit einem leeren Dataset. Fügen Sie in den Default Graph Statements ein, die sagen, dass :Peter der Autor von :G1 ist, und :Mary Autor von :G2.

In [None]:
ds = rdflib.Dataset()

solution1Jar = """
INSERT DATA
{
 :G1 :author :Mary .
 :G2 :author :Peter .
}
"""

ds.parse(format="trig", data="""
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix :    <http://example.org/>

:main {
 :G1 :author :Mary .
 :G2 :author :Peter .
}
""")
print(ds.serialize(format="trig"))

### Task 2
Schreiben Sie { :Mary :knows :Peter, :John, :Mary. } in den Named Graph :G1 und { :Peter :knows :Mary. :John :knows :Mary. } in den Named Graph :G2.



In [None]:
solution2Jar="""
INSERT DATA {
GRAPH :G1 {
    :Mary :knows :Mary , :John , :Peter .
  }
GRAPH :G2 {
    :Peter :knows :Mary. :John :knows :Mary.
  }
}
"""

ds.parse(format="trig", data="""
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix :    <http://example.org/>

GRAPH :G1 {
    :Mary :knows :Mary , :John , :Peter .
  }
GRAPH :G2 {
    :Peter :knows :Mary. :John :knows :Mary.
  }

""")
print(ds.serialize(format="trig"))

###Task 3
Fragen Sie mittels INSERT-WHERE die :knows-Beziehungen aus :G2 ab und fügen deren inverse :knownBy-Beziehungen in den Default-Graph ein. 
Ihr Update Request darf nicht enthalten: [Mary, Peter, John]

In [None]:
solution3Jar="""
INSERT
{
  ?obj :knownBy ?subj .
}
WHERE
{
  GRAPH :G2 {
    ?subj :knows ?obj .
  }
}
"""

ds.update("""
INSERT
{
  GRAPH :main {
    ?obj :knownBy ?subj .
  }
}
WHERE
{
  GRAPH :G2 {
    ?subj :knows ?obj .
  }
}
""")

print(ds.serialize(format="trig"))

###Task 4
Löschen Sie mittels DELETE-WHERE alle :knownBy-Beziehungen aus dem Default-Graph. 
Ihr Update Request darf nicht enthalten: [Mary, Peter, John]

In [None]:
solution4Jar="""
DELETE{
  ?subj :knownBy ?obj
}
WHERE{
  ?subj :knownBy ?obj
}
"""

ds.update("""
DELETE{
  GRAPH :main{
    ?subj :knownBy ?obj
  }
}
WHERE{
  GRAPH :main{
    ?subj :knownBy ?obj
  }
}
""")

print(ds.serialize(format="trig"))

###Task 5
Ermitteln Sie mittels INSERT-WHERE zu jedem Named-Graph dessen Anzahl an Statements mit der Property :knows und schreiben Sie diese in den Default Graph. 
Ihr Update Request darf nicht enthalten: [G1, G2]

In [None]:
solution5Jar="""
INSERT{
?g :knowsCount ?nr
}
WHERE{
  SELECT ?g (COUNT(?obj) AS ?nr) 
WHERE
{ 
GRAPH ?g
{?subj :knows ?obj.}
}
GROUP BY ?g
}
"""

ds.update("""
INSERT{
  GRAPH :main{
    ?g :knowsCount ?nr
  }
}
WHERE{
  SELECT ?g (COUNT(?obj) AS ?nr) 
   WHERE { 
     GRAPH ?g {
       ?subj :knows ?obj.
     }
   }
  GROUP BY ?g
}
""")

print(ds.serialize(format="trig"))

###Task 6
Ermitteln Sie mittels INSERT-WHERE die Anzahl an Named Graphs und schreiben Sie sie in den Default Graph. 
Ihr Update Request darf nicht enthalten: [2]

In [None]:
solution6Jar="""
INSERT{
:ds :graphCount ?count
}
WHERE{
SELECT (COUNT(DISTINCT ?graph) AS ?count)
WHERE {
GRAPH ?graph {
  ?subj ?pred ?obj
}
} 
}
"""

ds.update("""
INSERT{
  GRAPH :main{
    :ds :graphCount ?count
  }
}
WHERE{
  SELECT (COUNT(DISTINCT ?graph) AS ?count)
   WHERE {
    GRAPH ?graph {
      ?subj ?pred ?obj
    }
  } 
}
""")

print(ds.serialize(format="trig"))

###Task 7
Verschieben Sie mittels DELETE-INSERT-WHERE alle Metadaten zu Named Graphs (also Statements die einen Named Graph als Subjekt haben) in den entsprechenden Named Graph. 
Ihr Update Request darf nicht enthalten: [G1, G2]

In [None]:
solution7Jar="""
DELETE{ ?graph ?pred ?obj . }
INSERT {
GRAPH ?graph {
?graph ?pred ?obj .
}
}
WHERE {
SELECT ?graph ?pred ?obj 
WHERE {
?graph ?pred ?obj .
GRAPH ?graph {}
}
}
"""

ds.update("""
DELETE{ 
  GRAPH :main{
    ?graph ?pred ?obj .
  } 
}
INSERT {
  GRAPH ?graph {
    ?graph ?pred ?obj .
  }
}
WHERE {
   SELECT ?graph ?pred ?obj 
    WHERE {
     GRAPH ?graph {}
     GRAPH :main{
      ?graph ?pred ?obj .
     } 
    }
}
""")

print(ds.serialize(format="trig"))

###Task 8
Schreiben Sie in jeden Named Graph ein Statement, dass der Autor des jeweiligen Named Graphs die :Susi kennt und aktualisieren Sie mit dem selben UpdateRequest den knowsCount. 
Ihr Update Request darf nicht enthalten: [Peter, Mary]

In [None]:
solution8Jar="""
DELETE{ 
GRAPH ?graph{ ?graph :knowsCount ?oldCount . }
}
INSERT{
GRAPH ?graph{
?a :knows :Susi .
?graph :knowsCount ?newCount .
}
}
WHERE {
GRAPH ?graph {
?graph :author ?a ;
:knowsCount ?oldCount .
BIND(?oldCount+1 AS ?newCount)
}
}
"""

ds.update("""
DELETE{ 
   GRAPH ?graph{
     ?graph :knowsCount ?oldCount . 
   }
}
INSERT{
   GRAPH ?graph{
    ?a :knows :Susi .
    ?graph :knowsCount ?newCount .
   }
}
WHERE {
   GRAPH ?graph {
    ?graph :author ?a ;
    :knowsCount ?oldCount .
    BIND(?oldCount+1 AS ?newCount)
   }
}
""")

print(ds.serialize(format="trig"))