Course Instructor: Bernd Neumayr, JKU

# UE04- SPARQL Updata and RDF Datasets

Complete the **8 tasks (1 point per task)** in the `3. SPARQL Update` sheet of `SemAI.jar` first and then transfer them to this notebook.

For each task include:
- A headline including the task number
- The task description 
- Your solution in executable form: your solutions for SemAI.jar will make use of the default grap. In this notebook you have to transform your solutions according to the workaround exemplified in V04_SPARQL_Update.ipynb
- After executing the update request, print a serizalization of the dataset in TriG format.  

**Task 9 (2 points)**  is to develop a nice visualization of RDF datasets using `visualize_graph_pyvis` from UE02 as a starting point. The requirements are as follows:
- Each named graph must be represented as an independent graph. This means, for example, that :Jane in :JanesGraph is a different node than :Jane in :BillsGraph. There are no edges between nodes in different graphs.
- It is not strictly necessary to draw a box around each named graph, as seen in the slides. The different named graphs should simply be visually distinguishable and not overlap.
- If not all nodes within a named graph are connected, make sure in the visualization that the named graph still forms a coherent visual unit in some way.

# Copied code from Vo4 notebook

### Imports and Functions 

We are re-using the sparql_select function. 

In [None]:
# Install required packages
# !pip install -q rdflib     
# !pip install -q rdflib pydot networkx pyvis  # comment to avoid re-install with every re-run


In [None]:
# Imports
import pandas as pd
import rdflib
from rdflib import Graph, Literal, RDF, URIRef, BNode, Namespace, Dataset
from rdflib.namespace import FOAF , XSD , RDFS 
from html import escape
import networkx as nx
from pyvis.network import Network
from IPython.display import display, HTML, Image
import os


# Convenient Functions
def sparql_select(graph,query,use_prefixes=True):
  results = graph.query(query)          # execute the query against the graph, resulting in a rdflib.plugins.sparql.processor.SPARQLResult
  rows = []                             # a list of dictionaries, as intermediate format to construct the pandas DataFrame
  for result in results:                # iterate over the result set of the query, a result is an instance of rdflib.query.ResultRow
    row = {}                            #     create a dictionary to hold a single row of the result
    for var in results.vars:            #     iterate over the variables of the SPARQLResult to add a dictionary entry for each variable
      if (isinstance(result[var],URIRef) and use_prefixes):
        row[var] = result[var].n3(graph.namespace_manager)   # use namespace prefixes to shorten URIs
      else:
        row[var] = result[var]                  
    rows.append(row)                    #     add the dictionary (row) to the list 
  return pd.DataFrame(rows,columns=results.vars)        
                                        # return a pandas DataFrame constructed from the list of dictionaries, with the variables from the result set as columns      


# Task 1

Sie beginnen mit einem leeren Dataset. Fügen Sie in den Default Graph Statements ein, die sagen, dass :Peter der Autor von :G1 ist, und :Mary Autor von :G2.


```
Expected Dataset:
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:G1  :author  :Mary .
:G2  :author  :Peter .
```



### Create rdflib Dataset

In [None]:
ds = rdflib.Dataset()

ds.parse(format="trig", data="""
  @prefix :    <http://example.org/> .
  @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .
  @prefix xrdf: <urn:x-rdflib:> . 

  GRAPH xrdf:default {}
""")

print(ds.serialize(format="trig"))

### Actual Request of Task 1

In [None]:
ds.update("""
INSERT DATA { 
  GRAPH xrdf:default {
	  :G1  :author  :Mary .
	  :G2  :author  :Peter .
  }
}
""")

print(ds.serialize(format="trig"))

# Task 2 

Schreiben Sie { :Mary :knows :Peter, :John, :Mary. } in den Named Graph :G1 und { :Peter :knows :Mary. :John :knows :Mary. } in den Named Graph :G2.


```
Expected Dataset:
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:G1  :author  :Mary .
:G2  :author  :Peter .
:G1 {
   :Mary :knows :Mary , :John , :Peter .
}
:G2 {
   :John  :knows  :Mary .
   :Peter :knows  :Mary .
}
```

In [None]:
ds.update("""
INSERT DATA {
	GRAPH :G1 {
		:Mary :knows :Peter, :John, :Mary .
	}

	GRAPH :G2 {
		:Peter :knows :Mary. :John :knows :Mary .
	}
}
""")

print(ds.serialize(format="trig"))

# Task 3

Fragen Sie mittels INSERT-WHERE die :knows-Beziehungen aus :G2 ab und fügen deren inverse :knownBy-Beziehungen in den Default-Graph ein.

Ihr Update Request darf nicht enthalten: [Mary, Peter, John]



```
Expected Dataset:
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:G1  :author  :Mary .
:G2  :author  :Peter .
:Mary  :knownBy  :Peter , :John .
:G1 {
    :Mary  :knows  :Mary , :John , :Peter .
}
:G2 {
    :John  :knows  :Mary .
    :Peter :knows  :Mary .
}
```



In [None]:
ds.update("""
INSERT {
	GRAPH xrdf:default {
    ?p2 :knownBy ?p1 .
  }
}
WHERE {
	GRAPH :G2 {
		?p1 :knows ?p2 .
	}
}
""")

print(ds.serialize(format="trig"))

# Task 4

Löschen Sie mittels DELETE-WHERE alle :knownBy-Beziehungen aus dem Default-Graph.

Ihr Update Request darf nicht enthalten: [Mary, Peter, John]



```
Expected Dataset:
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:G1  :author  :Mary .
:G2  :author  :Peter .
:G1 {
   :Mary :knows :Mary , :John , :Peter .
}
:G2 {
   :John :knows :Mary .
   :Peter  :knows  :Mary .
}
```



In [None]:
ds.update("""
DELETE WHERE{
	GRAPH xrdf:default {
    ?p2 :knownBy ?p1 .
  }
}
""")

print(ds.serialize(format="trig"))

# Task 5

Ermitteln Sie mittels INSERT-WHERE zu jedem Named-Graph dessen Anzahl an Statements mit der Property :knows und schreiben Sie diese in den Default Graph.

Ihr Update Request darf nicht enthalten: [G1, G2]



```
Expected Dataset:
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:G1 :author :Mary ;
    :knowsCount  3 .
:G2 :author :Peter ;
    :knowsCount  2 .
:G1 {
   :Mary  :knows  :Mary , :John , :Peter . }
:G2 {
   :John  :knows  :Mary .
   :Peter  :knows  :Mary . 
}
```



In [None]:
ds.update("""
INSERT {
	GRAPH xrdf:default {
    ?g :knowsCount ?amount .
  }
}
WHERE {
		SELECT ?g (COUNT(?px) AS ?amount)
		WHERE {
			GRAPH ?g {
          ?p1 :knows ?px .
			} 
		} 
		GROUP BY ?g
}
""")

print(ds.serialize(format="trig"))

# Task 6 

Ermitteln Sie mittels INSERT-WHERE die Anzahl an Named Graphs und schreiben Sie sie in den Default Graph.

Ihr Update Request darf nicht enthalten: [2]



```
Expected Dataset:
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:G1 :author :Mary ; :knowsCount  3 .
:ds :graphCount  2 .
:G2 :author :Peter; :knowsCount  2 .
:G1 { 
	:Mary   :knows  :Mary , :John , :Peter .
}
:G2 { 
	:John   :knows  :Mary . 
	:Peter  :knows  :Mary .
}
```



In diesem Beispiel ergibt "graphcount" 3, weil der default graph mitgezählt wird.

In [None]:
ds.update("""
INSERT {
	GRAPH xrdf:default {
    :ds :graphCount ?count
  }
}
WHERE {
	SELECT (COUNT (?g) AS ?count)
	WHERE {
		GRAPH ?g {}
	}
}
""")

print(ds.serialize(format="trig"))

# Task 7 

Verschieben Sie mittels DELETE-INSERT-WHERE alle Metadaten zu Named Graphs (also Statements die einen Named Graph als Subjekt haben) in den entsprechenden Named Graph.

Ihr Update Request darf nicht enthalten: [G1, G2]



```
Expected Dataset:
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:ds  :graphCount  2 .
:G1 { 
	:Mary  :knows  :Mary , :John , :Peter .
	:G1  :author  :Mary ; :knowsCount  3 .
}
:G2 { 
	:John   :knows  :Mary . 
	:G2 :author :Peter ; :knowsCount  2 .
	:Peter  :knows  :Mary .
}
```



In [None]:
ds.update("""
DELETE {
	GRAPH xrdf:default {
    ?g ?p ?o .
  }
}
INSERT {
	GRAPH ?g {
		?g ?p ?o .
	}
}
WHERE{
	GRAPH xrdf:default {
    ?g ?p ?o .
  }
	GRAPH ?g { }
}
""")

print(ds.serialize(format="trig"))

# Task 8 

Schreiben Sie in jeden Named Graph ein Statement, dass der Autor des jeweiligen Named Graphs die :Susi kennt und aktualisieren Sie mit dem selben UpdateRequest den knowsCount.

Ihr Update Request darf nicht enthalten: [Peter, Mary]



```
Expected Dataset:
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:ds     :graphCount  2 .
:G1 { 
	:Mary   :knows  :Susi , :Mary , :John , :Peter .
	:G1     :author      :Mary ; 	:knowsCount  4 .
}
:G2 {
	:John   :knows  :Mary .
	:G2     :author      :Peter ; :knowsCount  3 .
	:Peter  :knows  :Susi , :Mary .
}
```

## Hint:

Hier zwei Lösungen entwickelt, da die Angabe nicht eindeutig war.

Es steht zwar nicht, dass das Update Statement kein "+1" enthalten darf, UND es steht nur da "aktualisieren Sie :knowsCount" - also würde ich denken, dass das passt (es gibt auch dazu ein Beispiel in den Folien - und es gibt Kein Beispiel in den Folien, wo man einen aktualisierten Graph neu auslesen und aggregieren soll...)

ABER da ich mir nicht sicher war, ob dies dann wirklich okay ist, oder ob der Punkt de rAufgabe war, die aktualisierte Graphen-Version neu auszulesen (und somit 2 Statements in 1 Request zu packen?), habe ich diese Version auch gemacht - sie ist viel länger und beinhaltet "INSERT-DELETE-WHERE + INSERT-WHERE" in einem Request, getrennt mit Semicolon, anstatt nur "INSERT-DELETE-WHERE".

##Version mit +1


In [None]:
ds.update("""
DELETE {
	GRAPH ?g {
		?g :knowsCount ?knowsCount_old .
	}
}
INSERT {
	GRAPH ?g {
		?p :knows :Susi .
		?g :knowsCount ?knowsCount_new .
	}
}
WHERE {
	GRAPH ?g {
		?p1 :knows ?px .
		?g :author ?p .
		?g :knowsCount ?knowsCount_old
		BIND (?knowsCount_old + 1 AS ?knowsCount_new)
	}	
}
""")

print(ds.serialize(format="trig"))

## Version mit 2 Statements in 1 Request

In [None]:
#### hier nochmal alle vorhergehende schritte

In [None]:
ds2 = rdflib.Dataset()

ds2.parse(format="trig", data="""
  @prefix :    <http://example.org/> .
  @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .
  @prefix xrdf: <urn:x-rdflib:> . 

  GRAPH xrdf:default {
      :ds :graphCount 3 .
  }

  :G2 {
      :G2 :author :Peter ;
          :knowsCount 2 .

      :John :knows :Mary .

      :Peter :knows :Mary .
  }

  :G1 {
      :G1 :author :Mary ;
          :knowsCount 3 .

      :Mary :knows :John,
              :Mary,
              :Peter .
  }
""")

print(ds2.serialize(format="trig"))



In [None]:
ds2.update("""
DELETE {
	GRAPH ?g {
		?g :knowsCount ?knowsCount_old .
	}
}
INSERT {
	GRAPH ?g {
		?p :knows :Susi .
	}
}
WHERE {
	GRAPH ?g {
		?p1 :knows ?px .
		?g :author ?p .
		?g :knowsCount ?knowsCount_old .

	}
};

INSERT { 
	GRAPH ?g {
		?g :knowsCount ?knowsCount_new .
	}	
}
WHERE {
	SELECT ?g (COUNT (?px) AS ?knowsCount_new)
	WHERE {
		GRAPH ?g {
			?p1 :knows ?px .
		}
	}
	GROUP BY ?g
}
""")

print(ds2.serialize(format="trig"))

# Task 9 

Here, I took the code from the improved code for in the UE02_solution notebook.

In [None]:
# Acknowledgment: Some parts of this solution have been taken from UE02_SuZa

def visualize_graph_pyvis(g, base=None):

    def get_label(rdfterm):
      label = rdfterm.n3(g.namespace_manager)
      if base: 
        label = label.replace(base,"")
      label = label[:12] + '...'+  label[-12:] if len(label)> 25 else label
      return label
    
    nx_graph = nx.MultiDiGraph()                  # Create the NetworkX graph

    for s, p, o in g:
      nx_graph.add_edge(s, o, label=p)

    # Create a PyVis network graph
    pyvis_graph = Network(directed=True, notebook=True, cdn_resources='in_line',bgcolor="#EEEEEE")
    pyvis_graph.from_nx(nx_graph)
    pyvis_graph.set_edge_smooth('dynamic')

    # Customize the node appearance
    for node in pyvis_graph.nodes:
        node["shape"] = "ellipse"
        node["color"] = {"border": "black", "background": "white", "highlight": {"border": "black", "background": "#eeeeee"}}
        node["size"] = 10
        node["font"] = {"size": 10}
        node["label"] = get_label(node["id"])
        if(isinstance(node["id"],Literal)):
          node["shape"] = "box"
        if(isinstance(node["id"],BNode)):
          node["label"] = ""

    # Customize the edge appearance
    for edge in pyvis_graph.edges:
        edge["width"] = 0.5
        edge["font"] = {"size": 8, "align": "middle"}
        edge["arrows"] = "to"
        edge["label"] = get_label(edge["label"])

    display(HTML(pyvis_graph.generate_html(local=True,notebook=True)))    

# visualize_graph_pyvis(g_primer,base=base_primer)

temp test code

In [None]:
gdef = Graph()
gdef.parse(format="turtle",data="""
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:ds :graphCount 3 .
""")

print("default")
visualize_graph_pyvis(gdef)

In [None]:
g1 = Graph()
g1.parse(format="turtle",data="""
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:G1 :author :Mary ;
    :knowsCount 4 .

:Mary :knows :John,
        :Mary,
        :Peter,
        :Susi .
""")

print("G1")
visualize_graph_pyvis(g1)

In [None]:
g2 = Graph()
g2.parse(format="turtle",data="""
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:G2 :author :Peter ;
    :knowsCount 3 .

:John :knows :Mary .

:Peter :knows :Mary,
        :Susi .
""")

print("G2")
visualize_graph_pyvis(g2)

Noch ein versuch - hier werden die einzelnen graphs leider nur für kurze momente angezeigt, anstatt einfach untereinander aufgelistet zu werden... 3 Fenster werden jedoch trotzdem geprinted..
habe nach mehreren Stunden probieren aufgegeben.

In [None]:
def visualize_graph_pyvis2(g, base=None):

    def get_label(rdfterm):
      label = rdfterm.n3(g.namespace_manager)
      if base: 
        label = label.replace(base,"")
      label = label[:12] + '...'+  label[-12:] if len(label)> 25 else label
      return label
    
    nx_graph = nx.MultiDiGraph()                  # Create the NetworkX graph

    for s, p, o in g:
      nx_graph.add_edge(s, o, label=p)

    # Create a PyVis network graph
    pyvis_graph = Network(directed=True, notebook=True, cdn_resources='in_line',bgcolor="#EEEEEE")
    pyvis_graph.from_nx(nx_graph)
    pyvis_graph.set_edge_smooth('dynamic')

    # Customize the node appearance
    for node in pyvis_graph.nodes:
        node["shape"] = "ellipse"
        node["color"] = {"border": "black", "background": "white", "highlight": {"border": "black", "background": "#eeeeee"}}
        node["size"] = 10
        node["font"] = {"size": 10}
        node["label"] = get_label(node["id"])
        if(isinstance(node["id"],Literal)):
          node["shape"] = "box"
        if(isinstance(node["id"],BNode)):
          node["label"] = ""

    # Customize the edge appearance
    for edge in pyvis_graph.edges:
        edge["width"] = 0.5
        edge["font"] = {"size": 8, "align": "middle"}
        edge["arrows"] = "to"
        edge["label"] = get_label(edge["label"])

    # display(HTML(pyvis_graph.generate_html(local=True,notebook=True)))    
    return pyvis_graph



In [None]:
gdef = Graph()
gdef.parse(format="turtle",data="""
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:ds :graphCount 3 .
""")

g1 = Graph()
g1.parse(format="turtle",data="""
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:G1 :author :Mary ;
    :knowsCount 4 .

:Mary :knows :John,
        :Mary,
        :Peter,
        :Susi .
""")

g2 = Graph()
g2.parse(format="turtle",data="""
@prefix :    <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .

:G2 :author :Peter ;
    :knowsCount 3 .

:John :knows :Mary .

:Peter :knows :Mary,
        :Susi .
""")

gdf = visualize_graph_pyvis2(gdef)
gg1 = visualize_graph_pyvis2(g1)
gg2 = visualize_graph_pyvis2(g2)

print("default")
print(display(HTML(gdf.generate_html(local=True,notebook=True)) ))
print("G1")
print( display(HTML(gg1.generate_html(local=True,notebook=True)) ))
print("G2")
print( display(HTML(gg2.generate_html(local=True,notebook=True)) ))