# SPARQL: Die RDF Abfragesprache

In dieser Übung schauen wir uns die SPARQL Abfragesprache für RDF in der Praxis an. Wir verwenden dafür etwas RDF mit der Pink Floyd Diskographie und führen einige Abfragen aus. Zum Schluss schreiben Sie dann Ihr eigenes RDF und einige SPARQL Abfragen dazu.

In [None]:
!pip install rdflib

In [7]:
import pandas as pd
from io import BytesIO, StringIO
from rdflib import Graph
from rdflib.plugins.sparql.results.csvresults import CSVResultSerializer
from IPython.display import display

rdf = """
@prefix ex: <http://example.org#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

[] a ex:Album ;
   ex:title "The Dark Side of the Moon"^^xsd:string ;
   ex:label "Harvest, EMI"@en ;
   ex:released [ 
     ex:day "16"^^xsd:int ;
     ex:month "03"^^xsd:int ;
     ex:year "1973"^^xsd:int 
   ] .
   
[] a ex:Album ;
   ex:title "The Wall" ;
   ex:label "Harvest, EMI" ;
   ex:released [ 
     ex:day 30 ;
     ex:month "11"^^xsd:string ;
     ex:year "1979"^^xsd:int 
   ] .

[] a ex:Single ;
   ex:title "What God Wants, Part 1"^^xsd:string ;
   ex:author [
     ex:firstname "Roger" ;
     ex:lastname "Waters"
   ] ;
   ex:released [ 
     ex:year "1992"^^xsd:int 
   ] .
"""

g = Graph()

r = g.parse(data=rdf, format='turtle')

def query(q):
    serializer = CSVResultSerializer(g.query(q))
    output = BytesIO()
    serializer.serialize(output)
    display(pd.read_csv(StringIO(output.getvalue().decode())))

Führen Sie nun die folgenden Abfragen aus und beantworten Sie die Fragen.

In [8]:
query("""
SELECT ?a ?b WHERE { 
  ?a ex:title ?b
}
""")

Unnamed: 0,a,b
0,ub3bL23C1,"What God Wants, Part 1"
1,ub3bL5C1,The Dark Side of the Moon
2,ub3bL14C1,The Wall


In [9]:
query("""
SELECT ?work ?title WHERE { 
  ?work ex:title ?title
}
""")

Unnamed: 0,work,title
0,ub3bL23C1,"What God Wants, Part 1"
1,ub3bL5C1,The Dark Side of the Moon
2,ub3bL14C1,The Wall


In [12]:
query("""
SELECT ?title WHERE { 
  [] ex:title ?title
}
""")

# Was ist der Unterschied zum vorherigem Beispiel? Antwort: 

Unnamed: 0,title
0,"What God Wants, Part 1"
1,The Dark Side of the Moon
2,The Wall


In [19]:
query("""
SELECT ?title WHERE { 
 ?work rdf:type ex:Album .
 ?work ex:title ?title
}
""")

# Warum nur zwei Resultate? Antwort: 

Unnamed: 0,title
0,The Dark Side of the Moon
1,The Wall


In [15]:
query("""
SELECT ?s ?p ?o WHERE { 
  ?s ?p ?o
}
""")

# Was erhält man hier? Antwort: 

Unnamed: 0,s,p,o
0,ub3bL23C1,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://example.org#Single
1,ub3bL5C1,http://example.org#title,The Dark Side of the Moon
2,ub3bL14C1,http://example.org#title,The Wall
3,ub3bL14C1,http://example.org#released,ub3bL17C16
4,ub3bL8C16,http://example.org#year,1973
5,ub3bL23C1,http://example.org#author,ub3bL25C14
6,ub3bL23C1,http://example.org#released,ub3bL29C16
7,ub3bL23C1,http://example.org#title,"What God Wants, Part 1"
8,ub3bL8C16,http://example.org#day,16
9,ub3bL14C1,http://example.org#label,"Harvest, EMI"


In [22]:
query("""
SELECT ?title ?year WHERE { 
  [] rdf:type ex:Album ; 
     ex:title ?title ;
     ex:released [ ex:year ?year ]
  FILTER (?year > 1973)
}
""")

Unnamed: 0,title,year
0,The Wall,1979


In [23]:
query("""
SELECT ?title ?year WHERE { 
  {
    [] rdf:type ex:Album ; 
       ex:title ?title ;
       ex:released [ ex:year ?year ]
    FILTER (?year > 1973)
  }
  UNION
  {
    [] rdf:type ex:Single ; 
       ex:title ?title ;
       ex:released [ ex:year ?year ]
    FILTER (?year <= 2000 )
  }
}
""")

Unnamed: 0,title,year
0,The Wall,1979
1,"What God Wants, Part 1",1992


In [26]:
query("""
SELECT ?title ?label WHERE { 
    ?work ex:title ?title .
    OPTIONAL { ?work ex:label ?label }
}
""")

# Warum ist der label von "What God Wants, Part 1" NaN? Antwort: 

Unnamed: 0,title,label
0,"What God Wants, Part 1",
1,The Dark Side of the Moon,"Harvest, EMI"
2,The Wall,"Harvest, EMI"


In [27]:
query("""
SELECT ?title WHERE { 
  [] rdf:type ex:Album ;
     ex:title ?title ;
     ex:label ?label
  FILTER (LANG(?label) = "en")
}
""")

Unnamed: 0,title
0,The Dark Side of the Moon


In [28]:
query("""
SELECT ?title WHERE { 
  [] rdf:type ex:Album ;
     ex:title ?title ;
     ex:released [ ex:day ?day ]
  FILTER (?day > 15)
}
""")

Unnamed: 0,title
0,The Dark Side of the Moon
1,The Wall


In [29]:
query("""
SELECT ?title WHERE { 
  [] rdf:type ex:Album ;
     ex:title ?title ;
     ex:released [ ex:month ?month ]
  FILTER (DATATYPE(?month) = xsd:string)
}
""")

# Warum erhält man hier nur "The Wall" als Resultat? Antwort: 

Unnamed: 0,title
0,The Wall


Schreiben Sie nun ihr eigenes RDF und werten Sie einige SPARQL Abfragen aus.

In [None]:
rdf = """
... Mein RDF ...
"""

g = Graph()

r = g.parse(data=rdf, format='turtle')

In [None]:
query("""
... Meine SPARQL Abfrage ...
""")