# The Semantic Web Lab – SPARQL

## Session objectives:

- Use of SPARQL to query remotely semantic end points

- Practice of the SPARQL language

- Merging of RDF data from multiple sources

- Storage of RDF data

- Visualisation of SPARQL queries

## 1.  Introduction

Over this lab we will use Python for developing simple programs that access, query, and manipulate RDF semantic data.
You can see a quick python reference here:

- Basic tutorial: http://docs.python.org/tutorial/ 

- Instruction videos: http://www.python.org/doc/av/

A python program is run by invoking python followed by the name of the `.py` file. We will be going mostly in interactive mode through the python shell.

The ITL machines are already prepared to run this lab sheet without further setup. If you want to work in your own computer you will need to install python, and the following additional python libraries:

- sparql-wrapper http://sparql-wrapper.sourceforge.net/ 

- rdflib http://code.google.com/p/rdflib/

- rdflibextras http://code.google.com/p/rdfextras/ 

- pyparsing http://pyparsing.wikispaces.com/

- networkx http://networkx.lanl.gov/

All the required source code is provided in this sheet.

## 2. SPARQL Querying

In this lab, we will query a remote endpoint at DBpedia, a community-created data source aiming to extract linked open data from Wikipedia. For creating the queries we are going to use the
```
‘’’Python way 
of
defining 
Strings
‘’’
```
This allows us to define multi-line Strings without any kind of scape characters.

We will be using an additional library for performing the queries: SPARQLWrapper. Using this library, execute the following queries, to find out various things about the autonomous community of Asturias (Principality of Asturias http://en.wikipedia.org/wiki/Asturias) in Spain. The following queries are increasing in complexity.

### 1) Simple SELECT Query:

simple SPARQL query example (from the SPARQLWrapper web page):

In [1]:
from SPARQLWrapper import SPARQLWrapper, JSON

wrap the dbpedia SPARQL end-point:

In [2]:
endpoint = SPARQLWrapper("http://dbpedia.org/sparql")

set the query string:

In [3]:
endpoint.setQuery("""
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbpr: <http://dbpedia.org/resource/>
SELECT ?label
WHERE { dbpr:Asturias rdfs:label ?label }
""")

select the return format (e.g. XML, JSON etc...):

In [4]:
endpoint.setReturnFormat(JSON)

execute the query and convert into Python objects:

Note: The JSON returned by the SPARQL endpoint is converted to nested Python dictionaries, so additional parsing is not required.

In [5]:
results = endpoint.query().convert()

interpret the results:

In [6]:
for res in results["results"]["bindings"] :
    print res['label']['value']

Asturias
منطقة أستورياس
Asturien
Asturias
Asturies
Asturie
アストゥリアス州
Asturië (regio)
Asturia
Astúrias
Астурия
阿斯图里亚斯


### 2) SELECT Query with FILTER constraints: 

This query filters all labels other than the ones in English and Spanish.

In [7]:
from SPARQLWrapper import SPARQLWrapper, JSON
endpoint = SPARQLWrapper("http://dbpedia.org/sparql")

endpoint.setQuery("""
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbpr: <http://dbpedia.org/resource/>
SELECT ?label
WHERE { dbpr:Asturias rdfs:label ?label .
FILTER( LANG(?label)="es" || LANG(?label)="en") . }
""")

endpoint.setReturnFormat(JSON)
results = endpoint.query().convert()
for res in results["results"]["bindings"] :
    print res['label']['value']

Asturias
Asturias


### 3) CONSTRUCT Query, obtain results in RDF/XML, and process them into an RDFLib graph

In [8]:
from SPARQLWrapper import SPARQLWrapper, JSON, XML
import rdflib
from rdflib.graph import Graph
from rdflib.namespace import Namespace
from rdflib import plugin

endpoint = SPARQLWrapper("http://dbpedia.org/sparql")

#### PART 1: Build a local RDF graph using a Construct query over DBpedia

create a construct query template.

NOTE: To keep it simple we create an exact copy matching a query in the local graph.

In [None]:
construct_query = '''
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
CONSTRUCT {
    <%(uri)s> a ?type .
    <%(uri)s> ?property ?value .
    <%(uri)s> rdfs:label ?label .
    ?value rdfs:label ?vlabel .
    ?property rdfs:label ?plabel . }
WHERE {
    <%(uri)s> a ?type.
    <%(uri)s> ?property ?value .
    <%(uri)s> rdfs:label ?label .
    ?value rdfs:label ?vlabel .
    ?property rdfs:label ?plabel . }'''

set query and substitute an arbitrary URI into the template:

In [None]:
endpoint.setQuery(construct_query%{'uri':"http://dbpedia.org/resource/Asturias"})

set the return format to XML:

In [None]:
endpoint.setReturnFormat(XML)

execute the query：

In [None]:
graph = endpoint.query().convert()

list all triples to verify the RDF graph：

In [None]:
for s,p,o in graph:
    print "Subject => " + s
    print "Predicate => " + p
    print "Object => " + o

#### PART2: query the resulting RDF graph

register the SPARQL plugin for rdflib:

In [None]:
plugin.register(
 'sparql', rdflib.query.Processor,
 'rdfextras.sparql.processor', 'Processor')
plugin.register(
 'sparql', rdflib.query.Result,
 'rdfextras.sparql.query', 'SPARQLQueryResult')

language tag, change it to see the results in Spanish, rather than in English:

In [None]:
Language = "es"

query the model constructed by the previous CONSTRUCT query:

In [None]:
query = graph.query("""
PREFIX dbpo: <http://dbpedia.org/ontology/>
PREFIX dbpp: <http://dbpedia.org/property/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?location ?party ?leader
WHERE {
    <http://dbpedia.org/resource/Asturias> rdfs:label ?location .
    <http://dbpedia.org/resource/Asturias> dbpp:leaderName ?leaderResource .
    ?leaderResource rdfs:label ?leader .
    <http://dbpedia.org/resource/Asturias> dbpo:leaderParty ?partyResource .
    ?partyResource rdfs:label ?party .
FILTER( LANG(?location)="%(lang)s" && LANG(?party)="%(lang)s" && LANG(?leader)="%(lang)s") . }""" %{'lang':Language})

print the results:

In [None]:
print "RESULTS:"
for row in query.result:
    print("The leader party of %s is %s who's leader is %s." % row)

#### PART3: using the interactive DBpedia end-point

Access the interactive DBpedia end-point at: http://dbpedia.org/isparql/

Select the Advanced tab and type the following query into the window:
```   
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbpo: <http://dbpedia.org/ontology/>
SELECT DISTINCT ?location ?leader ?party
FROM <http://dbpedia.org>
WHERE {
dbpedia:Asturias rdfs:label ?location ;
dbpo:leaderName ?leaderResource .
?leaderResource rdfs:label ?leader .
dbpedia:Asturias dbpo:leaderParty ?partyResource .
?partyResource rdfs:label ?party .
}
```
Select QBE (query by example) and click the “Get from Advanced” iconRun the query by clicking the play icon.