There are multiple ways to access a SPARQL endpoint.

The standard is meant to integrate well with all the other Web technologies, which means we can use our favorite web clients.

However, we will start with a library that abstracts away these details: SPARQLWrapper

# SPARQLWrapper

In [2]:
from SPARQLWrapper import SPARQLWrapper


ModuleNotFoundError: No module named 'SPARQLWrapper'

## Select queries

Let's start with a simple query to DBpediaL

In [86]:
dbpedia = SPARQLWrapper("http://dbpedia.org/sparql")
dbpedia.setQuery("""
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    SELECT ?label
    WHERE { <http://dbpedia.org/resource/Asturias> rdfs:label ?label }
""")
response = dbpedia.query()

In [87]:
print(response)

<SPARQLWrapper.Wrapper.QueryResult object at 0x00007F066C459A90>
{"requestedFormat" : 'xml',
"response (a file-like object, as return by the urllib2.urlopen library call)" : {
	"url" : "https://dbpedia.org/sparql?query=%0A++++PREFIX+rdfs%3A+%3Chttp%3A//www.w3.org/2000/01/rdf-schema%23%3E%0A++++SELECT+%3Flabel%0A++++WHERE+%7B+%3Chttp%3A//dbpedia.org/resource/Asturias%3E+rdfs%3Alabel+%3Flabel+%7D%0A&format=xml&output=xml&results=xml",
	"code" : "200",
	"headers" : Date: Mon, 22 Nov 2021 12:56:24 GMT
Content-Type: application/sparql-results+xml; charset=UTF-8
Content-Length: 2579
Connection: close
Server: Virtuoso/08.03.3322 (Linux) x86_64-generic-linux-glibc25  VDB
Content-disposition: filename=sparql_2021-11-22_12-35-25Z.xml
Expires: Mon, 29 Nov 2021 12:56:24 GMT
Cache-Control: max-age=604800
Access-Control-Allow-Credentials: true
Access-Control-Allow-Methods: GET, POST, OPTIONS
Access-Control-Allow-Headers: Depth,DNT,X-CustomHeader,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Sin

This is the response, we need to convert it to a document:

In [34]:
document = response.convert()
print(document)

<xml.dom.minidom.Document object at 0x7f0655202700>


Once we've gotten the document, we can get the XML content:

In [35]:
results.toxml()

'<?xml version="1.0" ?><sparql xmlns="http://www.w3.org/2005/sparql-results#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/sw/DataAccess/rf1/result2.xsd">\n <head>\n  <variable name="label"/>\n </head>\n <results distinct="false" ordered="true">\n  <result>\n   <binding name="label"><literal xml:lang="en">Asturias</literal></binding>\n  </result>\n  <result>\n   <binding name="label"><literal xml:lang="ar">أشتورية</literal></binding>\n  </result>\n  <result>\n   <binding name="label"><literal xml:lang="ca">Astúries</literal></binding>\n  </result>\n  <result>\n   <binding name="label"><literal xml:lang="cs">Asturie</literal></binding>\n  </result>\n  <result>\n   <binding name="label"><literal xml:lang="el">Αστούριες</literal></binding>\n  </result>\n  <result>\n   <binding name="label"><literal xml:lang="de">Asturien</literal></binding>\n  </result>\n  <result>\n   <binding name="label"><literal xml:lang="eo">Asturio</literal></bindi

Normally, we would merge some of these steps into a single line:

In [41]:
content = dbpedia.query().convert().toxml()
print(content[:100])

<?xml version="1.0" ?><sparql xmlns="http://www.w3.org/2005/sparql-results#" xmlns:xsi="http://www.w


## Using different formats

By default, SPARQLWrapper asks the endpoint to reply with an XML file. We can change that with `setReturnFormat`: 

In [88]:
from SPARQLWrapper import JSON
dbpedia.setReturnFormat(JSON)
results = dbpedia.query().convert()

In [89]:
results

{'head': {'link': [], 'vars': ['label']},
 'results': {'distinct': False,
  'ordered': True,
  'bindings': [{'label': {'type': 'literal',
     'xml:lang': 'en',
     'value': 'Asturias'}},
   {'label': {'type': 'literal', 'xml:lang': 'ar', 'value': 'أشتورية'}},
   {'label': {'type': 'literal', 'xml:lang': 'ca', 'value': 'Astúries'}},
   {'label': {'type': 'literal', 'xml:lang': 'cs', 'value': 'Asturie'}},
   {'label': {'type': 'literal', 'xml:lang': 'el', 'value': 'Αστούριες'}},
   {'label': {'type': 'literal', 'xml:lang': 'de', 'value': 'Asturien'}},
   {'label': {'type': 'literal', 'xml:lang': 'eo', 'value': 'Asturio'}},
   {'label': {'type': 'literal',
     'xml:lang': 'eu',
     'value': 'Asturiasko Printzerria'}},
   {'label': {'type': 'literal', 'xml:lang': 'es', 'value': 'Asturias'}},
   {'label': {'type': 'literal', 'xml:lang': 'fr', 'value': 'Asturies'}},
   {'label': {'type': 'literal', 'xml:lang': 'ga', 'value': 'Asturias'}},
   {'label': {'type': 'literal', 'xml:lang': 'in'

## Describe queries

The previous queries could not be directly converted to an RDF graph because they're using `SELECT`, which results in "tabular" results.

With other types of queries (e.g., `DESCRIBE`), we can generate a graph right from the response:

In [45]:
from SPARQLWrapper import XML

dbpedia.setQuery('DESCRIBE <http://dbpedia.org/resource/Asturias>')
dbpedia.setReturnFormat(XML)
results = dbpedia.query().convert()

In [46]:
g = results.toPython()

In [47]:
g.print()

@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix dbp: <http://dbpedia.org/property/> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix georss: <http://www.georss.org/georss/> .
@prefix gold: <http://purl.org/linguistics/gold/> .
@prefix ns12: <http://dbpedia.org/ontology/PopulatedPlace/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://dbpedia.org/resource/.as> dbo:wikiPageWikiLink <http://dbpedia.org/resource/Asturias> .

<http://dbpedia.org/resource/12_Treasures_of_Spain> dbo:wikiPageWikiLink <http://dbpedia.org/resource/Asturias> .

<http://dbpedia.org/resource/13th_Senate_of_Spain> dbo:wikiPageWikiLink <http://dbpedia.org/resource/Asturias> .

<http://dbpedia.

## Construct queries

Similarly, we can do the same with `CONSTRUCT` queries, which specify how new triples should be constructed from the results in the query:

In [48]:
dbpedia.setQuery('''
    PREFIX dbo: <http://dbpedia.org/ontology/>
    PREFIX schema: <http://schema.org/>
    CONSTRUCT {
      ?lang a schema:Language ;
      schema:alternateName ?iso6391Code . 
    }
    WHERE {
      ?lang a dbo:Language ;
      dbo:iso6391Code ?iso6391Code .
      FILTER (STRLEN(?iso6391Code)=2) # to filter out non-valid values
    }
    LIMIT 10
''')
results = dbpedia.query().convert()

In [49]:
g = results.toPython()

In [50]:
g.print()

@prefix schema: <http://schema.org/> .

<http://dbpedia.org/resource/Arabic> a schema:Language ;
    schema:alternateName "ar" .

<http://dbpedia.org/resource/Aragonese_language> a schema:Language ;
    schema:alternateName "an" .

<http://dbpedia.org/resource/Avar_language> a schema:Language ;
    schema:alternateName "av" .

<http://dbpedia.org/resource/Avestan> a schema:Language ;
    schema:alternateName "ae" .

<http://dbpedia.org/resource/Aymara_language> a schema:Language ;
    schema:alternateName "ay" .

<http://dbpedia.org/resource/Azerbaijani_language> a schema:Language ;
    schema:alternateName "az" .

<http://dbpedia.org/resource/Dutch_language> a schema:Language ;
    schema:alternateName "nl" .

<http://dbpedia.org/resource/Kannada> a schema:Language ;
    schema:alternateName "kn" .

<http://dbpedia.org/resource/Kanuri_language> a schema:Language ;
    schema:alternateName "kr" .

<http://dbpedia.org/resource/Uruguayan_Spanish> a schema:Language ;
    schema:alternateN

# Using Requests

`requests` is a very popular HTTP client library that focuses on usability and idiomatic code.

We could use requests to access a SPARQL endpoint:

In [52]:
SELECT_QUERY = """
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    SELECT ?label
    WHERE { <http://dbpedia.org/resource/Asturias> rdfs:label ?label }
"""

In [54]:
import requests

In [116]:
response = requests.post('http://dbpedia.org/sparql', params={'query': SELECT_QUERY})

By default, we get an XML file:

In [118]:
response.text

'<sparql xmlns="http://www.w3.org/2005/sparql-results#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/sw/DataAccess/rf1/result2.xsd">\n <head>\n  <variable name="label"/>\n </head>\n <results distinct="false" ordered="true">\n  <result>\n   <binding name="label"><literal xml:lang="en">Asturias</literal></binding>\n  </result>\n  <result>\n   <binding name="label"><literal xml:lang="ar">أشتورية</literal></binding>\n  </result>\n  <result>\n   <binding name="label"><literal xml:lang="ca">Astúries</literal></binding>\n  </result>\n  <result>\n   <binding name="label"><literal xml:lang="cs">Asturie</literal></binding>\n  </result>\n  <result>\n   <binding name="label"><literal xml:lang="el">Αστούριες</literal></binding>\n  </result>\n  <result>\n   <binding name="label"><literal xml:lang="de">Asturien</literal></binding>\n  </result>\n  <result>\n   <binding name="label"><literal xml:lang="eo">Asturio</literal></binding>\n  </result>\n  <r

We can ask for a different file format with the `Accept` header. For instance, let's ask for a JSON file:

In [131]:
response = requests.get('http://dbpedia.org/sparql', params={'query': SELECT_QUERY}, headers={'Accept': 'application/sparql-results+json'})

In [132]:
response.json()

{'head': {'link': [], 'vars': ['label']},
 'results': {'distinct': False,
  'ordered': True,
  'bindings': [{'label': {'type': 'literal',
     'xml:lang': 'en',
     'value': 'Asturias'}},
   {'label': {'type': 'literal', 'xml:lang': 'ar', 'value': 'أشتورية'}},
   {'label': {'type': 'literal', 'xml:lang': 'ca', 'value': 'Astúries'}},
   {'label': {'type': 'literal', 'xml:lang': 'cs', 'value': 'Asturie'}},
   {'label': {'type': 'literal', 'xml:lang': 'el', 'value': 'Αστούριες'}},
   {'label': {'type': 'literal', 'xml:lang': 'de', 'value': 'Asturien'}},
   {'label': {'type': 'literal', 'xml:lang': 'eo', 'value': 'Asturio'}},
   {'label': {'type': 'literal',
     'xml:lang': 'eu',
     'value': 'Asturiasko Printzerria'}},
   {'label': {'type': 'literal', 'xml:lang': 'es', 'value': 'Asturias'}},
   {'label': {'type': 'literal', 'xml:lang': 'fr', 'value': 'Asturies'}},
   {'label': {'type': 'literal', 'xml:lang': 'ga', 'value': 'Asturias'}},
   {'label': {'type': 'literal', 'xml:lang': 'in'

# Using the standard library

Using requests is very intuitive and straightforward. However, we can get the same functionality with Python's standard library and a little bit more code:

In [144]:
from urllib.request import Request, urlopen
from urllib.parse import quote_plus, urlencode
from urllib.error import HTTPError
import json
from pprint import pprint

In [145]:
r = Request('http://dbpedia.org/sparql?'+ urlencode({'query': SELECT_QUERY}),
            headers={'content-type': 'application/x-www-form-urlencoded',
                     'accept': "application/sparql-results+json"})
res = urlopen(r)
data = res.read().decode('utf-8')
if res.getcode() == 200:
    try:
        pprint(json.loads(data))
    except HTTPError:
        print('Got: ', data, file=sys.stderr)

{'head': {'link': [], 'vars': ['label']},
 'results': {'bindings': [{'label': {'type': 'literal',
                                     'value': 'Asturias',
                                     'xml:lang': 'en'}},
                          {'label': {'type': 'literal',
                                     'value': 'أشتورية',
                                     'xml:lang': 'ar'}},
                          {'label': {'type': 'literal',
                                     'value': 'Astúries',
                                     'xml:lang': 'ca'}},
                          {'label': {'type': 'literal',
                                     'value': 'Asturie',
                                     'xml:lang': 'cs'}},
                          {'label': {'type': 'literal',
                                     'value': 'Αστούριες',
                                     'xml:lang': 'el'}},
                          {'label': {'type': 'literal',
                                     'value': 'A

We would normally wrap this code in a function, and add some extra parameters, like so:

In [153]:
def send_query(query, endpoint='https://dbpedia.org/sparql'):
    FORMATS = ",".join(["application/sparql-results+json",
                        "text/javascript",
                        "application/json"])

    data = {'query': query}
    # b = quote_plus(query)

    r = Request(endpoint,
                data=urlencode(data).encode('utf-8'),
                headers={'content-type': 'application/x-www-form-urlencoded',
                         'accept': FORMATS},
                method='POST')
    res = urlopen(r)
    data = res.read().decode('utf-8')
    if res.getcode() == 200:
        try:
            return json.loads(data)
        except Exception:
            print('Got: ', data, file=sys.stderr)
            raise
    raise Exception('Error getting results: {}'.format(data))

Now we can use this new function:

In [154]:
send_query('SELECT ?s ?p ?o WHERE {?s ?p ?o} LIMIT 10')

{'head': {'link': [], 'vars': ['s', 'p', 'o']},
 'results': {'distinct': False,
  'ordered': True,
  'bindings': [{'s': {'type': 'uri',
     'value': 'http://www.openlinksw.com/virtrdf-data-formats#default-iid'},
    'p': {'type': 'uri',
     'value': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'},
    'o': {'type': 'uri',
     'value': 'http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat'}},
   {'s': {'type': 'uri',
     'value': 'http://www.openlinksw.com/virtrdf-data-formats#default-iid-nullable'},
    'p': {'type': 'uri',
     'value': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'},
    'o': {'type': 'uri',
     'value': 'http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat'}},
   {'s': {'type': 'uri',
     'value': 'http://www.openlinksw.com/virtrdf-data-formats#default-iid-blank'},
    'p': {'type': 'uri',
     'value': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'},
    'o': {'type': 'uri',
     'value': 'http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat