# Query RDF with the SPARQL query language


* SPARQL (pronounced sparkle) stands for: **SPARQL Protocol And RDF Query Language**
* SPARQL 1.0 W3C-Recommendation since January 15th 2008
* SPARQL 1.1 W3C-Recommendation since March 21st 2013 Query language to query instances in RDF documents

**Reference specifications: https://www.w3.org/TR/sparql11-query/**


> w3.org materials are standards and recommendations accepted by the World Wide Web Consortium (W3C, the organism defining the Internet standards)

# SPARQL endpoint 🔗

* Databases are not built to be publicly available, they usually live close to their applications
* Triplestore can be fully queried through a publicly available SPARQL endpoint URL
* Some solutions enable user management, but natively SPARQL endpoints are built to be open and give the same access to all their users

We will use the **DBpedia SPARQL endpoint**: 

>**https://dbpedia.org/sparql**

[DBpedia](https://wiki.dbpedia.org/) is a project to represent (parts of) Wikipedia as RDF, it has been used has a playground for the Semantic Web for years. The data is not controlled or curated, which lead to poor data quality (don't be surprised to find weird things)

You can use a nicer query editor that can query any public SPARQL endpoint: 

> **https://yasgui.triply.cc**

# Install the SPARQL kernel

This notebook uses the SPARQL Kernel to define and **execute SPARQL queries in the notebook** codeblocks.
To **install the SPARQL Kernel** in your JupyterLab installation:

```shell
pip install sparqlkernel --user
jupyter sparqlkernel install --user
```

To start running SPARQL query in this notebook, we need to define the **SPARQL kernel parameters**:
* 🔗 **URL of the SPARQL endpoint to query**
* 🌐 Language of preferred labels
* 📜 Log level

In [36]:
%endpoint http://dbpedia.org/sparql

# This is optional, it would increase the log level
%log debug

# Uncomment the next line to return label in english and avoid duplicates
# %lang en

# SPARQL query components

Variables to resolve are defined using `?` (e.g. `?my_variable`)

```sparql
# prefix declarations: for abbreviating URIs
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
# dataset definition (optional): which RDF graph(s) are being queried
FROM
# result clause: what information to return from the query
SELECT *
# query pattern: specifying what to query for in the underlying dataset
WHERE {
    ?s ?p ?o .
}
# query modifiers: slicing, ordering, and rearranging query results
ORDER BY ?s
LIMIT 10
```

# Run a SPARQL query

Let's get **[all triples in DBpedia](http://yasgui.triply.cc/#query=SELECT%20*%0AWHERE%20%7B%0A%20%20%20%20%3Fsubject%20%3Fpredicate%20%3Fobject%20.%0A%7D&endpoint=https%3A%2F%2Fdbpedia.org%2Fsparql&requestMethod=POST&tabTitle=Query&headers=%7B%7D&contentTypeConstruct=text%2Fturtle%2C*%2F*%3Bq%3D0.9&contentTypeSelect=application%2Fsparql-results%2Bjson%2C*%2F*%3Bq%3D0.9&outputFormat=table)**:

In [2]:
SELECT *
WHERE {
    ?subject ?predicate ?object .
}

subject,predicate,object
http://www.openlinksw.com/virtrdf-data-formats#default-iid,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/virtrdf-data-formats#default-iid-nullable,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/virtrdf-data-formats#default-iid-nonblank,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/virtrdf-data-formats#default-iid-nonblank-nullable,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/virtrdf-data-formats#default,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/virtrdf-data-formats#default-nullable,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/virtrdf-data-formats#sql-varchar,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/virtrdf-data-formats#sql-varchar-nullable,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/virtrdf-data-formats#sql-varchar-dt,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/virtrdf-data-formats#sql-varchar-dt-nullable,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat


We can see that **DBpedia limit by default to 10.000 results** (and the SPARQL kernel shows only 20 for readability)

But the returned triples are not really interesting, we will need to filter our results

# Get all books

Get **[all the books in DBpedia 📚](https://yasgui.triply.cc/#query=PREFIX%20rdf%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%3E%0APREFIX%20dbo%3A%20%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2F%3E%0ASELECT%20*%0AWHERE%20%7B%0A%20%20%3Fbook%20rdf%3Atype%20dbo%3ABook%20.%0A%7D%0A&endpoint=https%3A%2F%2Fdbpedia.org%2Fsparql&requestMethod=POST&tabTitle=Query&headers=%7B%7D&contentTypeConstruct=text%2Fturtle%2C*%2F*%3Bq%3D0.9&contentTypeSelect=application%2Fsparql-results%2Bjson%2C*%2F*%3Bq%3D0.9&outputFormat=table)**

In [3]:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT *
WHERE {
    ?book rdf:type dbo:Book .
}

book
http://dbpedia.org/resource/Havana_Storm
http://dbpedia.org/resource/The_Awful_German_Language
http://dbpedia.org/resource/Urmonotheismus
http://dbpedia.org/resource/Modern_C_Design
http://dbpedia.org/resource/1066_and_All_That
http://dbpedia.org/resource/2010:_Odyssey_Two
http://dbpedia.org/resource/401(k)
http://dbpedia.org/resource/A_Crown_of_Swords
http://dbpedia.org/resource/A_Dictionary_of_the_English_Language
http://dbpedia.org/resource/Adultery


Here a prefix is defined for the DBpedia ontology:

```sparql
PREFIX dbo: <http://dbpedia.org/ontology/>
```

The `rdf:` prefix is defined by default, but `rdf:type` can be shortened to `a`

```sparql
SELECT * WHERE {
  ?book a dbo:Book .
}
```

If run on the 2 following statements it will return only the `<http://book1>`

```turtle
<http://book1> rdf:type <http://dbpedia.org/ontology/Book> .
<http://country1> rdf:type <http://dbpedia.org/ontology/Country> .
```

# Get property of class

Get [**the author of each returned book** 🖋️](http://yasgui.triply.cc/#query=PREFIX%20dbo%3A%20%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2F%3E%0ASELECT%20*%0AWHERE%20%7B%0A%20%20%20%20%3Fbook%20a%20dbo%3ABook%20.%0A%20%20%20%20%3Fbook%20dbo%3Aauthor%20%3Fauthor%20.%0A%7D&endpoint=https%3A%2F%2Fdbpedia.org%2Fsparql&requestMethod=POST&tabTitle=Query&headers=%7B%7D&contentTypeConstruct=text%2Fturtle%2C*%2F*%3Bq%3D0.9&contentTypeSelect=application%2Fsparql-results%2Bjson%2C*%2F*%3Bq%3D0.9&outputFormat=table) (when an author is defined):

In [37]:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT *
WHERE {
    ?book a dbo:Book .
    ?book dbo:author ?author .
}

book,author
http://dbpedia.org/resource/1066_and_All_That,http://dbpedia.org/resource/R._J._Yeatman
http://dbpedia.org/resource/1066_and_All_That,http://dbpedia.org/resource/W._C._Sellar
http://dbpedia.org/resource/2010:_Odyssey_Two,http://dbpedia.org/resource/Arthur_C._Clarke
http://dbpedia.org/resource/A_Crown_of_Swords,http://dbpedia.org/resource/Robert_Jordan
http://dbpedia.org/resource/A_Dictionary_of_the_English_Language,http://dbpedia.org/resource/Samuel_Johnson
http://dbpedia.org/resource/Alice's_Adventures_in_Wonderland,http://dbpedia.org/resource/Lewis_Carroll
http://dbpedia.org/resource/Anne_of_Green_Gables,http://dbpedia.org/resource/Lucy_Maud_Montgomery
http://dbpedia.org/resource/Around_the_World_in_Eighty_Days,http://dbpedia.org/resource/Jules_Verne
http://dbpedia.org/resource/Between_Planets,http://dbpedia.org/resource/Robert_A._Heinlein
http://dbpedia.org/resource/Beyond_This_Horizon,http://dbpedia.org/resource/Robert_A._Heinlein


A turtle-like syntax can also be used to make the query more readable:

```sparql
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT *
WHERE {
    ?book a dbo:Book ; 
        dbo:author ?author .
}
```

Consider a graph with the following 4 statements:

```turtle
<http://book1> rdf:type <http://dbpedia.org/ontology/Book> .
<http://book1> dbo:author <http://author1> .
<http://book2> rdf:type <http://dbpedia.org/ontology/Book> .
<http://book2> dbo:contributor <http://author2> .
```

The previous query will return only **one row of results with `<http://book1>` and `<http://author1>`**

# Restriction of result set

* `LIMIT`: maximal number of results (table rows) 
* `OFFSET`: position of the first delivered result SELECT 
* `DISTINCT`: removal of duplicate table rows

> `LIMIT` and `OFFSET` usually make sense with `ORDER BY`!


# Retrieve specific variables and limit

You can specify the variables you want to be returned. And `LIMIT` the number of results returned.

This query will return 10 authors of book, with duplicates if the author wrote multiple books 👥

In [5]:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?author
WHERE {
    ?book a dbo:Book ;
        dbo:author ?author .
} 
LIMIT 10

author
http://dbpedia.org/resource/R._J._Yeatman
http://dbpedia.org/resource/W._C._Sellar
http://dbpedia.org/resource/Arthur_C._Clarke
http://dbpedia.org/resource/Robert_Jordan
http://dbpedia.org/resource/Samuel_Johnson
http://dbpedia.org/resource/Lewis_Carroll
http://dbpedia.org/resource/Lucy_Maud_Montgomery
http://dbpedia.org/resource/Jules_Verne
http://dbpedia.org/resource/Robert_A._Heinlein
http://dbpedia.org/resource/Robert_A._Heinlein


# Retrieve distinct results

You can also require to only return `DISTINCT` results (will not return duplicate rows)

> Use the `*` if you want all variables

This query will return all authors of book **without duplicates**, even if they wrote multiple books 👤

In [6]:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT DISTINCT ?author
WHERE {
    ?book a dbo:Book ;
        dbo:author ?author .
} LIMIT 10

author
http://dbpedia.org/resource/R._J._Yeatman
http://dbpedia.org/resource/W._C._Sellar
http://dbpedia.org/resource/Arthur_C._Clarke
http://dbpedia.org/resource/Robert_Jordan
http://dbpedia.org/resource/Samuel_Johnson
http://dbpedia.org/resource/Lewis_Carroll
http://dbpedia.org/resource/Lucy_Maud_Montgomery
http://dbpedia.org/resource/Jules_Verne
http://dbpedia.org/resource/Robert_A._Heinlein
http://dbpedia.org/resource/Greg_Bear


# FILTER the results

## Comparison operators: <, =, >, <=, >=, !=
* Comparison of data literals according to natural order
* Support for numerical data types, xsd:dateTime, xsd:string (alphabetic ordering), xsd:Boolean (1>0)
* For other types and other RDF-elements, only = and != are available
* Comparison of literals of incompatible types (e.g. xsd:string and xsd:integer) is not allowed, they must be converted

## Arithmetic operators: +, -, *, /
* Support for numerical data types
* Used to combine values in filter conditions 
* E.g. `FILTER(?weight/ (?size*?size)>=25)`


# Filter by number of pages

This query will return 10 authors who wrote a book with more than 500 pages 📖

In [34]:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT DISTINCT ?author
WHERE {
    ?book a dbo:Book ;
        dbo:author ?author ;
        dbo:numberOfPages ?pages .
    FILTER (?pages > 500)
} LIMIT 10

author
http://dbpedia.org/resource/Robert_Jordan
http://dbpedia.org/resource/Tom_Clancy
http://dbpedia.org/resource/Douglas_Hofstadter
http://dbpedia.org/resource/Richard_Llewellyn
http://dbpedia.org/resource/Charles_Darwin
http://dbpedia.org/resource/Henry_Martyn_Robert
http://dbpedia.org/resource/Iain_Banks
http://dbpedia.org/resource/Victor_Hugo
http://dbpedia.org/resource/Robert_Anton_Wilson
http://dbpedia.org/resource/Robert_Shea


# Perform an arithmetic operation

Calculate the GDP per capita of countries from `dbp:gdpNominal` and `dbo:populationTotal` **impossible due to different datatypes** 🚫

In [33]:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?country ?gdpValue datatype(?gdpValue) AS ?gdpType ?population datatype(?population) AS ?populationType (?gdpValue / ?population AS ?gdpPerCapita)
WHERE {
    ?country dbp:gdpNominal ?gdpValue ;
             dbo:populationTotal ?population .
} LIMIT 10

country,gdpValue,gdpType,population,populationType,gdpPerCapita
http://dbpedia.org/resource/Arab_League,3.526E12,http://dbpedia.org/datatype/usDollar,423000000,http://www.w3.org/2001/XMLSchema#nonNegativeInteger,0
http://dbpedia.org/resource/Syria,5.9957E10,http://dbpedia.org/datatype/usDollar,17064854,http://www.w3.org/2001/XMLSchema#nonNegativeInteger,0
http://dbpedia.org/resource/Economic_Cooperation_Organization,US $1.9 trillion,http://www.w3.org/1999/02/22-rdf-syntax-ns#langString,416046863,http://www.w3.org/2001/XMLSchema#nonNegativeInteger,0
http://dbpedia.org/resource/Egypt,3.30765E11,http://dbpedia.org/datatype/usDollar,85783,http://www.w3.org/2001/XMLSchema#nonNegativeInteger,0
http://dbpedia.org/resource/United_States,1.8558E13,http://dbpedia.org/datatype/usDollar,324720797,http://www.w3.org/2001/XMLSchema#nonNegativeInteger,0
http://dbpedia.org/resource/Afghanistan,1.9654E10,http://dbpedia.org/datatype/usDollar,32564342,http://www.w3.org/2001/XMLSchema#nonNegativeInteger,0
http://dbpedia.org/resource/Albania,1.2204E10,http://dbpedia.org/datatype/usDollar,2886026,http://www.w3.org/2001/XMLSchema#nonNegativeInteger,0
http://dbpedia.org/resource/Algeria,1.8171E11,http://dbpedia.org/datatype/usDollar,40400000,http://www.w3.org/2001/XMLSchema#nonNegativeInteger,0
http://dbpedia.org/resource/Andorra,4.51E9,http://dbpedia.org/datatype/usDollar,85470,http://www.w3.org/2001/XMLSchema#nonNegativeInteger,0
http://dbpedia.org/resource/Antigua_and_Barbuda,1.332E9,http://dbpedia.org/datatype/usDollar,91295,http://www.w3.org/2001/XMLSchema#nonNegativeInteger,0


# Cast a variable to a specific datatype

Especially useful when **comparing or performing an arithmetical operations on 2 variables**.

Here we divide a value in `usDollar` by a `nonNegativeInteger` casting the 2 to `xsd:integer` to calculate the GDP per capita of each country 💶

In [27]:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT ?country ?gdpValue ?population (xsd:integer(?gdpValue) / xsd:integer(?population) AS ?gdpPerCapita)
WHERE {
    ?country dbp:gdpNominal ?gdpValue ;
             dbo:populationTotal ?population .
} LIMIT 10

country,gdpValue,population,gdpPerCapita
http://dbpedia.org/resource/Arab_League,3.526E12,423000000,8335.0
http://dbpedia.org/resource/Syria,5.9957E10,17064854,3513.0
http://dbpedia.org/resource/Economic_Cooperation_Organization,US $1.9 trillion,416046863,
http://dbpedia.org/resource/Egypt,3.30765E11,85783,3855833.0
http://dbpedia.org/resource/United_States,1.8558E13,324720797,57150.0
http://dbpedia.org/resource/Afghanistan,1.9654E10,32564342,603.0
http://dbpedia.org/resource/Albania,1.2204E10,2886026,4228.0
http://dbpedia.org/resource/Algeria,1.8171E11,40400000,4497.0
http://dbpedia.org/resource/Andorra,4.51E9,85470,52767.0
http://dbpedia.org/resource/Antigua_and_Barbuda,1.332E9,91295,14590.0


# Filter books on name content

* Cast the variable `?name` using the `str()` function, to avoid datatypes incompatibility
* Use the **function** `contains(bigger_string,smaller_string)` to only keep books with the string "grand" in their label 🏷️

In [8]:
PREFIX dbo:<http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?name ?author 
WHERE {
    ?book a dbo:Book ;
        dbo:author ?author ;
        rdfs:label ?name . 
    FILTER (contains(str(?name),"grand"))
} LIMIT 10

name,author
Il grande divorzio,http://dbpedia.org/resource/C._S._Lewis
La grande caccia,http://dbpedia.org/resource/Robert_Jordan
La grande fuga dell'Ottobre Rosso,http://dbpedia.org/resource/Tom_Clancy
La grande storia del tempo,http://dbpedia.org/resource/Leonard_Mlodinow
La grande storia del tempo,http://dbpedia.org/resource/Stephen_Hawking
Il grande ritratto,http://dbpedia.org/resource/Dino_Buzzati
Il piccolo grande uomo (romanzo),http://dbpedia.org/resource/Thomas_Berger_(novelist)
Il grande mare dei Sargassi,http://dbpedia.org/resource/Jean_Rhys
La Conspiration au grand jour,http://dbpedia.org/resource/H._G._Wells
Considérations sur les causes de la grandeur des Romains et de leur décadence,http://dbpedia.org/resource/Montesquieu


# Using conditions in FILTER

Filter conditions can be linked with boolean operators: `&&`, `||`, `!`

> Keep when pages superior to 500 and does not contains "grand" in its name

```sparql
FILTER((?pages > 500) && !contains(?name,"grand"))
```

> Keep when pages superior to 500 or contains "grand" in its name

```sparql
FILTER((?pages > 500) || contains(?name,"grand"))
```

Expressible through graph pattern:
* **Conjunction** corresponds to specifications of **several filters**.
* **Disjunction** corresponds to application of filters in **alternative patterns** (use `UNION`).


# Return results from different patterns

Use `UNION` to return the books name wherever it is defined with `rdfs:label` or `dbp:name`:

In [38]:
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX dbo:<http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?book ?name
WHERE {
    {
        ?book a dbo:Book ;
            rdfs:label ?name .
        FILTER (langMatches(lang(?name),"en"))
    } UNION {
        ?book a dbo:Book ;
            dbp:name ?name .
    }
    FILTER (contains(str(?name), "grand"))
}

book,name
http://dbpedia.org/resource/Alphabetum_grandonico-malabaricum_sive_samscrudonicum,Alphabetum grandonico-malabaricum sive samscrudonicum
http://dbpedia.org/resource/Les_grands_cimetières_sous_la_lune,Les grands cimetières sous la lune
http://dbpedia.org/resource/L'Art_d'être_grand-père,L'Art d'être grand-père
http://dbpedia.org/resource/La_Cour_des_grands,La Cour des grands
http://dbpedia.org/resource/Plame_affair_grand_jury_investigation,Plame affair grand jury investigation
http://dbpedia.org/resource/Servitude_et_grandeur_militaires,Servitude et grandeur militaires
http://dbpedia.org/resource/My_Grandfather_Would_Have_Shot_Me,"Jennifer Teege , My grandfather would have shot me : a Black woman discovers her family's Nazi past"


# Without UNION

For `rdfs:label` only 6 results

> Check how many results for `dbp:name`

In [39]:
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX dbo:<http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?book ?name
WHERE {
    ?book a dbo:Book ;
        rdfs:label ?name .
    FILTER (langMatches(lang(?name),"en"))
    FILTER (contains(str(?name), "grand"))
}

book,name
http://dbpedia.org/resource/Alphabetum_grandonico-malabaricum_sive_samscrudonicum,Alphabetum grandonico-malabaricum sive samscrudonicum
http://dbpedia.org/resource/Les_grands_cimetières_sous_la_lune,Les grands cimetières sous la lune
http://dbpedia.org/resource/L'Art_d'être_grand-père,L'Art d'être grand-père
http://dbpedia.org/resource/La_Cour_des_grands,La Cour des grands
http://dbpedia.org/resource/Plame_affair_grand_jury_investigation,Plame affair grand jury investigation
http://dbpedia.org/resource/Servitude_et_grandeur_militaires,Servitude et grandeur militaires


# Get the language of a literal

We can use the `lang()` function in the `SELECT` statement to return the lang of the book label 💬

> Give it a name using the keyword `as`

In [9]:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo:<http://dbpedia.org/ontology/>
SELECT ?name ?author lang(?name) as ?lang
WHERE {
    ?book a dbo:Book ;
        dbo:author ?author ;
        rdfs:label ?name . 
} LIMIT 10

name,author,lang
1066 and All That,http://dbpedia.org/resource/R._J._Yeatman,en
1066 and All That,http://dbpedia.org/resource/R._J._Yeatman,es
1066 and All That,http://dbpedia.org/resource/R._J._Yeatman,fr
1066 and All That,http://dbpedia.org/resource/W._C._Sellar,en
1066 and All That,http://dbpedia.org/resource/W._C._Sellar,es
1066 and All That,http://dbpedia.org/resource/W._C._Sellar,fr
2010: Odyssey Two,http://dbpedia.org/resource/Arthur_C._Clarke,en
"Odyssee 2010 – Das Jahr, in dem wir Kontakt aufnehmen",http://dbpedia.org/resource/Arthur_C._Clarke,de
2010: Odisea dos,http://dbpedia.org/resource/Arthur_C._Clarke,es
2010 : Odyssée deux,http://dbpedia.org/resource/Arthur_C._Clarke,fr


# Return only books with title in Dutch

Use the `langMatches()` function to filter books with a label in Dutch 🇳🇱

In [35]:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo:<http://dbpedia.org/ontology/>
SELECT ?name ?author
WHERE {
    ?book a dbo:Book ;
        dbo:author ?author ;
        rdfs:label ?name . 
    FILTER (langMatches(lang(?name),"nl"))
} LIMIT 10

name,author
2010: Odyssey Two,http://dbpedia.org/resource/Arthur_C._Clarke
Een Kroon van Zwaarden,http://dbpedia.org/resource/Robert_Jordan
A Dictionary of the English Language,http://dbpedia.org/resource/Samuel_Johnson
Alice's Adventures in Wonderland,http://dbpedia.org/resource/Lewis_Carroll
Anne van het Groene Huis,http://dbpedia.org/resource/Lucy_Maud_Montgomery
De reis om de wereld in tachtig dagen,http://dbpedia.org/resource/Jules_Verne
Tussen de planeten,http://dbpedia.org/resource/Robert_A._Heinlein
Brave New World (roman),http://dbpedia.org/resource/Aldous_Huxley
Catch-22 (roman),http://dbpedia.org/resource/Joseph_Heller
Sjakie en de chocoladefabriek,http://dbpedia.org/resource/Roald_Dahl


# Sorting results

We can **[`ORDER BY` the books by number of pages](https://yasgui.triply.cc/#query=PREFIX%20dbo%3A%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0ASELECT%20%3Fname%20%3Fauthor%20%3Fpages%20%3Fbook%0AWHERE%20%7B%0A%20%20%20%20%3Fbook%20a%20dbo%3ABook%20%3B%0A%20%20%20%20%20%20%20%20dbo%3Aauthor%20%3Fauthor%20%3B%0A%20%20%20%20%20%20%20%20rdfs%3Alabel%20%3Fname%20%3B%0A%20%20%20%20%20%20%20%20dbo%3AnumberOfPages%20%3Fpages%20.%0A%7D%20%0AORDER%20BY%20%3Fpages%0ALIMIT%2010&endpoint=https%3A%2F%2Fdbpedia.org%2Fsparql&requestMethod=POST&tabTitle=Query&headers=%7B%7D&contentTypeConstruct=text%2Fturtle%2C*%2F*%3Bq%3D0.9&contentTypeSelect=application%2Fsparql-results%2Bjson%2C*%2F*%3Bq%3D0.9&outputFormat=table)**, in ascending order by default 📤

In [16]:
PREFIX dbo:<http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?name ?author ?pages
WHERE {
    ?book a dbo:Book ;
        dbo:author ?author ;
        rdfs:label ?name ;
        dbo:numberOfPages ?pages .
} 
ORDER BY ?pages
LIMIT 10

name,author,pages
"Liberia; or, Mr. Peyton's Experiments",http://dbpedia.org/resource/Sarah_Josepha_Hale,0
"The Cabin and Parlor; or, Slaves and Masters",http://dbpedia.org/resource/Charles_Jacobs_Peterson,0
"The Lofty and the Lowly, or Good in All and None All Good",http://dbpedia.org/resource/Maria_Jane_McIntosh,0
Geological Evidences of the Antiquity of Man,http://dbpedia.org/resource/Charles_Lyell,0
Geological Evidences of the Antiquity of Man,http://dbpedia.org/resource/Charles_Lyell,0
The Vampyre,http://dbpedia.org/resource/John_William_Polidori,0
Der Vampyr (Polidori),http://dbpedia.org/resource/John_William_Polidori,0
El vampiro (1819),http://dbpedia.org/resource/John_William_Polidori,0
Le Vampire (nouvelle),http://dbpedia.org/resource/John_William_Polidori,0
Il vampiro (racconto),http://dbpedia.org/resource/John_William_Polidori,0


# Order by results descending 

Order results from books with the highest number of pages to the lowest 📥

In [12]:
PREFIX dbo:<http://dbpedia.org/ontology/>
PREFIX dbp:<http://dbpedia.org/property/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?name ?author ?pages
WHERE {
    ?book a dbo:Book ;
        dbo:author ?author ;
        rdfs:label ?name ;
        dbo:numberOfPages ?pages .
    FILTER (langMatches(lang(?name),"en"))
} 
ORDER BY DESC(?pages)
LIMIT 10

name,author,pages
The Tolkien Reader,http://dbpedia.org/resource/J._R._R._Tolkien,241127964
"L. Ron Hubbard, Messiah or Madman?",http://dbpedia.org/resource/Ronald_DeWolf,1992460
"L. Ron Hubbard, Messiah or Madman?",http://dbpedia.org/resource/Bent_Corydon,1992460
"L. Ron Hubbard, Messiah or Madman?",http://dbpedia.org/resource/Ronald_DeWolf,1987402
"L. Ron Hubbard, Messiah or Madman?",http://dbpedia.org/resource/Bent_Corydon,1987402
The Merlin Conspiracy,http://dbpedia.org/resource/Diana_Wynne_Jones,473468
Political Liberalism,http://dbpedia.org/resource/John_Rawls,435576
Halo: The Flood,http://dbpedia.org/resource/William_C._Dietz,352432
Halo: The Fall of Reach,http://dbpedia.org/resource/Eric_Nylund,352416
Unsung: A History of Women in American Music,http://dbpedia.org/resource/Christine_Ammer,317382


# Count aggregated results

Count the number of books for each author 📚

In [13]:
PREFIX dbo:<http://dbpedia.org/ontology/>
SELECT ?author (count(?book) as ?book_count)
WHERE {
    ?book a dbo:Book ;
        dbo:author ?author .
} LIMIT 10

author,book_count
http://dbpedia.org/resource/Daniel_Carter_Beard,1
http://dbpedia.org/resource/Raymond_Smullyan,1
http://dbpedia.org/resource/William_Donaldson,1
http://dbpedia.org/resource/Joe_Conason,1
http://dbpedia.org/resource/John_Lennon,3
http://dbpedia.org/resource/Jürgen_Habermas,8
http://dbpedia.org/resource/Robert_Nathan,1
http://dbpedia.org/resource/J._P._Martin,6
http://dbpedia.org/resource/Anthony_Everitt,1
http://dbpedia.org/resource/Arthur_R.G._Solmssen,1


# Count depend on the aggregated results of a row

Here we select also the book, hence getting a count of 1 book for each row 📘

In [14]:
PREFIX dbo:<http://dbpedia.org/ontology/>
SELECT ?book ?author (count(?book) as ?book_count)
WHERE {
    ?book a dbo:Book ;
        dbo:author ?author .
} LIMIT 10

book,author,book_count
http://dbpedia.org/resource/The_Aunt's_Story,http://dbpedia.org/resource/Patrick_White,1
http://dbpedia.org/resource/The_Circus_of_Dr._Lao_and_Other_Improbable_Stories,http://dbpedia.org/resource/Ray_Bradbury,1
http://dbpedia.org/resource/The_Four_False_Weapons,http://dbpedia.org/resource/John_Dickson_Carr,1
http://dbpedia.org/resource/The_Smack_Man,http://dbpedia.org/resource/Nelson_DeMille,1
http://dbpedia.org/resource/The_Surprising_Archaea,http://dbpedia.org/resource/John_L._Howland,1
http://dbpedia.org/resource/Trapped_in_the_USSR,http://dbpedia.org/resource/J._J._Fortune,1
http://dbpedia.org/resource/A_Gent_from_Bear_Creek,http://dbpedia.org/resource/Robert_E._Howard,1
http://dbpedia.org/resource/A_Golden_Anniversary_Bibliography_of_Edgar_Rice_Burroughs,http://dbpedia.org/resource/Henry_Hardy_Heins,1
http://dbpedia.org/resource/Act_of_Providence,http://dbpedia.org/resource/Joseph_Payne_Brennan,1
http://dbpedia.org/resource/Atlas_of_the_British_Flora,http://dbpedia.org/resource/Franklyn_Perring,1


# SPARQL query breakdown 🧬

<img src="sparql_query_breakdown.png">

# Search on DBpedia 🔎

Use **[https://yasgui.triply.cc](https://yasgui.triply.cc)** to write and run SPARQL query on DBpedia

> Find DBpedia classes and relations: search on google, e.g.: "**[dbpedia capital](https://www.google.com/search?&q=dbpedia+capital)**"

* The capital ([dbo:capital](http://dbpedia.org/ontology/capital)) of the country in which authors of books are born ([dbo:birthPlace](http://dbpedia.org/ontology/birthPlace)), limit to 10

* All books with a name in english starting with "http" ignoring case

* Calculate the GDP per capita using countries `dbp:gdpNominal`, `dbo:populationTotal` (GDP/population) and compare it to the existing property in DBpedia `dbp:gdpNominalPerCapita`

> **Search for functions in the specifications: https://www.w3.org/TR/sparql11-query**


# Public SPARQL endpoints 🔗

* Wikidata, facts powering Wikipedia infobox: https://query.wikidata.org/sparql
* Bio2RDF, linked data for the life sciences: https://bio2rdf.org/sparql
* Disgenet, gene-disease association: http://rdf.disgenet.org/sparql
* PathwayCommons, resource for biological pathways analysis: http://rdf.pathwaycommons.org/sparql

# Going further

* Wikidata SPARQL queries around the SARS-CoV-2 virus and pandemic: https://egonw.github.io/SARS-CoV-2-Queries
* Use [prefix.cc](http://prefix.cc/) to resolve mysterious prefixes.