# 2. Querying RDF - SPARQL

This module applies simple SPARQL queries to simple data.

Some pre-build Python functions are used to lodge the query and present neat results:

* `query()` - the [kurra]() toolki's general-purpose query function that works with inline RDF data or databased
* `table_print()` - a function that lets Jupyter Notebooks render a SPARQL query result in Markdown nicely

---

## 2.1. Running a basic query

The Turtle data from Notebook 1 was:

```turtle
PREFIX people: <https://linked.data.gov.au/dataset/people/>
PREFIX schema: <https://schema.org/>

people:nick
    a
        schema:Person ,
        schema:Patient ;
    schema:name "Nick" ;
    schema:age 42 ;
    schema:parent people:george ;
.

people:george
    a schema:Person ;
    schema:name "George" ;
    schema:age 70 ;
.
```

Here there are two people, `people:nick` and `people:george`. To find all the people with age greater than 50 (just George), we can query the data like this:

```sparql
PREFIX people: <https://linked.data.gov.au/dataset/people/>
PREFIX schema: <https://schema.org/>

SELECT ?p
WHERE {
    ?p
        a schema:Person ;
        schema:age ?age ;
    .

    FILTER (?age > 50)
}
```

This part matches a "subgraph":

```
    ?p
        a schema:Person ;
        schema:age ?age ;
    .
```

where `?p` & `?age` are variables and `a`, `schema:Person` & `schema:name` are all fixed values.

This part filters all the matched subgraphs:

```FILTER (?age > 50)```

Let's really run this:

In [1]:
# importing some things we need
from IPython.display import display, Markdown
from kurra.sparql import query
from kurra.utils import render_sparql_result

# a pretty table printing function
def table_print(r):
    display(Markdown(render_sparql_result(r)))


# our data, in Turtle format
rdf_data = """
PREFIX people: <https://linked.data.gov.au/dataset/people/>
PREFIX schema: <https://schema.org/>

people:nick
    a
        schema:Person ,
        schema:Patient ;
    schema:name "Nick" ;
    schema:age 42 ;
    schema:parent people:george ;
.

people:george
    a schema:Person ;
    schema:name "George" ;
    schema:age 70 ;
.
"""

# our SPARQL query
q = """
PREFIX people: <https://linked.data.gov.au/dataset/people/>
PREFIX schema: <https://schema.org/>

SELECT ?p ?name
WHERE {
    ?p
        a schema:Person ;
        schema:name ?name ;
        schema:age ?age ;
    .

    FILTER (?age > 50)
}
"""

In [None]:
# run the query on the data
r = query(rdf_data, q)
table_print(r)

If we have 2 people older than 50 and we wanted their ages:

In [None]:
rdf_data2 = """
PREFIX ex: <http://example.com/>
PREFIX people: <https://linked.data.gov.au/dataset/people/>
PREFIX schema: <https://schema.org/>

people:nick
    a
        schema:Person ,
        schema:Patient ;
    schema:name "Nick" ;
    schema:age 42 ;
    schema:parent people:george ;
.

people:george
    a schema:Person ;
    schema:name "George" ;
    schema:age 70 ;
    schema:gender ex:male ;
    schema:spouse people:cathy ;  # NEW
.

people:cathy
    a schema:Person ;
    schema:name "Cathy" ;
    schema:gender ex:female ;
    schema:age 68 ;
    schema:spouse people:george ; # symetrical
.
"""

q2 = """
PREFIX people: <https://linked.data.gov.au/dataset/people/>
PREFIX schema: <https://schema.org/>

SELECT ?name ?age
WHERE {
    ?p
        a schema:Person ;
        schema:name ?name ;
        schema:age ?age ;
    .

    FILTER (?age > 50)
}
"""

r = query(rdf_data2, q2)
table_print(r)

> **&#9432;** Have a think about how the query is matching elements in the RDF graph and what could be done to reduce the results from 2 to 1 by either FILTERing on additional criteria or matching on additional nodes.

## 2.2. Running a query on a DB

Run a query for 10 Concepts from the KurrawongAI demo server within the _Seabed geomorphology - Part 1 Morphology_ vocabulary:

SPARQL consists not only of a query language, but also:

* **update extension** - how to perform write queries and manipulate whole datasets
* **service description** - how to describe the capabilities of a SPARQL DB
* **protocol** - defining how queries are to be sent to remote servers
* **several results formats**

All the SPARQL documents link to these, see <https://www.w3.org/TR/sparql12-query/#related>

Having a defined _protocol_ allows us to know how to interact with any DB claiming to conform to SPARQL.

We do this via HTTP (web) requests.

Let's just get on and perform a SPARQL query using a public-avialable "SPARQL Endpoint":

In [3]:
# importing some Python things we need
from IPython.display import display, Markdown
from kurra.sparql import query
from kurra.utils import render_sparql_result

# a pretty table printing function
def table_print(r):
    display(Markdown(render_sparql_result(r)))

# a simple query to list terms in a vocabulary - Seabed geomorphology - Part 1 Morphology
q = """
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT ?c ?pl
WHERE {
  ?c
    a skos:Concept ;
    skos:inScheme <https://pid.geoscience.gov.au/def/voc/ga/SeabedGeomorphologyMorphology> ;
  	skos:prefLabel ?pl ;
  .
}
ORDER BY ?pl
LIMIT 10
"""

# run the query against the DB
r = query("https://prez.niceforest-128e6d31.australiaeast.azurecontainerapps.io/sparql", q)

# pretty-print the result
table_print(r)

| c | pl |
| --- | --- |
[apron](https://pid.geoscience.gov.au/def/voc/ga/SeabedGeomorphologyMorphology/apron) | Apron
| [bank](https://pid.geoscience.gov.au/def/voc/ga/SeabedGeomorphologyMorphology/bank) | Bank
| [block](https://pid.geoscience.gov.au/def/voc/ga/SeabedGeomorphologyMorphology/block) | Block
| [break_in_slope](https://pid.geoscience.gov.au/def/voc/ga/SeabedGeomorphologyMorphology/break_in_slope) | Break in Slope
| [canyon](https://pid.geoscience.gov.au/def/voc/ga/SeabedGeomorphologyMorphology/canyon) | Canyon
| [centreline](https://pid.geoscience.gov.au/def/voc/ga/SeabedGeomorphologyMorphology/centreline) | Centreline
| [channel](https://pid.geoscience.gov.au/def/voc/ga/SeabedGeomorphologyMorphology/channel) | Channel
| [cone](https://pid.geoscience.gov.au/def/voc/ga/SeabedGeomorphologyMorphology/cone) | Cone
| [crest](https://pid.geoscience.gov.au/def/voc/ga/SeabedGeomorphologyMorphology/crest) | Crest
| [crevice](https://pid.geoscience.gov.au/def/voc/ga/SeabedGeomorphologyMorphology/crevice) | Crevice |


*## 2.3. A SPARQL Endpoint UI

Now we will try this directly on the DB UI and talk through that interface: 

* <https://demo.dev.kurrawong.ai/sparql>