# 2. Querying RDF - SPARQL

## 2.1. Running a basic query

The Turtle data from Notebook 1 was:

```turtle
PREFIX people: <https://linked.data.gov.au/dataset/people/>
PREFIX schema: <https://schema.org/>

people:nick
    a
        schema:Person ,
        schema:Patient ;
    schema:name "Nick" ;
    schema:age 42 ;
    schemaparent people:george ;
.

people:george
    a schema:Person ;
    schema:name "George" ;
    schema:age 70 ;
.
```

Here there are two people, `people:nick` and `people:george`. To find all the people with age greater than 50 (just George), we can query the data like this:

```sparql
PREFIX people: <https://linked.data.gov.au/dataset/people/>
PREFIX schema: <https://schema.org/>

SELECT ?p
WHERE {
    ?p
        a schema:Person ;
        schema:age ?age ;
    .

    FILTER (?age > 50)
}
```

This part matches a "subgraph":

```
    ?p
        a schema:Person ;
        schema:age ?age ;
    .
```

where `?p` & `?age` are variables and `a`, `schema:Person` & `schema:name` are all fixed values.

This part filters all the matched subgraphs:

```FILTER (?age > 50)```

Let's really run this:


In [1]:
# importing some things we need
from IPython.display import display, Markdown
from kurra.sparql import query
from kurra.utils import render_sparql_result

# a pretty table printing function
def table_print(r):
    display(Markdown(render_sparql_result(r)))


# our data, in Turtle format
rdf_data = """
PREFIX people: <https://linked.data.gov.au/dataset/people/>
PREFIX schema: <https://schema.org/>

people:nick
    a
        schema:Person ,
        schema:Patient ;
    schema:name "Nick" ;
    schema:age 42 ;
    schema:parent people:george ;
.

people:george
    a schema:Person ;
    schema:name "George" ;
    schema:age 70 ;
.
"""

# our SPARQL query
q = """
PREFIX people: <https://linked.data.gov.au/dataset/people/>
PREFIX schema: <https://schema.org/>

SELECT ?p ?name
WHERE {
    ?p
        a schema:Person ;
        schema:name ?name ;
        schema:age ?age ;
    .

    FILTER (?age > 50)
}
"""

In [2]:
# run the query on the data
r = query(rdf_data, q)
table_print(r)

| p | name |
| --- | --- |
[george](https://linked.data.gov.au/dataset/people/george) | George |


If we have 2 people older than 50 and we wanted their ages:

In [3]:
rdf_data2 = """
PREFIX people: <https://linked.data.gov.au/dataset/people/>
PREFIX schema: <https://schema.org/>

people:nick
    a
        schema:Person ,
        schema:Patient ;
    schema:name "Nick" ;
    schema:age 42 ;
    schema:parent people:george ;
.

people:george
    a schema:Person ;
    schema:name "George" ;
    schema:age 70 ;
    schema:spouse people:cathy ;  # NEW
.

people:cathy
    a schema:Person ;
    schema:name "Cathy" ;
    schema:age 68 ;
    schema:spouse people:george ; # symetrical
.
"""

q2 = """
PREFIX people: <https://linked.data.gov.au/dataset/people/>
PREFIX schema: <https://schema.org/>

SELECT ?name ?age
WHERE {
    ?p
        a schema:Person ;
        schema:name ?name ;
        schema:age ?age ;
    .

    FILTER (?age > 50)
}
"""

r = query(rdf_data2, q2)
table_print(r)

| name | age |
| --- | --- |
George | 70
| Cathy | 68 |


## 2.2. Running a query on a DB

Run a query for any 3 (randomly-selected) Addresses from the QALI training server:

In [6]:
q = """
PREFIX addr: <https://linked.data.gov.au/def/addr/>

SELECT ?a
WHERE {
  GRAPH ?g {
    ?a a addr:Address ;
  }
}
LIMIT 3
"""

r = query("https://training.cam.kurrawong.ai/qali/", q)
table_print(r)

| a |
| --- |
[65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e](https://linked.data.gov.au/dataset/qld-addr/address/65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e)
| [2fd46078-88c0-5f30-b43e-d2908d9445b6](https://linked.data.gov.au/dataset/qld-addr/address/2fd46078-88c0-5f30-b43e-d2908d9445b6)
| [7036d80a-ecb6-5ec9-86f0-5e5135934e04](https://linked.data.gov.au/dataset/qld-addr/address/7036d80a-ecb6-5ec9-86f0-5e5135934e04) |


## 2.3. A SPARQL Endpoint UI

Now we will try this directly on the DB UI and talk through that interface: 

* <https://training.cam.kurrawong.ai/>