# Linked Data and Music Encodings
## SPARQL Exercises

This Jupyter Notebook is inspired by Bob DuCharme's description of the approach in his blog entry: 
- http://www.bobdc.com/blog/jupytersparql/ 

and the corresponding notebook 

- https://github.com/bobdc/misc/blob/master/JupyterSPARQL/Jupyter%20and%20SPARQL%20and%20Dort%20or%20Dordrecht.ipynb.

It uses the `sparqlkernel` released by Paulo Villegas: https://github.com/paulovn/sparql-kernel.

## 1. Setup

Use some  `magics`from the `sparqlkernel` to set up the endpoint, the query format & output format.
The `magics` don't have to be in a separate cell from the actual query, but it is easier to set things up separately. 

Possible music-related endpoints (alphabetically listed):


**DoReMus**
- Endpoint: [http://data.doremus.org/sparql]
- Website:  [https://www.doremus.org/] 


**JazzCats**
- Endpoint: [http://cdhr-linkeddata.anu.edu.au/jazzcats-sparql/sparql]
- Website:  [http://jazzcats.cdhr.anu.edu.au/]


**MIDI Linked Data**
- Endpoint: [http://virtuoso-midi.amp.ops.labs.vu.nl/sparql]
- Website:  [https://midi-ld.github.io/]


**MusicOWL**
- Endpoint: [http://linkeddata.uni-muenster.de:7200]
- Website:  [http://linkeddata.uni-muenster.de/musicportal/]


**RISM**
- Endpoint: [https://data.rism.info/sparql]
- Website:  [https://opac.rism.info/]


**SLICKMEM**
- Endpoint: [https://blazegraph.linkedmusic.org/blazegraph/namespace/SLICKMEM/sparql] 
- Website:  [https://slickmem.linkedmusic.org]


In [64]:
# Set the SPARQL endpoint
%endpoint http://cdhr-linkeddata.anu.edu.au/jazzcats-sparql/sparql

# Set the data format requested to the SPARQL endpoint (%format JSON | XML | N3 | any | default)
%format any

# Set the output rendering shape (%display raw | table [withtypes] | diagram [svg|png] [withliterals])
%display table

Short test if the selected endpoint is running and we are connecting to it properly.

**IMPORTANT: Use the `LIMIT` modifier in all of the following queries to avoid stressing the endpoint too much.**

In [20]:
SELECT * WHERE {
   ?s ?p ?o . 
 }
LIMIT 5

s,p,o
http://www.openlinksw.com/virtrdf-data-formats#default-iid,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/virtrdf-data-formats#default-iid-nullable,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/virtrdf-data-formats#default-iid-nonblank,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/virtrdf-data-formats#default-iid-nonblank-nullable,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/virtrdf-data-formats#default,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat


## 2. SPARQL Queries

### 2.1 Find all Concepts/Classes of the dataset

#### Explanation:

General information about and specification of the SPARQL Query language can be found here: https://www.w3.org/TR/sparql11-query/ 

`SELECT`    = one of SPARQL's query forms; returns all, or a subset of, the variables bound in a query pattern match.

`DISTINCT`  = ensure solutions in the sequence are unique, i.e. duplicates are merged into one result

`*`         = shortcut to return all bound variables (`?xyz`) as results, here the results bound in `?Concept`

`WHERE {…}` = so-called WHERE-clause, basic graph pattern to match against the data graph

`[]`        = blank node, see [4.1.4 Syntax for Blank Nodes](https://www.w3.org/TR/sparql11-query/#QSynBlankNodes)

`a`         = `rdf:type`

`?Concept` = random variable name for any URI that matches the given pattern in the object position of a triple.

`LIMIT`     = modifier to limit the result number (important!)

In [21]:
SELECT DISTINCT * 
WHERE {                 

    [] a ?Concept .     
    
}
LIMIT 50

Concept
http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/schemas/virtrdf#QuadStorage
http://www.openlinksw.com/schemas/virtrdf#array-of-QuadMap
http://www.openlinksw.com/schemas/virtrdf#QuadMap
http://www.openlinksw.com/schemas/virtrdf#array-of-QuadMapFormat
http://www.openlinksw.com/schemas/virtrdf#QuadMapValue
http://www.openlinksw.com/schemas/virtrdf#array-of-QuadMapATable
http://www.openlinksw.com/schemas/virtrdf#array-of-QuadMapColumn
http://www.openlinksw.com/schemas/virtrdf#QuadMapColumn
http://www.openlinksw.com/schemas/virtrdf#QuadMapFText


### 2.1.1 Get the number of these concepts/classes in the dataset

#### Explanation
`(COUNT(*) AS ?n)` = bind the number of results of all bound variables to the variable `?n`

`GROUP BY`         = group the results by the given value/variable

`ORDER BY DESC`    = sort the results in descending order by the given value/variable (leave out `DESC` or use `ASC` for ascending order

In [22]:
SELECT DISTINCT ?Concept (COUNT(*) AS ?n) 
WHERE {

    [] a ?Concept .

}
GROUP BY ?Concept
ORDER BY DESC(?n)
LIMIT 50

Concept,n
http://xmlns.com/foaf/0.1/Person,9348
http://jazzcats.cdhr.anu.edu.au/data/vocab/melody,2885
http://jazzcats.cdhr.anu.edu.au/data/vocab/beats,2342
http://jazzcats.cdhr.anu.edu.au/data/vocab/sections,839
http://purl.org/ontology/mo/Recording,691
http://purl.org/ontology/mo/Performance,231
http://purl.org/ontology/mo/Record,231
http://purl.org/ontology/mo/Sound,231
http://www.w3.org/2006/time#DateTimeInterval,221
http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat,130


### 2.1.2 Get the number of these concepts/classes in the dataset plus a sample of every class

#### Explanation

see also: https://twitter.com/kidehen/status/1129143936839294976 

`(SAMPLE(?s) AS ?sample)` = retrieve one random sample of the result set for the variable `?s` and bind the sample to the variable `?sample`

`COUNT(1)`                = equivalent of `COUNT(*)`

`?s`                      = random variable name for any URI that matches the given pattern in the subject position of a triple

`FILTER(isIri(?s))`       = filter out only those results for the variable `?s` that are an IRI

In [24]:
SELECT (SAMPLE(?s) AS ?sample) (COUNT(1) AS ?n) (?Concept AS ?entityType) 
WHERE {

    ?s a ?Concept .
    FILTER(isIri(?s)) .
}
GROUP BY ?Concept
ORDER BY DESC(?n)
LIMIT 50

sample,n,entityType
http://dbpedia.org/resource/Abbey_Lincoln,9348,http://xmlns.com/foaf/0.1/Person
http://jazzcats.cdhr.anu.edu.au/data/melody/11912,2885,http://jazzcats.cdhr.anu.edu.au/data/vocab/melody
http://jazzcats.cdhr.anu.edu.au/data/beats/19650,2342,http://jazzcats.cdhr.anu.edu.au/data/vocab/beats
http://jazzcats.cdhr.anu.edu.au/data/sections/21428,839,http://jazzcats.cdhr.anu.edu.au/data/vocab/sections
http://jazzcats.cdhr.anu.edu.au/data/recording/02c86ae8-a6a1-461c-b970-176ecbb7ef38,691,http://purl.org/ontology/mo/Recording
http://jazzcats.cdhr.anu.edu.au/data/sound/47b12e9d-8b23-4461-8c5e-5e088ec8e408,231,http://purl.org/ontology/mo/Sound
http://jazzcats.cdhr.anu.edu.au/data/performance/47b12e9d-8b23-4461-8c5e-5e088ec8e408,231,http://purl.org/ontology/mo/Performance
http://jazzcats.cdhr.anu.edu.au/data/record//47b12e9d-8b23-4461-8c5e-5e088ec8e408,231,http://purl.org/ontology/mo/Record
http://jazzcats.cdhr.anu.edu.au/1/01/93,221,http://www.w3.org/2006/time#DateTimeInterval
http://www.openlinksw.com/virtrdf-data-formats#default-iid,130,http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat


### 2.2 Find all persons of the dataset

Given there are any persons in the dataset at all, make sure that you use the correct entity type when you search for persons. For an overview of the entity types that are used in the dataset see 2.1.2.

#### Explanation:

`PREFIX foaf: <http://xmlns.com/foaf/0.1/>` = shortcut prefix for the [FOAF](http://xmlns.com/foaf/spec/) vocabulary and namespace

`SELECT DISTINCT ?s` = return all unique values bound in `?s` as results

`foaf:Person`        = concept of Person from the [FOAF](http://xmlns.com/foaf/spec/) vocabulary (replace with the corresponding Person concept according to your dataset)

In [28]:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT DISTINCT ?s 
WHERE {
    
    ?s a foaf:Person .
} 
LIMIT 50

s
http://dbpedia.org/resource/Abbey_Lincoln
http://dbpedia.org/resource/Abdullah_Ibrahim
http://dbpedia.org/resource/Adele_Girard
http://dbpedia.org/resource/Ahmad_Jamal
http://dbpedia.org/resource/Al_Casey_(jazz_guitarist)
http://dbpedia.org/resource/Al_Cohn
http://dbpedia.org/resource/Al_Foster
http://dbpedia.org/resource/Al_Grey
http://dbpedia.org/resource/Al_Haig
http://dbpedia.org/resource/Al_Hall_(musician)


### 2.2.1 Count all persons in the dataset

Given there are any persons in the dataset at all, make sure that you use the correct entity type when you search for persons. For an overview of the entity types that are used in the dataset see 2.1.2.

#### Explanation:

see 2.1.1

`BIND(foaf:Person as ?o )` = bind `foaf:Person` to the variable `?o`

In [32]:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT  ?o (COUNT(?s) AS ?sCount)
WHERE {
    
        ?s a foaf:Person .
        BIND(foaf:Person as ?o )
}
GROUP BY ?o
ORDER BY DESC (?sCount)
LIMIT 50

o,sCount
http://xmlns.com/foaf/0.1/Person,9348


### 2.2.2 Find all properties assigned to the Person concept (as subject) in the dataset

Given there are any persons in the dataset at all, make sure that you use the correct entity type when you search for persons. For an overview of the entity types that are used in the dataset see 2.1.2.

#### Explanation:

`?p` = random variable name for any URI that matches the given pattern in the predicate position of a triple

In [33]:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT DISTINCT ?p
WHERE {
    
        ?s a foaf:Person .
        ?s ?p ?o .

}
ORDER BY ?p
LIMIT 50

p
http://dbpedia.org/ontology/birthDate
http://dbpedia.org/ontology/birthPlace
http://dbpedia.org/ontology/deathDate
http://dbpedia.org/ontology/deathPlace
http://dbpedia.org/ontology/thumbnail
http://linkedjazz.org/ontology/bandLeaderOf
http://linkedjazz.org/ontology/bandmember
http://linkedjazz.org/ontology/inBandTogether
http://linkedjazz.org/ontology/playedTogether
http://linkedjazz.org/ontology/touredWith


### 2.2.3 How often are the properties assigned to the Person concept (as subject) used in the dataset

Given there are any persons in the dataset at all, make sure that you use the correct entity type when you search for persons. For an overview of the entity types that are used in the dataset see 2.1.2.

#### Explanation:

`?p` = random variable name for any URI that matches the given pattern in the predicate position of a triple

In [35]:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?p (COUNT(?p) AS ?pCount) 
## add `(SAMPLE(?o) AS ?sample)` to get a sample of objects
WHERE
{
      ?s a foaf:Person .
      ?s ?p ?o .
}
GROUP BY ?p
ORDER BY DESC (?pCount)
LIMIT 50


p,pCount
http://xmlns.com/foaf/0.1/name,9739
http://www.w3.org/1999/02/22-rdf-syntax-ns#type,9480
http://xmlns.com/foaf/0.1/givenName,8444
http://xmlns.com/foaf/0.1/surname,8444
http://dbpedia.org/ontology/birthDate,7156
http://dbpedia.org/ontology/birthPlace,5988
http://purl.org/vocab/relationship/knowsOf,3400
http://dbpedia.org/ontology/deathDate,3070
http://purl.org/dc/elements/1.1/description,2767
http://dbpedia.org/ontology/deathPlace,2164


TEST

In [53]:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX mo: <http://purl.org/ontology/mo/>

SELECT ?p (COUNT(?p) AS ?pCount) (SAMPLE(?o) AS ?sample)
WHERE
{
      ?s a mo:Performance .
      ?s ?p ?o .
}
GROUP BY ?p
ORDER BY DESC (?pCount)
LIMIT 50

p,pCount,sample
http://purl.org/ontology/mo/performer,646,http://jazzcats.cdhr.anu.edu.au/data/person/George_Cables
http://purl.org/ontology/mo/instrument,410,http://jazzcats.cdhr.anu.edu.au/data/instrument/alto_saxophone
http://purl.org/ontology/mo/performance_of,231,http://jazzcats.cdhr.anu.edu.au/data/work/Body_and_Soul
http://purl.org/ontology/mo/produced_sound,231,http://jazzcats.cdhr.anu.edu.au/data/sound/47b12e9d-8b23-4461-8c5e-5e088ec8e408
http://www.w3.org/1999/02/22-rdf-syntax-ns#type,231,http://purl.org/ontology/mo/Performance


In [63]:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX mo: <http://purl.org/ontology/mo/>

SELECT DISTINCT ?instrumentLabel (COUNT(?instrument) as ?n)
WHERE {

    ?s a foaf:Person ; 
       <http://purl.org/ontology/mo/performed> ?performance .
    ?performance  mo:performance_of ?work ;
                  mo:instrument ?instrument .
    ?instrument rdfs:label ?instrumentLabel.
    
    
}
GROUP BY ?instrumentLabel
ORDER BY DESC (?n)
LIMIT 50


instrumentLabel,n
tenor saxophone,990
piano,897
drums,528
bass,525
vocals,525
alto saxophone,357
trumpet,282
big band,150
guitar,135
trombone,84
