# Jupyter and SPARQL and Dort or Dordrecht

*Bob DuCharme*

![Dort or Dordrecht painting](https://upload.wikimedia.org/wikipedia/commons/thumb/0/08/DortorDordrecht.jpg/300px-DortorDordrecht.jpg)

This Jupyter notebook is a demo of the kinds of cool things you can do with the  [sparql-kernel](https://github.com/paulovn/sparql-kernel) Jupyter module. As the module's readme file tells you, you can switch around between different target endpoints all in the same notebook. You can also specify various instructions about the format in which you want to retrieve the data for each query.

To play with this, we'll retrieve data from two different SPARQL endpoints about J. M. W. Turner's painting "Dort or Dordrecht: The Dort packet-boat from Rotterdam becalmed". 

Our first query asks the SPARQL endpoint at the Yale Center for British Art which resources have the phrase "Dort or Dordrecht" in their labels. We start with a special Jupyter instruction known as a "magic" to identify the first endpoint where we'll send some queries. The sparql-kernel module offers several options for specifying the return format; the safest bet seems to be putting ``any`` as as default. There are also a choice of display formats for retrieved data, and while the defaults usually work fine I got in the habit of specifying specific values (in the case below, `table`) to reduce the chance of a mess if I'm skipping around amoung the notebook's cells and running them out of order. 

In [1]:
%endpoint http://collection.britishart.yale.edu/openrdf-sesame/repositories/ycba
%format any
%display table
# The "magics" don't have to be in a separate cell from the actual query. 

In [2]:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
SELECT * WHERE {
   ?s rdfs:label ?label . 
   FILTER CONTAINS(?label,"Dort or Dordrecht")
}

s,label
http://collection.britishart.yale.edu/id/object/34,Dort or Dordrecht: The Dort Packet-Boat from Rotterdam Becalmed
http://collection.britishart.yale.edu/id/object/34/title/1,Dort or Dordrecht: The Dort Packet-Boat from Rotterdam Becalmed
http://collection.britishart.yale.edu/id/object/34/title/2,Dort or Dordrecht: The Dort Packet-Boat from Rotterdam Becalmed
http://collection.britishart.yale.edu/id/bibliography/844,"[Yale University Press Advertisement ] Dort or Dordrecht:, The Dort Packed-boat from Rotterdam Becalmed, Art Journal (CAA), 37, no. 1, Autumn 1977, p. 81, Available online in JSTOR , hardcoopy also available : N81 A887 + OVERSIZE (HAAS)"
http://collection.britishart.yale.edu/id/bibliography/990,"Dort or Dordrecht: The Dort Packet-boat from Rotterdam Becalmed, Magazine Antiques, vol. 95, February 1969, p. 210, NK1125 A3 OVERSIZE (HAAS)"
http://collection.britishart.yale.edu/id/bibliography/993,"Dort or Dordrecht: The Dort Packet-boat from Rotterdam Becalmed, Magazine Antiques, vol. 111, May 1977, p. 864, NK1125 A3 (HAAS)"
http://collection.britishart.yale.edu/id/bibliography/994,"Dort or Dordrecht: The Dort Packet-boat from Rotterdam Becalmed, ART News, vol. 76, November 1977, p. 57, N1 A6 OVERSIZE (HAAS)"
http://collection.britishart.yale.edu/id/object/34,Dort or Dordrecht: The Dort packet-boat from Rotterdam becalmed
http://collection.britishart.yale.edu/id/object/34/title/1,Dort or Dordrecht: The Dort packet-boat from Rotterdam becalmed
http://collection.britishart.yale.edu/id/bibliography/844,"[Yale University Press Advertisement ] Dort or Dordrecht:, The Dort Packed-boat from Rotterdam Becalmed, Art Journal (CAA), 37, no. 1, Autumn 1977, p. 81, Available online in JSTOR , hardcoopy also available : N81 A887 + OVERSIZE (HAAS)"


It looks like we found the resource that has the label we're looking for: `http://collection.britishart.yale.edu/id/object/34`. 

Next we'll retrieve all triples about that resource. Inside of a Jupyter Notebook, the results of a CONSTRUCT query in a table don't look much different from the results of a `SELECT ?s ?p ?o` query. If we set display to RAW, we'll get the actual triples and we can copy them and paste them into another application.

In [32]:
%endpoint http://collection.britishart.yale.edu/openrdf-sesame/repositories/ycba 
%display RAW

(The query has a lot of results. If you scroll below the cell with these results you'll see that we've only just gotten started in our exploration of data about the painting.)

In [33]:
CONSTRUCT {<http://collection.britishart.yale.edu/id/object/34> ?p ?o }
WHERE {<http://collection.britishart.yale.edu/id/object/34> ?p ?o}


<http://collection.britishart.yale.edu/id/object/34> <http://erlangen-crm.org/current/P62_depicts> <http://vocab.getty.edu/aat/300117546> .
<http://collection.britishart.yale.edu/id/object/34> <http://erlangen-crm.org/current/P70i_is_documented_in> <http://collection.britishart.yale.edu/id/bibliography/763> .
<http://collection.britishart.yale.edu/id/object/34> <http://collection.britishmuseum.org/id/ontology/PX_display_wrap> "Bibliograpic reference ::  Christie Advertisement, On Friday July 2, 1937  Important         Pictures by Old Masters, The Times (London), Tuesday, June 22, 1937, p. 29, Available         Online : Times Digital Archive  , Also available on  microfilm : An T482 (SML)" .
<http://collection.britishart.yale.edu/id/object/34> <http://collection.britishmuseum.org/id/ontology/PX_display_wrap> "Bibliograpic reference ::  Mark Hallett, The Great Spectacle: 250 years of the Royal         Academy's Summer Exhibition, Royal Academy of Arts, [London], p. 69, fig. 44, N5054    

Let's set up the notebook to send the next query to Wikidata:

In [34]:
%endpoint https://query.wikidata.org/sparql
%format any
%display table

Now we'll get the data about the painting from Wikidata. I didn't send queries to find out the painting's URI, like I did with the museum endpoint above; I just went to the painting's [Wikipedia page](https://en.wikipedia.org/wiki/Dort_or_Dordrecht:_The_Dort_packet-boat_from_Rotterdam_becalmed), picked "Wikidata Item" in the lower left, and added a prefix of `wd:` before the resulting URL's local name. Then I could use that to retrieve all of Wikidata's triples about the painting:

In [35]:
%display raw
CONSTRUCT { wd:Q22329821 ?p ?o } 
WHERE { wd:Q22329821 ?p ?o } 

<http://www.wikidata.org/entity/Q22329821> <http://www.w3.org/2000/01/rdf-schema#label> "Dort or Dordrecht: The Dort packet-boat from Rotterdam becalmed"@en .
<http://www.wikidata.org/entity/Q22329821> <http://schema.org/description> "pintura de J. M. W. Turner"@ast .
<http://www.wikidata.org/entity/Q22329821> <http://schema.org/description> "schilderij van Joseph Mallord William Turner"@nl .
<http://www.wikidata.org/entity/Q22329821> <http://schema.org/description> "pintura de Joseph Mallord William Turner"@gl .
<http://www.wikidata.org/entity/Q22329821> <http://www.wikidata.org/prop/direct/P186> <http://www.wikidata.org/entity/Q4259259> .
<http://www.wikidata.org/entity/Q22329821> <http://www.wikidata.org/prop/P4738> <http://www.wikidata.org/entity/statement/Q22329821-D2E35D46-6C8B-488B-B761-76FEB4BAD745> .
<http://www.wikidata.org/entity/Q22329821> <http://schema.org/description> "gem\u00E4lde von William Turner"@de .
<http://www.wikidata.org/entity/Q22329821> <http://www.wikidata.o

I can also connect up to a local running instance of Fuseki. The following is usually my first query with any endpoint just to make sure I'm connecting to the endpoint properly. I did it with the endpoints above as well, but removed the cells after each successful test. That's the beauty of Jupyter: keep or refine the cells you want, delete the ones you don't.

In [36]:
%endpoint http://localhost:3030/myDataset/sparql
%format any
%display table

In [37]:
SELECT * WHERE {?s ?p ?o} LIMIT 5

s,p,o
http://learningsparql.com/ns/addressbook#richard,http://learningsparql.com/ns/addressbook#homeTel,(229) 276-5135
http://learningsparql.com/ns/addressbook#richard,http://learningsparql.com/ns/addressbook#email,richard49@hotmail.com
http://learningsparql.com/ns/addressbook#cindy,http://learningsparql.com/ns/addressbook#homeTel,(245) 646-5488
http://learningsparql.com/ns/addressbook#cindy,http://learningsparql.com/ns/addressbook#email,cindym@gmail.com
http://learningsparql.com/ns/addressbook#craig,http://learningsparql.com/ns/addressbook#homeTel,(194) 966-1505


Next I'll ask my local copy of Fuseki to use the SERVICE keyword to retrieve triples about the Turner painting from both the Yale Center for British Art and from Wikidata. This is the real fun of having multiple endpoints out there with data about related subjects—linking it all up.  

In [39]:
%display raw
PREFIX wd: <http://www.wikidata.org/entity/>

CONSTRUCT { ?paintingURI ?p ?o}
WHERE
{
  {
    SERVICE <http://collection.britishart.yale.edu/openrdf-sesame/repositories/ycba>
  {
    BIND (<http://collection.britishart.yale.edu/id/object/34> AS ?paintingURI )
      ?paintingURI ?p ?o .
  }
      } UNION {

  SERVICE <https://query.wikidata.org/sparql>
  {
      BIND (wd:Q22329821 AS ?paintingURI) 
    ?paintingURI ?p ?o . 
}
      }
}

<http://collection.britishart.yale.edu/id/object/34> <http://collection.britishmuseum.org/id/ontology/PX_display_wrap> "Bibliograpic reference ::  Prof. David Bindman, The History of British art, Yale Center         for British Art, New Haven, Conn., 2008, pp. 208-09 (v.2), fig. 134, N6761 H57 2008         + Oversize (YCBA)" .
<http://collection.britishart.yale.edu/id/object/34> <http://erlangen-crm.org/current/P138i_has_representation> <http://deliver.odai.yale.edu/content/id/7b53a98f-4e91-469f-a925-5aad74fcd6b0/format/3> .
<http://www.wikidata.org/entity/Q22329821> <http://www.wikidata.org/prop/P276> <http://www.wikidata.org/entity/statement/Q22329821-E4A08816-9594-4054-BE77-923FF96CBB66> .
<http://collection.britishart.yale.edu/id/object/34> <http://erlangen-crm.org/current/P43_has_dimension> <http://collection.britishart.yale.edu/id/object/34/measurement/1/height> .
<http://collection.britishart.yale.edu/id/object/34> <http://collection.britishmuseum.org/id/ontology/PX_display_wrap

And there we it: lots of data about the painting from two different endpoints in a format that can be pasted into a text editor and then read into a triplestore. 