Skip to content

Building a knowledge graph of artists and artworks of the Irish Museum of Modern Art (IMMA)

Notifications You must be signed in to change notification settings

SPARQL-Anything/showcase-imma

Repository files navigation

Construct a knowledge graph of artists and artworks of the IMMA museum website

This showcase demonstrates the use of SPARQL Anything for constructing a Knowledge Graph from data encoded in HTML pages.

In what follows, fx refers to the following command line

java -jar sparql-anything-<version>-.jar  

Knowledge graph construction pipeline

Step 1: list artists from the catalogue

This query extracts the list of artists from the Web page and build an XML result set with ?artistNickname and ?artistUrl. The SPARQL result set file will be used in the next query to iterate over each one of the artists' pages.

Title Step 1: list artists from the catalogue
Query queries/imma-artists.sparql
Input https://imma.ie/artists/
Output imma-artists.xml
Type SELECT
Options html.selector=#az-group
Formats HTML
Level Novice

Run the example as follows:

fx -q queries/imma-artists.sparql -o imma-artists.xml -f xml

Step 2: iterate over artists' web pages and create a JSON-LD for each one of them

In this step we use a parametrized query that is able to query an artists' web page and extract relevant metadata. The query is repeated for each value of the SPARQL result set file produced in the previous step. The command generates a JSON-LD for each execution, using the artist nickname as file name (one of the values provided by the result set). Crucially, the JSON-LD files produced will include web pages of the related artworks.

Title Step 2: iterate over artists' web pages and create a JSON-LD for each one of them
Query queries/imma-artist.sparql
Input imma-artists.xml, ?_artistUrl
Output artists/*.jsonld
Type CONSTRUCT
Options
Formats HTML
Level Novice

Run the example as follows:

fx -q queries/imma-artist.sparql -i imma-artists.xml -p "artists/?artistNickname.jsonld" -f json

Step 3: Generate the list of artworks

Next, we extract the list of artworks' Web pages from the JSON-LD files of the artists. This is easy as we can simply query the JSON-LD files, loading them in an in-memory dataset via the command-line option -l.

Title Step 3: Generate the list of artworks
Query queries/imma-artworks.sparql
Input artists/
Output imma-artworks.xml
Type SELECT
Options -l
Formats
Level Novice

Run the example as follows:

fx -q queries/imma-artworks.sparql -l artists/ -o imma-artworks.xml -f xml

Step 4: Generate the list of artworks

Next, we extract data from the artworks' Web pages and build one JSON-LD file each (create folder 'artworks' first).

Title Step 4: Generate the list of artworks
Query queries/imma-artwork.sparql
Input imma-artworks.xml, ?_artworkUrl
Output artworks/*.jsonld
Type CONSTRUCT
Options
Formats
Level Novice
fx -q queries/imma-artwork.sparql -i imma-artworks.xml -p "artworks/?artworkNickname.jsonld" -f json

Finally, we can load the files into our favourite triple store.

Extract single artists / artworks

These queries can be used to execute only one specific artists/artwork. In addition, they showcase the CLI option -v, used to pass parameter values.

Extract data from a specific artist Web page:

fx -q queries/imma-artist.sparql -v artistNickname=lambert-gene -v artistUrl=https://imma.ie/artists/gene-lambert/ -p "artists/?artistNickname.jsonld" -f json

Extract data from a specific artwork Web page:

fx -q queries/imma-artwork.sparql  -v artworkNickname=naturaleza-desde-la-ventana -v artworkUrl=https://imma.ie/collection/naturaleza-desde-la-ventana/ -p "artworks/?artworkNickname.jsonld" -f json
fx -q queries/imma-artwork.sparql  -v artworkNickname=berry-dress -v artworkUrl=https://imma.ie/collection/berry-dress/ -p "artworks/?artworkNickname.jsonld" -f json

About

Building a knowledge graph of artists and artworks of the Irish Museum of Modern Art (IMMA)

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages