## Artists from the Republic and Spanish Civil War

This example is based on the book Artistas de la República which documents relevant artists related to the Republic and Spanish Civil War. The VALUES instruction enables the inclusion of a list of Wikidata identifiers representing a selection of the artists described in this book.

This example shows how to use <a href="https://www.w3.org/TR/sparql11-query/">SPARQL</a> as a query language in Linked Open Data repositories.

### First of all, we init the SPARQLWrapper service with the SPARQL endpoint

In [30]:
from SPARQLWrapper import SPARQLWrapper

sparql = SPARQLWrapper("https://query.wikidata.org/sparql")

### Then we define our CONSTRUCT query to extract the metadata

In [60]:
sparql.setQuery("""
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>

CONSTRUCT {
    ?artist wdt:P31 ?type .
    ?artist wdt:P19 ?placeBirth . 
    ?placeBirth rdfs:label ?placeBirthLabel .
    ?artist wdt:P20 ?placeDeath . 
    ?placeDeath rdfs:label ?placeDeathLabel .
    ?artist wdt:P569 ?dateBirth .
    ?artist wdt:P570 ?dateDeath .
    ?artist wdt:P135 ?movement .
    ?artist rdfs:label ?artistLabel .
    ?artist wdt:P18 ?image .
    ?artist wdt:P21 ?sex .
} WHERE { 
    VALUES ?artist {
        wd:Q5593 wd:Q152384 wd:Q2447692 wd:Q235275 wd:Q5660510 
        wd:Q118936 wd:Q979226 wd:Q134644 wd:Q921933 wd:Q5994858 
        wd:Q1042706 wd:Q3398317 wd:Q51545 wd:Q467712 wd:Q77347 wd:Q236161}
    ?artist wdt:P31 ?type .
    ?artist wdt:P19 ?placeBirth . 
    ?placeBirth rdfs:label ?placeBirthLabel . FILTER (lang(?placeBirthLabel) = 'es') .
    ?artist wdt:P20 ?placeDeath . 
    ?placeDeath rdfs:label ?placeDeathLabel . FILTER (lang(?placeDeathLabel) = 'es') .
    ?artist wdt:P569 ?dateBirth .
    ?artist wdt:P570 ?dateDeath .
    OPTIONAL {?artist wdt:P135 ?movement} .
    ?artist rdfs:label ?artistLabel . FILTER (lang(?artistLabel) = 'es') .
    OPTIONAL {?artist wdt:P18 ?image} .
    OPTIONAL {?artist wdt:P21 ?sex} .
}
"""
)

### Finally, we serialise the result

In [63]:
results = sparql.queryAndConvert()
print(results.serialize())

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix wd: <http://www.wikidata.org/entity/> .
@prefix wdt: <http://www.wikidata.org/prop/direct/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

wd:Q1042706 rdfs:label "Carlos Arniches"@es ;
    wdt:P18 <http://commons.wikimedia.org/wiki/Special:FilePath/Carlos%20Arniches%202.jpg> ;
    wdt:P19 wd:Q11959 ;
    wdt:P20 wd:Q2807 ;
    wdt:P21 wd:Q6581097 ;
    wdt:P31 wd:Q5 ;
    wdt:P569 "1866-10-11T00:00:00+00:00"^^xsd:dateTime,
        "1866-11-11T00:00:00+00:00"^^xsd:dateTime ;
    wdt:P570 "1943-04-16T00:00:00+00:00"^^xsd:dateTime .

wd:Q118936 rdfs:label "Rafael Alberti"@es ;
    wdt:P18 <http://commons.wikimedia.org/wiki/Special:FilePath/Rafael%20Alberti%201978-10-01.jpg> ;
    wdt:P19 wd:Q203040 ;
    wdt:P20 wd:Q15682,
        wd:Q203040 ;
    wdt:P21 wd:Q6581097 ;
    wdt:P31 wd:Q5 ;
    wdt:P569 "1902-12-16T00:00:00+00:00"^^xsd:dateTime ;
    wdt:P570 "1999-10-28T00:00:00+00:00"^^xsd:dateTime .

wd:Q134644 rdfs:l

### We can save the results to a file

In [66]:
with open("output/artists.ttl", "w") as text_file:
    text_file.write(results.serialize())

### We can also provide metadata about the extracted dataset using ontologies and controlled vocabularies

In [69]:
from rdflib import Graph, URIRef, Literal, Namespace
from rdflib.namespace import FOAF, RDF, DCTERMS, VOID, DC, SKOS, OWL
import datetime

In [71]:
domain = 'https://example.org/'

g = Graph()
g.bind("foaf", FOAF)
g.bind("rdf", RDF)
g.bind("dcterms", DCTERMS)
g.bind("dc", DC)
g.bind("void", VOID)
g.bind("skos", SKOS)
g.bind("owl", OWL)

schema = Namespace("https://schema.org/")
g.bind("schema", schema)

viaf = Namespace("https://viaf.org/viaf/")
g.bind("viaf", viaf)

wd = Namespace("http://www.wikidata.org/entity/")
g.bind("wd", wd)

In [73]:
dataset = URIRef(domain + "dataset/artists")

g.add((dataset, RDF.type, schema.Dataset))
g.add((dataset, schema.url, URIRef("https://www.cultura.gob.es/en/cultura/areas/archivos/mc/centros/cida/4-difusion-cooperacion/4-1-guias-de-lectura/guia-exilio-espanol-1939-archivos-estatales.html")))
g.add((dataset, schema.description, Literal("This example is based on the book Artistas de la República which documents relevant artists related to the Republic and Spanish Civil War.")))
g.add((dataset, schema.name, Literal("Artists from the Republic and Spanish Civil War")))
g.add((dataset, DC.title, Literal("Artists from the Republic and Spanish Civil War")))
g.add((dataset, schema.license, URIRef('https://creativecommons.org/publicdomain/zero/1.0/')))

now = datetime.datetime.now()
g.add((dataset, schema.dateCreated, Literal(str(now)[:10])))

<Graph identifier=N6f8dd9f43bb443a2b47b6983b335f114 (<class 'rdflib.graph.Graph'>)>

Let's store the metadata generated

In [76]:
g.serialize(destination="output/metadata-artists.ttl") 

<Graph identifier=N6f8dd9f43bb443a2b47b6983b335f114 (<class 'rdflib.graph.Graph'>)>

### Finally we can analyse the metadata generated

In [79]:
input_file = "output/artists.ttl"
g = Graph().parse(input_file)

Let's check the number of properties

In [82]:
print('##### Number of properties:')

# Query the data in g using SPARQL
q = """
    SELECT (count(distinct ?prop) as ?properties)
    WHERE {
        ?s ?prop ?o .
    }
"""

# Apply the query to the graph and iterate through results
for r in g.query(q):
    print(r["properties"])

##### Number of properties:
9


We can also check the total number of triples

In [85]:
print('##### Number of triples:')
    
# Query the data in g using SPARQL
q = """
    SELECT (COUNT(*) as ?triples) 
    WHERE { ?s ?p ?o } 
"""

# Apply the query to the graph and iterate through results
for r in g.query(q):
    print(r["triples"])

##### Number of triples:
146
