To investigate and extract all the photographers partaking in the Zeri Photo Archive we started with some exploratory queries.  We saw that the photographs were linked to their creators by a recursive relation, such that ?photo crm:P94i_was_created_by ?creation . ?creation crm:P14_carried_out_by ?photographer. 
The definition of "Photographer" in this context was "an entity that carried out a creation process which created the resource". This was our first successful query  

In [None]:
my_SPARQL_query = """
PREFIX crm: <http://www.cidoc-crm.org/cidoc-crm/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?photographer 
WHERE { 
  	?x rdf:type <http://www.essepuntato.it/2014/03/fentry/Photograph> ; 
    crm:P94i_was_created_by ?creation .
    ?creation crm:P14_carried_out_by ?photographer .
 }
"""


Once we found the photographers we wanted to count the contributions of each one of them made to the Zeri Archive, and we saw that the property <http://purl.org/spar/pro/holdsRoleInTime> was repeated for each photo they created. 

In [2]:
#Import all the libraries we need at once: 
from json import decoder
from SPARQLWrapper.Wrapper import GET
import requests
import json
import rdflib
import pprint
from rdflib import Namespace
from rdflib.namespace import DCTERMS
from rdflib.namespace import RDFS
from rdflib import URIRef, Literal
from rdflib.namespace import XSD
import numpy as np 
import matplotlib.pyplot as plt 
from SPARQLWrapper import SPARQLWrapper, POST, DIGEST, JSON
from SPARQLWrapper import POST 
import ssl
import csv

In [None]:
ssl._create_default_https_context = ssl._create_unverified_context

# get the endpoint API
fototeca_endpoint = "http://data.fondazionezeri.unibo.it/sparql"

# prepare the query : 10 random triples
my_SPARQL_query = """
PREFIX crm: <http://www.cidoc-crm.org/cidoc-crm/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?photographer_label (COUNT(<http://purl.org/spar/pro/holdsRoleInTime>) as ?cnt)
WHERE { 
  	?x rdf:type <http://www.essepuntato.it/2014/03/fentry/Photograph> ; 
    crm:P94i_was_created_by ?creation .
    ?creation crm:P14_carried_out_by ?photographer .
    ?photographer rdfs:label ?photographer_label
 }
GROUP BY ?photographer_label 
ORDER BY DESC(?cnt) ?photographer_label
"""

# set the endpoint 
sparql_ft = SPARQLWrapper(fototeca_endpoint)
# set the query
sparql_ft.setQuery(my_SPARQL_query)
# set the returned format
sparql_ft.setReturnFormat(JSON)
# get the results
results = sparql_ft.query().convert()

with open('photographers.csv', mode='w') as my_file:
    my_writer = csv.writer(my_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_ALL)
    # write the column names
    my_writer.writerow(['photographer', 'contribution count'])
    for result in results["results"]["bindings"]:
        my_writer.writerow([result['photographer_label']['value'], result['cnt']['value'].strip()])

Looking at the data we see that most of the photographies inside the Zeri Archive are anonymous, and the following four entities are not, apparently, individual people as we first thought. We need now to find each one of the photographers inside another knowledge base to have more informations about them.

In [4]:
data = pd.read_csv("photographers.csv")
data.head()

Unnamed: 0,photographer,contribution count
0,Anonimo,13355
1,Brogi,2213
2,Istituto Centrale per il Catalogo e la Documen...,1539
3,"Alinari, Fratelli",1532
4,Anderson,1192


We decided to find more informations on Wikidata, so the first need was to find each Wikidata ID (this time using the .json format) for the entities identifiable as photographers according to the contextual definition of the Zeri Archive. We managed to do that via the Wikidata API. 
This function saves all the API responses, since many searches yielded more than one result; in the API response there was a key ['search-continue'] which allowed the function to continue to append new findings to the results list. 
In practice, if there is more than one "James Anderson", all of them will be saved to the outcomes, since we don't know if the first "James Anderson" is exactly who we are looking for. 

Let's prepare the uris to be finally queried on Wikidata 

In [None]:
uris = ' '.join(suit_for_SPARQL_dinner(hand_your_id_please('queryResults.json' ,"photographer_label")))

This new query will filter the results from the Wikidata matching. 

In [13]:
#deprecated 
sparql = SPARQLWrapper("https://query.wikidata.org/bigdata/namespace/wdq/sparql")
sparql.setMethod(GET)

my_SPARQL_query= """
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?photographer ?label ?citizenships ?placeOfBirth ?worklocation
WHERE {
    VALUES ?photographer {""" + uris + """} .
    ?photographer wdt:P106 wd:Q33231 ; a Q672070 . 
    ?photographer rdfs:label ?label .
  OPTIONAL {
    ?photographer wdt:P27* ?citizenships .
    ?photographer wdt:P19* ?placeOfBirth . 
    ?photographer wdt:P937* ?worklocation .
    }
    FILTER(LANG(?label) = "en").
    }
GROUP BY ?photographer ?label ?citizenships ?placeOfBirth ?worklocation
"""
sparql.setQuery(my_SPARQL_query)
sparql.setReturnFormat(JSON)
results = sparql.query().convert()
print(results)

for result in results["results"]["bindings"]:
    print(result["photographer"]["value"], result["label"]["value"], result["citizenship"]["value"], result["placeOfBirth"]["value"], result["worklocation"]["value"])

['<http://www.wikidata.org/entity/Q15140103>', '<http://www.wikidata.org/entity/Q4770013>', '<http://www.wikidata.org/entity/Q1324702>', '<http://www.wikidata.org/entity/Q12253600>', '<http://www.wikidata.org/entity/Q3354174>', '<http://www.wikidata.org/entity/Q2409629>', '<http://www.wikidata.org/entity/Q179174>', '<http://www.wikidata.org/entity/Q3618225>', '<http://www.wikidata.org/entity/Q3618226>', '<http://www.wikidata.org/entity/Q60185968>', '<http://www.wikidata.org/entity/Q100228480>', '<http://www.wikidata.org/entity/Q102358462>', '<http://www.wikidata.org/entity/Q102358464>', '<http://www.wikidata.org/entity/Q102358466>', '<http://www.wikidata.org/entity/Q95842771>', '<http://www.wikidata.org/entity/Q102358459>', '<http://www.wikidata.org/entity/Q102358465>', '<http://www.wikidata.org/entity/Q22264387>', '<http://www.wikidata.org/entity/Q20007448>', '<http://www.wikidata.org/entity/Q21088952>', '<http://www.wikidata.org/entity/Q32827685>', '<http://www.wikidata.org/entity/Q4