# ARTchives: a data driven historiography of art history - Jupyter Notebook

This Jupyter Notebook is integral part of the research carried out by Lucia Giagnolini for her master's thesis in Knowledge organization and Cultural Heritage, Digital Humanities and Digital Knowledge international degree a. y. 2020/2021.

The focus of the thesis is the recently published ARTchives project, which can be accessed at: http://artchives.fondazionezeri.unibo.it. ARTchives is the first international web-based platform for a census of art historians' archives. It is an integrated system that collects archival descriptions of notable art historians' collections and opens up unexplored research paths through the implementation of semantic web technologies. Being a nascent project, there is considerable margin for improvement in several aspects. At the moment, one of the most compelling needs is to better exploit the potential of ARTchives and, in particular, of Semantic Web technologies underlying the system. A way to achieve this goal is to enhance communication and visual aspects by introducing new data visualizations to the ones aready published in the dedicated section.
The work presented in this Notebook aims at providing further data visualization proposals, not yet published in the application but in the view of an actual implementation in the next releases of ARTchives. The starting point for the development of these visualizations were four fundamental research questions:
1. RQ1. What have been the places of education and activity of all the art historians recorded in ARTchives?
2. RQ2. What have been the places of education and activity of a particular art historian recorded in ARTchives?
3. RQ3. What were the relations of art historians with other experts of their times (other scholars, art collectors, connoisseurs etc.)?
4. RQ4. Which artists and personalities have been studied by art historians recorded in ARTchives?


## 1. The distribution of art historians’ places of education and activity: visualizing RQ1.

In [1]:
#! pip install rdflib
import rdflib
from rdflib import Namespace , Literal , URIRef
from rdflib.namespace import RDF , RDFS

# create an empty Graph
g = rdflib.ConjunctiveGraph()

# parse a local RDF file by specifying the format
result = g.parse("Desktop/dhdk_epds/resources/artchives.nq", format='nquads') #Desktop/dhdk_epds/resources/

# bind the uncommon namespaces
wd = Namespace("http://www.wikidata.org/entity/") # remember that a prefix matches a URI until the last slash (or hashtag #)
wdt = Namespace("http://www.wikidata.org/prop/direct/")
art = Namespace("https://w3id.org/artchives/")
rdfs = Namespace ("http://www.w3.org/2000/01/")

# Get the list of art historians in our graph "g"
arthistorians_list = set()
arthistorians_names = list()

# iterate over the triples in the graph
for s,p,o in g.triples(( None, wdt.P170, None)):
    for subj, prop, obj in g.triples((o, RDFS.label, None )):# people "o" are the creator "wdt.P170" of a collection "s"
        if "wikidata.org/entity/" in str(o):           # look for the substring to filter wikidata entities only
            arthistorians_list.add('<' + str(o) + '>')
            if obj.strip() not in arthistorians_names:
                arthistorians_names.append(obj.strip())
                arthistorians_names.append('<' + str(o) + '>')  # remember to transform them in strings! 
    
print(arthistorians_list)
print(arthistorians_names)


{'<http://www.wikidata.org/entity/Q88907>', '<http://www.wikidata.org/entity/Q1629748>', '<http://www.wikidata.org/entity/Q995470>', '<http://www.wikidata.org/entity/Q3057287>', '<http://www.wikidata.org/entity/Q55453618>', '<http://www.wikidata.org/entity/Q60185>', '<http://www.wikidata.org/entity/Q1296486>', '<http://www.wikidata.org/entity/Q41616785>', '<http://www.wikidata.org/entity/Q19997512>', '<http://www.wikidata.org/entity/Q1271052>', '<http://www.wikidata.org/entity/Q85761254>', '<http://www.wikidata.org/entity/Q1089074>', '<http://www.wikidata.org/entity/Q6700132>', '<http://www.wikidata.org/entity/Q537874>', '<http://www.wikidata.org/entity/Q2824734>', '<http://www.wikidata.org/entity/Q18935222>', '<http://www.wikidata.org/entity/Q90407>', '<http://www.wikidata.org/entity/Q457739>', '<http://www.wikidata.org/entity/Q1641821>', '<http://www.wikidata.org/entity/Q1715096>', '<http://www.wikidata.org/entity/Q3051533>', '<http://www.wikidata.org/entity/Q1373290>', '<http://www.

In [3]:
#!pip install SPARQLWrapper
from SPARQLWrapper import SPARQLWrapper, JSON
import ssl
ssl._create_default_https_context = ssl._create_unverified_context

# get the endpoint API
wikidata_endpoint = "https://query.wikidata.org/bigdata/namespace/wdq/sparql"

# prepare the values to be queried
historians = ' '.join(arthistorians_list) # <uri1> <uri2> <uri3> ... <uriN>
#print(historians)
# prepare the query: for each historian in ARTchives check in wikidata if there are work or education places.
formationplace_query = """ 
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT DISTINCT ?historian ?workplace ?workplace_label ?coordinates1 ?eduplace ?eduplace_label ?coordinates2 (group_concat(?type ; separator="; ") as ?place_type) (group_concat(?type_label ; separator="; ") as ?label)
WHERE {
    VALUES ?historian {"""+historians+"""} . 
    optional {?historian wdt:P108 ?workplace . 
    ?workplace rdfs:label ?workplace_label .
    FILTER (langMatches(lang(?workplace_label), "EN")) 
    ?workplace wdt:P625 ?coordinates1; wdt:P31 ?type . 
    ?type rdfs:label ?type_label . 
    FILTER (langMatches(lang(?type_label), "EN"))}
    optional {?historian wdt:P69 ?eduplace . 
    ?eduplace rdfs:label ?eduplace_label .
    FILTER (langMatches(lang(?eduplace_label), "EN")) 
    ?eduplace wdt:P625 ?coordinates2; wdt:P31 ?type . 
    ?type rdfs:label ?type_label . 
    FILTER (langMatches(lang(?type_label), "EN")) }
    } 
GROUP BY ?historian ?workplace ?workplace_label ?coordinates1 ?eduplace ?eduplace_label ?coordinates2 ?place_type ?label
    
"""

# set the endpoint 
sparql_wd = SPARQLWrapper(wikidata_endpoint)
# set the query
sparql_wd.setQuery(formationplace_query)
# set the returned format
sparql_wd.setReturnFormat(JSON)
# get the results
results = sparql_wd.query().convert()

# manipulate the result
for result in results["results"]["bindings"]:
    #print(result)
    historian_uri = result["historian"]["value"]
    #print("historian:", historian_uri)
    if "workplace" in result: 
        workplace = result["workplace"]["value"]
        if "workplace_label" in result and "coordinates1" in result: 
            workplace_label = result["workplace_label"]["value"]
            work_coord = result["coordinates1"]["value"][6:-1].split(" ")
            #print(work_coord)
            #print("work:", workplace, workplace_label)

                    # only if uri, label and coords are found we add them to the graph
            g.add(( URIRef(historian_uri) , URIRef(wdt.P108) , URIRef(workplace) ))
            g.add(( URIRef(workplace) , RDFS.label , Literal(workplace_label) ))
            g.add(( URIRef(workplace) , URIRef(wdt.P625) , Literal(work_coord) ))
            if "place_type" in result: 
                type_label = result["label"]["value"].split(" ")[0]
                #print(type_label)
                place_type = result["place_type"]["value"].split(";")[0]
                #print("WORK ", workplace_label, type_label, place_type)
                if type_label != "":
                    g.add(( URIRef(workplace) , URIRef(wdt.P31) , URIRef(place_type) ))
                    g.add(( URIRef(place_type) , RDFS.label  , Literal(type_label) ))
                    
            
    if "eduplace" in result: 
        eduplace = result["eduplace"]["value"]
        #print(eduplace)
        if "eduplace_label" in result and "coordinates2" in result: 
            eduplace_label = result["eduplace_label"]["value"]
            eduplace_coord = result["coordinates2"]["value"][6:-1].split(" ")
            #print("education:", eduplace, eduplace_label)
                    # only if both uri and label are found we add them to the graph
            g.add(( URIRef(historian_uri) , URIRef(wdt.P69) , URIRef(eduplace) ))
            g.add(( URIRef(eduplace) , RDFS.label , Literal(eduplace_label) ))
            g.add(( URIRef(eduplace) , URIRef(wdt.P625) , Literal(eduplace_coord) ))
            if "place_type" in result: 
                type_label = result["label"]["value"].split(" ")[0]
                place_type = result["place_type"]["value"].split(";")[0]
                #print("EDUCATION ", eduplace_label, type_label, place_type)
                if type_label != "":
                    g.add(( URIRef(eduplace) , URIRef(wdt.P31) , URIRef(place_type) ))
                    g.add(( URIRef(place_type) , RDFS.label  , Literal(type_label) ))
            

In [4]:
g.serialize(destination='Desktop/dhdk_epds/resources/artchives2.nq', format='nquads') #Desktop/dhdk_epds/resources/
result = g.parse("Desktop/dhdk_epds/resources/artchives2.nq", format='nquads')



In [5]:
loc_list = ['country', 'city', 'village', 'capital', 'state', 'region', 'municipality', 'county', 'frazione', 'comune', 'city-state', 'enclave']
firstdict = {}
final = {}
for hist, prop, place in g.triples((None, wdt.P69, None)):
    for eduplace, hasname, eduplacename in g.triples((place, RDFS.label, None)):
        #print(eduplacename)
        for edup, hascoordinates, coord in g.triples((eduplace, wdt.P625, None)):
            for edupl, istype, placetype in g.triples((edup, wdt.P31, None)):
                for plctype, named, typelabel in g.triples((placetype, RDFS.label, None)):
                    #print(plctype)
                    for historian, p, name in g.triples(( hist, RDFS.label, None)):
                        if "wikidata.org/entity/" in str(historian):
                            type_label = typelabel
                            check =  any(item in loc_list for item in type_label)
                            if check:
                                loc = "geoloc"
                            else:
                                loc = 'institution'
                            key = (eduplacename.strip(), "<" + place.strip() +">", float(coord.split(" ")[1][1:-2][0:6]), float(coord.split(" ")[0][2:-2][0:6]), loc)
                            value = tuple(["<" + hist.strip() +">", name.strip()])
                            if key not in firstdict.keys():
                                firstdict[key] = set([value])
                            else:
                                firstdict[key].update([value])


                    
for hist, prop, place in g.triples((None, wdt.P108, None)):
    for place, hasname, workplacename in g.triples((place, RDFS.label, None)):
        for place, hasCoordinates, coord in g.triples((place, wdt.P625, None)):
            for edupl, istype, placetype in g.triples((place, wdt.P31, None)):
                for plctype, named, typelabel in g.triples((placetype, RDFS.label, None)):
                    for historian, p, name in g.triples(( hist, RDFS.label, None)):   
                        if "wikidata.org/entity/" in str(historian):
                            type_label = typelabel
                            check =  any(item in loc_list for item in type_label)
                            if check:
                                loc = "geoloc"
                            else:
                                loc = 'institution'
                            key = (workplacename.strip(), "<" + place.strip() +">", float(coord.split(" ")[1][1:-2][0:6]), float(coord.split(" ")[0][2:-2][0:6]), loc)
                            value = tuple(["<" + hist.strip() +">", name.strip()])
                            if key not in firstdict.keys():
                                firstdict[key] = set([value])
                            else:
                                firstdict[key].update([value])


for k, v in firstdict.items(): #creation of a dictionary that has as key a tuple with the info for a place and as values a list of tuples of art historians connected to that place.
    for el in v:
        if k not in final.keys():
            final[k] = [el]
        else:
            final[k].append(el)
            
for k, v in final.items():
    print(k,v)




('Sapienza University of Rome', '<http://www.wikidata.org/entity/Q209344>', 41.903, 12.515, 'institution') [('<http://www.wikidata.org/entity/Q2824734>', 'Adolfo Venturi'), ('<http://www.wikidata.org/entity/Q1089074>', 'Federico Zeri')]
('Humboldt University of Berlin', '<http://www.wikidata.org/entity/Q152087>', 52.518, 13.393, 'institution') [('<http://www.wikidata.org/entity/Q90407>', 'Richard Krautheimer')]
('Harvard University', '<http://www.wikidata.org/entity/Q13371>', 42.374, -71.11, 'institution') [('<http://www.wikidata.org/entity/Q88907>', 'Ernst Kitzinger'), ('<http://www.wikidata.org/entity/Q19997512>', 'Everett Fahy'), ('<http://www.wikidata.org/entity/Q457739>', 'Leo Steinberg')]
('Ludwig Maximilian University of Munich', '<http://www.wikidata.org/entity/Q55044>', 48.15, 11.58, 'institution') [('<http://www.wikidata.org/entity/Q88907>', 'Ernst Kitzinger')]
('Vassar College', '<http://www.wikidata.org/entity/Q2093794>', 41.686, -73.89, 'institution') [('<http://www.wikida

In [6]:
related = {} #dictionary to collect the main subjects of an art historian -p921- present in ARTchives data 
for s,p,o in g.triples(( None, wdt.P170, None)):   
    if "wikidata.org/entity/" in str(o):           
        for hist, prop, obj in g.triples((o, wdt.P921, None)):     
            for subj, pr, name in g.triples(( hist, RDFS.label, None)):   
                key = tuple(['<' + str(hist) + '>', name.strip()])
                value = '<' + str(obj) + '>'
                if key not in related.keys(): #the dict has as keys the historians and as values the relative list of main subjects.
                    related[key] = [value]
                else:
                    related[key].append(value)
    
for k, v in related.items():
    print(k, v)



('<http://www.wikidata.org/entity/Q1715096>', 'Ulrich Middeldorf') ['<http://www.wikidata.org/entity/Q7787>', '<http://www.wikidata.org/entity/Q7787>', '<http://www.wikidata.org/entity/Q5597>', '<http://www.wikidata.org/entity/Q5597>', '<http://www.wikidata.org/entity/Q183>', '<http://www.wikidata.org/entity/Q183>', '<http://www.wikidata.org/entity/Q131252>', '<http://www.wikidata.org/entity/Q131252>', '<http://www.wikidata.org/entity/Q317053>', '<http://www.wikidata.org/entity/Q317053>', '<http://www.wikidata.org/entity/Q152087>', '<http://www.wikidata.org/entity/Q152087>', '<http://www.wikidata.org/entity/Q70804>', '<http://www.wikidata.org/entity/Q70804>', '<http://www.wikidata.org/entity/Q2044>', '<http://www.wikidata.org/entity/Q2044>', '<http://www.wikidata.org/entity/Q1361667>', '<http://www.wikidata.org/entity/Q1361667>', '<http://www.wikidata.org/entity/Q38>', '<http://www.wikidata.org/entity/Q38>', '<http://www.wikidata.org/entity/Q2143255>', '<http://www.wikidata.org/entity/

In [7]:
#import json

related_dict= {}


for k, v in related.items():  #for each art historian I extract those main subjects which are defined as places in wikidatata 
    relatedlist = ' '.join(v)
    

    query_results = """
        PREFIX wdt: <http://www.wikidata.org/prop/direct/>
        SELECT DISTINCT ?place ?place_label ?coord (group_concat(?type_label ; separator="; ") as ?label) 
        WHERE {
            VALUES ?place {"""+relatedlist+"""} . 
            ?place rdfs:label ?place_label . 
            FILTER (langMatches(lang(?place_label), "EN")) . ?place wdt:P625 ?coord; wdt:P31 ?type . ?type rdfs:label ?type_label . FILTER (langMatches(lang(?type_label), "EN"))
            } 
            group by ?place ?place_label ?coord ?label
        """

    # set the endpoint 
    sparql_wd = SPARQLWrapper(wikidata_endpoint)
    # set the query
    sparql_wd.setQuery(query_results)
    # set the returned format
    sparql_wd.setReturnFormat(JSON)
    # get the results
    results = sparql_wd.query().convert()
    
    #with open('q1.json', 'w') as f:
        #json.dump(results, f)

    for result in results["results"]["bindings"]:
        #print(v, result)
        place = '<' + str(result["place"]["value"]) + '>'
        place_name = result["place_label"]["value"]
        coord = result["coord"]["value"][6:-1].split(" ")
        type_label = result["label"]["value"].split("; ")[0]
        type_label_list = type_label.split(" ")
        check =  any(item in loc_list for item in type_label_list)
        if check:
            loc = "geoloc"
        else:
            loc = 'institution'
        key = tuple([place_name, place, float(coord[1][0:6]), float(coord[0][0:6]), loc])
        for value in v:
            if value == str(key[1]):
                #print(k[1], place_name)
                if key not in related_dict.keys():
                    related_dict[key] = set([k])
                else: 
                    related_dict[key].update([k])
                

                




            
for k, v in final.items():
    for el in v:
        if k not in related_dict.keys():
            related_dict[k] = set([el])
        else:
            related_dict[k].update([el])
    
    
for k, v in related_dict.items():
    print(k, v)





('Florence', '<http://www.wikidata.org/entity/Q2044>', 43.771, 11.254, 'geoloc') {('<http://www.wikidata.org/entity/Q995470>', 'Kornél Fabriczy'), ('<http://www.wikidata.org/entity/Q19997512>', 'Everett Fahy'), ('<http://www.wikidata.org/entity/Q18935222>', 'Werner Cohn'), ('<http://www.wikidata.org/entity/Q61913691>', 'Luisa Vertova'), ('<http://www.wikidata.org/entity/Q60185>', 'Aby Warburg'), ('<http://www.wikidata.org/entity/Q3057287>', 'Ernst Steinmann'), ('<http://www.wikidata.org/entity/Q55453618>', 'Stefano Tumidei'), ('<http://www.wikidata.org/entity/Q1715096>', 'Ulrich Middeldorf')}
('University of Chicago', '<http://www.wikidata.org/entity/Q131252>', 41.789, -87.59, 'institution') {('<http://www.wikidata.org/entity/Q1715096>', 'Ulrich Middeldorf')}
('Ludwig Maximilian University of Munich', '<http://www.wikidata.org/entity/Q55044>', 48.15, 11.58, 'institution') {('<http://www.wikidata.org/entity/Q85761254>', 'Julian Kliemann'), ('<http://www.wikidata.org/entity/Q1629748>', '

In [9]:
#If runned in Binder, click on File > Trust Notebook to properly visualize maps
#!pip install ipywidgets
#!pip install ipyleaflet
from ipywidgets import HTML

from ipyleaflet import Map, Marker, Popup, LayersControl, AwesomeIcon

center = (41.080684, -30.683374)


m = Map(center=center, zoom=3, close_popup_on_click=False)



for k,v in related_dict.items():
    names = []
    for value in v:
        if value[0] not in names:
            names.append("<a href='http://artchives.fondazionezeri.unibo.it/historian-" + value[0][32:-1] + "'>" + value[1] + "</a>" + ". ")
    namelist = " ".join(names)
    #print(namelist)
    if 'geoloc' in k[4]:
        icon2 = AwesomeIcon(
        name = "map-marker",
        marker_color='blue',
        icon_color='white',
        spin=False
        )
        marker = Marker(icon = icon2, location=(k[2], k[3]))
        m.add_layer(marker)
        #message = HTML()
        #marker.popup = message
        #message.description = ""
    #message.value = "<b>" + k[0] + "</b>" + "<br>"  + namelist
    else:
        icon2 = AwesomeIcon(
        name = "bank",
        marker_color='green',
        icon_color='white',
        spin=False
            )
        marker = Marker(icon = icon2, location=(k[2], k[3]))
        m.add_layer(marker)
    message = HTML()
    marker.popup = message
    message.description = ""
    message.value = "<b>" + k[0] + "</b>" + "<br>" + namelist
    
        


m

Map(center=[41.080684, -30.683374], close_popup_on_click=False, controls=(ZoomControl(options=['position', 'zo…

## 2. The distribution of selected art historians’ places of education and activity: visualizing RQ2.

In [10]:
hist_dict = {} #dict that has as keys art historians and as values the list of related places. 
hist_list = []
for k, v in related_dict.items():
    for value in v:
        if value not in hist_list:
            hist_list.append(value)
    for el in hist_list:
        if el in v:
            if el not in hist_dict.keys():
                hist_dict[el] = [k]
            else:
                hist_dict[el].append(k)
        

for k, v in hist_dict.items():            
    print(k, v)

('<http://www.wikidata.org/entity/Q995470>', 'Kornél Fabriczy') [('Florence', '<http://www.wikidata.org/entity/Q2044>', 43.771, 11.254, 'geoloc'), ('Italy', '<http://www.wikidata.org/entity/Q38>', 42.5, 12.5, 'geoloc'), ('London', '<http://www.wikidata.org/entity/Q84>', 51.507, -0.127, 'geoloc'), ('Paris', '<http://www.wikidata.org/entity/Q90>', 48.856, 2.3513, 'geoloc'), ('Stuttgart', '<http://www.wikidata.org/entity/Q1022>', 48.776, 9.1775, 'geoloc')]
('<http://www.wikidata.org/entity/Q19997512>', 'Everett Fahy') [('Florence', '<http://www.wikidata.org/entity/Q2044>', 43.771, 11.254, 'geoloc'), ('Italy', '<http://www.wikidata.org/entity/Q38>', 42.5, 12.5, 'geoloc'), ('New York University', '<http://www.wikidata.org/entity/Q49210>', 40.73, -73.99, 'institution'), ('Harvard University Center for Italian Renaissance Studies', '<http://www.wikidata.org/entity/Q3558578>', 43.794, 11.315, 'institution'), ('Harvard University', '<http://www.wikidata.org/entity/Q13371>', 42.374, -71.11, 'ins

In [53]:
from ipyleaflet import LayerGroup
import random 
center = (41.080684, -30.683374)

colors = ['red', 'darkred', 'lightred', 'orange', 'beige', 'green', 'darkgreen', 'lightgreen', 'blue', 'darkblue', 'lightblue', 'purple', 'darkpurple', 'pink', 'cadetblue', 'gray', 'lightgray', 'black']


m = Map(center=center, zoom=3, close_popup_on_click=False)
control = LayersControl(position='topright')
m.add_control(control)

for k,v in hist_dict.items():
    layer_group = LayerGroup(layers=(), name=k[1]) #creating different layers for different historians
    m.add_layer(layer_group)
    #print(k, v)
    icon2 = AwesomeIcon(
        name = "map-marker",
        marker_color= random.choice(colors),
        icon_color='white',
        spin=False
        )
    for value in v:
        marker = Marker(icon = icon2, location=(value[2], value[3]))
        message = HTML()
        marker.popup = message
        message.description = ""
        message.value ="<b>" + "<a href='http://artchives.fondazionezeri.unibo.it/historian-" + k[0][32:-1] +"'>" + k[1] + "</a>" + "</b>" + "<br>"  + value[0]
        layer_group.add_layer(marker)





m

Map(center=[41.080684, -30.683374], close_popup_on_click=False, controls=(ZoomControl(options=['position', 'zo…

## 3. The distribution of art historians’ relations with experts of their times: visualizing RQ3.

In [12]:
people = []
for k, v in related.items():   
    relatedlist = ' '.join(v)
    #print(relatedlist)
    query_res = """
        PREFIX wdt: <http://www.wikidata.org/prop/direct/>
        PREFIX wd: <http://www.wikidata.org/entity/>
        SELECT DISTINCT ?person ?person_label ?occupation ?deathdate
        WHERE {
            VALUES ?person {"""+relatedlist+"""} . 
            ?person wdt:P31 wd:Q5; rdfs:label ?person_label . 
            VALUES ?occupation {wd:Q1792450 wd:Q201788 wd:Q36180 wd:Q4164507 wd:Q1126160 wd:Q10732476 wd:Q1622272 wd:Q22132694} .  
            ?person wdt:P106 ?occupation ; wdt:P570 ?deathdate.
            FILTER (langMatches(lang(?person_label), "EN") ) 
            } 

        """

    # set the endpoint 
    sparql_wd = SPARQLWrapper(wikidata_endpoint)
    # set the query
    sparql_wd.setQuery(query_res)
    # set the returned format
    sparql_wd.setReturnFormat(JSON)
    # get the results
    results = sparql_wd.query().convert()

    for result in results["results"]["bindings"]:
        #print(result)
        person = '<' + str(result["person"]["value"]) + '>'
        person_name = str(result["person_label"]["value"])
        death_date = int(result["deathdate"]["value"][0:4])
        key = tuple([person_name, person])
        if death_date > 1850:
            for value in v:
                if value == str(key[1]):
                    tupla = tuple([k[1], k[0], person_name, person, 2])
                    if tupla not in people:
                        people.append(tupla)

            
print(people)


[('Ulrich Middeldorf', '<http://www.wikidata.org/entity/Q1715096>', 'Adolph Goldschmidt', '<http://www.wikidata.org/entity/Q70804>', 2), ('Ulrich Middeldorf', '<http://www.wikidata.org/entity/Q1715096>', 'Heinrich Wölfflin', '<http://www.wikidata.org/entity/Q123466>', 2), ('Ulrich Middeldorf', '<http://www.wikidata.org/entity/Q1715096>', 'Roberto Longhi', '<http://www.wikidata.org/entity/Q1361667>', 2), ('Otto Lehmann-Brockhaus', '<http://www.wikidata.org/entity/Q1641821>', 'Wolfgang Lotz', '<http://www.wikidata.org/entity/Q701741>', 2), ('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 'Rudolf Wittkower', '<http://www.wikidata.org/entity/Q215999>', 2), ('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 'Henriette Hertz', '<http://www.wikidata.org/entity/Q103498>', 2), ('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 'Anton Heinrich Springer', '<http://www.wikidata.org/entity/Q64637>', 2), ('Ernst Steinmann', '<http://www.wikidata.org/entity

In [14]:
coll_related = {}
for coll,p,hist in g.triples(( None, wdt.P170, None)):
    for historian, proper, hist_name in g.triples((hist, RDFS.label, None)):     # people "o" are the creator "wdt.P170" of a collection "s"
        for collection, pr, coll_name in g.triples(( coll, RDFS.label, None)):   
            for col, prop, content in g.triples((coll, wdt.P921, None)):     
                for cont, pro, content_label in g.triples(( content, RDFS.label, None)): 
                    if "wikidata.org/entity/" in str(content):    
                        key = tuple([hist_name.strip(), '<' + str(hist) + '>'])
                        value = '<' + str(content) + '>'
                        if key not in coll_related.keys():
                            coll_related[key] = set([value])
                        else:
                            coll_related[key].add(value)
    




collection_related = {}
for coll,p,hist in g.triples(( None, wdt.P170, None)):
    for historian, proper, hist_name in g.triples((hist, RDFS.label, None)):     
        for collection, pr, coll_name in g.triples(( coll, RDFS.label, None)):   
            for col, prop, content in g.triples((coll, art.hasSubjectPeople, None)):     
                for cont, pro, content_label in g.triples(( content, RDFS.label, None)): 
                    if "wikidata.org/entity/" in str(content):    
                        key = tuple([hist_name.strip(), '<' + str(hist) + '>'])
                        value = '<' + str(content) + '>'
                        if key not in collection_related.keys():
                            collection_related[key] = set([value])
                        else:
                            collection_related[key].add(value)
 



In [16]:
for k, v in collection_related.items(): 
    collection_list = ' '.join(v)
    query_res = """
        PREFIX wdt: <http://www.wikidata.org/prop/direct/>
        PREFIX wd: <http://www.wikidata.org/entity/>
        PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
        SELECT DISTINCT ?person ?person_label ?occupation ?deathdate
        WHERE {
            VALUES ?person {"""+collection_list+"""} . 
            ?person wdt:P31 wd:Q5; rdfs:label ?person_label 
            FILTER (langMatches(lang(?person_label), "EN")) . 
            }  

        """

    # set the endpoint 
    sparql_wd = SPARQLWrapper(wikidata_endpoint)
    # set the query
    sparql_wd.setQuery(query_res)
    # set the returned format
    sparql_wd.setReturnFormat(JSON)
    # get the results
    results = sparql_wd.query().convert()

    for result in results["results"]["bindings"]:
        #print(result)
        person = '<' + str(result["person"]["value"]) + '>'
        person_name = str(result["person_label"]["value"])
        key = tuple([person_name, person])
        for value in v:
            if value == str(key[1]):
                tupla = tuple([k[0], k[1], person_name, person, 3])
                if tupla not in people:
                    print(tupla)
                    people.append(tupla)
                    
                        

('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 'Luca Beltrami', '<http://www.wikidata.org/entity/Q937875>', 3)
('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 'Henriette Hertz', '<http://www.wikidata.org/entity/Q103498>', 3)
('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 'Paul Fridolin Kehr', '<http://www.wikidata.org/entity/Q108748>', 3)
('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 'Richard Krautheimer', '<http://www.wikidata.org/entity/Q90407>', 3)
('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 'Anton Heinrich Springer', '<http://www.wikidata.org/entity/Q64637>', 3)
('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 'Otto Lehmann-Brockhaus', '<http://www.wikidata.org/entity/Q1641821>', 3)
('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 'Domenico Anderson', '<http://www.wikidata.org/entity/Q3034711>', 3)
('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287

('Ernst Kitzinger', '<http://www.wikidata.org/entity/Q88907>', 'William D. Wixom', '<http://www.wikidata.org/entity/Q19754137>', 3)
('Ernst Kitzinger', '<http://www.wikidata.org/entity/Q88907>', 'Ernst Gombrich', '<http://www.wikidata.org/entity/Q78504>', 3)
('Ernst Kitzinger', '<http://www.wikidata.org/entity/Q88907>', 'Kurt Weitzmann', '<http://www.wikidata.org/entity/Q70362>', 3)
('Ernst Kitzinger', '<http://www.wikidata.org/entity/Q88907>', 'Erwin Panofsky', '<http://www.wikidata.org/entity/Q57656>', 3)
('Kornél Fabriczy', '<http://www.wikidata.org/entity/Q995470>', 'Karl Frey', '<http://www.wikidata.org/entity/Q5743139>', 3)
('Kornél Fabriczy', '<http://www.wikidata.org/entity/Q995470>', 'Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 3)
('Stefano Tumidei', '<http://www.wikidata.org/entity/Q55453618>', 'Andrea Emiliani', '<http://www.wikidata.org/entity/Q3615689>', 3)
('Stefano Tumidei', '<http://www.wikidata.org/entity/Q55453618>', 'Massimo Ferretti', '<http://www

In [19]:
for k, v in coll_related.items(): 
    coll_list = ' '.join(v)
    query_res = """
        PREFIX wdt: <http://www.wikidata.org/prop/direct/>
        PREFIX wd: <http://www.wikidata.org/entity/>
        PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
        SELECT DISTINCT ?person ?person_label ?occupation ?deathdate
        WHERE {
            VALUES ?person {"""+coll_list+"""} . 
            ?person wdt:P31 wd:Q5; rdfs:label ?person_label 
            optional {?person wdt:P570 ?deathdate. FILTER (?deathdate > "1850-01-01"^^xsd:dateTime)} .
            VALUES ?occupation {wd:Q1792450 wd:Q201788 wd:Q36180 wd:Q4164507 wd:Q1126160 wd:Q10732476 wd:Q1622272 wd:Q22132694}. 
            ?person wdt:P106 ?occupation .
            FILTER (langMatches(lang(?person_label), "EN")) . 
            }  

        """

    # set the endpoint 
    sparql_wd = SPARQLWrapper(wikidata_endpoint)
    # set the query
    sparql_wd.setQuery(query_res)
    # set the returned format
    sparql_wd.setReturnFormat(JSON)
    # get the results
    results = sparql_wd.query().convert()

    for result in results["results"]["bindings"]:
        #print(result)
        person = '<' + str(result["person"]["value"]) + '>'
        person_name = str(result["person_label"]["value"])
        key = tuple([person_name, person])
        if "deathdate" in result: 
            death_date = int(result["deathdate"]["value"][0:4])
            if death_date > 1850 or "deathdate" not in result:
                for value in v:
                    if value == str(key[1]):
                        tupla = tuple([k[0], k[1], person_name, person, 3])
                        if tupla not in people and k[0] != person_name:
                            print(tupla)
                            people.append(tupla)

#print(people)

('Ernst Kitzinger', '<http://www.wikidata.org/entity/Q88907>', 'Meyer Schapiro', '<http://www.wikidata.org/entity/Q533444>', 3)
('Ernst Kitzinger', '<http://www.wikidata.org/entity/Q88907>', 'Philip Grierson', '<http://www.wikidata.org/entity/Q3901455>', 3)
('Leo Steinberg', '<http://www.wikidata.org/entity/Q457739>', 'Rudolf Wittkower', '<http://www.wikidata.org/entity/Q215999>', 3)


In [20]:
wdpeople_query = """ 
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT DISTINCT ?historian ?student ?student_label ?master ?master_label ?influencer ?influencer_label ?signper ?signper_label ?birthdate ?deathdate 
WHERE {
        VALUES ?historian {"""+historians+"""} . 
        optional {?historian wdt:P802 ?student . 
        ?student rdfs:label ?student_label .
        FILTER (langMatches(lang(?student_label), "EN")). 
        }
        optional {?historian wdt:P1066 ?master . 
        ?master rdfs:label ?master_label .
        FILTER (langMatches(lang(?master_label), "EN")). 
        }
        optional {?historian wdt:P737 ?influencer . 
        ?influencer rdfs:label ?influencer_label .
        FILTER (langMatches(lang(?influencer_label), "EN")).
        } 
        optional {?historian wdt:P3342 ?signper . 
        ?signper rdfs:label ?signper_label .
        FILTER (langMatches(lang(?signper_label), "EN")) . 
        }
    } 
"""

sparql_wd = SPARQLWrapper(wikidata_endpoint)
    # set the query
sparql_wd.setQuery(wdpeople_query)
    # set the returned format
sparql_wd.setReturnFormat(JSON)
    # get the results
results = sparql_wd.query().convert()

for result in results["results"]["bindings"]:
    #print(result)
    historian = result["historian"]["value"]
    if "influencer" in result: 
        influencer = result["influencer"]["value"]
        #print(influencer)
        if "influencer_label" in result: 
            influencer_label = result["influencer_label"]["value"]
            #print("influencer:", influencer, influencer_label)

                    # only if both uri and label are found we add them to 
            g.add(( URIRef(historian) , URIRef(wdt.P737) , URIRef(influencer) ))
            g.add(( URIRef(influencer) , RDFS.label , Literal(influencer_label) ))
            
           
    if "student" in result: 
        student = result["student"]["value"]
        print(historian)
        if "student_label" in result: 
            student_label = result["student_label"]["value"]
            print("historian:", historian, "student:", student, student_label)

                        # only if both uri and label are found we add them to 
            g.add(( URIRef(historian) , URIRef(wdt.P802) , URIRef(student) ))
            g.add(( URIRef(student) , RDFS.label , Literal(student_label) )) 
           
    
    if "master" in result: 
        master = result["master"]["value"]
        if "master_label" in result: 
            master_label = result["master_label"]["value"]
            #print("historian:", historian, "master:", master_label)

                        # only if both uri and label are found we add them to 
            g.add(( URIRef(historian) , URIRef(wdt.P1066) , URIRef(master) ))
            g.add(( URIRef(master) , RDFS.label , Literal(master_label) ))
            
                
    
    if "signper" in result: 
        signper = result["signper"]["value"]
        if "signper_label" in result: 
            signper_label = result["signper_label"]["value"]
            #print("signper:", signper, signper_label)

                        # only if both uri and label are found we add them to 
            g.add(( URIRef(historian) , URIRef(wdt.P3342) , URIRef(signper) ))
            g.add(( URIRef(signper) , RDFS.label , Literal(signper_label) ))
           
                

http://www.wikidata.org/entity/Q2824734
historian: http://www.wikidata.org/entity/Q2824734 student: http://www.wikidata.org/entity/Q3659000 Carlo Anti
http://www.wikidata.org/entity/Q2824734
historian: http://www.wikidata.org/entity/Q2824734 student: http://www.wikidata.org/entity/Q19301518 Géza de Fràncovich
http://www.wikidata.org/entity/Q2824734
historian: http://www.wikidata.org/entity/Q2824734 student: http://www.wikidata.org/entity/Q253864 Pietro Toesca
http://www.wikidata.org/entity/Q2824734
historian: http://www.wikidata.org/entity/Q2824734 student: http://www.wikidata.org/entity/Q253864 Pietro Toesca
http://www.wikidata.org/entity/Q2824734
historian: http://www.wikidata.org/entity/Q2824734 student: http://www.wikidata.org/entity/Q253864 Pietro Toesca


In [21]:
g.serialize(destination='Desktop/dhdk_epds/resources/artchives3.nq', format='nquads') #Desktop/dhdk_epds/resources/
result = g.parse("Desktop/dhdk_epds/resources/artchives3.nq", format='nquads')



In [22]:
for hist, prop, influencer in g.triples((None, wdt.P737, None)):
    #print(hist)
    for influencer, hasname, influencername in g.triples((influencer, RDFS.label, None)):
        for historian, p, name in g.triples(( hist, RDFS.label, None)):
            if "wikidata.org/entity/" in str(hist):
                tuplawd = tuple([name.strip(), "<" + hist.strip() +">", influencername.strip(), "<" + influencer.strip() +">", 1])
                if tuplawd not in people:
                    people.append(tuplawd)
                
for hist, prop, student in g.triples((None, wdt.P802, None)):
    #print(hist)
    for historian, p, name in g.triples(( hist, RDFS.label, None)):
        for student, hasname, studentname in g.triples((student, RDFS.label, None)):
            if "wikidata.org/entity/" in str(hist):  
                tuplawd = tuple([name.strip(), "<" + hist.strip() +">", studentname.strip(), "<" + student.strip() +">", 1])
                if tuplawd not in people:
                    people.append(tuplawd)

for hist, prop, master in g.triples((None, wdt.P1066, None)):
    #print(hist)
    for historian, p, name in g.triples(( hist, RDFS.label, None)):
        for master, hasname, mastername in g.triples((master, RDFS.label, None)):
            if "wikidata.org/entity/" in str(hist):  
                tuplawd = tuple([name.strip(), "<" + hist.strip() +">", mastername.strip(), "<" + master.strip() +">", 1])
                if tuplawd not in people:
                    people.append(tuplawd)
                
                
for hist, prop, signper in g.triples((None, wdt.P3342, None)):
    #print(hist)
    for historian, p, name in g.triples(( hist, RDFS.label, None)):
        for signper, hasname, signpername in g.triples((signper, RDFS.label, None)):
            if "wikidata.org/entity/" in str(hist):  
                tuplawd = tuple([name.strip(), "<" + hist.strip() +">", signpername.strip(), "<" + signper.strip() +">", 1])
                if tuplawd not in people:
                    people.append(tuplawd)
                
print(people)

[('Ulrich Middeldorf', '<http://www.wikidata.org/entity/Q1715096>', 'Adolph Goldschmidt', '<http://www.wikidata.org/entity/Q70804>', 2), ('Ulrich Middeldorf', '<http://www.wikidata.org/entity/Q1715096>', 'Heinrich Wölfflin', '<http://www.wikidata.org/entity/Q123466>', 2), ('Ulrich Middeldorf', '<http://www.wikidata.org/entity/Q1715096>', 'Roberto Longhi', '<http://www.wikidata.org/entity/Q1361667>', 2), ('Otto Lehmann-Brockhaus', '<http://www.wikidata.org/entity/Q1641821>', 'Wolfgang Lotz', '<http://www.wikidata.org/entity/Q701741>', 2), ('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 'Rudolf Wittkower', '<http://www.wikidata.org/entity/Q215999>', 2), ('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 'Henriette Hertz', '<http://www.wikidata.org/entity/Q103498>', 2), ('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>', 'Anton Heinrich Springer', '<http://www.wikidata.org/entity/Q64637>', 2), ('Ernst Steinmann', '<http://www.wikidata.org/entity

In [23]:
import csv
with open('people.csv', mode='w') as my_file:
    my_writer = csv.writer(my_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_ALL)
    # write the column names
    my_writer.writerow(['art_hist', 'art_hist_uri','person', 'person_uri', 'weight'])
    
    # access the list of tuples of the query results
    for res in people:
        # write in the csv
        my_writer.writerow([res[0], res[1], res[2], res[3], res[4]])

In [24]:
#! pip install pandas
import pandas as pd
# parse the csv into a dataframe
df = pd.read_csv("people.csv")
# print the first 5 rows
df.head()

Unnamed: 0,art_hist,art_hist_uri,person,person_uri,weight
0,Ulrich Middeldorf,<http://www.wikidata.org/entity/Q1715096>,Adolph Goldschmidt,<http://www.wikidata.org/entity/Q70804>,2
1,Ulrich Middeldorf,<http://www.wikidata.org/entity/Q1715096>,Heinrich Wölfflin,<http://www.wikidata.org/entity/Q123466>,2
2,Ulrich Middeldorf,<http://www.wikidata.org/entity/Q1715096>,Roberto Longhi,<http://www.wikidata.org/entity/Q1361667>,2
3,Otto Lehmann-Brockhaus,<http://www.wikidata.org/entity/Q1641821>,Wolfgang Lotz,<http://www.wikidata.org/entity/Q701741>,2
4,Ernst Steinmann,<http://www.wikidata.org/entity/Q3057287>,Rudolf Wittkower,<http://www.wikidata.org/entity/Q215999>,2


In [25]:
#!pip install pyvis

from pyvis import network as net



people_net = net.Network(height="750px", width="100%", bgcolor="white", font_color="#23f5ad", notebook="True", heading="The distribution of art historians' relations with experts of their times")

# set the physics layout of the network
people_net.barnes_hut()
people_data = pd.read_csv("people.csv")

sources = people_data['art_hist']
targets = people_data['person']
weights = people_data['weight']
uri = people_data['art_hist_uri']

edge_data = zip(sources, targets, weights)

for e in edge_data:
    #print(e)
    src = e[0]
    dst = e[1]
    w = e[2]


    people_net.add_node(src, src, title=src, color= "#1cae81", shape='dot')
    people_net.add_node(dst, dst,  title=dst, color= "#1cae81", shape='dot')
    if w == 1:
        people_net.add_edge(src, dst, value=w, color="orange")
    elif w == 2:
        people_net.add_edge(src, dst, value=w, color="grey")
    elif w == 3:
        people_net.add_edge(src, dst, value=w, color="#1cae81")

neighbor_map = people_net.get_adj_list()

# add neighbor data to node hover data
for node in people_net.nodes:
    #print(node)
    node["title"] = "<b>" + node["title"] + "</b>"+ " relations:<br>" + "<br>".join(neighbor_map[node["id"]])
    node["size"] = 70 + (len(neighbor_map[node["id"]])*10)  #len(neighbor_map[node["id"]]) #70 + (len(neighbor_map[node["id"]])*10) 
    node["label"] = node["id"] 
    node["borderWidthSelected"] = 5 
    if node["label"] in arthistorians_names:
        node["color"] = "#23f5ad"
        uripos = arthistorians_names.index(node["label"])+1
        uri = arthistorians_names[uripos]
        #print(node["label"], uri)
        #node["title"] = "<br>" + "<b>" + "<a href='http://artchives.fondazionezeri.unibo.it/historian-" + uri[32:-1] + "'>" + node["label"] + "</a>" +  "</b>" +  "<i>" + "  relations:<br>" + "</i>"+ "<hr>" + "<br>".join(neighbor_map[node["id"]]) 
        node["title"] = "<br>" + "<b>" + node["label"] + "</b>" +  "<i>" + "  relations:<br>" + "</i>"+ "<hr>"+ "<br>".join(neighbor_map[node["id"]]) + "<br>" 
    else: 
        node["title"] = "<br>" + "<b>" + node["label"] + "</b>" +  "<i>" + "  relations:<br>" + "</i>"+ "<hr>"+ "<br>".join(neighbor_map[node["id"]]) + "<br>" 
        
print(len(people_net.edges))

people_net.show("people.html")


153


## 4. The distribution of art historians’ relations with their subjects of study: visualizing RQ4.

In [28]:

query_periods = g.query(
    """PREFIX wdt: <http://www.wikidata.org/prop/direct/>
    SELECT ?hist_label ?hist ?period (SAMPLE(?label) AS ?period_label) 
    WHERE {?coll <https://w3id.org/artchives/hasSubjectPeriod> ?period ; rdfs:label ?coll_label 
    . ?coll wdt:P170 ?hist . ?hist rdfs:label ?hist_label .
    ?period rdfs:label ?label . 
    }
    GROUP BY ?period ?label ?hist ?hist_label 
    ORDER BY ?period""")

periods = set()
period_dict = {}
for result in query_periods:
    #print(result)
    hist = tuple([result[0].strip(), '<' + str(result[1])+ '>'])
    period = result[2].strip()
    #print(period)
    if hist not in period_dict.keys():
        period_dict[hist] = set([period])
    else:
        period_dict[hist].add(period)
    

for k,v in period_dict.items():
    print(k, v)
    for value in v:
        periods.add('<' + str(value) + '>') 


('Otto Lehmann-Brockhaus', '<http://www.wikidata.org/entity/Q1641821>') {'http://www.wikidata.org/entity/Q12554'}
('Werner Cohn', '<http://www.wikidata.org/entity/Q18935222>') {'http://www.wikidata.org/entity/Q4692', 'http://www.wikidata.org/entity/Q12554'}
('Federico Zeri', '<http://www.wikidata.org/entity/Q1089074>') {'http://www.wikidata.org/entity/Q7017', 'http://www.wikidata.org/entity/Q7018', 'http://www.wikidata.org/entity/Q7016', 'http://www.wikidata.org/entity/Q37853', 'http://www.wikidata.org/entity/Q131808', 'http://www.wikidata.org/entity/Q4692', 'http://www.wikidata.org/entity/Q12554'}
('Ernst Kitzinger', '<http://www.wikidata.org/entity/Q88907>') {'http://www.wikidata.org/entity/Q12554', 'http://www.wikidata.org/entity/Q429811', 'http://www.wikidata.org/entity/Q211884'}
('Kurt Badt', '<http://www.wikidata.org/entity/Q1629748>') {'http://www.wikidata.org/entity/Q12554', 'http://www.wikidata.org/entity/Q37853', 'http://www.wikidata.org/entity/Q131808', 'http://www.wikidata.

In [29]:
period_list = ' '.join(periods)
period_res = """
        PREFIX wdt: <http://www.wikidata.org/prop/direct/>
        PREFIX wd: <http://www.wikidata.org/entity/>
        SELECT DISTINCT ?period ?period_label ?startdate ?enddate
        WHERE {
            VALUES ?period {"""+period_list+"""} . 
            ?period wdt:P580 ?startdate ; wdt:P582 ?enddate; rdfs:label ?period_label .
            FILTER (langMatches(lang(?period_label), "EN"))
            
            } 
        """
sparql_wd = SPARQLWrapper(wikidata_endpoint)
    # set the query
sparql_wd.setQuery(period_res)
    # set the returned format
sparql_wd.setReturnFormat(JSON)
    # get the results
results = sparql_wd.query().convert()

dates_dict = {}
for result in results["results"]["bindings"]:
    period = '<' + str(result["period"]["value"]) + '>'
    period_label = result["period_label"]["value"]
    start = int(result["startdate"]["value"][0:4])
    end = int(result["enddate"]["value"][0:4])
    key = tuple([period, period_label.lower()])
    years = tuple([start, end])
    if key not in dates_dict.keys():
        dates_dict[key] = years

for k, v in dates_dict.items():
    print(k,v)


('<http://www.wikidata.org/entity/Q4692>', 'renaissance') (1400, 1650)
('<http://www.wikidata.org/entity/Q6955>', '19th century') (1801, 1900)
('<http://www.wikidata.org/entity/Q7015>', '18th century') (1701, 1800)
('<http://www.wikidata.org/entity/Q7016>', '17th century') (1601, 1700)
('<http://www.wikidata.org/entity/Q7018>', '15th century') (1401, 1500)
('<http://www.wikidata.org/entity/Q7017>', '16th century') (1501, 1600)
('<http://www.wikidata.org/entity/Q7034>', '14th century') (1301, 1400)
('<http://www.wikidata.org/entity/Q8683>', 'cold war') (1947, 1991)
('<http://www.wikidata.org/entity/Q12554>', 'middle ages') (476, 1500)
('<http://www.wikidata.org/entity/Q14378>', 'neoclassicism') (1760, 1830)
('<http://www.wikidata.org/entity/Q37068>', 'romanticism') (1800, 1900)
('<http://www.wikidata.org/entity/Q37853>', 'baroque') (1590, 1750)


In [30]:
related_artists = {}
for coll,p,hist in g.triples(( None, wdt.P170, None)):
    for historian, proper, hist_name in g.triples((hist, RDFS.label, None)):     
        for collection, pr, coll_name in g.triples(( coll, RDFS.label, None)):   
            for col, prop, content in g.triples((coll, art.hasSubjectArtist, None)):     
                for cont, pro, content_label in g.triples(( content, RDFS.label, None)): 
                    if "wikidata.org/entity/" in str(content):    
                        key = tuple([hist_name.strip(), '<' + str(hist) + '>'])
                        value = '<' + str(content) + '>'
                        if key not in related_artists.keys():
                            related_artists[key] = set([value])
                        else:
                            related_artists[key].add(value)
                            
for k, v in related_artists.items():
    print(k,v)

('Luigi Salerno', '<http://www.wikidata.org/entity/Q6700132>') {'<http://www.wikidata.org/entity/Q3769699>', '<http://www.wikidata.org/entity/Q447730>', '<http://www.wikidata.org/entity/Q359421>', '<http://www.wikidata.org/entity/Q334262>', '<http://www.wikidata.org/entity/Q42207>', '<http://www.wikidata.org/entity/Q601644>'}
('Ulrich Middeldorf', '<http://www.wikidata.org/entity/Q1715096>') {'<http://www.wikidata.org/entity/Q1404472>'}
('Ernst Steinmann', '<http://www.wikidata.org/entity/Q3057287>') {'<http://www.wikidata.org/entity/Q5592>', '<http://www.wikidata.org/entity/Q5669>', '<http://www.wikidata.org/entity/Q692>'}
('Werner Cohn', '<http://www.wikidata.org/entity/Q18935222>') {'<http://www.wikidata.org/entity/Q208125>', '<http://www.wikidata.org/entity/Q8537>', '<http://www.wikidata.org/entity/Q205148>', '<http://www.wikidata.org/entity/Q9358>', '<http://www.wikidata.org/entity/Q168496>', '<http://www.wikidata.org/entity/Q37030>', '<http://www.wikidata.org/entity/Q48319>', '<h

In [31]:
artist_dict = {}
for k, v in related_artists.items(): 
    art_list = ' '.join(v)
    query_art = """
        PREFIX wdt: <http://www.wikidata.org/prop/direct/>
        PREFIX wd: <http://www.wikidata.org/entity/>
        SELECT DISTINCT ?person ?person_label ?occupation ?birthdate ?deathdate
        WHERE {
            VALUES ?person {"""+art_list+"""} . 
            ?person wdt:P31 wd:Q5; rdfs:label ?person_label .
            ?person wdt:P569 ?birthdate ; wdt:P570 ?deathdate.
            FILTER (langMatches(lang(?person_label), "EN")) . 
            } 

        """

    # set the endpoint 
    sparql_wd = SPARQLWrapper(wikidata_endpoint)
    # set the query
    sparql_wd.setQuery(query_art)
    # set the returned format
    sparql_wd.setReturnFormat(JSON)
    # get the results
    results = sparql_wd.query().convert()

    for result in results["results"]["bindings"]:
        #print(result)
        artist = '<' + str(result["person"]["value"]) + '>'
        artist_name = str(result["person_label"]["value"]) 
        death = str(result["deathdate"]["value"]) 
        birth = str(result["birthdate"]["value"])
        key = tuple([artist, artist_name, int(birth[0:4]), int(death[0:4])])
        for value in v:
            if value == artist:
                if key not in artist_dict.keys():      
                    artist_dict[key] = set([k[0]])
                else:
                    artist_dict[key].add(k[0])

for k,v in artist_dict.items():
    print(k, v)

('<http://www.wikidata.org/entity/Q3769699>', 'Giulio Mancini', 1559, 1630) {'Luigi Salerno'}
('<http://www.wikidata.org/entity/Q447730>', 'Giovanni Lanfranco', 1582, 1647) {'Luigi Salerno'}
('<http://www.wikidata.org/entity/Q3769699>', 'Giulio Mancini', 1558, 1630) {'Luigi Salerno'}
('<http://www.wikidata.org/entity/Q601644>', 'Filippo Napoletano', 1589, 1629) {'Luigi Salerno'}
('<http://www.wikidata.org/entity/Q42207>', 'Caravaggio', 1571, 1610) {'Federico Zeri', 'Luigi Salerno'}
('<http://www.wikidata.org/entity/Q359421>', 'Salvator Rosa', 1615, 1673) {'Luigi Salerno'}
('<http://www.wikidata.org/entity/Q334262>', 'Guercino', 1591, 1666) {'Luigi Salerno'}
('<http://www.wikidata.org/entity/Q692>', 'William Shakespeare', 1564, 1616) {'Ernst Steinmann'}
('<http://www.wikidata.org/entity/Q5669>', 'Sandro Botticelli', 1445, 1510) {'Everett Fahy', 'Ernst Steinmann', 'John Pope-Hennessy'}
('<http://www.wikidata.org/entity/Q5592>', 'Michelangelo', 1475, 1564) {'Kornél Fabriczy', 'Ernst Stein

In [32]:
check_dict = {}
for k, v in coll_related.items(): 
    coll_list = ' '.join(v)
    query_res = """
        PREFIX wdt: <http://www.wikidata.org/prop/direct/>
        PREFIX wd: <http://www.wikidata.org/entity/>
        SELECT DISTINCT ?person ?person_label ?occupation ?birthdate ?deathdate
        WHERE {
            VALUES ?person {"""+coll_list+"""} . 
            ?person wdt:P31 wd:Q5; rdfs:label ?person_label .
            VALUES ?occupation {wd:Q1281618 wd:Q42973 wd:Q483501 wd:Q1028181 wd:Q329439} . 
            ?person wdt:P106 ?occupation ; wdt:P569 ?birthdate ; wdt:P570 ?deathdate.
            FILTER (langMatches(lang(?person_label), "EN")) . 
            } 

        """

    # set the endpoint 
    sparql_wd = SPARQLWrapper(wikidata_endpoint)
    # set the query
    sparql_wd.setQuery(query_res)
    # set the returned format
    sparql_wd.setReturnFormat(JSON)
    # get the results
    results = sparql_wd.query().convert()

    for result in results["results"]["bindings"]:
        #print(result)
        artist = '<' + str(result["person"]["value"]) + '>'
        artist_name = str(result["person_label"]["value"]) 
        death = str(result["deathdate"]["value"]) 
        birth = str(result["birthdate"]["value"])
        key = tuple([artist, artist_name, int(birth[0:4]), int(death[0:4])])
        
        for value in v:
            #print(value)
            if value == artist:
                if key not in check_dict.keys():      
                    check_dict[key] = set([k[0]])
                else:
                    check_dict[key].add(k[0])
               
                
for k, v in check_dict.items():
    if k not in artist_dict.items():
        print(k)

('<http://www.wikidata.org/entity/Q42207>', 'Caravaggio', 1571, 1610)
('<http://www.wikidata.org/entity/Q601644>', 'Filippo Napoletano', 1589, 1629)
('<http://www.wikidata.org/entity/Q447730>', 'Giovanni Lanfranco', 1582, 1647)
('<http://www.wikidata.org/entity/Q359421>', 'Salvator Rosa', 1615, 1673)
('<http://www.wikidata.org/entity/Q334262>', 'Guercino', 1591, 1666)
('<http://www.wikidata.org/entity/Q5592>', 'Michelangelo', 1475, 1564)
('<http://www.wikidata.org/entity/Q5669>', 'Sandro Botticelli', 1445, 1510)
('<http://www.wikidata.org/entity/Q937875>', 'Luca Beltrami', 1854, 1933)
('<http://www.wikidata.org/entity/Q5580>', 'Albrecht Dürer', 1471, 1528)
('<http://www.wikidata.org/entity/Q168496>', 'Carlo Carrà', 1881, 1966)
('<http://www.wikidata.org/entity/Q208125>', 'Lorenzo Ghiberti', 1378, 1455)
('<http://www.wikidata.org/entity/Q8537>', 'Bernardino Luini', 1475, 1532)
('<http://www.wikidata.org/entity/Q48319>', 'Hans Holbein', 1497, 1543)
('<http://www.wikidata.org/entity/Q2051

In [33]:
artist_rel = list()
for k, v in artist_dict.items():
    for value in v:
        tupla = tuple([k[1], value, 1])
        if tupla not in artist_rel:
            artist_rel.append(tupla)
        for el in v:
            if value != el:
                tupla = tuple([el, value, 1])
                tuplabis = tuple([value, el, 1])
                if tuplabis not in artist_rel:
                    artist_rel.append(tupla)
    
print(len(artist_rel))

135


In [34]:
from collections import Counter

count_dict = Counter(artist_rel)
artist_rel_final = []

for k, v in count_dict.items():
    if v != 1:
        x = list(k)
        x[2] = v
        k = tuple(x)
    artist_rel_final.append(k)
    
print(artist_rel_final)

[('Giulio Mancini', 'Luigi Salerno', 1), ('Giovanni Lanfranco', 'Luigi Salerno', 1), ('Filippo Napoletano', 'Luigi Salerno', 1), ('Caravaggio', 'Federico Zeri', 1), ('Luigi Salerno', 'Federico Zeri', 1), ('Caravaggio', 'Luigi Salerno', 1), ('Salvator Rosa', 'Luigi Salerno', 1), ('Guercino', 'Luigi Salerno', 1), ('William Shakespeare', 'Ernst Steinmann', 1), ('Sandro Botticelli', 'Everett Fahy', 1), ('Ernst Steinmann', 'Everett Fahy', 1), ('John Pope-Hennessy', 'Everett Fahy', 1), ('Sandro Botticelli', 'Ernst Steinmann', 1), ('John Pope-Hennessy', 'Ernst Steinmann', 2), ('Sandro Botticelli', 'John Pope-Hennessy', 1), ('Michelangelo', 'Kornél Fabriczy', 1), ('Ernst Steinmann', 'Kornél Fabriczy', 1), ('John Pope-Hennessy', 'Kornél Fabriczy', 1), ('Michelangelo', 'Ernst Steinmann', 1), ('Michelangelo', 'John Pope-Hennessy', 1), ('Albrecht Dürer', 'Werner Cohn', 1), ('Bernardino Luini', 'Werner Cohn', 1), ('Friedrich Nietzsche', 'Werner Cohn', 1), ('Thomas Mann', 'Werner Cohn', 1), ('Carlo 

In [49]:
for k, v in coll_related.items():
    #print(k, v)
    for key, value in dates_dict.items():
        #print(key, value)
        for artist, hist in artist_dict.items():
            #print(artist, hist)
            if key[0] in v and artist[0] in v:
                if (value[0] <= artist[2] and value[1] >= artist[3]) or (value[0] >= artist[2] and value[1] <= artist[3]) or (value[0] >= artist[2] and (value[1] + 15) >= artist[3] and artist[2] >= value[1]) or (value[0] <= artist[2] and (value[1] - 15) >= artist[2] and value[1] <= artist[3]):
                    tupla = tuple([k[0], artist[1], 2])
                    print(tupla)
                    tupla_remove1 = tuple([k[0], artist[1], 1])
                    tupla_remove2 = tuple([artist[1], k[0], 1])
                    if tupla_remove1 in artist_rel_final: 
                        artist_rel_final.remove(tupla_remove1)
                    elif tupla_remove2 in artist_rel_final: 
                        artist_rel_final.remove(tupla_remove2)
                    artist_rel_final.append(tupla)
                        
#print(artist_rel_final)                 

('Luigi Salerno', 'Salvator Rosa', 2)
('Luigi Salerno', 'Salvator Rosa', 2)
('Luigi Salerno', 'Guercino', 2)
('Ernst Steinmann', 'William Shakespeare', 2)
('Ernst Steinmann', 'Sandro Botticelli', 2)
('Ernst Steinmann', 'Michelangelo', 2)
('Werner Cohn', 'Albrecht Dürer', 2)
('Werner Cohn', 'Bernardino Luini', 2)
('Werner Cohn', 'Filippo Lippi', 2)
('Werner Cohn', 'Hans Holbein', 2)
('Werner Cohn', 'Albrecht Dürer', 2)
('Werner Cohn', 'Bernardino Luini', 2)
('Werner Cohn', 'Filippo Lippi', 2)
('Werner Cohn', 'Lorenzo Ghiberti', 2)
('Gustav Ludwig', 'Vittore Carpaccio', 2)
('Gustav Ludwig', 'Giovanni Bellini', 2)
('Julian Kliemann', 'David Bailly', 2)
('Julian Kliemann', 'Federico Zuccari', 2)
('Julian Kliemann', 'Giorgio Vasari', 2)
('Julian Kliemann', 'David Bailly', 2)
('Julian Kliemann', 'Federico Zuccari', 2)
('Julian Kliemann', 'Giorgio Vasari', 2)
('Julius S. Held', 'Rembrandt', 2)
('Julius S. Held', 'Peter Paul Rubens', 2)
('Julius S. Held', 'Pieter Brueghel the Elder', 2)
('Juli

In [50]:
with open('artists.csv', mode='w') as my_file:
    my_writer = csv.writer(my_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_ALL)
    # write the column names
    my_writer.writerow(['subj1','subj2','weight'])
    
    # access the list of tuples of the query results
    for res in artist_rel_final:
        # write in the csv
        my_writer.writerow([res[0], res[1], res[2]])

In [51]:
# parse the csv into a dataframe
df = pd.read_csv("artists.csv")
# print the first 5 rows
df.head()

Unnamed: 0,subj1,subj2,weight
0,Giulio Mancini,Luigi Salerno,1
1,Giovanni Lanfranco,Luigi Salerno,1
2,Filippo Napoletano,Luigi Salerno,1
3,Luigi Salerno,Federico Zeri,1
4,Caravaggio,Luigi Salerno,1


In [52]:
people_net = net.Network(height="750px", width="100%", bgcolor="white", font_color="#1cae81", notebook="True", heading="The distribution of art historians' relations with their subjects of study")

# set the physics layout of the network
people_net.barnes_hut()
people_data = pd.read_csv("artists.csv")

sources = people_data['subj1']
targets = people_data['subj2']
weights = people_data['weight']


edge_data = zip(sources, targets, weights)

for e in edge_data:
    #print(e)
    src = e[0]
    dst = e[1]
    w = e[2]


    people_net.add_node(src, src, title=src, color= "#1cae81")
    people_net.add_node(dst, dst, title=dst, color= "#1cae81")
    if w == 1:
        people_net.add_edge(src, dst, value=w, color="grey")
    elif w == 2:
        people_net.add_edge(src, dst, value=w, color="#1cae81")
    elif w == 3:
        people_net.add_edge(src, dst, value=w, color="black")
    
        

neighbor_map = people_net.get_adj_list()

# add neighbor data to node hover data
for node in people_net.nodes:
    #print(node)
    
    node["size"] = 80 + (len(neighbor_map[node["id"]])*10)   
    node["borderWidthSelected"] = 5
    node["label"] = node["id"] 
    historians = set()
    artists = set()
    for el in (neighbor_map[node["id"]]):
        if el in arthistorians_names:
            historians.add(el)
        else: 
            artists.add(el)
    if node["label"] in arthistorians_names:
        node["color"] = "#23f5ad"
        uripos = arthistorians_names.index(node["label"])+1
        uri = arthistorians_names[uripos]
        #print(node["label"], uri)
        if len(historians) > 0:
            node["title"] = "<br>" + "<b>" + node["label"] +  "</b>" +  "<i>" + "  relations:<br>" + "</i>"+ "<hr>" + "<b>" + "Subjects of study" + "</b>" + "<br>" + "<br>".join(artists) + "<br>" + "<hr>"+ "<b>" + "Art historians" + "</b>" + "<br>" + "<br>".join(historians)  
        else:
            node["title"] = "<br>" + "<b>" + node["label"] +   "</b>" +  "<i>" + "  relations:<br>" + "</i>"+ "<hr>" + "<b>" + "Subjects of study" + "</b>" + "<br>" + "<br>".join(artists)
    
        #+ "<a href='http://artchives.fondazionezeri.unibo.it/historian-" + uri[32:-1] + "'>" + node["label"] + "</a>" +
    else: 
        node["title"] = "<br>" + "<b>" + node["label"] + "</b>" +  "<i>" + "  relations:<br>" + "</i>"+ "<hr>" + "<b>" + "Art historians" + "</b>" + "<br>" + "<br>".join(historians) + "<br>" 
   
        #print(historians)
        #print(artists)
people_net.show("people.html")