# Answering RQ1_1: Counting serendipitous connections in Wikidata

## Resources needed for this test

IICONGRAPHwikidata, either [downloaded from zenodo](https://zenodo.org/records/10294589) or produced following the guidelines in the [\docs](https://github.com/br0ast/iicongraph/tree/main/docs) part of this repository.

It needs to be loaded in a GraphDB or blazegraph triplestore.

Additional libraries can be installed via the **requirements.txt** file in this repository.

## Rationale behind this connection selection

First, we need to extract the symbols of HyperReal matched in Wikidata that are linked to a symbolic meaning **shared by more than 1 symbol** (example: rose is a symbol, love, friendship, marriage could be the symbolic meaning of rose). This is important, as we are trying to connect artworks by their symbolic meaning (and not the symbols, as connecting the symbols would be the same as connecting artworks because they depict the same thing). **Symbolic meanings that are only associated with a symbol are excluded because connecting artwork via them equals to connecting via their depictions**. For example if "resurrection" is the symbolic meaning of **only** the term "apple", all the paintings that would be connected because they symbolize "resurrection" depict an apple, and therefore there would be no distinction between connecting them because they depict an "apple" or because they symbolize resurrection, and this connection does not count for the analysis.

Once we extract this list of symbols, we pre compute how many symbolic meaning they share with each other.

Then, we extract from IICONGRAPHwiki, all the paintings that depict at least one of the selected symbols, and their depiction.

Then, we compare all the extracted paintings in pairs, counting the symbolic connections that they share by checking, using the pre-computed selection performed before, how many symbolic meaning each of their depictions share, making sure that we exclude from the count the connections shared by the **same depiction**

### Example scenario

Painting A depicts a *dog* and an *apple*, and Painting B depicts a *cat* and an *apple*.

* *dog* is linked to the symbolic meanings of *fidelity*, *envy*, and *family*. 
* *cat* is linked to the symbolic meaning of *envy*, *loneliness*, *family*. and *health*.
* *apple* is linked to the symbolic meaning of *original sin*, *health*.

By comparing *dog* and *cat* we get two connections (via *envy* and via *family*). By comparing *dog* and *apple* we get zero connections. By comparing *apple* and *cat* we get one connection (via *health*). We don't compare *apple* and *apple* because they are the same element, and we count 0 connections between them.

In total, we have 3 symbolic connections between Painting A and Painting B.






# Extracting symbols and symbolic meanings (if the symbolic meaning is symbolized by more than 1 symbol)

In [3]:
from pymantic import sparql
server = sparql.SPARQLServer('http://localhost:7200/repositories/IICONGRAPHwiki') #only works if 
# you set up iicongraphwikidata in blazegraph or graphdb

In [7]:
query = '''
prefix crm:<http://www.cidoc-crm.org/cidoc-crm/> 
prefix dul:<http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#> 
prefix hyr:<https://w3id.org/simulation/data/> 
prefix icon:<https://w3id.org/icon/ontology/> 
prefix prov:<http://www.w3.org/ns/prov#> 
prefix rdfs:<http://www.w3.org/2000/01/rdf-schema#> 
prefix sim:<https://w3id.org/simulation/ontology/> 
prefix wd:<http://www.wikidata.org/entity/> 
prefix wicon:<http://www.example.org/wikicon/>

#select (AVG(?tot) as ?average) where 
#{ 
  select distinct ?simulacra ?rc WHERE { ?ico icon:aboutWorkOfArt ?painting;
                                          icon:recognizedImage ?img .
                                    ?img icon:hasSymbol ?symbol .
                                    ?symbol sim:hasSimulacrum ?simulacra;
                                    sim:hasRealityCounterpart|sim:easedRealityCounterpart|
                                               sim:healedRealityCounterpart|sim:restoredRealityCounterpart|
                                               sim:preventedRealityCounterpart
                                               |sim:elicitedRealityCounterpart ?rc . {
                                    select ?rc (count(distinct ?simu) as ?tot) where { ?simu sim:hasRealityCounterpart|sim:easedRealityCounterpart|
                                               sim:healedRealityCounterpart|sim:restoredRealityCounterpart|
                                               sim:preventedRealityCounterpart
                                               |sim:elicitedRealityCounterpart ?rc } group by ?rc having (?tot > 1)
                                            } }'''
res = server.query(query)

In [8]:
simu_ser = []
for el in res["results"]["bindings"]:
    simu_ser.append(el["simulacra"]["value"])

In [10]:
simurc_ser = dict()
for el in res["results"]["bindings"]:
    if el["simulacra"]["value"] not in simurc_ser:
        simurc_ser[el["simulacra"]["value"]] = set()
    simurc_ser[el["simulacra"]["value"]].add(el["rc"]["value"])

# Precomputing the shared symbolic meanings between symbols

In [20]:
from itertools import combinations,permutations

In [21]:
from tqdm import tqdm
simu_simu_conn = dict()
for comb in tqdm(list(combinations(list(simurc_ser.keys()), 2))):
    total = len(simurc_ser[comb[0]].intersection(simurc_ser[comb[1]]))
    simu_simu_conn[comb] = total

100%|█████████████████████████████████████████████████████████████████████| 485605/485605 [00:00<00:00, 1576818.59it/s]


# Extracting Paintings and their depictions (if the depictions are included in one of the symbols extracted above)

In [4]:
query = '''
prefix crm:<http://www.cidoc-crm.org/cidoc-crm/> 
prefix dul:<http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#> 
prefix hyr:<https://w3id.org/simulation/data/> 
prefix icon:<https://w3id.org/icon/ontology/> 
prefix prov:<http://www.w3.org/ns/prov#> 
prefix rdfs:<http://www.w3.org/2000/01/rdf-schema#> 
prefix sim:<https://w3id.org/simulation/ontology/> 
prefix wd:<http://www.wikidata.org/entity/>  
  select distinct ?painting ?simulacra WHERE { ?ico icon:aboutWorkOfArt ?painting;
                                          icon:recognizedImage ?img .
                                    ?img icon:hasSymbol ?symbol .
                                    ?symbol sim:hasSimulacrum ?simulacra;
                                    sim:hasRealityCounterpart|sim:easedRealityCounterpart|
                                               sim:healedRealityCounterpart|sim:restoredRealityCounterpart|
                                               sim:preventedRealityCounterpart
                                               |sim:elicitedRealityCounterpart ?rc ;
                                    sim:hasContext ?context . {
                                    select ?rc (count(distinct ?simu) as ?tot) where { ?simu sim:hasRealityCounterpart|sim:easedRealityCounterpart|
                                               sim:healedRealityCounterpart|sim:restoredRealityCounterpart|
                                               sim:preventedRealityCounterpart
                                               |sim:elicitedRealityCounterpart ?rc } group by ?rc having (?tot > 1)
                                            } }'''
res = server.query(query)

In [5]:
connect_p = dict()
for el in res["results"]["bindings"]:
    if el["painting"]["value"] not in connect_p:
        connect_p[el["painting"]["value"]] = set()
    connect_p[el["painting"]["value"]].add(el["simulacra"]["value"])

In [6]:
len(connect_p)

75119

# Counting the connections between the extracted paintings

**important** this step took around 2 hours to complete

In [None]:
connections = 0
p_ncp = dict()
p_nc = dict()
p_ncperp = dict()
for painting1 in tqdm(connect_p):
    for painting2 in connect_p:
        if painting1 != painting2:
            for dep in connect_p[painting1]:
                for dep2 in connect_p[painting2]:
                    if (dep, dep2) in simu_simu_conn:
                        connections +=simu_simu_conn[dep, dep2]

In [None]:
print(connections)