# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.

In [1]:
## SETUP used later
import sys
import os
import json
import pandas as pd
sys.path.insert(1, '../../../../src/')
import gt_modules.evaluation as evaluation
from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-history1-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString,verbose = True):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://gracevirtuoso.dei.unipd.it/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        results = sparql.query()
        json_results = results.convert()
        if len(json_results['results']['bindings'])==0:
            print("Empty")
            return []
        array = []
        for bindings in json_results['results']['bindings']:
            app =  [ (var, value['value'])  for var, value in bindings.items() ] 
            if verbose:
                print( app)
            array.append(app)
        if verbose:
            print(len(array))
        return array

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://gracevirtuoso.dei.unipd.it/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# History Workflow Series ("Ancient Rome explorative search") 

Consider the following exploratory information need:

> The goal of this workflow is to study the Ancient Rome and some of the main events connected to that historical period. For instance, the emperors, killings, wars and so on.

## Useful URIs for the current workflow
The following data are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | occupation    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P27`     | nationality   | predicate |
| `wdt:P3342`     | Significant person       | predicate |
| `wd:Q5`| Human       | node |
| `wd:Q1048`| Julius Caesar       | node |





Also consider that

```
wd:Q1048 ?p ?obj .
```

is the BGP to retrieve all **properties of Julius Caesar**.

Please consider that when you return a resource, you should return the IRI and the label of the resource. In particular, when the task require you to identify a BGP the result set must always be a list of couples IRI - label.

The workload should:


1. Identify the BGP for Ancient Rome

2. Find all the Roman Dictators (return the IRI and the name of the dictator)

3. Identify the BGP for Roman Empire and Roman Republic

4. Return the IRI and the name of all Roman emperors killed by homicide (see "manner of death") and, for each killed emperor, return also the IRI and name of the killer, if present in Wikidata.

5. Return the Roman emperors who killed another Roman emperor (the result set must a list of 4-tuples with killer IRI and label, and killed IRI and label).

6. Find the BGP for war

7. Find the wars (note that "war" and "battle" are not the same thing) participated by the Roman Empire (return IRI and name of the war).

8. Consider all the wars participated by the Roman Empire. For each war return the number of related battles and the opponent (a war is composed of one or more battles) (the result set must be a list of 5-tuples war IRI, label, opponent IRI and label, and #battles).


In [2]:
## startup the evaluation
# setup the file and create the empty json
ipname = "h1.ipynb"
pt = os.getcwd()+os.sep+ipname
evaluation.setup(pt)

The index of this workflow is: 1_5


## Task 1

In [3]:
# looking for the properties of  Julius Caesar
queryString = """
SELECT DISTINCT ?p ?name
WHERE { 
    wd:Q1048 ?p ?obj.
    ?p sc:name ?name.
    FILTER(!isLiteral(?obj)).
} 
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P102'), ('name', 'member of political party')]
[('p', 'http://www.wikidata.org/prop/direct/P103'), ('name', 'native language')]
[('p', 'http://www.wikidata.org/prop/direct/P1035'), ('name', 'honorific suffix')]
[('p', 'http://www.wikidata.org/prop/direct/P1038'), ('name', 'relative')]
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P119'), ('name', 'place of burial')]
[('p', 'http://www.wikidata.org/prop/direct/P1196'), ('name', 'manner of death')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('name', 'described by source')]
[('p', 'http://www.wikidata.org/prop/direct/P1412'), ('name', 'languages spoken, written or signed')]
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('name', "topic's main template")]
[('p', 'http://www.wikidata.org/prop/direct/P1441'), ('name', 'present in work')]
[('p', 'http://www.wikidata.org/prop/direct/P157'), ('name', 'kil

In [4]:
# look for the nationality
queryString = """
SELECT DISTINCT ?obj ?name
WHERE { 
    wd:Q1048 wdt:P27 ?obj.
    ?obj sc:name ?name.
} 
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q1747689'), ('name', 'Ancient Rome')]
1


In [5]:
### insert the result of TASK 1 in the file
og_uri = "http://www.wikidata.org/entity/Q1747689"
og_name = "Ancient Rome"
obj = {"uri":og_uri,"name":og_name}
evaluation.add_result(evaluation.get_index_workflow(pt),"1", evaluation.TYPE_SINGLE ,"uri", [obj] ,"all")

The index of this workflow is: 1_5
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow1_5.json
JSON object updated


`wd:Q1747689` -> Ancient Rome

## Task 2

In [7]:
# look for position held
queryString = """
SELECT DISTINCT ?obj ?name
WHERE { 
    wd:Q1048 wdt:P39 ?obj.
    ?obj sc:name ?name.
}
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q1190123'), ('name', 'Roman governor')]
[('obj', 'http://www.wikidata.org/entity/Q20056508'), ('name', 'Ancient Roman senator')]
[('obj', 'http://www.wikidata.org/entity/Q212943'), ('name', 'pontifex maximus')]
[('obj', 'http://www.wikidata.org/entity/Q40779'), ('name', 'Roman consul')]
[('obj', 'http://www.wikidata.org/entity/Q952844'), ('name', 'pontifex')]
[('obj', 'http://www.wikidata.org/entity/Q5413489'), ('name', 'curule aedile')]
[('obj', 'http://www.wikidata.org/entity/Q172907'), ('name', 'praetor')]
[('obj', 'http://www.wikidata.org/entity/Q189430'), ('name', 'quaestor')]
[('obj', 'http://www.wikidata.org/entity/Q1426927'), ('name', 'Flamen Dialis')]
[('obj', 'http://www.wikidata.org/entity/Q236885'), ('name', 'Roman dictator')]
10


In [9]:
# find all the roman dictator
queryString = """
SELECT DISTINCT ?obj ?name
WHERE { 
    ?obj wdt:P39 wd:Q236885.
    ?obj sc:name ?name.
}
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q1048'), ('name', 'Julius Caesar')]
[('obj', 'http://www.wikidata.org/entity/Q529867'), ('name', 'Titus Lartius')]
[('obj', 'http://www.wikidata.org/entity/Q334940'), ('name', 'Aulus Cornelius Cossus')]
[('obj', 'http://www.wikidata.org/entity/Q359810'), ('name', 'Lucius Caecilius Metellus')]
[('obj', 'http://www.wikidata.org/entity/Q929498'), ('name', 'Quintus Caecilius Metellus')]
[('obj', 'http://www.wikidata.org/entity/Q155312'), ('name', 'Gaius Claudius Centho')]
[('obj', 'http://www.wikidata.org/entity/Q297783'), ('name', 'Appius Claudius Caecus')]
[('obj', 'http://www.wikidata.org/entity/Q703448'), ('name', 'Marcus Livius Salinator')]
[('obj', 'http://www.wikidata.org/entity/Q519186'), ('name', 'Quintus Hortensius')]
[('obj', 'http://www.wikidata.org/entity/Q1223522'), ('name', 'Caius Poetelius Libo Visolus')]
[('obj', 'http://www.wikidata.org/entity/Q682087'), ('name', 'Gaius Poetelius Libo Visolus')]
[('obj', 'http://www.wikidat

In [10]:
obj = [{"uri":r[0][1],"name":r[1][1]} for r in x]
evaluation.add_result(evaluation.get_index_workflow(pt),"2", evaluation.TYPE_SET ,"uri", obj)

The index of this workflow is: 1_5
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow1_5.json
JSON object updated


## Task 3

In [11]:
# find the roman empire and roman republic
queryString = """
SELECT DISTINCT ?obj ?o
WHERE { 
    wd:Q1747689 ?prop ?obj.
    ?obj sc:name ?o.
    FILTER(REGEX(?o,"[R,r]oman")).
} 
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q5652564'), ('o', 'Category:Ancient Romans')]
[('obj', 'http://www.wikidata.org/entity/Q2277'), ('o', 'Roman Empire')]
[('obj', 'http://www.wikidata.org/entity/Q17167'), ('o', 'Roman Republic')]
[('obj', 'http://www.wikidata.org/entity/Q201038'), ('o', 'Roman Kingdom')]
4


In [12]:
obj = [{"uri":"http://www.wikidata.org/entity/Q2277","name":"Roman Empire"},{"uri":"http://www.wikidata.org/entity/Q17167","name":"Roman Republic"},]
evaluation.add_result(evaluation.get_index_workflow(pt),"3", evaluation.TYPE_SET ,"uri", obj)

The index of this workflow is: 1_5
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow1_5.json
JSON object updated


## Task 4

In [13]:
# properties of Roman Empire
queryString = """
SELECT DISTINCT ?prop ?name
WHERE { 
    wd:Q2277 ?prop ?obj.
    ?prop sc:name ?name.
}
"""

print("Results")
x=run_query(queryString)

Results
[('prop', 'http://www.wikidata.org/prop/direct/P1151'), ('name', "topic's main Wikimedia portal")]
[('prop', 'http://www.wikidata.org/prop/direct/P122'), ('name', 'basic form of government')]
[('prop', 'http://www.wikidata.org/prop/direct/P1245'), ('name', 'OmegaWiki Defined Meaning')]
[('prop', 'http://www.wikidata.org/prop/direct/P1343'), ('name', 'described by source')]
[('prop', 'http://www.wikidata.org/prop/direct/P1344'), ('name', 'participant in')]
[('prop', 'http://www.wikidata.org/prop/direct/P1365'), ('name', 'replaces')]
[('prop', 'http://www.wikidata.org/prop/direct/P1366'), ('name', 'replaced by')]
[('prop', 'http://www.wikidata.org/prop/direct/P140'), ('name', 'religion')]
[('prop', 'http://www.wikidata.org/prop/direct/P1417'), ('name', 'Encyclopædia Britannica Online ID')]
[('prop', 'http://www.wikidata.org/prop/direct/P1464'), ('name', 'category for people born here')]
[('prop', 'http://www.wikidata.org/prop/direct/P1465'), ('name', 'category for people who died

In [14]:
# use the property office held by head of state
queryString = """
SELECT DISTINCT ?obj ?name
WHERE { 
    wd:Q2277 wdt:P1906 ?obj.
    ?obj sc:name ?name.
}
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q842606'), ('name', 'Roman emperor')]
1


In [15]:
# find how people are connected with Roman emperor
queryString = """
SELECT DISTINCT ?prop ?name
WHERE { 
    ?obj ?prop wd:Q842606;
            wdt:P31 wd:Q5.
    ?prop sc:name ?name.
}
"""

print("Results")
x=run_query(queryString)

Results
[('prop', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('prop', 'http://www.wikidata.org/prop/direct/P39'), ('name', 'position held')]
[('prop', 'http://www.wikidata.org/prop/direct/P97'), ('name', 'noble title')]
3


In [17]:
# return the properties of roman emperors
queryString = """
SELECT DISTINCT ?prop ?name
WHERE { 
    ?obj wdt:P39 wd:Q842606;
        ?prop ?o.
    ?prop sc:name ?name.
    FILTER(!isLiteral(?o)).
}
ORDER BY ?name
"""

print("Results")
x=run_query(queryString)

Results
[('prop', 'http://www.wikidata.org/prop/direct/P2366'), ('name', 'Roman agnomen')]
[('prop', 'http://www.wikidata.org/prop/direct/P2365'), ('name', 'Roman cognomen')]
[('prop', 'http://www.wikidata.org/prop/direct/P2359'), ('name', 'Roman nomen gentilicium')]
[('prop', 'http://www.wikidata.org/prop/direct/P2358'), ('name', 'Roman praenomen')]
[('prop', 'http://www.wikidata.org/prop/direct/P945'), ('name', 'allegiance')]
[('prop', 'http://www.wikidata.org/prop/direct/P166'), ('name', 'award received')]
[('prop', 'http://www.wikidata.org/prop/direct/P411'), ('name', 'canonization status')]
[('prop', 'http://www.wikidata.org/prop/direct/P509'), ('name', 'cause of death')]
[('prop', 'http://www.wikidata.org/prop/direct/P40'), ('name', 'child')]
[('prop', 'http://www.wikidata.org/prop/direct/P607'), ('name', 'conflict')]
[('prop', 'http://www.wikidata.org/prop/direct/P7763'), ('name', 'copyright status as a creator')]
[('prop', 'http://www.wikidata.org/prop/direct/P27'), ('name', 'c

In [18]:
# find examples of manner of death
queryString = """
SELECT DISTINCT ?o ?name
WHERE { 
    ?obj wdt:P39 wd:Q842606;
        wdt:P1196 ?o.
    ?o sc:name ?name.
}
ORDER BY ?name
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q171558'), ('name', 'accident')]
[('o', 'http://www.wikidata.org/entity/Q8454'), ('name', 'capital punishment')]
[('o', 'http://www.wikidata.org/entity/Q18663901'), ('name', 'death in battle')]
[('o', 'http://www.wikidata.org/entity/Q2746270'), ('name', 'forced suicide')]
[('o', 'http://www.wikidata.org/entity/Q149086'), ('name', 'homicide')]
[('o', 'http://www.wikidata.org/entity/Q3739104'), ('name', 'natural causes')]
[('o', 'http://www.wikidata.org/entity/Q10737'), ('name', 'suicide')]
7


In [20]:
# return the emperors killed by homicide and using optional return also the killer, if it exists
queryString = """
SELECT DISTINCT ?emperor ?name ?killer ?kName
WHERE { 
    ?emperor wdt:P39 wd:Q842606;
            wdt:P1196 wd:Q149086.
    OPTIONAL{
        ?emperor wdt:P157 ?killer.
        ?killer sc:name ?kName.
    }
    
    ?emperor sc:name ?name.
}
"""

print("Results")
x=run_query(queryString)

Results
[('emperor', 'http://www.wikidata.org/entity/Q103860'), ('name', 'Julius Nepos'), ('killer', 'http://www.wikidata.org/entity/Q1228819'), ('kName', 'Ovida')]
[('emperor', 'http://www.wikidata.org/entity/Q189108'), ('name', 'Gratian'), ('killer', 'http://www.wikidata.org/entity/Q2846204'), ('kName', 'Andragathius')]
[('emperor', 'http://www.wikidata.org/entity/Q1434'), ('name', 'Commodus'), ('killer', 'http://www.wikidata.org/entity/Q980693'), ('kName', 'Narcissus')]
[('emperor', 'http://www.wikidata.org/entity/Q183089'), ('name', 'Geta'), ('killer', 'http://www.wikidata.org/entity/Q1446'), ('kName', 'Caracalla')]
[('emperor', 'http://www.wikidata.org/entity/Q1411'), ('name', 'Claudius')]
[('emperor', 'http://www.wikidata.org/entity/Q1423'), ('name', 'Domitian')]
[('emperor', 'http://www.wikidata.org/entity/Q185538'), ('name', 'Constans')]
[('emperor', 'http://www.wikidata.org/entity/Q46734'), ('name', 'Constantine II')]
[('emperor', 'http://www.wikidata.org/entity/Q437604'), ('n

In [21]:
# add the elements to the JSON gt
objs = []
for i in x:
    f_uri = i[0][1]
    f_name = i[1][1]
    obj = {}
    obj["refers_to"] = f_uri
    obj["refers_to_name"] = f_name
    if len(i) == 2:
        # the killer is not known
        c_uri = None
        c_name = None
    else:
        c_uri = i[2][1]
        c_name = i[3][1]
    obj["check"] = ["uri","name"]
    obj["uri"]= c_uri
    obj["name"]= c_name
    objs.append(obj)

evaluation.add_result(evaluation.get_index_workflow(pt),"4", evaluation.TYPE_REFERRED ,["uri","name"], objs)

The index of this workflow is: 1_5
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow1_5.json
JSON object updated


## Task 5

In [23]:
# find roman emperors killed by another roman emperor
queryString = """
SELECT DISTINCT ?killer ?kName
WHERE { 
    ?emperor wdt:P39 wd:Q842606.
    ?emperor wdt:P157 ?killer.
    ?killer wdt:P39 wd:Q842606.
    ?killer sc:name ?kName.
}
"""

print("Results")
x=run_query(queryString)

Results
[('killer', 'http://www.wikidata.org/entity/Q8413'), ('kName', 'Constantine the Great')]
[('killer', 'http://www.wikidata.org/entity/Q1446'), ('kName', 'Caracalla')]
2


In [24]:
obj = [{"uri":r[0][1],"name":r[1][1]} for r in x]
evaluation.add_result(evaluation.get_index_workflow(pt),"5", evaluation.TYPE_SET ,"uri", obj)

The index of this workflow is: 1_5
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow1_5.json
JSON object updated


## Task 6

In [25]:
# from the first query, use the predicate conflict P607
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
    wd:Q1048 wdt:P607 ?obj.
    ?obj sc:name ?name
}
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q202161'), ('name', 'Gallic War')]
[('obj', 'http://www.wikidata.org/entity/Q1238338'), ('name', "Caesar's Civil War")]
2


In [26]:
# look at the instance of
queryString = """
SELECT DISTINCT ?inst ?name
WHERE {
    wd:Q1048 wdt:P607 ?obj.
    ?obj wdt:P31 ?inst.
    ?inst sc:name ?name
}
"""

print("Results")
x=run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q198'), ('name', 'war')]
[('inst', 'http://www.wikidata.org/entity/Q8465'), ('name', 'civil war')]
2


In [27]:
### insert the result of TASK 6 in the file
og_uri = "http://www.wikidata.org/entity/Q198"
og_name = "war"
obj = {"uri":og_uri,"name":og_name}
evaluation.add_result(evaluation.get_index_workflow(pt),"6", evaluation.TYPE_SINGLE ,"uri", [obj] ,"all")

The index of this workflow is: 1_5
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow1_5.json
JSON object updated


`wd:Q198` -> war

## Task 7

In [57]:
# find the wars
queryString = """
SELECT DISTINCT ?obj ?name 
WHERE {
    ?obj wdt:P710 wd:Q2277 .
    ?obj wdt:P31 wd:Q198.
    ?obj sc:name ?name.
}
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q6254'), ('name', 'Kitos War')]
[('obj', 'http://www.wikidata.org/entity/Q500409'), ('name', 'Roman–Persian Wars')]
[('obj', 'http://www.wikidata.org/entity/Q202578'), ('name', 'First Jewish-Roman War')]
[('obj', 'http://www.wikidata.org/entity/Q1533828'), ('name', 'Roman–Parthian War of 58–63')]
[('obj', 'http://www.wikidata.org/entity/Q3651751'), ('name', "Trajan's Parthian campaign")]
[('obj', 'http://www.wikidata.org/entity/Q933963'), ('name', 'Marcomannic Wars')]
[('obj', 'http://www.wikidata.org/entity/Q3651748'), ('name', 'Parthian war of Caracalla')]
[('obj', 'http://www.wikidata.org/entity/Q1790268'), ('name', 'Roman–Parthian War of 161–166')]
[('obj', 'http://www.wikidata.org/entity/Q84850006'), ('name', 'Roman–Parthian War of 195–198')]
9


## Task 8

In [58]:
# look at the properties of these war
queryString = """
SELECT DISTINCT ?prop ?name
WHERE {
    ?obj wdt:P710 wd:Q2277 .
    ?obj ?prop ?x.
    ?prop sc:name ?name
}
"""

print("Results")
x=run_query(queryString)

Results
[('prop', 'http://www.wikidata.org/prop/direct/P1296'), ('name', 'Gran Enciclopèdia Catalana ID')]
[('prop', 'http://www.wikidata.org/prop/direct/P131'), ('name', 'located in the administrative territorial entity')]
[('prop', 'http://www.wikidata.org/prop/direct/P1343'), ('name', 'described by source')]
[('prop', 'http://www.wikidata.org/prop/direct/P1417'), ('name', 'Encyclopædia Britannica Online ID')]
[('prop', 'http://www.wikidata.org/prop/direct/P1438'), ('name', 'Jewish Encyclopedia ID (Russian)')]
[('prop', 'http://www.wikidata.org/prop/direct/P1542'), ('name', 'has effect')]
[('prop', 'http://www.wikidata.org/prop/direct/P17'), ('name', 'country')]
[('prop', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('prop', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('prop', 'http://www.wikidata.org/prop/direct/P1936'), ('name', 'Digital Atlas of the Roman Empire ID')]
[('prop', 'http://www.wikidata.org/prop/direct/P2163'), ('name',

In [2]:
# look at the properties of these war
queryString = """
SELECT ?obj ?name COUNT(*) AS ?c
WHERE {
    ?obj wdt:P710 wd:Q2277 .
    ?obj wdt:P31 wd:Q198.
    ?obj wdt:P527 ?x.
    ?obj sc:name ?name.
}
GROUP BY ?obj ?name
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q500409'), ('name', 'Roman–Persian Wars'), ('c', '2')]
1
