# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [2]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-NOTEBOOK_CODE_HERE-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# GEO Workflow Series ("archaeological sites") 

Consider the following exploratory information need:

> Search for archaeological site in the world, across countries, continents, and in reference to their culture

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P279`    | subclass      | predicate |
| `wdt:P17`     | country       | predicate |
| `wd:Q38`      | Italy  | node      |
| `wd:Q641556`  | Verona Arena  | node      |
| `wd:Q1747689` | Ancient Rome  | node |
| `wd:Q46`      | Europe        | node |
| `wd:Q173527`  | Knossos       | node |
| `wd:Q839954`  | archaeological site | node |


Also consider

```
?p wdt:P17 wd:Q38  . 
?p wdt:P31 wd:Q839954  . 
```

is the BGP to retrieve all **archaeological sites in italy**

## Workload Goals

1. Identify the BGP that connect an archaeological site to the country, the continent, and the culture 

2. Identify the BGP to retrieve other types of an archaeological site, e.g., human settlement or theatre

3. Is there any relevant numerical attribute that describes these sites, e.g., number visitors?

4. Analyze the number of archaeological sites per type and country
 
   4.1 Which country has more archaeological sites? Which country has the most  human settlements?
   
   4.2 Which countries have Ancient Rome sites, which other "archaeological cultures" are described?
   
   4.3 Which country has the most diverse set of civilizations or cultures across its sites?
   
   4.4 If you are interested in visiting some sites, which country would you pick? Based on what criteria?


In [8]:
# start your workflow here

In [18]:
queryString = """
SELECT ?p ?name
WHERE { 
?p wdt:P17 wd:Q38  . 
?p wdt:P31 wd:Q839954  . 

?p <http://schema.org/name> ?name .

} 
GROUP BY ?cult  ?arch
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q810069'), ('name', 'Basilica Sempronia')]
[('p', 'http://www.wikidata.org/entity/Q3874144'), ('name', 'Necropolis of Catania')]
[('p', 'http://www.wikidata.org/entity/Q17992699'), ('name', "Sant'Ansano, Spoleto")]
[('p', 'http://www.wikidata.org/entity/Q16537540'), ('name', 'House of the Relief of Telephus')]
[('p', 'http://www.wikidata.org/entity/Q104787278'), ('name', 'Tomb of Aulus Hirtius')]
[('p', 'http://www.wikidata.org/entity/Q1453042'), ('name', 'Nuragic complex of Sa Sedda e Sos Carros')]
[('p', 'http://www.wikidata.org/entity/Q3616683'), ('name', 'Roman Amphitheatre of Benevento')]
[('p', 'http://www.wikidata.org/entity/Q85429625'), ('name', 'Accu Sa Pira Nuraghe')]
[('p', 'http://www.wikidata.org/entity/Q3801615'), ('name', 'Hypogeum of via Livenza')]
[('p', 'http://www.wikidata.org/entity/Q3874153'), ('name', 'Necropolis of Pedras Serradas')]
[('p', 'http://www.wikidata.org/entity/Q3879885'), ('name', 'nuraghe  Lugherras')]
[

30

# Workload 1

In [None]:
# 1. Identify the BGP that connect an archaeological site to the country, the continent, and the culture

In [None]:
# I start exploring an archaeological site given in the tab --> wd:Q641556 -> Arena Verona

In [15]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

?s ?p wd:Q641556 .
?p <http://schema.org/name> ?pname .
} 

"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('pname', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P2283'), ('pname', 'uses')]
[('p', 'http://www.wikidata.org/prop/direct/P276'), ('pname', 'location')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('pname', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P69'), ('pname', 'educated at')]
[('p', 'http://www.wikidata.org/prop/direct/P915'), ('pname', 'filming location')]
[('p', 'http://www.wikidata.org/prop/direct/P971'), ('pname', 'category combines topics')]
[('p', 'http://www.wikidata.org/prop/direct/P483'), ('pname', 'recorded at studio or venue')]
[('p', 'http://www.wikidata.org/prop/direct/P1855'), ('pname', 'Wikidata property example')]


9

In [16]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

wd:Q641556 ?p  ?o.
?p <http://schema.org/name> ?pname .
} 

"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P5633'), ('pname', 'Amphi-Theatrum ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1004'), ('pname', 'MusicBrainz place ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1017'), ('pname', 'Vatican Library ID (former scheme)')]
[('p', 'http://www.wikidata.org/prop/direct/P131'), ('pname', 'located in the administrative territorial entity')]
[('p', 'http://www.wikidata.org/prop/direct/P1481'), ('pname', 'vici.org ID')]
[('p', 'http://www.wikidata.org/prop/direct/P149'), ('pname', 'architectural style')]
[('p', 'http://www.wikidata.org/prop/direct/P1566'), ('pname', 'GeoNames ID')]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('pname', 'country')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('pname', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P186'), ('pname', 'made from material')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('pname', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P19

39

In [None]:
# Here I Get:
# 'culture' --> wdt:P2596
# 'country' --> wdt:P17

In [None]:
# Now i get --> wd:Q46	Europe to find continent IRI. 

In [20]:
queryString = """
SELECT DISTINCT ?o ?oname
WHERE { 

wd:Q46 wdt:P31  ?o.   # Europe --> instance of --> ?o
?o <http://schema.org/name> ?oname .
} 

"""

print("Results")
run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q5107'), ('oname', 'continent')]
[('o', 'http://www.wikidata.org/entity/Q2418896'), ('oname', 'part of the world')]


2

In [24]:
# GET:
# 'continent' --> wd:Q5107

In [None]:
# Explore all contient 

In [22]:
queryString = """
SELECT DISTINCT ?c ?cname
WHERE { 

?c wdt:P31 wd:Q5107 .   # ?c --> instance of --> continent
?c <http://schema.org/name> ?cname .
} 

"""

print("Results")
run_query(queryString)

Results
[('c', 'http://www.wikidata.org/entity/Q15'), ('cname', 'Africa')]
[('c', 'http://www.wikidata.org/entity/Q18'), ('cname', 'South America')]
[('c', 'http://www.wikidata.org/entity/Q46'), ('cname', 'Europe')]
[('c', 'http://www.wikidata.org/entity/Q48'), ('cname', 'Asia')]
[('c', 'http://www.wikidata.org/entity/Q49'), ('cname', 'North America')]
[('c', 'http://www.wikidata.org/entity/Q51'), ('cname', 'Antarctica')]
[('c', 'http://www.wikidata.org/entity/Q538'), ('cname', 'Insular Oceania')]
[('c', 'http://www.wikidata.org/entity/Q828'), ('cname', 'Americas')]
[('c', 'http://www.wikidata.org/entity/Q3960'), ('cname', 'Australia')]
[('c', 'http://www.wikidata.org/entity/Q2872203'), ('cname', 'mainland Australia')]
[('c', 'http://www.wikidata.org/entity/Q27527'), ('cname', 'Afro-Eurasia')]
[('c', 'http://www.wikidata.org/entity/Q7856408'), ('cname', 'Turtle Island')]
[('c', 'http://www.wikidata.org/entity/Q150408'), ('cname', 'Zealandia')]


13

In [None]:
# Now i use (wd:Q38	Italy) to find the property that connect a country to a continent

In [26]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

wd:Q38 ?p wd:Q46 .       # Italy --> ?p --> Europe
?p <http://schema.org/name> ?pname .
} 

"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P30'), ('pname', 'continent')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('pname', 'part of')]


2

In [None]:
# GET property: wdt:P30 --> 'continent'

In [None]:
# Use all property found above to resolve the first question:
# Identify the BGP that connect an archaeological site to the country, the continent, and the culture

In [30]:
queryString = """
SELECT DISTINCT ?asname ?cuname ?coname ?cname
WHERE { 

?as wdt:P31 wd:Q839954  ;
    wdt:P2596 ?cu ;           # ?as --> archaeological site /// ?cu --> culture
    wdt:P17 ?co  .
?co wdt:P30 ?c .               # ?co --> country
?c  wdt:P31 wd:Q5107 .         # ?c --> continent



?as <http://schema.org/name> ?asname .
?cu <http://schema.org/name> ?cuname .
?co <http://schema.org/name> ?coname .
?c <http://schema.org/name> ?cname .
} 

LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('asname', 'Seleucia'), ('cuname', 'Greater Iran'), ('coname', 'Sasanian Empire'), ('cname', 'Africa')]
[('asname', 'Taq-i Kisra'), ('cuname', 'Greater Iran'), ('coname', 'Sasanian Empire'), ('cname', 'Africa')]
[('asname', 'Seleucia'), ('cuname', 'Hellenistic civilization'), ('coname', 'Sasanian Empire'), ('cname', 'Africa')]
[('asname', 'Ghirza'), ('cuname', 'Ancient Rome'), ('coname', 'Libya'), ('cname', 'Africa')]
[('asname', 'Chellah'), ('cuname', 'Ancient Rome'), ('coname', 'Morocco'), ('cname', 'Africa')]
[('asname', 'Chellah'), ('cuname', 'Marinid Dynasty'), ('coname', 'Morocco'), ('cname', 'Africa')]
[('asname', 'Chellah'), ('cuname', 'Berbers'), ('coname', 'Morocco'), ('cname', 'Africa')]
[('asname', 'Volubilis'), ('cuname', 'Ancient Rome'), ('coname', 'Morocco'), ('cname', 'Africa')]
[('asname', 'Volubilis'), ('cuname', 'Berbers'), ('coname', 'Morocco'), ('cname', 'Africa')]
[('asname', 'Volubilis'), ('cuname', 'Punics'), ('coname', 'Morocco'), ('cname', 'Africa')]


30

In [34]:
# I notice that each archaeological site can have different cultures, so i group it

In [31]:
queryString = """
SELECT DISTINCT ?asname ?coname ?cname (GROUP_CONCAT(?cuname ; separator = ", ") AS ?cultures)
WHERE { 

?as wdt:P31 wd:Q839954  ;
    wdt:P2596 ?cu .           # ?as --> archaeological site /// ?cu --> culture
?as    wdt:P17 ?co  .
?co wdt:P30 ?c .               # ?co --> country
?c  wdt:P31 wd:Q5107 .         # ?c --> continent



?as <http://schema.org/name> ?asname .
?cu <http://schema.org/name> ?cuname .
?co <http://schema.org/name> ?coname .
?c <http://schema.org/name> ?cname .
} 

LIMIT 20

"""

print("Results")
run_query(queryString)

Results
[('asname', 'Kalabsha'), ('coname', 'Egypt'), ('cname', 'Africa'), ('cultures', 'Ancient Egypt')]
[('asname', 'Antioch'), ('coname', 'Turkey'), ('cname', 'Asia'), ('cultures', 'Seleucids')]
[('asname', 'Palenqué'), ('coname', 'Mexico'), ('cname', 'North America'), ('cultures', 'Maya civilization')]
[('asname', 'Punic wall of Cartagena'), ('coname', 'Spain'), ('cname', 'Europe'), ('cultures', 'Ancient Carthage')]
[('asname', 'Chalasmenos'), ('coname', 'Greece'), ('cname', 'Europe'), ('cultures', 'Minoan civilization')]
[('asname', 'Necropolis of Populonia'), ('coname', 'Italy'), ('cname', 'Europe'), ('cultures', 'Etruscans')]
[('asname', 'Monte Albán'), ('coname', 'Mexico'), ('cname', 'North America'), ('cultures', 'Zapotec civilization')]
[('asname', 'Temple of Bellona'), ('coname', 'Italy'), ('cname', 'Europe'), ('cultures', 'Ancient Rome')]
[('asname', 'nuraghe  Nieddu'), ('coname', 'Italy'), ('cname', 'Europe'), ('cultures', 'nuragic civilization')]
[('asname', 'Qift'), ('co

20

In [None]:
#
#

# Workload 2

In [None]:
# 2. Identify the BGP to retrieve other types of an archaeological site, e.g., human settlement or theatre

In [28]:
queryString = """
SELECT DISTINCT ?o ?oname
WHERE { 


?as wdt:P31 wd:Q839954  ;          # I look for all the archaeological site linked to their instance of ..
    wdt:P31 ?o .

?as <http://schema.org/name> ?name .
?o <http://schema.org/name> ?oname .


} 

LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q14601169'), ('oname', 'human fortified settlement')]
[('o', 'http://www.wikidata.org/entity/Q325017'), ('oname', 'causewayed enclosure')]
[('o', 'http://www.wikidata.org/entity/Q5069563'), ('oname', 'chambered long barrow')]
[('o', 'http://www.wikidata.org/entity/Q30504603'), ('oname', 'Maya site in Mexico')]
[('o', 'http://www.wikidata.org/entity/Q6581615'), ('oname', 'thermae')]
[('o', 'http://www.wikidata.org/entity/Q95463156'), ('oname', 'Ancient Roman dam')]
[('o', 'http://www.wikidata.org/entity/Q1128906'), ('oname', 'medina quarter')]
[('o', 'http://www.wikidata.org/entity/Q19979289'), ('oname', 'birth house')]
[('o', 'http://www.wikidata.org/entity/Q29702995'), ('oname', 'chalkotheke')]
[('o', 'http://www.wikidata.org/entity/Q1193753'), ('oname', 'stone row')]
[('o', 'http://www.wikidata.org/entity/Q2006795'), ('oname', 'abschnittsmotte')]
[('o', 'http://www.wikidata.org/entity/Q28936794'), ('oname', 'provincial park in Saskatchew

50

In [None]:
# Here I find 48607 different type of sites

In [6]:
queryString = """
SELECT DISTINCT count(?o) as ?num_type
WHERE { 


?as wdt:P31 wd:Q839954  ; 
    wdt:P31 ?o .

?as <http://schema.org/name> ?name .
?o <http://schema.org/name> ?oname .

} 

LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('num_type', '48607')]


1

In [None]:
#
#

# Workload 3

In [None]:
# 3. Is there any relevant numerical attribute that describes these sites, e.g., number visitors?

In [7]:
# I look for one particular site, to see all his attributes

In [3]:
queryString = """
SELECT DISTINCT ?p ?pname ?o ?oname
WHERE { 

wd:Q641556 ?p  ?o.
?p <http://schema.org/name> ?pname .

OPTIONAL{
?o <http://schema.org/name> ?oname .
}

} 
ORDER BY ASC (?o)
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P646'), ('pname', 'Freebase ID'), ('o', '/m/0343fy')]
[('p', 'http://www.wikidata.org/prop/direct/P213'), ('pname', 'ISNI'), ('o', '0000 0001 2165 9603')]
[('p', 'http://www.wikidata.org/prop/direct/P949'), ('pname', 'National Library of Israel ID'), ('o', '000595277')]
[('p', 'http://www.wikidata.org/prop/direct/P269'), ('pname', 'IdRef ID'), ('o', '034525114')]
[('p', 'http://www.wikidata.org/prop/direct/P1004'), ('pname', 'MusicBrainz place ID'), ('o', '05522e79-052a-4542-9be4-3b7715e3f2b2')]
[('p', 'http://www.wikidata.org/prop/direct/P2186'), ('pname', 'Wiki Loves Monuments ID'), ('o', '05L7810006')]
[('p', 'http://www.wikidata.org/prop/direct/P227'), ('pname', 'GND ID'), ('o', '1024966-7')]
[('p', 'http://www.wikidata.org/prop/direct/P5383'), ('pname', 'archINFORM project ID'), ('o', '12401')]
[('p', 'http://www.wikidata.org/prop/direct/P268'), ('pname', 'Bibliothèque nationale de France ID'), ('o', '12527265j')]
[('p', 'http://

48

In [None]:
# Here for archaeological site, I don't find relevant numerical attribute that describes these sites

In [29]:
# I try to look for other sites: theatre , museum 

In [9]:
queryString = """
SELECT DISTINCT ?o ?oname
WHERE { 


?as wdt:P31 wd:Q839954  ;          # I look for all the archaeological site linked to their instance of ..
    wdt:P31 ?o .

?as <http://schema.org/name> ?name .
?o <http://schema.org/name> ?oname .

FILTER(REGEX(?oname, "theater") or REGEX(?oname, "museum"))
} 

LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q3329412'), ('oname', 'archaeological museum')]
[('o', 'http://www.wikidata.org/entity/Q27699276'), ('oname', 'museum collection')]
[('o', 'http://www.wikidata.org/entity/Q33506'), ('oname', 'museum')]
[('o', 'http://www.wikidata.org/entity/Q94701740'), ('oname', 'didactic museum')]
[('o', 'http://www.wikidata.org/entity/Q17431399'), ('oname', 'national museum')]
[('o', 'http://www.wikidata.org/entity/Q2087181'), ('oname', 'historic house museum')]
[('o', 'http://www.wikidata.org/entity/Q718893'), ('oname', 'theater')]
[('o', 'http://www.wikidata.org/entity/Q756102'), ('oname', 'open-air museum')]
[('o', 'http://www.wikidata.org/entity/Q636819'), ('oname', 'archaeological open-air museum')]
[('o', 'http://www.wikidata.org/entity/Q2516357'), ('oname', 'transport museum')]
[('o', 'http://www.wikidata.org/entity/Q207694'), ('oname', 'art museum')]
[('o', 'http://www.wikidata.org/entity/Q24699794'), ('oname', 'museum building')]
[('o', 'http:/

15

In [None]:
# GET: 'theater' --> wd:Q24354
#       'museum' --> wd:Q33506

In [5]:
queryString = """
SELECT DISTINCT ?p ?pname ?o ?oname
WHERE { 

?s ?p  ?o.
?s wdt:P31 ?site .
?p <http://schema.org/name> ?pname .

OPTIONAL{
?o <http://schema.org/name> ?oname .
}

FILTER(?site = wd:Q24354 or ?site = wd:Q33506)
} 
ORDER BY ASC (?o)
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P4272'), ('pname', 'DPLA subject ID'), ('o', '"Arts and Industries Building (Washington, D.C.)"')]
[('p', 'http://www.wikidata.org/prop/direct/P4272'), ('pname', 'DPLA subject ID'), ('o', '"Franklin Court (Philadelphia, Pa.)"')]
[('p', 'http://www.wikidata.org/prop/direct/P4272'), ('pname', 'DPLA subject ID'), ('o', '"Independence National Historical Park (Philadelphia, Pa.)"')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('pname', 'Commons category'), ('o', '"Licurici" Theater of Chișinău')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('pname', 'Commons category'), ('o', '"Mihai Eminescu" National Theater of Chișinău')]
[('p', 'http://www.wikidata.org/prop/direct/P5361'), ('pname', 'BNB person ID'), ('o', '%C3%96sterreichischeNationalbibliothek')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('pname', 'Commons category'), ('o', "'62 Center for Theatre and Dance")]
[('p', 'http://www.wikidata.org/prop/direct/P37

30

In [None]:
#
#

# Workload 4

In [6]:
# 4. Analyze the number of archaeological sites per type and country

## Workload 4.1

In [None]:
# 4.1 Which country has more archaeological sites? Which country has the most human settlements?

In [7]:
# To find Which country has more archaeological sites I start finding all the countries with their sites

In [10]:
queryString = """
SELECT DISTINCT ?c ?cname ?as ?asname 
WHERE { 

?as wdt:P17 ?c  . 
?as wdt:P31 wd:Q839954  .

?as <http://schema.org/name> ?asname .
?c <http://schema.org/name> ?cname .

} 
GROUP BY ?cname
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('c', 'http://www.wikidata.org/entity/Q145'), ('cname', 'United Kingdom'), ('as', 'http://www.wikidata.org/entity/Q17647106'), ('asname', 'Two enclosures on Winterbourne Down')]
[('c', 'http://www.wikidata.org/entity/Q145'), ('cname', 'United Kingdom'), ('as', 'http://www.wikidata.org/entity/Q17648184'), ('asname', "Site of St Ethelred's Church and adjoining area to south")]
[('c', 'http://www.wikidata.org/entity/Q145'), ('cname', 'United Kingdom'), ('as', 'http://www.wikidata.org/entity/Q17648556'), ('asname', 'Tumulus in the Warren')]
[('c', 'http://www.wikidata.org/entity/Q145'), ('cname', 'United Kingdom'), ('as', 'http://www.wikidata.org/entity/Q17649098'), ('asname', 'Medieval dyke: part of deer park boundary south of Cow Green')]
[('c', 'http://www.wikidata.org/entity/Q145'), ('cname', 'United Kingdom'), ('as', 'http://www.wikidata.org/entity/Q17650159'), ('asname', 'The Mount bell barrow 110m south west of Mount Farm')]
[('c', 'http://www.wikidata.org/entity/Q145'), ('

20

In [11]:
# Now I count the archeaological site for each country and find the country that has more

In [16]:
queryString = """
SELECT DISTINCT ?c ?cname (count(?as) as ?num_arc_site) 
WHERE { 

?as wdt:P17 ?c  . 
?as wdt:P31 wd:Q839954  .

?as <http://schema.org/name> ?asname .
?c <http://schema.org/name> ?cname .

} 
ORDER BY DESC (?num_arc_site)
LIMIT 1
"""

print("Results")
run_query(queryString)

Results
[('c', 'http://www.wikidata.org/entity/Q145'), ('cname', 'United Kingdom'), ('num_arc_site', '6785')]


1

In [None]:
# Now i look for the IRI of human settlements

In [17]:
queryString = """
SELECT DISTINCT ?o ?oname
WHERE { 


?as wdt:P31 wd:Q839954  ;          # I look for all the archaeological site linked to their instance of ..
    wdt:P31 ?o .

?as <http://schema.org/name> ?name .
?o <http://schema.org/name> ?oname .

FILTER REGEX(?oname, "human sett")
} 

LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q486972'), ('oname', 'human settlement')]


1

In [None]:
# Now I get Q486972 --> human settlement and find the country that has more human settlement

In [21]:
queryString = """
SELECT DISTINCT ?c ?cname (count( DISTINCT ?hs) as ?num_human_settlement) 
WHERE { 

?hs wdt:P17 ?c  .                    # ?hs --> human settlement
?hs wdt:P31 wd:Q486972  .

?hs <http://schema.org/name> ?hsname .
?c <http://schema.org/name> ?cname .

} 
ORDER BY DESC (?num_human_settlement)
LIMIT 1
"""

print("Results")
run_query(queryString)

Results
[('c', 'http://www.wikidata.org/entity/Q38'), ('cname', 'Italy'), ('num_human_settlement', '29476')]


1

In [None]:
#
#

## Workload 4.2

In [6]:
# 4.2 Which countries have Ancient Rome sites, which other "archaeological cultures" are described?

In [7]:
# I find the IRI for Ancient Rome culture

In [9]:
queryString = """
SELECT DISTINCT ?cu ?cuname
WHERE { 

?as wdt:P31 wd:Q839954  ;
    wdt:P2596 ?cu ;           # ?as --> archaeological site /// ?cu --> culture
    wdt:P17 ?co  .
?co wdt:P30 ?c .               # ?co --> country
?c  wdt:P31 wd:Q5107 .         # ?c --> continent



?as <http://schema.org/name> ?asname .
?cu <http://schema.org/name> ?cuname .
?co <http://schema.org/name> ?coname .
?c <http://schema.org/name> ?cname .

FILTER REGEX(?cuname, "Ancient Ro")
} 

LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('cu', 'http://www.wikidata.org/entity/Q1747689'), ('cuname', 'Ancient Rome')]


1

In [None]:
# GET: 'Ancient Rome' --> wd:Q1747689

In [23]:
# I look for Ancient Rome sites

In [12]:
queryString = """
SELECT DISTINCT ?as ?asname ?cuname ?coname 
WHERE { 

?as wdt:P31 wd:Q839954  ;
    wdt:P2596  wd:Q1747689;           # ?as --> archaeological site /// ?cu --> culture
    wdt:P17 ?co  .
?co wdt:P30 ?c .               # ?co --> country
?c  wdt:P31 wd:Q5107 .         # ?c --> continent



?as <http://schema.org/name> ?asname .
wd:Q1747689 <http://schema.org/name> ?cuname .
?co <http://schema.org/name> ?coname .
?c <http://schema.org/name> ?cname .

#FILTER REGEX(?cuname, "Ancient Ro")
} 

LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('as', 'http://www.wikidata.org/entity/Q17652208'), ('asname', 'Excavation at Largo Perosi - Via Giulia (Rome)'), ('cuname', 'Ancient Rome'), ('coname', 'Italy')]
[('as', 'http://www.wikidata.org/entity/Q21208533'), ('asname', 'Temple of Fortuna Respiciens'), ('cuname', 'Ancient Rome'), ('coname', 'Italy')]
[('as', 'http://www.wikidata.org/entity/Q908472'), ('asname', 'Domus Augustana'), ('cuname', 'Ancient Rome'), ('coname', 'Italy')]
[('as', 'http://www.wikidata.org/entity/Q22907580'), ('asname', 'Palace of Domitian'), ('cuname', 'Ancient Rome'), ('coname', 'Italy')]
[('as', 'http://www.wikidata.org/entity/Q3558925'), ('asname', 'Chiragan Roman villa'), ('cuname', 'Ancient Rome'), ('coname', 'France')]
[('as', 'http://www.wikidata.org/entity/Q26220411'), ('asname', 'Temple of Hercules'), ('cuname', 'Ancient Rome'), ('coname', 'Italy')]
[('as', 'http://www.wikidata.org/entity/Q179656'), ('asname', 'Arena of Nîmes'), ('cuname', 'Ancient Rome'), ('coname', 'France')]
[('as', 'h

30

In [None]:
# To have only the countries with their num of sites with Ancient Rome culture

In [13]:
queryString = """
SELECT DISTINCT ?coname ?cuname (count(?as) as ?num_ancient_rome_site)
WHERE { 

?as wdt:P31 wd:Q839954  ;
    wdt:P2596 wd:Q1747689 ;           # ?as --> archaeological site /// ?cu --> culture
    wdt:P17 ?co  .
?co wdt:P30 ?c .               # ?co --> country
?c  wdt:P31 wd:Q5107 .         # ?c --> continent



?as <http://schema.org/name> ?asname .
wd:Q1747689 <http://schema.org/name> ?cuname .
?co <http://schema.org/name> ?coname .
?c <http://schema.org/name> ?cname .

FILTER REGEX(?cuname, "Ancient Ro")
} 

LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('coname', 'Algeria'), ('cuname', 'Ancient Rome'), ('num_ancient_rome_site', '2')]
[('coname', 'Austria'), ('cuname', 'Ancient Rome'), ('num_ancient_rome_site', '2')]
[('coname', 'Greece'), ('cuname', 'Ancient Rome'), ('num_ancient_rome_site', '3')]
[('coname', 'Serbia'), ('cuname', 'Ancient Rome'), ('num_ancient_rome_site', '2')]
[('coname', 'Ancient Rome'), ('cuname', 'Ancient Rome'), ('num_ancient_rome_site', '15')]
[('coname', 'France'), ('cuname', 'Ancient Rome'), ('num_ancient_rome_site', '13')]
[('coname', 'Libya'), ('cuname', 'Ancient Rome'), ('num_ancient_rome_site', '1')]
[('coname', 'United Kingdom'), ('cuname', 'Ancient Rome'), ('num_ancient_rome_site', '16')]
[('coname', 'Egypt'), ('cuname', 'Ancient Rome'), ('num_ancient_rome_site', '6')]
[('coname', 'Italy'), ('cuname', 'Ancient Rome'), ('num_ancient_rome_site', '324')]
[('coname', 'Syria'), ('cuname', 'Ancient Rome'), ('num_ancient_rome_site', '2')]
[('coname', 'Croatia'), ('cuname', 'Ancient Rome'), ('num_anci

24

In [14]:
# which other "archaeological cultures" are described? 
# To solve this question, I understand that for every site found above with Ancient Rome culture, 
# can have also other cultures linked. 

In [15]:
# Reasoning in this way, I can find all the cultures sites where there is also 'Ancient Rome'

In [29]:
queryString = """
SELECT DISTINCT ?asname ?coname (GROUP_CONCAT( DISTINCT ?cuname ; separator = ", ") AS ?cultures)
WHERE { 

?as  wdt:P31 wd:Q839954  ;
     wdt:P2596 wd:Q1747689 ;
     wdt:P2596 ?cu .           # ?as --> archaeological site /// ?cu --> culture
?as  wdt:P17 ?co  .
?co  wdt:P30 ?c .               # ?co --> country
?c   wdt:P31 wd:Q5107 .         # ?c --> continent



?as <http://schema.org/name> ?asname .
?cu <http://schema.org/name> ?cuname .
?co <http://schema.org/name> ?coname .
?c <http://schema.org/name> ?cname .


} 

ORDER BY DESC (?cultures)
LIMIT 10


"""

print("Results")
run_query(queryString)

Results
[('asname', 'Arch of Constantine'), ('coname', 'Italy'), ('cultures', 'Ancient Rome, culture of ancient Rome')]
[('asname', 'Arch of Titus'), ('coname', 'Italy'), ('cultures', 'Ancient Rome, culture of ancient Rome')]
[('asname', "Trajan's Column"), ('coname', 'Italy'), ('cultures', 'Ancient Rome, culture of ancient Rome')]
[('asname', 'Dura-Europos'), ('coname', 'Syria'), ('cultures', 'Ancient Rome, Seleucid Empire')]
[('asname', 'Colonia Claudia Ara Agrippinensium'), ('coname', 'Germany'), ('cultures', 'Ancient Rome, Germanic peoples')]
[('asname', 'Miróbriga'), ('coname', 'Spain'), ('cultures', 'Ancient Rome, Celtiberians')]
[('asname', 'Volubilis'), ('coname', 'Morocco'), ('cultures', 'Ancient Rome, Berbers, Punics')]
[('asname', 'Chellah'), ('coname', 'Morocco'), ('cultures', 'Ancient Rome, Berbers, Marinid Dynasty')]
[('asname', 'Charax'), ('coname', 'Turkey'), ('cultures', 'Ancient Rome')]
[('asname', 'Arch of Septimius Severus'), ('coname', 'Italy'), ('cultures', 'Ancie

10

In [30]:
#
#

## Workload 4.3

In [35]:
# 4.3 Which country has the most diverse set of civilizations or cultures across its sites?

In [11]:
# I look for all archeological sites linked to their countries and find their cultures,
# count it and give the 3 mcountry with most diverse set of cultures
# Italy has the most diverse cultures 
# All IRIs get from above

In [12]:
queryString = """
SELECT DISTINCT ?c ?cname (count( DISTINCT ?cu) as ?num_cultures) 
WHERE { 

?as  wdt:P2596 ?cu ;              # ?cu --> cultures
     wdt:P17 ?c  .                # ?c --> country
?as  wdt:P31 wd:Q839954  .

?as <http://schema.org/name> ?asname .
?c <http://schema.org/name> ?cname .

} 
ORDER BY DESC (?num_cultures)
LIMIT 3
"""

print("Results")
run_query(queryString)

Results
[('c', 'http://www.wikidata.org/entity/Q38'), ('cname', 'Italy'), ('num_cultures', '58')]
[('c', 'http://www.wikidata.org/entity/Q30'), ('cname', 'United States of America'), ('num_cultures', '46')]
[('c', 'http://www.wikidata.org/entity/Q796'), ('cname', 'Iraq'), ('num_cultures', '15')]


3

In [None]:
#
#

## Workload 4.4

In [None]:
# 4.4 If you are interested in visiting some sites, which country would you pick? Based on what criteria?

In [6]:
# Maybe I'll decide a country linked to his number of archaeological sites, museum, churches and theaters. 

In [None]:
# An other thing can be to look the site's culture, beacuse if I know the story, it can be more interesting. 