# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-e7b806b93a-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    #sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    #sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# GEO Workflow Series ("archaeological sites") 

Consider the following exploratory information need:

> Search for archaeological site in the world, across countries, continents, and in reference to their culture

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P279`    | subclass      | predicate |
| `wdt:P17`     | country       | predicate |
| `wd:Q38`      | Italy  | node      |
| `wd:Q641556`  | Verona Arena  | node      |
| `wd:Q1747689` | Ancient Rome  | node |
| `wd:Q46`      | Europe        | node |
| `wd:Q173527`  | Knossos       | node |
| `wd:Q839954`  | archaeological site | node |


Also consider

```
?p wdt:P17 wd:Q38  . 
?p wdt:P31 wd:Q839954  . 
```

is the BGP to retrieve all **archaeological sites in italy**

## Workload Goals

1. Identify the BGP that connect an archaeological site to the country, the continent, and the culture 

2. Identify the BGP to retrieve other types of an archaeological site, e.g., human settlement or theatre

3. Is there any relevant numerical attribute that describes these sites, e.g., number visitors?

4. Analyze the number of archaeological sites per type and country
 
   4.1 Which country has more archaeological sites? Which country has the most  human settlements?
   
   4.2 Which countries have Ancient Rome sites, which other "archaeological cultures" are described?
   
   4.3 Which country has the most diverse set of civilizations or cultures across its sites?
   
   4.4 If you are interested in visiting some sites, which country would you pick? Based on what criteria?


In [3]:
#let's find properties relative to continent, culture for an archeological site
q = f"""
select distinct ?property ?name where{{
?site wdt:P31 wd:Q839954 .
?site ?property ?object .
?property <http://schema.org/name> ?name .
filter regex(?name, '.*(culture|continent).*', 'i')
}}
limit 20
"""
run_query(q)

[('property', 'http://www.wikidata.org/prop/direct/P2596'), ('name', 'culture')]
[('property', 'http://www.wikidata.org/prop/direct/P3569'), ('name', 'Cultureel Woordenboek ID')]
[('property', 'http://www.wikidata.org/prop/direct/P30'), ('name', 'continent')]
[('property', 'http://www.wikidata.org/prop/direct/P8698'), ('name', "Turkey's Culture Portal ID")]
[('property', 'http://www.wikidata.org/prop/direct/P4702'), ('name', 'Google Arts & Culture partner ID')]
[('property', 'http://www.wikidata.org/prop/direct/P9013'), ('name', 'Encyclopedia of Saami Culture ID')]


6

In [4]:
culture = 'wdt:P2596'
continent = 'wdt:P30'

#Point 1. Identify the BGP that connets an arch site to the country, the continent, the culture

#first of all let's find an archeological site
q = f"""
select ?site ?name where{{
?site wdt:P31 wd:Q839954 .
?site <http://schema.org/name> ?name .
?site wdt:P17 wd:Q38 .
}}
limit 10
"""
run_query(q)

[('site', 'http://www.wikidata.org/entity/Q1046434'), ('name', 'Casa Romuli')]
[('site', 'http://www.wikidata.org/entity/Q17624209'), ('name', 'Catacomba di Santa Savinilla')]
[('site', 'http://www.wikidata.org/entity/Q17631557'), ('name', 'Gymnasium')]
[('site', 'http://www.wikidata.org/entity/Q17637300'), ('name', 'Nuraghe Sirai')]
[('site', 'http://www.wikidata.org/entity/Q17651475'), ('name', 'Porta Venere')]
[('site', 'http://www.wikidata.org/entity/Q17652208'), ('name', 'Excavation at Largo Perosi - Via Giulia (Rome)')]
[('site', 'http://www.wikidata.org/entity/Q21208533'), ('name', 'Temple of Fortuna Respiciens')]
[('site', 'http://www.wikidata.org/entity/Q908472'), ('name', 'Domus Augustana')]
[('site', 'http://www.wikidata.org/entity/Q1665908'), ('name', 'Interamna Lirenas')]
[('site', 'http://www.wikidata.org/entity/Q22907580'), ('name', 'Palace of Domitian')]


10

In [5]:
Gymnasium = 'wd:Q17631557'

#let's answer to point 1
q = f"""
select ?siteName ?countryName ?continent ?continentName where{{
wd:Q17631557 wdt:P17 ?country ;
    wdt:P30 ?continent ;
    <http://schema.org/name> ?siteName .
    
?continent <http://schema.org/name> ?continentName .
?country <http://schema.org/name> ?countryName .
}}
"""
run_query(q)

Empty


0

In [6]:
#what property is Gymnasium missing?

q = f"""
select ?p ?name where{{
wd:Q17631557 ?p ?o .
?p <http://schema.org/name> ?name
}}
"""
run_query(q)

[('p', 'http://www.wikidata.org/prop/direct/P131'), ('name', 'located in the administrative territorial entity')]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('name', 'country')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P2596'), ('name', 'culture')]
[('p', 'http://www.wikidata.org/prop/direct/P2671'), ('name', 'Google Knowledge Graph ID')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('name', 'Commons category')]


7

In [7]:
#is a country connected to its continent via the "continent" property?
q = f"""
select ?continent ?name where{{
wd:Q38 wdt:P30 ?continent .
?continent <http://schema.org/name> ?name
}}
"""
run_query(q)

[('continent', 'http://www.wikidata.org/entity/Q46'), ('name', 'Europe')]


1

In [8]:
#let's modify the query and let's answer point 1
#let's answer to point 1
q = f"""
select ?siteName ?countryName ?continent ?continentName ?culture ?cultureName where{{
wd:Q17631557 wdt:P17 ?country ;
    <http://schema.org/name> ?siteName ;
    wdt:P2596 ?culture .
    
?country <http://schema.org/name> ?countryName ;
    wdt:P30 ?continent .
    
?continent <http://schema.org/name> ?continentName .
?culture <http://schema.org/name> ?cultureName
}}
"""
run_query(q)

[('siteName', 'Gymnasium'), ('countryName', 'Italy'), ('continent', 'http://www.wikidata.org/entity/Q46'), ('continentName', 'Europe'), ('culture', 'http://www.wikidata.org/entity/Q2277'), ('cultureName', 'Roman Empire')]


1

In [9]:
#what are the top 10 cultures for archeological sites in italy?
q = f"""
select ?cultureName (count(*) as ?n) where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P2596 ?culture ;
    wdt:P17 wd:Q38 .
?culture <http://schema.org/name> ?cultureName
}}
group by ?cultureName
order by desc(?n)
limit 10
"""
run_query(q)

[('cultureName', 'Ancient Rome'), ('n', '378')]
[('cultureName', 'nuragic civilization'), ('n', '74')]
[('cultureName', 'Etruscans'), ('n', '70')]
[('cultureName', 'Ancient Greece'), ('n', '23')]
[('cultureName', 'Samnites'), ('n', '15')]
[('cultureName', 'Sicels'), ('n', '9')]
[('cultureName', 'Roman Empire'), ('n', '7')]
[('cultureName', 'Latins'), ('n', '6')]
[('cultureName', 'Sicani'), ('n', '5')]
[('cultureName', 'Greeks'), ('n', '4')]


10

In [10]:
#is Archeological Site a sublclass of something?
q = f"""
select distinct ?class ?className where{{
wd:Q839954 wdt:P279 ?class .
?class <http://schema.org/name> ?className .
}}
"""
run_query(q)

[('class', 'http://www.wikidata.org/entity/Q1291195'), ('className', 'find spot')]
[('class', 'http://www.wikidata.org/entity/Q15642541'), ('className', 'human-geographic territorial entity')]


2

In [11]:
#let's keep going

tmp = 'wd:Q15642541'
q = f"""
select distinct ?class ?className where{{
wd:Q15642541 wdt:P279 ?class .
?class <http://schema.org/name> ?className .
}}
"""
run_query(q)

[('class', 'http://www.wikidata.org/entity/Q1496967'), ('className', 'territorial entity')]
[('class', 'http://www.wikidata.org/entity/Q16686448'), ('className', 'artificial entity')]


2

In [12]:
tmp1 = 'wd:Q1496967'
q = f"""
select distinct ?class ?className where{{
wd:Q1496967 wdt:P279 ?class .
?class <http://schema.org/name> ?className .
}}
"""
run_query(q)

[('class', 'http://www.wikidata.org/entity/Q618123'), ('className', 'geographical feature')]


1

In [13]:
tmp = 'wd:Q618123'
q = f"""
select distinct ?class ?className where{{
wd:Q618123 wdt:P279 ?class .
?class <http://schema.org/name> ?className .
}}
"""
run_query(q)

[('class', 'http://www.wikidata.org/entity/Q2221906'), ('className', 'geographic location')]
[('class', 'http://www.wikidata.org/entity/Q27096213'), ('className', 'geographic entity')]


2

In [14]:
#we're getting more and more general but i'm afraid of the amount of results i'd get if i tried to find all the subclasses of
#geographic location or geographic entity.

#we have already available a theatre and a city, let's check their classes

q = f"""
select ?class ?name where{{
wd:Q641556 wdt:P31 ?class .
?class <http://schema.org/name> ?name
}}
"""
run_query(q)

[('class', 'http://www.wikidata.org/entity/Q14752696'), ('name', 'ancient Roman structure')]
[('class', 'http://www.wikidata.org/entity/Q7362268'), ('name', 'Roman amphitheatre')]
[('class', 'http://www.wikidata.org/entity/Q24354'), ('name', 'theatre')]
[('class', 'http://www.wikidata.org/entity/Q570116'), ('name', 'tourist attraction')]
[('class', 'http://www.wikidata.org/entity/Q839954'), ('name', 'archaeological site')]
[('class', 'http://www.wikidata.org/entity/Q153562'), ('name', 'opera house')]


6

In [15]:
Theatre = 'wd:Q24354'
Attraction = 'wd:Q570116'
OperaHouse = 'wd:Q153562'
#let's see the class of "ancient roman structure"
tmp = 'wd:Q14752696'
q = f"""
select distinct ?class ?className where{{
wd:Q14752696 wdt:P279 ?class .
?class <http://schema.org/name> ?className .
}}
"""
run_query(q)

[('class', 'http://www.wikidata.org/entity/Q811979'), ('className', 'architectural structure')]
[('class', 'http://www.wikidata.org/entity/Q21752084'), ('className', 'Roman archaeological site')]
[('class', 'http://www.wikidata.org/entity/Q41176'), ('className', 'building')]


3

In [16]:
#Roman archeological structure is most likely subclass of archeological structure, for now let's check something else

#does Verona Arena have a culture associated?
q = f"""
select ?culture ?name where{{
wd:Q641556 wdt:P2596 ?culture .
?culture <http://schema.org/name> ?name
}}
"""
run_query(q)

[('culture', 'http://www.wikidata.org/entity/Q1747689'), ('name', 'Ancient Rome')]


1

In [17]:
#let's try and find the class related to culture
tmp = 'wd:Q1747689'
q = f"""
select ?class ?name where{{
    wd:Q1747689 wdt:P31 ?class .
    ?class <http://schema.org/name> ?name
}}
"""
run_query(q)

[('class', 'http://www.wikidata.org/entity/Q3024240'), ('name', 'historical country')]


1

In [18]:
#let's find the class of all entities that refer to a certain culture
q = f"""
select distinct ?class ?name where{{
?entity wdt:P2596 ?culture .
?entity wdt:P31 ?class .
?class <http://schema.org/name> ?name .
}}
limit 100
"""
run_query(q)

[('class', 'http://www.wikidata.org/entity/Q1243603'), ('name', '14 regions of Augustan Rome')]
[('class', 'http://www.wikidata.org/entity/Q1489259'), ('name', 'superpower')]
[('class', 'http://www.wikidata.org/entity/Q731966'), ('name', 'nymphaeum')]
[('class', 'http://www.wikidata.org/entity/Q19757'), ('name', 'Roman theatre')]
[('class', 'http://www.wikidata.org/entity/Q1063239'), ('name', 'polity')]
[('class', 'http://www.wikidata.org/entity/Q3822604'), ('name', 'roman mausoleum')]
[('class', 'http://www.wikidata.org/entity/Q11801536'), ('name', 'winged altarpiece')]
[('class', 'http://www.wikidata.org/entity/Q30504603'), ('name', 'Maya site in Mexico')]
[('class', 'http://www.wikidata.org/entity/Q586552'), ('name', 'Viking ring fortress')]
[('class', 'http://www.wikidata.org/entity/Q6581615'), ('name', 'thermae')]
[('class', 'http://www.wikidata.org/entity/Q29702995'), ('name', 'chalkotheke')]
[('class', 'http://www.wikidata.org/entity/Q656720'), ('name', 'workshop')]
[('class', '

100

In [19]:
#what if we look for all the classes of entities that are also archeological sites?
q = f"""
select distinct ?class ?name where{{
?entity wdt:P31 wd:Q839954 .
?entity wdt:P31 ?class .
?class <http://schema.org/name> ?name .
}}
limit 50
"""
run_query(q)

[('class', 'http://www.wikidata.org/entity/Q14947481'), ('name', 'wierde')]
[('class', 'http://www.wikidata.org/entity/Q505963'), ('name', 'artificial dwelling hill')]
[('class', 'http://www.wikidata.org/entity/Q14601169'), ('name', 'human fortified settlement')]
[('class', 'http://www.wikidata.org/entity/Q325017'), ('name', 'causewayed enclosure')]
[('class', 'http://www.wikidata.org/entity/Q5069563'), ('name', 'chambered long barrow')]
[('class', 'http://www.wikidata.org/entity/Q30504603'), ('name', 'Maya site in Mexico')]
[('class', 'http://www.wikidata.org/entity/Q6581615'), ('name', 'thermae')]
[('class', 'http://www.wikidata.org/entity/Q95463156'), ('name', 'Ancient Roman dam')]
[('class', 'http://www.wikidata.org/entity/Q1128906'), ('name', 'medina quarter')]
[('class', 'http://www.wikidata.org/entity/Q19979289'), ('name', 'birth house')]
[('class', 'http://www.wikidata.org/entity/Q29702995'), ('name', 'chalkotheke')]
[('class', 'http://www.wikidata.org/entity/Q1193753'), ('name

50

In [20]:
#what if we look for all the classes of entities that are also archeological sites and tourist attractions?
q = f"""
select distinct ?class ?name where{{
?entity wdt:P31 wd:Q839954 .
?entity wdt:P31 wd:Q570116 .
?entity wdt:P31 ?class .
?class <http://schema.org/name> ?name .
}}
limit 50
"""
run_query(q)

[('class', 'http://www.wikidata.org/entity/Q19757'), ('name', 'Roman theatre')]
[('class', 'http://www.wikidata.org/entity/Q2593777'), ('name', 'grave field')]
[('class', 'http://www.wikidata.org/entity/Q13724255'), ('name', 'rock-cut architecture')]
[('class', 'http://www.wikidata.org/entity/Q27916659'), ('name', 'historical location')]
[('class', 'http://www.wikidata.org/entity/Q1785071'), ('name', 'fort')]
[('class', 'http://www.wikidata.org/entity/Q4363262'), ('name', 'smooth-sided pyramid')]
[('class', 'http://www.wikidata.org/entity/Q14752696'), ('name', 'ancient Roman structure')]
[('class', 'http://www.wikidata.org/entity/Q2434238'), ('name', 'heritage')]
[('class', 'http://www.wikidata.org/entity/Q88205'), ('name', 'castrum')]
[('class', 'http://www.wikidata.org/entity/Q1051606'), ('name', 'Daibutsu')]
[('class', 'http://www.wikidata.org/entity/Q200141'), ('name', 'necropolis')]
[('class', 'http://www.wikidata.org/entity/Q3397526'), ('name', 'stone bridge')]
[('class', 'http:/

50

In [21]:
#let's try with archeological site and culture, see if the classes that we find are relevant for point 2
q = f"""
select distinct ?class ?name where{{
?entity wdt:P31 wd:Q839954 .
?entity wdt:P2596 ?culture .
?entity wdt:P31 ?class .
?class <http://schema.org/name> ?name .
}}
limit 50
"""
run_query(q)

[('class', 'http://www.wikidata.org/entity/Q1243603'), ('name', '14 regions of Augustan Rome')]
[('class', 'http://www.wikidata.org/entity/Q731966'), ('name', 'nymphaeum')]
[('class', 'http://www.wikidata.org/entity/Q19757'), ('name', 'Roman theatre')]
[('class', 'http://www.wikidata.org/entity/Q3822604'), ('name', 'roman mausoleum')]
[('class', 'http://www.wikidata.org/entity/Q30504603'), ('name', 'Maya site in Mexico')]
[('class', 'http://www.wikidata.org/entity/Q6581615'), ('name', 'thermae')]
[('class', 'http://www.wikidata.org/entity/Q29702995'), ('name', 'chalkotheke')]
[('class', 'http://www.wikidata.org/entity/Q660398'), ('name', 'cryptoporticus')]
[('class', 'http://www.wikidata.org/entity/Q1193753'), ('name', 'stone row')]
[('class', 'http://www.wikidata.org/entity/Q20010053'), ('name', 'Etruscan grave')]
[('class', 'http://www.wikidata.org/entity/Q1510380'), ('name', 'grove')]
[('class', 'http://www.wikidata.org/entity/Q173387'), ('name', 'grave')]
[('class', 'http://www.wik

50

In [22]:
#let's see instead the superclasses of the classes that we are retrieving
q = f"""
select distinct ?super ?name where{{
?entity wdt:P31 wd:Q839954 .
?entity wdt:P2596 ?culture .
?entity wdt:P31 ?class .
?class wdt:P279 ?super .
?super <http://schema.org/name> ?name
}}
limit 50
"""
run_query(q)

[('super', 'http://www.wikidata.org/entity/Q14601169'), ('name', 'human fortified settlement')]
[('super', 'http://www.wikidata.org/entity/Q867393'), ('name', 'amenity')]
[('super', 'http://www.wikidata.org/entity/Q66661745'), ('name', 'group of artificial physical objects')]
[('super', 'http://www.wikidata.org/entity/Q1307276'), ('name', 'single-family detached home')]
[('super', 'http://www.wikidata.org/entity/Q185600'), ('name', 'portico')]
[('super', 'http://www.wikidata.org/entity/Q7302601'), ('name', 'recognition')]
[('super', 'http://www.wikidata.org/entity/Q173387'), ('name', 'grave')]
[('super', 'http://www.wikidata.org/entity/Q1785071'), ('name', 'fort')]
[('super', 'http://www.wikidata.org/entity/Q2065736'), ('name', 'cultural property')]
[('super', 'http://www.wikidata.org/entity/Q4421'), ('name', 'forest')]
[('super', 'http://www.wikidata.org/entity/Q1497375'), ('name', 'architectural ensemble')]
[('super', 'http://www.wikidata.org/entity/Q3476533'), ('name', 'monumental s

50

In [23]:
#so to answer point 2 i would do a query of this type
q = f"""
select distinct ?super ?name where{{
?entity wdt:P31 wd:Q839954 .
#only want tourist attractions
?entity wdt:P31 wd:Q570116 .
?entity wdt:P2596 ?culture .
?entity wdt:P31 ?class .
?class wdt:P279 ?super .
?super <http://schema.org/name> ?name
}}
limit 50
"""
run_query(q)

[('super', 'http://www.wikidata.org/entity/Q3476533'), ('name', 'monumental sculpture')]
[('super', 'http://www.wikidata.org/entity/Q1030034'), ('name', 'GLAM')]
[('super', 'http://www.wikidata.org/entity/Q14752696'), ('name', 'ancient Roman structure')]
[('super', 'http://www.wikidata.org/entity/Q2434238'), ('name', 'heritage')]
[('super', 'http://www.wikidata.org/entity/Q3152824'), ('name', 'cultural institution')]
[('super', 'http://www.wikidata.org/entity/Q5003624'), ('name', 'memorial')]
[('super', 'http://www.wikidata.org/entity/Q6023295'), ('name', 'funerary structure')]
[('super', 'http://www.wikidata.org/entity/Q1076486'), ('name', 'sports venue')]
[('super', 'http://www.wikidata.org/entity/Q13226383'), ('name', 'facility')]
[('super', 'http://www.wikidata.org/entity/Q1779653'), ('name', 'colossal statue')]
[('super', 'http://www.wikidata.org/entity/Q294440'), ('name', 'public space')]
[('super', 'http://www.wikidata.org/entity/Q975783'), ('name', 'estate in land')]
[('super',

50

In [24]:
#let's see the top 50 most common types

#get all of the types
q1 = f"""
select ?super where{{
?entity wdt:P31 wd:Q839954 .
#only want tourist attractions
?entity wdt:P31 wd:Q570116 .
?entity wdt:P2596 ?culture .
?entity wdt:P31 ?class .
?class wdt:P279 ?super .
}}
"""

q = f"""
select ?super ?name (count(*) as ?n) where{{
?super <http://schema.org/name> ?name .

    {{
        select ?super where{{
            ?entity wdt:P31 wd:Q839954 .
            #only want tourist attractions
            ?entity wdt:P31 wd:Q570116 .
            ?entity wdt:P2596 ?culture .
            ?entity wdt:P31 ?class .
            ?class wdt:P279 ?super .
        }}
    }}

}}
group by ?super ?name
order by desc(?n)
limit 50
"""

run_query(q)

[('super', 'http://www.wikidata.org/entity/Q15642541'), ('name', 'human-geographic territorial entity'), ('n', '18')]
[('super', 'http://www.wikidata.org/entity/Q1291195'), ('name', 'find spot'), ('n', '16')]
[('super', 'http://www.wikidata.org/entity/Q960648'), ('name', 'point of interest'), ('n', '16')]
[('super', 'http://www.wikidata.org/entity/Q811979'), ('name', 'architectural structure'), ('n', '6')]
[('super', 'http://www.wikidata.org/entity/Q2974842'), ('name', 'lost city'), ('n', '4')]
[('super', 'http://www.wikidata.org/entity/Q41176'), ('name', 'building'), ('n', '3')]
[('super', 'http://www.wikidata.org/entity/Q12280'), ('name', 'bridge'), ('n', '3')]
[('super', 'http://www.wikidata.org/entity/Q14752696'), ('name', 'ancient Roman structure'), ('n', '3')]
[('super', 'http://www.wikidata.org/entity/Q57660343'), ('name', 'performing arts building'), ('n', '2')]
[('super', 'http://www.wikidata.org/entity/Q83620'), ('name', 'thoroughfare'), ('n', '2')]
[('super', 'http://www.wik

50

In [25]:
#that last query looks wrong

#let's see the top 50 most common types

#removing some constraints 

#get all of the types
q = f"""
select ?super ?name (count(?super) as ?n) where{{
?entity wdt:P31 wd:Q839954 .
?entity wdt:P31 ?class .
?class wdt:P279 ?super .
?super <http://schema.org/name> ?name .
}}
group by ?super ?name
order by desc(?n)
limit 50
"""

run_query(q)

[('super', 'http://www.wikidata.org/entity/Q15642541'), ('name', 'human-geographic territorial entity'), ('n', '47137')]
[('super', 'http://www.wikidata.org/entity/Q1291195'), ('name', 'find spot'), ('n', '46886')]
[('super', 'http://www.wikidata.org/entity/Q839954'), ('name', 'archaeological site'), ('n', '4631')]
[('super', 'http://www.wikidata.org/entity/Q8513'), ('name', 'database'), ('n', '2005')]
[('super', 'http://www.wikidata.org/entity/Q811979'), ('name', 'architectural structure'), ('n', '1379')]
[('super', 'http://www.wikidata.org/entity/Q57821'), ('name', 'fortification'), ('n', '1305')]
[('super', 'http://www.wikidata.org/entity/Q177634'), ('name', 'community'), ('n', '1120')]
[('super', 'http://www.wikidata.org/entity/Q27096235'), ('name', 'artificial geographic entity'), ('n', '918')]
[('super', 'http://www.wikidata.org/entity/Q41176'), ('name', 'building'), ('n', '896')]
[('super', 'http://www.wikidata.org/entity/Q82794'), ('name', 'geographic region'), ('n', '882')]
[(

50

In [26]:
#let's try in another way, without superclasses
q = f"""
select ?class ?name (count(?class) as ?n) where{{
?entity wdt:P31 wd:Q839954 .
?entity wdt:P31 ?class .
?class <http://schema.org/name> ?name .
}}
group by ?class ?name
order by desc(?n)
limit 50
"""

run_query(q)

[('class', 'http://www.wikidata.org/entity/Q839954'), ('name', 'archaeological site'), ('n', '46862')]
[('class', 'http://www.wikidata.org/entity/Q102202066'), ('name', 'Samian Ware Discovery Site'), ('n', '3879')]
[('class', 'http://www.wikidata.org/entity/Q105440620'), ('name', 'BADACOR'), ('n', '2005')]
[('class', 'http://www.wikidata.org/entity/Q23413'), ('name', 'castle'), ('n', '903')]
[('class', 'http://www.wikidata.org/entity/Q486972'), ('name', 'human settlement'), ('n', '857')]
[('class', 'http://www.wikidata.org/entity/Q14752696'), ('name', 'ancient Roman structure'), ('n', '490')]
[('class', 'http://www.wikidata.org/entity/Q34023'), ('name', 'tumulus'), ('n', '421')]
[('class', 'http://www.wikidata.org/entity/Q1081138'), ('name', 'historic site'), ('n', '404')]
[('class', 'http://www.wikidata.org/entity/Q15661340'), ('name', 'ancient city'), ('n', '396')]
[('class', 'http://www.wikidata.org/entity/Q35509'), ('name', 'cave'), ('n', '315')]
[('class', 'http://www.wikidata.org

50

In [27]:
#Point 3. let's retrieve numerical properties for archeological sites
q = f"""
select distinct ?property ?name where{{
?site wdt:P31 wd:Q839954 .
?site ?property ?o .
?property <http://schema.org/name> ?name .
filter(isNumeric(?o))
}}
limit 100
"""
run_query(q)

[('property', 'http://www.wikidata.org/prop/direct/P1082'), ('name', 'population')]
[('property', 'http://www.wikidata.org/prop/direct/P2043'), ('name', 'length')]
[('property', 'http://www.wikidata.org/prop/direct/P2044'), ('name', 'elevation above sea level')]
[('property', 'http://www.wikidata.org/prop/direct/P2046'), ('name', 'area')]
[('property', 'http://www.wikidata.org/prop/direct/P2048'), ('name', 'height')]
[('property', 'http://www.wikidata.org/prop/direct/P2049'), ('name', 'width')]
[('property', 'http://www.wikidata.org/prop/direct/P2109'), ('name', 'installed capacity')]
[('property', 'http://www.wikidata.org/prop/direct/P4511'), ('name', 'vertical depth')]
[('property', 'http://www.wikidata.org/prop/direct/P8687'), ('name', 'social media followers')]
[('property', 'http://www.wikidata.org/prop/direct/P1174'), ('name', 'visitors per year')]
[('property', 'http://www.wikidata.org/prop/direct/P2130'), ('name', 'cost')]
[('property', 'http://www.wikidata.org/prop/direct/P153

41

In [28]:
visitors = 'wdt:P1174'
elevation = 'wdt:P2044'
population = 'wdt:P1082'
fee = 'wdt:P2555'
cost = 'wdt:P2130'

#let's rank the top 10 most visited sites in italy
q = f"""
select ?site ?name ?visitors where{{
?site wdt:P31 wd:Q839954 ;
    <http://schema.org/name> ?name ;
    wdt:P1174 ?visitors .
}}
order by desc(?visitors)
limit 10
"""
run_query(q)

[('site', 'http://www.wikidata.org/entity/Q10285'), ('name', 'Colosseum'), ('visitors', '7400000')]
[('site', 'http://www.wikidata.org/entity/Q1190282'), ('name', 'Independence National Historical Park'), ('visitors', '3572770')]
[('site', 'http://www.wikidata.org/entity/Q62378'), ('name', 'Tower of London'), ('visitors', '2741126')]
[('site', 'http://www.wikidata.org/entity/Q173527'), ('name', 'Knossos'), ('visitors', '855663')]
[('site', 'http://www.wikidata.org/entity/Q173527'), ('name', 'Knossos'), ('visitors', '634710')]
[('site', 'http://www.wikidata.org/entity/Q173527'), ('name', 'Knossos'), ('visitors', '611455')]
[('site', 'http://www.wikidata.org/entity/Q237128'), ('name', 'Mesa Verde National Park'), ('visitors', '557248')]
[('site', 'http://www.wikidata.org/entity/Q132564'), ('name', 'Mycenae'), ('visitors', '495993')]
[('site', 'http://www.wikidata.org/entity/Q421068'), ('name', 'Acropolis of Lindos'), ('visitors', '489137')]
[('site', 'http://www.wikidata.org/entity/Q2335

10

In [29]:
#i accidentally forgot to add italy, but colosseum is still the most visited each year
q = f"""
select ?site ?name ?visitors where{{
?site wdt:P31 wd:Q839954 ;
    <http://schema.org/name> ?name ;
    wdt:P1174 ?visitors ;
    wdt:P17 wd:Q38 .
}}
order by desc(?visitors)
limit 10
"""
run_query(q)

[('site', 'http://www.wikidata.org/entity/Q10285'), ('name', 'Colosseum'), ('visitors', '7400000')]


1

In [30]:
#we still only get the colosseum...

#let's see the most visited site for each continent
q1 = f"""
select distinct ?continent (max(?visitors) as ?max) where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 ?country ;
    <http://schema.org/name> ?name ;
    wdt:P1174 ?visitors .
    
?country wdt:P30 ?continent .
}}
group by ?continent
"""

q = f"""
select distinct ?continentName ?name ?visitors where{{
?site wdt:P31 wd:Q839954 ;
    <http://schema.org/name> ?name ;
    wdt:P1174 ?visitors ;
    wdt:P17 ?country .

?country wdt:P30 ?continent .
?continent <http://schema.org/name> ?continentName
filter (?visitors = ?max) .
    
    {{
        select distinct ?continent (max(?visitors) as ?max) where{{
        ?site wdt:P31 wd:Q839954 ;
            wdt:P17 ?country ;
            <http://schema.org/name> ?name ;
            wdt:P1174 ?visitors .

        ?country wdt:P30 ?continent .
        }}
        group by ?continent
    }}
    
}}
"""
run_query(q)

[('continentName', 'Europe'), ('name', 'Colosseum'), ('visitors', '7400000')]
[('continentName', 'North America'), ('name', 'Independence National Historical Park'), ('visitors', '3572770')]
[('continentName', 'Insular Oceania'), ('name', 'Independence National Historical Park'), ('visitors', '3572770')]


3

In [31]:
#this is suspicious, let's see just the max
q1 = f"""
select distinct ?continent (max(?visitors) as ?max) where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 ?country ;
    <http://schema.org/name> ?name ;
    wdt:P1174 ?visitors .
    
?country wdt:P30 ?continent .
}}
group by ?continent
"""
run_query(q1)

[('continent', 'http://www.wikidata.org/entity/Q538'), ('max', '3572770')]
[('continent', 'http://www.wikidata.org/entity/Q49'), ('max', '3572770')]
[('continent', 'http://www.wikidata.org/entity/Q46'), ('max', '7400000')]


3

In [33]:
#let's rank archeological sites by elevation
q = f"""
select ?siteName ?elevation where{{
?site wdt:P31 wd:Q839954 ;
    <http://schema.org/name> ?siteName ;
    wdt:P2044 ?elevation .
    
}}
order by desc(?elevation)
limit 10
"""
run_query(q)

[('siteName', 'Far View Sites Complex'), ('elevation', '7690')]
[('siteName', 'Far View House'), ('elevation', '7680')]
[('siteName', 'Obsidian Cliff'), ('elevation', '7401')]
[('siteName', 'Tusayan Ruins'), ('elevation', '7201')]
[('siteName', 'Badger House Community'), ('elevation', '7126')]
[('siteName', 'Bandelier National Monument'), ('elevation', '7123')]
[('siteName', 'Cedar Tree Tower'), ('elevation', '7113')]
[('siteName', 'Sun Point Pueblo'), ('elevation', '6788')]
[('siteName', 'Lowry Pueblo'), ('elevation', '6726')]
[('siteName', 'Casa Rinconada'), ('elevation', '6148')]


10

In [34]:
#let's rank cultures by the number of visitors the respective sites get each year

q = f"""
select ?cultureName (sum(?visitors) as ?total) where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P1174 ?visitors ;
    wdt:P2596 ?culture .
?culture <http://schema.org/name> ?cultureName .
}}
group by ?cultureName
order by desc(?total)
limit 10
"""
run_query(q)

[('cultureName', 'Ancient Rome'), ('total', '7400000')]
[('cultureName', 'Ancient Greece'), ('total', '2499715')]
[('cultureName', 'Minoan civilization'), ('total', '2408619')]
[('cultureName', 'Mycenaean Greece'), ('total', '1649962')]
[('cultureName', 'Byzantine Empire'), ('total', '385781')]
[('cultureName', 'Taíno people'), ('total', '80000')]


6

In [35]:
#Point 4.1 which country has more archaeological sites? which country has the most human settlements?
q = f"""
select ?countryName (count(?site) as ?n) where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 ?country .
?country <http://schema.org/name> ?countryName .
}}
group by ?countryName
order by desc(?n)
limit 20
"""
run_query(q)

[('countryName', 'United Kingdom'), ('n', '7026')]
[('countryName', 'Spain'), ('n', '6168')]
[('countryName', 'North Macedonia'), ('n', '4329')]
[('countryName', 'Germany'), ('n', '2825')]
[('countryName', 'Moldova'), ('n', '2674')]
[('countryName', 'United States of America'), ('n', '2159')]
[('countryName', 'Argentina'), ('n', '2073')]
[('countryName', 'Italy'), ('n', '1869')]
[('countryName', 'Switzerland'), ('n', '984')]
[('countryName', 'Netherlands'), ('n', '980')]
[('countryName', 'France'), ('n', '790')]
[('countryName', 'Greece'), ('n', '771')]
[('countryName', 'Slovenia'), ('n', '767')]
[('countryName', 'Turkey'), ('n', '633')]
[('countryName', 'Russia'), ('n', '558')]
[('countryName', 'Japan'), ('n', '455')]
[('countryName', 'Peru'), ('n', '453')]
[('countryName', 'Egypt'), ('n', '410')]
[('countryName', 'Kosovo'), ('n', '382')]
[('countryName', "People's Republic of China"), ('n', '379')]


20

In [36]:
#the united kingdom has more archaeological sites

HumanSettlement = 'wd:Q486972'

q = f"""
select ?countryName (count(?site) as ?n) where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P31 wd:Q486972 ;
    wdt:P17 ?country .
    
?country <http://schema.org/name> ?countryName
}}
group by ?countryName
order by desc(?n)
limit 20
"""
run_query(q)

[('countryName', 'Spain'), ('n', '263')]
[('countryName', 'Turkey'), ('n', '104')]
[('countryName', 'Libya'), ('n', '71')]
[('countryName', 'Lebanon'), ('n', '50')]
[('countryName', 'Egypt'), ('n', '49')]
[('countryName', 'Greece'), ('n', '32')]
[('countryName', 'Iraq'), ('n', '28')]
[('countryName', 'Italy'), ('n', '23')]
[('countryName', 'United States of America'), ('n', '18')]
[('countryName', 'Russia'), ('n', '18')]
[('countryName', 'United Kingdom'), ('n', '13')]
[('countryName', 'Israel'), ('n', '13')]
[('countryName', 'Mexico'), ('n', '11')]
[('countryName', 'Guatemala'), ('n', '10')]
[('countryName', 'Germany'), ('n', '10')]
[('countryName', 'Peru'), ('n', '9')]
[('countryName', 'India'), ('n', '8')]
[('countryName', 'Morocco'), ('n', '6')]
[('countryName', 'Belize'), ('n', '6')]
[('countryName', 'Switzerland'), ('n', '6')]


20

In [37]:
#spain has the most human settlements

#4.2 Which countries have Ancient Rome sites? Which other archaeological cultures are described?

#We have the node Ancient Rome, let's see its classes
q = f"""
select ?class ?name where{{
wd:Q1747689 wdt:P31 ?class .
?class <http://schema.org/name> ?name
}}
"""
run_query(q)

[('class', 'http://www.wikidata.org/entity/Q3024240'), ('name', 'historical country')]


1

In [38]:
#for now let's answer the first part
q = f"""
select distinct ?country ?name where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P2596 wd:Q1747689 ;
    wdt:P17 ?country .
    
?country <http://schema.org/name> ?name
}}
"""
run_query(q)

[('country', 'http://www.wikidata.org/entity/Q185103'), ('name', 'Roman Britain')]
[('country', 'http://www.wikidata.org/entity/Q1016'), ('name', 'Libya')]
[('country', 'http://www.wikidata.org/entity/Q1028'), ('name', 'Morocco')]
[('country', 'http://www.wikidata.org/entity/Q142'), ('name', 'France')]
[('country', 'http://www.wikidata.org/entity/Q145'), ('name', 'United Kingdom')]
[('country', 'http://www.wikidata.org/entity/Q1747689'), ('name', 'Ancient Rome')]
[('country', 'http://www.wikidata.org/entity/Q183'), ('name', 'Germany')]
[('country', 'http://www.wikidata.org/entity/Q218'), ('name', 'Romania')]
[('country', 'http://www.wikidata.org/entity/Q219'), ('name', 'Bulgaria')]
[('country', 'http://www.wikidata.org/entity/Q222'), ('name', 'Albania')]
[('country', 'http://www.wikidata.org/entity/Q224'), ('name', 'Croatia')]
[('country', 'http://www.wikidata.org/entity/Q230'), ('name', 'Georgia')]
[('country', 'http://www.wikidata.org/entity/Q233'), ('name', 'Malta')]
[('country', 'h

28

In [39]:
#there's a few countries that look like dissolved, let's see if we have a dissolution time as a property
tmp = 'wd:Q185103'
q = f"""
select distinct ?property ?name where{{
wd:Q185103 ?property ?p .
?property <http://schema.org/name> ?name .
filter (isNumeric(?p))
}}
"""
run_query(q)

Empty


0

In [40]:
tmp = 'wd:Q185103'
q = f"""
select distinct ?property ?name where{{
wd:Q185103 ?property ?p .
?property <http://schema.org/name> ?name .
}}
"""
run_query(q)

[('property', 'http://www.wikidata.org/prop/direct/P9505'), ('name', 'Gardens of the Roman Empire ID')]
[('property', 'http://www.wikidata.org/prop/direct/P131'), ('name', 'located in the administrative territorial entity')]
[('property', 'http://www.wikidata.org/prop/direct/P1365'), ('name', 'replaces')]
[('property', 'http://www.wikidata.org/prop/direct/P1366'), ('name', 'replaced by')]
[('property', 'http://www.wikidata.org/prop/direct/P1464'), ('name', 'category for people born here')]
[('property', 'http://www.wikidata.org/prop/direct/P156'), ('name', 'followed by')]
[('property', 'http://www.wikidata.org/prop/direct/P1566'), ('name', 'GeoNames ID')]
[('property', 'http://www.wikidata.org/prop/direct/P1584'), ('name', 'Pleiades ID')]
[('property', 'http://www.wikidata.org/prop/direct/P17'), ('name', 'country')]
[('property', 'http://www.wikidata.org/prop/direct/P1792'), ('name', 'category of associated people')]
[('property', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'i

38

In [41]:
dissolution = 'wdt:P576'
#let's repeat the query
q = f"""
select distinct ?country ?name where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P2596 wd:Q1747689 ;
    wdt:P17 ?country .
    

filter not exists{{
        ?country wdt:P576 ?o
    }}
    
?country <http://schema.org/name> ?name .

}}
"""
run_query(q)

[('country', 'http://www.wikidata.org/entity/Q1016'), ('name', 'Libya')]
[('country', 'http://www.wikidata.org/entity/Q1028'), ('name', 'Morocco')]
[('country', 'http://www.wikidata.org/entity/Q142'), ('name', 'France')]
[('country', 'http://www.wikidata.org/entity/Q145'), ('name', 'United Kingdom')]
[('country', 'http://www.wikidata.org/entity/Q183'), ('name', 'Germany')]
[('country', 'http://www.wikidata.org/entity/Q218'), ('name', 'Romania')]
[('country', 'http://www.wikidata.org/entity/Q219'), ('name', 'Bulgaria')]
[('country', 'http://www.wikidata.org/entity/Q222'), ('name', 'Albania')]
[('country', 'http://www.wikidata.org/entity/Q224'), ('name', 'Croatia')]
[('country', 'http://www.wikidata.org/entity/Q230'), ('name', 'Georgia')]
[('country', 'http://www.wikidata.org/entity/Q233'), ('name', 'Malta')]
[('country', 'http://www.wikidata.org/entity/Q262'), ('name', 'Algeria')]
[('country', 'http://www.wikidata.org/entity/Q29'), ('name', 'Spain')]
[('country', 'http://www.wikidata.or

26

In [42]:
#4.2 which other archaeological cultures are described?

#let's list all archaeological cultures in italy
q = f"""
select distinct ?culture ?name where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P2596 ?culture ;
    wdt:P17 wd:Q38 .
    
?culture <http://schema.org/name> ?name .
}}
"""
run_query(q)

[('culture', 'http://www.wikidata.org/entity/Q500272'), ('name', 'Samnites')]
[('culture', 'http://www.wikidata.org/entity/Q220379'), ('name', 'Adriatic Veneti')]
[('culture', 'http://www.wikidata.org/entity/Q17161'), ('name', 'Etruscans')]
[('culture', 'http://www.wikidata.org/entity/Q777462'), ('name', 'Aurunci')]
[('culture', 'http://www.wikidata.org/entity/Q11772'), ('name', 'Ancient Greece')]
[('culture', 'http://www.wikidata.org/entity/Q1200427'), ('name', 'culture of ancient Rome')]
[('culture', 'http://www.wikidata.org/entity/Q1747689'), ('name', 'Ancient Rome')]
[('culture', 'http://www.wikidata.org/entity/Q2277'), ('name', 'Roman Empire')]
[('culture', 'http://www.wikidata.org/entity/Q4948'), ('name', 'Republic of Venice')]
[('culture', 'http://www.wikidata.org/entity/Q40203'), ('name', 'paleolithic')]
[('culture', 'http://www.wikidata.org/entity/Q841576'), ('name', 'Iapyges')]
[('culture', 'http://www.wikidata.org/entity/Q14618893'), ('name', 'history of the Roman Empire fro

61

In [43]:
#let's do that for Egypt
Egypt = 'wd:Q79'

q = f"""
select distinct ?culture ?name where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P2596 ?culture ;
    wdt:P17 wd:Q79 .
    
?culture <http://schema.org/name> ?name .
}}
"""
run_query(q)

[('culture', 'http://www.wikidata.org/entity/Q220276'), ('name', 'Third Dynasty of Egypt')]
[('culture', 'http://www.wikidata.org/entity/Q634708'), ('name', 'culture of ancient Egypt')]
[('culture', 'http://www.wikidata.org/entity/Q11768'), ('name', 'Ancient Egypt')]
[('culture', 'http://www.wikidata.org/entity/Q1747689'), ('name', 'Ancient Rome')]
[('culture', 'http://www.wikidata.org/entity/Q428995'), ('name', 'Hellenistic period')]
[('culture', 'http://www.wikidata.org/entity/Q191324'), ('name', 'Middle Kingdom of Egypt')]
[('culture', 'http://www.wikidata.org/entity/Q1546344'), ('name', 'History of Ptolemaic Egypt')]
[('culture', 'http://www.wikidata.org/entity/Q3185305'), ('name', 'Kingdom of Israel')]
[('culture', 'http://www.wikidata.org/entity/Q177819'), ('name', 'Old Kingdom of Egypt')]


9

In [44]:
#in which countries are Ancient Rome sites predominant?


q1 = f"""
select ?country ?culture (count(?culture) as ?n) where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 ?country ;
    wdt:P2596 ?culture .
}}
group by ?country ?culture
"""

q2 = f"""
select ?country ?culture (max(?n) as ?max) where{{
    
    {{
        select ?country ?culture (count(?culture) as ?n) where{{
        ?site wdt:P31 wd:Q839954 ;
            wdt:P17 ?country ;
            wdt:P2596 ?culture .
        }}
        group by ?country ?culture
    }}
}}
group by ?country ?culture
"""

q = f"""
select ?country ?countryName where{{
?country <http://schema.org/name> ?countryName .
    {{
        select ?country ?culture (max(?n) as ?max) where{{
    
        {{
            select ?country ?culture (count(?culture) as ?n) where{{
                ?site wdt:P31 wd:Q839954 ;
                    wdt:P17 ?country ;
                    wdt:P2596 ?culture .
                }}
                group by ?country ?culture
            }}
        }}
        group by ?country ?culture
    }}
filter(?culture = wd:Q1747689)

}}
"""
run_query(q)

[('country', 'http://www.wikidata.org/entity/Q185103'), ('countryName', 'Roman Britain')]
[('country', 'http://www.wikidata.org/entity/Q1016'), ('countryName', 'Libya')]
[('country', 'http://www.wikidata.org/entity/Q1028'), ('countryName', 'Morocco')]
[('country', 'http://www.wikidata.org/entity/Q142'), ('countryName', 'France')]
[('country', 'http://www.wikidata.org/entity/Q145'), ('countryName', 'United Kingdom')]
[('country', 'http://www.wikidata.org/entity/Q1747689'), ('countryName', 'Ancient Rome')]
[('country', 'http://www.wikidata.org/entity/Q183'), ('countryName', 'Germany')]
[('country', 'http://www.wikidata.org/entity/Q218'), ('countryName', 'Romania')]
[('country', 'http://www.wikidata.org/entity/Q219'), ('countryName', 'Bulgaria')]
[('country', 'http://www.wikidata.org/entity/Q222'), ('countryName', 'Albania')]
[('country', 'http://www.wikidata.org/entity/Q224'), ('countryName', 'Croatia')]
[('country', 'http://www.wikidata.org/entity/Q230'), ('countryName', 'Georgia')]
[('

28

In [45]:
#should have also added filtered out dissolved countries but there's not too many


#let's do a random check of this query with Germany

Germany = 'wd:Q183'
q = f"""
select ?culture ?name (count(*) as ?n) where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 wd:Q183 ;
    wdt:P2596 ?culture .
    
?culture <http://schema.org/name> ?name .
}}
group by ?culture ?name
order by desc(?n)
"""
run_query(q)

[('culture', 'http://www.wikidata.org/entity/Q1747689'), ('name', 'Ancient Rome'), ('n', '11')]
[('culture', 'http://www.wikidata.org/entity/Q208247'), ('name', 'La Tène culture'), ('n', '9')]
[('culture', 'http://www.wikidata.org/entity/Q202165'), ('name', 'Hallstatt culture'), ('n', '8')]
[('culture', 'http://www.wikidata.org/entity/Q35966'), ('name', 'Ancient Celts'), ('n', '3')]
[('culture', 'http://www.wikidata.org/entity/Q12567'), ('name', 'Vikings'), ('n', '3')]
[('culture', 'http://www.wikidata.org/entity/Q22633'), ('name', 'Germanic peoples'), ('n', '2')]
[('culture', 'http://www.wikidata.org/entity/Q14618893'), ('name', 'history of the Roman Empire from 27 BCE to 395 AD'), ('n', '1')]
[('culture', 'http://www.wikidata.org/entity/Q223998'), ('name', 'Urnfield culture'), ('n', '1')]
[('culture', 'http://www.wikidata.org/entity/Q12554'), ('name', 'Middle Ages'), ('n', '1')]
[('culture', 'http://www.wikidata.org/entity/Q806348'), ('name', 'Linear Pottery culture'), ('n', '1')]
[(

14

In [46]:
#4.3 Which country has the most diverse set of civilizations or cultures across its sites?
q1 = f"""
select distinct ?country ?culture where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 ?country ;
    wdt:P2596 ?culture .
    
}}

"""

q = f"""
select ?countryName (count(?culture) as ?n) where{{

?country <http://schema.org/name> ?countryName

        {{
            select distinct ?country ?culture where{{
            ?site wdt:P31 wd:Q839954 ;
                wdt:P17 ?country ;
                wdt:P2596 ?culture .

            }}
        }}

}}
order by desc(?n)
limit(1)
"""
run_query(q)

[('countryName', 'Italy'), ('n', '61')]


1

In [47]:
#let's do top 10
q1 = f"""
select distinct ?country ?culture where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 ?country ;
    wdt:P2596 ?culture .
    
}}

"""

q = f"""
select ?countryName (count(?culture) as ?n) where{{

?country <http://schema.org/name> ?countryName

        {{
            select distinct ?country ?culture where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 ?country ;
    wdt:P2596 ?culture .
    
}}
        }}

}}
order by desc(?n)
limit(10)
"""
run_query(q)

[('countryName', 'Italy'), ('n', '61')]
[('countryName', 'United States of America'), ('n', '46')]
[('countryName', 'Spain'), ('n', '16')]
[('countryName', 'Israel'), ('n', '16')]
[('countryName', 'Iraq'), ('n', '15')]
[('countryName', 'Turkey'), ('n', '15')]
[('countryName', 'Syria'), ('n', '15')]
[('countryName', 'Greece'), ('n', '14')]
[('countryName', 'Germany'), ('n', '14')]
[('countryName', 'Peru'), ('n', '13')]


10

In [49]:
#4.4 Let's explore archaeological sites that are also tourist attractions
#first of all let's figure out what the 2 properties 'fee' and 'cost' that we found earlier mean

q = f"""
select ?siteName ?fee where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P31 wd:Q570116 ;
    wdt:P17 wd:Q38 ;
    wdt:P2555 ?fee ;
    <http://schema.org/name> ?siteName .
}}
order by desc(?fee)
limit 20
"""
run_query(q)

Empty


0

In [51]:
#let's try with cost
q = f"""
select ?siteName ?cost where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P31 wd:Q570116 ;
    wdt:P17 wd:Q38 ;
    wdt:P2130 ?cost ;
    <http://schema.org/name> ?siteName .
}}
order by desc(?fee)
limit 20
"""
run_query(q)

Empty


0

In [52]:
#let's retry without the Attraction class
q = f"""
select ?siteName ?fee where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 wd:Q38 ;
    wdt:P2555 ?fee ;
    <http://schema.org/name> ?siteName .
}}
order by desc(?fee)
limit 20
"""
run_query(q)

Empty


0

In [54]:
q = f"""
select ?siteName ?cost where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 wd:Q38 ;
    wdt:P2130 ?cost ;
    <http://schema.org/name> ?siteName .
}}
order by desc(?cost)
limit 20
"""
run_query(q)

Empty


0

In [55]:
#let's try and ignore the country
q = f"""
select ?siteName ?fee where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P2555 ?fee ;
    <http://schema.org/name> ?siteName .
}}
order by desc(?fee)
limit 20
"""
run_query(q)

[('siteName', 'Borobudur'), ('fee', '350000')]
[('siteName', 'Borobudur'), ('fee', '30000')]


2

In [56]:
q = f"""
select ?siteName ?cost where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P2130 ?cost ;
    <http://schema.org/name> ?siteName .
}}
order by desc(?cost)
limit 20
"""
run_query(q)

[('siteName', 'Franklin Court'), ('cost', '4750000')]
[('siteName', 'Orford Castle'), ('cost', '1413')]


2

In [57]:
#my hope was that one between cost or fee would indicate the price of entry into a site, unfortunately there's too few
#sites with those properties so we can't use them to decide the country with the most affordable sites to visit

followers = 'wdt:P8687'
maxCap = 'wdt:P1083'

#let's see if archeological sites have social media followers

q = f"""
select ?siteName ?n where{{
?site wdt:P31 wd:Q839954 ;
    <http://schema.org/name> ?siteName ;
    wdt:P8687 ?n
}}
order by desc(?n)
limit 50
"""
run_query(q)

[('siteName', 'Tower of London'), ('n', '61777')]
[('siteName', 'Edinburgh Castle'), ('n', '37034')]
[('siteName', 'Stonehenge'), ('n', '32651')]
[('siteName', 'Independence National Historical Park'), ('n', '8870')]
[('siteName', 'Dover Castle'), ('n', '7859')]
[('siteName', 'Must Farm Bronze Age settlement'), ('n', '6790')]
[('siteName', 'Hoghton Tower'), ('n', '6406')]
[('siteName', 'Ocmulgee Mounds National Historical Park'), ('n', '5596')]
[('siteName', 'Hopewell Culture National Historical Park'), ('n', '5389')]
[('siteName', 'The Castle, Newcastle'), ('n', '3910')]
[('siteName', 'Pembroke Castle'), ('n', '1389')]


11

In [59]:
#archeological sites with many social media followers are also probably well organized for tourists, so that list would be useful

#during this period of time, it's important to follow safety regulations, so the higher the capacity of a site, the better
q = f"""
select ?name ?cap where{{
?site wdt:P31 wd:Q839954 ;
    <http://schema.org/name> ?name ;
    wdt:P1083 ?cap .
}}
order by desc(?cap)
limit 50
"""
run_query(q)

[('name', 'Colosseum'), ('cap', '65000')]
[('name', 'Odeon of Herodes Atticus'), ('cap', '5000')]
[('name', 'SS Portland'), ('cap', '800')]


3

In [60]:
#if we had information about capacity for more sites, we could combine it with the number of visitors each year
#and pick archaeological sites that each year don't see too many visitors
q = f"""
select ?name ?visitors where{{
?site wdt:P31 wd:Q839954 ;
    <http://schema.org/name> ?name ;
    wdt:P1174 ?visitors .
}}
order by desc(?visitors)
limit 50
"""
run_query(q)

[('name', 'Colosseum'), ('visitors', '7400000')]
[('name', 'Independence National Historical Park'), ('visitors', '3572770')]
[('name', 'Tower of London'), ('visitors', '2741126')]
[('name', 'Knossos'), ('visitors', '855663')]
[('name', 'Knossos'), ('visitors', '634710')]
[('name', 'Knossos'), ('visitors', '611455')]
[('name', 'Mesa Verde National Park'), ('visitors', '557248')]
[('name', 'Mycenae'), ('visitors', '495993')]
[('name', 'Acropolis of Lindos'), ('visitors', '489137')]
[('name', 'Epidaurus'), ('visitors', '469060')]
[('name', 'Acropolis of Lindos'), ('visitors', '441012')]
[('name', 'Acropolis of Lindos'), ('visitors', '421070')]
[('name', 'Mycenae'), ('visitors', '412397')]
[('name', 'Delphi'), ('visitors', '405947')]
[('name', 'Mycenae'), ('visitors', '398718')]
[('name', 'Epidaurus'), ('visitors', '375437')]
[('name', 'Delphi'), ('visitors', '345804')]
[('name', 'Mycenae'), ('visitors', '342854')]
[('name', 'Temple of Poseidon at cape Sounion'), ('visitors', '306665')]
[

50

In [61]:
#let's only pick 1 entry for visitors for each site
q = f"""
select ?name (max(?visitors) as ?max) where{{
?site wdt:P31 wd:Q839954 ;
    <http://schema.org/name> ?name ;
    wdt:P1174 ?visitors .
}}
group by ?name
order by desc(?max)
limit 50
"""
run_query(q)

[('name', 'Colosseum'), ('max', '7400000')]
[('name', 'Independence National Historical Park'), ('max', '3572770')]
[('name', 'Tower of London'), ('max', '2741126')]
[('name', 'Knossos'), ('max', '855663')]
[('name', 'Mesa Verde National Park'), ('max', '557248')]
[('name', 'Mycenae'), ('max', '495993')]
[('name', 'Acropolis of Lindos'), ('max', '489137')]
[('name', 'Epidaurus'), ('max', '469060')]
[('name', 'Delphi'), ('max', '405947')]
[('name', 'Temple of Poseidon at cape Sounion'), ('max', '306665')]
[('name', 'Aigai'), ('max', '222503')]
[('name', 'Wupatki National Monument'), ('max', '216165')]
[('name', 'Ancient Corinth'), ('max', '196944')]
[('name', 'Corfe Castle'), ('max', '190000')]
[('name', 'Mystras'), ('max', '145939')]
[('name', 'Phaistos'), ('max', '134538')]
[('name', 'Tibes Indigenous Ceremonial Center'), ('max', '80000')]
[('name', 'Paléosite de Saint-Césaire'), ('max', '68000')]
[('name', 'Orford Castle'), ('max', '38818')]
[('name', 'Amphoralis'), ('max', '13080')]

22

In [70]:
#let's find the "best" country for each culture, meaning the country with most sites for each culture

#count number of sites for each culture in each country
q1 = f"""
select ?country ?culture (count(?site) as ?n) where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 ?country ;
    wdt:P2596 ?culture .
}}
group by ?country ?culture
"""

q2 = f"""
select ?country2 ?culture2 (count(?site) as ?n2) where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 ?country2 ;
    wdt:P2596 ?culture2 .
}}
group by ?country2 ?culture2
"""

#select max number of sites in a certain country for a culture
q3 = f"""
select ?culture2 (max(?n2) as ?max) where{{
    
    {{
        select ?country2 ?culture2 (count(?site) as ?n2) where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 ?country2 ;
    wdt:P2596 ?culture2 .
}}
group by ?country2 ?culture2
    }}
    
}}
group by ?culture2
"""

#put it all together
q = f"""
select ?countryName ?cultureName ?max where{{

?country <http://schema.org/name> ?countryName .
?culture <http://schema.org/name> ?cultureName .

    {{
        select ?country ?culture (count(?site) as ?n) where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 ?country ;
    wdt:P2596 ?culture .
}}
group by ?country ?culture
    }}
    
    {{
        select ?culture2 (max(?n2) as ?max) where{{
    
    {{
        select ?country2 ?culture2 (count(?site) as ?n2) where{{
?site wdt:P31 wd:Q839954 ;
    wdt:P17 ?country2 ;
    wdt:P2596 ?culture2 .
}}
group by ?country2 ?culture2
    }}
    
}}
group by ?culture2
    }}

filter(?culture = ?culture2 && ?n = ?max)

}}
order by desc(?max) asc(?cultureName)
limit 50
"""
run_query(q)

[('countryName', 'Italy'), ('cultureName', 'Ancient Rome'), ('max', '378')]
[('countryName', 'Egypt'), ('cultureName', 'Ancient Egypt'), ('max', '125')]
[('countryName', 'Greece'), ('cultureName', 'Ancient Greece'), ('max', '94')]
[('countryName', 'Italy'), ('cultureName', 'nuragic civilization'), ('max', '74')]
[('countryName', 'Italy'), ('cultureName', 'Etruscans'), ('max', '70')]
[('countryName', 'Mexico'), ('cultureName', 'Maya civilization'), ('max', '43')]
[('countryName', 'United States of America'), ('cultureName', 'Ancient Pueblo Peoples'), ('max', '38')]
[('countryName', 'Greece'), ('cultureName', 'Minoan civilization'), ('max', '27')]
[('countryName', 'Italy'), ('cultureName', 'Samnites'), ('max', '15')]
[('countryName', 'Greece'), ('cultureName', 'Mycenaean Greece'), ('max', '14')]
[('countryName', 'Iraq'), ('cultureName', 'Sumer'), ('max', '12')]
[('countryName', 'Spain'), ('cultureName', 'history of the Roman Empire from 27 BCE to 395 AD'), ('max', '12')]
[('countryName',

50