# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p sc:name ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-90a4aeb93a-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# GEO Workflow Series ("archaeological sites") 

Consider the following exploratory information need:

> Search for archaeological site in the world, across countries, continents, and in reference to their culture

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P279`    | subclass      | predicate |
| `wdt:P17`     | country       | predicate |
| `wd:Q38`      | Italy  | node      |
| `wd:Q641556`  | Verona Arena  | node      |
| `wd:Q1747689` | Ancient Rome  | node |
| `wd:Q46`      | Europe        | node |
| `wd:Q173527`  | Knossos       | node |
| `wd:Q839954`  | archaeological site | node |


Also consider

```
?p wdt:P17 wd:Q38  . 
?p wdt:P31 wd:Q839954  . 
```

is the BGP to retrieve all **archaeological sites in italy**

## Workload Goals

1. Identify the BGP that connect an archaeological site to the country, the continent, and the culture 

2. Identify the BGP to retrieve other types of an archaeological site, e.g., human settlement or theatre

3. Is there any relevant numerical attribute that describes these sites, e.g., number visitors?

4. Analyze the number of archaeological sites per type and country
 
   4.1 Which country has more archaeological sites? Which country has the most  human settlements?
   
   4.2 Which countries have Ancient Rome sites, which other "archaeological cultures" are described?
   
   4.3 Which country has the most diverse set of civilizations or cultures across its sites?
   
   4.4 If you are interested in visiting some sites, which country would you pick? Based on what criteria?


In [2]:
# start your workflow here

In [3]:
queryString = """
SELECT COUNT( ?p)
WHERE { 
?p wdt:P17 wd:Q38  . 
?p wdt:P31 wd:Q839954  . 

} 
GROUP BY ?cult  ?arch
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '1869')]


1

## Task 1
Identify the BGP that connect an archaeological site to the country, the continent, and the culture

Use a REGEX filter to discover property related with country, culture and continent

In [4]:
queryString = """
SELECT DISTINCT ?pr ?prop
WHERE { 
?a wdt:P31 wd:Q839954;
    ?pr ?o.
?pr sc:name ?prop .
FILTER(REGEX(?prop,\"coun.*\") || REGEX(?prop,\"cont.*\") || REGEX(?prop,\"cult.*\") || REGEX(?prop,\"civi.*\") )
}
"""

print("Results")
run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P633'), ('prop', 'Quebec cultural heritage directory ID')]
[('pr', 'http://www.wikidata.org/prop/direct/P2596'), ('prop', 'culture')]
[('pr', 'http://www.wikidata.org/prop/direct/P17'), ('prop', 'country')]
[('pr', 'http://www.wikidata.org/prop/direct/P7959'), ('prop', 'historic county')]
[('pr', 'http://www.wikidata.org/prop/direct/P1586'), ('prop', 'Catalan object of cultural interest ID')]
[('pr', 'http://www.wikidata.org/prop/direct/P30'), ('prop', 'continent')]
[('pr', 'http://www.wikidata.org/prop/direct/P205'), ('prop', 'basin country')]
[('pr', 'http://www.wikidata.org/prop/direct/P4245'), ('prop', 'cultural heritage monument in Serbia ID')]
[('pr', 'http://www.wikidata.org/prop/direct/P193'), ('prop', 'main building contractor')]
[('pr', 'http://www.wikidata.org/prop/direct/P495'), ('prop', 'country of origin')]
[('pr', 'http://www.wikidata.org/prop/direct/P5694'), ('prop', 'Moroccan cultural heritage ID')]
[('pr', 'http://w

21

The relation of interest are:
- `P17` country
- `P30` continent
- `P2596` culture

Use the optional match to show all the archeological site

In [5]:
queryString = """
SELECT ?site ?country ?continent ?culture
WHERE { 
?arc wdt:P31 wd:Q839954.
OPTIONAL{?arc    wdt:P17 ?a;
    wdt:P30 ?b;
    wdt:P2596 ?c.
    
?a sc:name ?country .
?b sc:name ?continent .
?c sc:name ?culture .}
?arc sc:name ?site .
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('site', 'Olduvai Gorge'), ('country', 'Tanzania'), ('continent', 'Africa'), ('culture', 'prehistory')]
[('site', 'Veii'), ('country', 'Italy'), ('continent', 'Europe'), ('culture', 'Etruscans')]
[('site', 'Keros')]
[('site', 'Mesimvria')]
[('site', 'Ancient Theatre of Ohrid')]
[('site', 'Lasaea')]
[('site', 'Tomb of Liu Sheng')]
[('site', 'Schloss')]
[('site', 'Burgstall Windeck')]
[('site', 'sanctuary of Yria')]
[('site', 'Burgstall Lichteburg')]
[('site', 'Xixiaowang site')]
[('site', 'Riverside Plantation Tabby Ruins')]
[('site', 'Daughters of Jacob Bridge')]
[('site', 'Arch of Hadrian')]
[('site', 'Birthplace of Patrick Henry')]
[('site', 'Aghabullogue Bridge')]
[('site', 'Clontead More House')]
[('site', 'Lands End Road Tabby Ruins')]
[('site', 'Magic Mountain Site')]
[('site', 'Hatun Uchku')]
[('site', 'Upper Ridge Site')]
[('site', 'Santee Indian Mound and Fort Watson')]
[('site', 'Seven Islands Archeological and Historic District')]
[('site', 'Roman estate')]
[('site'

100

Force the match with all the property

In [6]:
queryString = """
SELECT ?site ?country ?continent ?culture
WHERE { 
?arc wdt:P31 wd:Q839954;
    wdt:P17 ?a;
    wdt:P30 ?b;
    wdt:P2596 ?c.
    
?a sc:name ?country .
?b sc:name ?continent .
?c sc:name ?culture .
?arc sc:name ?site .
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('site', 'Veii'), ('country', 'Italy'), ('continent', 'Europe'), ('culture', 'Etruscans')]
[('site', 'Olduvai Gorge'), ('country', 'Tanzania'), ('continent', 'Africa'), ('culture', 'prehistory')]
[('site', 'Mari'), ('country', 'Syria'), ('continent', 'Asia'), ('culture', 'Semitic people')]
[('site', 'Indian Grove (Mosca, Colorado)'), ('country', 'United States of America'), ('continent', 'North America'), ('culture', 'Ute people')]
[('site', 'Puerco Ruin and Petroglyphs'), ('country', 'United States of America'), ('continent', 'North America'), ('culture', 'Ancient Pueblo Peoples')]
[('site', 'Wukoki Pueblo'), ('country', 'United States of America'), ('continent', 'North America'), ('culture', 'Ancient Pueblo Peoples')]
[('site', 'Far View House'), ('country', 'United States of America'), ('continent', 'North America'), ('culture', 'Ancient Pueblo Peoples')]
[('site', 'Badger House Community'), ('country', 'United States of America'), ('continent', 'North America'), ('culture'

52

Little different query to see how many cultural site have no country, no continent, no culture because only 52 have all this informations.

In [7]:
#retrieve the amount of archeological site
queryString = """
SELECT COUNT(*) AS ?all_site
WHERE { 
?arc wdt:P31 wd:Q839954.
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('all_site', '46862')]


1

In [8]:
#retrieve the amount of archeological site without the related country

queryString = """
SELECT COUNT(*) AS ?no_country
WHERE { 
?arc wdt:P31 wd:Q839954.
FILTER NOT EXISTS{?arc wdt:P17 ?a.}
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('no_country', '5016')]


1

In [9]:
#retrieve the amount of archeological site without the related continent

queryString = """
SELECT COUNT(*) AS ?no_continent
WHERE { 
?arc wdt:P31 wd:Q839954.
FILTER NOT EXISTS{?arc wdt:P30 ?a.}
#    wdt:P30 ?b;
 #   wdt:P2596 ?c.
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('no_continent', '46638')]


1

In [10]:
#retrieve the amount of archeological site without the related culture

queryString = """
SELECT COUNT(*)
WHERE { 
?arc wdt:P31 wd:Q839954.
FILTER NOT EXISTS{?arc wdt:P2596 ?a.}
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '45184')]


1

Now since I am smart I can look for continent through the country.

In [11]:
queryString = """
SELECT count(?o) AS ?cont
WHERE { 
?arc wdt:P31 wd:Q839954;
    wdt:P17 ?a.
?a wdt:P30 ?o.

}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('cont', '48111')]


1

Nice result. Do again the query to find how many architectural site do not have the continent

In [12]:
queryString = """
SELECT count(?arc) AS ?cont
WHERE { 
?arc wdt:P31 wd:Q839954.
 

FILTER NOT EXISTS{
    ?arc wdt:P17 ?a.
    ?a wdt:P30 ?o.
}
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('cont', '5019')]


1

This allows me to associate the continent even if it was not put as object property for the archeological site.

## Task 2
Identify the BGP to retrieve other types of an archaeological site, e.g., human settlement or theatre

Look for the subclasses

In [13]:
queryString = """
SELECT DISTINCT ?ist ?ista
WHERE { 
?arc wdt:P31/wdt:P279* wd:Q839954;
    wdt:P31 ?ist.

?ist sc:name ?ista .
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('ist', 'http://www.wikidata.org/entity/Q11486287'), ('ista', 'former Buddhist temple')]
[('ist', 'http://www.wikidata.org/entity/Q1891165'), ('ista', 'mansio')]
[('ist', 'http://www.wikidata.org/entity/Q8157'), ('ista', 'dolphin')]
[('ist', 'http://www.wikidata.org/entity/Q1016853'), ('ista', 'burnt mound')]
[('ist', 'http://www.wikidata.org/entity/Q107413232'), ('ista', 'moated site')]
[('ist', 'http://www.wikidata.org/entity/Q14601169'), ('ista', 'human fortified settlement')]
[('ist', 'http://www.wikidata.org/entity/Q31028835'), ('ista', 'level terrain fort')]
[('ist', 'http://www.wikidata.org/entity/Q325017'), ('ista', 'causewayed enclosure')]
[('ist', 'http://www.wikidata.org/entity/Q4868359'), ('ista', 'bastle house')]
[('ist', 'http://www.wikidata.org/entity/Q48794661'), ('ista', 'Zona Arqueológica')]
[('ist', 'http://www.wikidata.org/entity/Q181623'), ('ista', 'warehouse')]
[('ist', 'http://www.wikidata.org/entity/Q39420219'), ('ista', 'archaeological monument in Meck

100

Use regex to find theatre or human settlement

In [14]:
queryString = """
SELECT DISTINCT ?ist ?ista
WHERE { 
?arc wdt:P31/wdt:P279* wd:Q839954;
    wdt:P31 ?ist.

?ist sc:name ?ista .
filter(regex(?ista,\".*atr.*\") || regex(?ista,\".*human.*\"))
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('ist', 'http://www.wikidata.org/entity/Q14601169'), ('ista', 'human fortified settlement')]
[('ist', 'http://www.wikidata.org/entity/Q811430'), ('ista', 'human-made geographic feature')]
[('ist', 'http://www.wikidata.org/entity/Q486972'), ('ista', 'human settlement')]
[('ist', 'http://www.wikidata.org/entity/Q19757'), ('ista', 'Roman theatre')]
[('ist', 'http://www.wikidata.org/entity/Q11183017'), ('ista', 'open-air theatre')]
[('ist', 'http://www.wikidata.org/entity/Q7362268'), ('ista', 'Roman amphitheatre')]
[('ist', 'http://www.wikidata.org/entity/Q24354'), ('ista', 'theatre')]
[('ist', 'http://www.wikidata.org/entity/Q2860319'), ('ista', 'Greek theatre')]
[('ist', 'http://www.wikidata.org/entity/Q2844395'), ('ista', 'Avenches amphitheatre')]
[('ist', 'http://www.wikidata.org/entity/Q11635'), ('ista', 'theatre')]
[('ist', 'http://www.wikidata.org/entity/Q54831'), ('ista', 'amphitheatre')]
[('ist', 'http://www.wikidata.org/entity/Q24933318'), ('ista', 'Gallo-Roman amphithea

16

## Task 3
Is there any relevant numerical attribute that describes these sites, e.g., number visitors?

With isNumeric filter I can retrieve all numeric attribute

In [15]:
queryString = """
SELECT DISTINCT ?pr ?prop
WHERE { 
?arc wdt:P31/wdt:P279* wd:Q839954;
    ?pr ?o.

?pr sc:name ?prop .
filter(isNumeric(?o))
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P1103'), ('prop', 'number of platform tracks')]
[('pr', 'http://www.wikidata.org/prop/direct/P2043'), ('prop', 'length')]
[('pr', 'http://www.wikidata.org/prop/direct/P2046'), ('prop', 'area')]
[('pr', 'http://www.wikidata.org/prop/direct/P3872'), ('prop', 'patronage')]
[('pr', 'http://www.wikidata.org/prop/direct/P1314'), ('prop', 'number of spans')]
[('pr', 'http://www.wikidata.org/prop/direct/P2787'), ('prop', 'longest span')]
[('pr', 'http://www.wikidata.org/prop/direct/P1101'), ('prop', 'floors above ground')]
[('pr', 'http://www.wikidata.org/prop/direct/P2044'), ('prop', 'elevation above sea level')]
[('pr', 'http://www.wikidata.org/prop/direct/P2048'), ('prop', 'height')]
[('pr', 'http://www.wikidata.org/prop/direct/P2225'), ('prop', 'discharge')]
[('pr', 'http://www.wikidata.org/prop/direct/P2234'), ('prop', 'volume as quantity')]
[('pr', 'http://www.wikidata.org/prop/direct/P1128'), ('prop', 'employees')]
[('pr', 'http://www

55

`P1174` this attribute tells me the number of visitor per year. Looking at in details

In [16]:
queryString = """
SELECT DISTINCT ?name ?o
WHERE { 
?arc wdt:P31/wdt:P279* wd:Q839954;
    wdt:P1174 ?o.

?arc sc:name ?name .

}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('name', 'Pantheon'), ('o', '30000000')]
[('name', 'Roman amphitheatre of Saintes'), ('o', '50000')]
[('name', 'musée gallo-romain de Claracq'), ('o', '2883')]
[('name', 'Prague Castle'), ('o', '512800')]
[('name', 'Prague Castle'), ('o', '2587500')]
[('name', 'Colosseum'), ('o', '7400000')]
[('name', 'Orford Castle'), ('o', '31276')]
[('name', 'Orford Castle'), ('o', '38818')]
[('name', 'Orford Castle'), ('o', '29410')]
[('name', 'Orford Castle'), ('o', '35552')]
[('name', 'Orford Castle'), ('o', '29022')]
[('name', 'Orford Castle'), ('o', '33803')]
[('name', 'Orford Castle'), ('o', '37049')]
[('name', 'Orford Castle'), ('o', '29974')]
[('name', 'Orford Castle'), ('o', '6728')]
[('name', 'Orford Castle'), ('o', '31047')]
[('name', 'Orford Castle'), ('o', '34866')]
[('name', 'Orford Castle'), ('o', '29062')]
[('name', 'Orford Castle'), ('o', '37890')]
[('name', 'Orford Castle'), ('o', '32890')]
[('name', 'Orford Castle'), ('o', '29745')]
[('name', 'Orford Castle'), ('o', '3562

90

Since I notice more than one attribute per entity I assume that they are related on different year.

In [17]:
queryString = """
SELECT DISTINCT ?name sum(?o) AS ?visitors
WHERE { 
?arc wdt:P31/wdt:P279* wd:Q839954;
    wdt:P1174 ?o.


?arc sc:name ?name .
}
GROUP BY ?arc ?name
ORDER BY DESC (?visitors)
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('name', 'Pantheon'), ('visitors', '60000000')]
[('name', 'Colosseum'), ('visitors', '14800000')]
[('name', 'Acropolis of Athens'), ('visitors', '7679402')]
[('name', 'Independence National Historical Park'), ('visitors', '3572770')]
[('name', 'Catacombs of Paris'), ('visitors', '3116733')]
[('name', 'Prague Castle'), ('visitors', '3100300')]
[('name', 'Tower of London'), ('visitors', '2741126')]
[('name', 'Knossos'), ('visitors', '2101828')]
[('name', 'Mycenae'), ('visitors', '1649962')]
[('name', 'Olympia'), ('visitors', '1388914')]
[('name', 'Acropolis of Lindos'), ('visitors', '1351219')]
[('name', 'Delphi'), ('visitors', '1042456')]
[('name', 'Epidaurus'), ('visitors', '844497')]
[('name', 'Temple of Poseidon at cape Sounion'), ('visitors', '789174')]
[('name', 'Aigai'), ('visitors', '612762')]
[('name', 'Mesa Verde National Park'), ('visitors', '557248')]
[('name', 'Ancient Corinth'), ('visitors', '542087')]
[('name', 'Orford Castle'), ('visitors', '502752')]
[('name', '

28

## Task 4
Analyze the number of archaeological sites per type and country

4.1 Which country has more archaeological sites? Which country has the most human settlements?

4.2 Which countries have Ancient Rome sites, which other "archaeological cultures" are described?

4.3 Which country has the most diverse set of civilizations or cultures across its sites?

4.4 If you are interested in visiting some sites, which country would you pick? Based on what criteria?

<h3> Task 4.1 </h3> Which country has more archaeological sites? Which country has the most human settlements?

In [18]:
queryString = """
SELECT ?country count(DISTINCT ?arc) AS ?arc_sites
WHERE { 
?arc wdt:P31/wdt:P279* wd:Q839954;
    wdt:P17 ?o.

?o sc:name ?country .
}
GROUP BY ?o ?country
ORDER BY DESC (?arc_sites)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('country', 'Sweden'), ('arc_sites', '80827')]
[('country', 'Germany'), ('arc_sites', '52063')]
[('country', 'United Kingdom'), ('arc_sites', '12151')]
[('country', 'Spain'), ('arc_sites', '9843')]
[('country', 'Denmark'), ('arc_sites', '7034')]
[('country', 'North Macedonia'), ('arc_sites', '4335')]
[('country', 'Finland'), ('arc_sites', '2930')]
[('country', 'Moldova'), ('arc_sites', '2674')]
[('country', 'Italy'), ('arc_sites', '2379')]
[('country', 'United States of America'), ('arc_sites', '2181')]


10

In [19]:
#select the maximum 
queryString = """
SELECT ?country ?arc_sites
WHERE
{
    {
        SELECT max(?arc_sites) AS ?max_c
        WHERE
        {
            SELECT ?country count(DISTINCT ?arc) AS ?arc_sites
            WHERE { 
            ?arc wdt:P31/wdt:P279* wd:Q839954;
                wdt:P17 ?o.

            ?o sc:name ?country .
            }
            GROUP BY ?o ?country
        }
    }
    {
        SELECT ?country count(DISTINCT ?arc) AS ?arc_sites
        WHERE { 
        ?arc wdt:P31/wdt:P279* wd:Q839954;
            wdt:P17 ?o.

        ?o sc:name ?country .
        }
        GROUP BY ?o ?country
    }
    FILTER(?arc_sites=?max_c)
}
"""

print("Results")
run_query(queryString)

Results
[('country', 'Sweden'), ('arc_sites', '80827')]


1

From one of the previous query about task 2 I found that `Q486972` is the human settlements

In [21]:
queryString = """
SELECT ?country count(DISTINCT ?arc) AS ?arc_sites
WHERE { 
?arc wdt:P31/wdt:P279* wd:Q486972;
    wdt:P17 ?o.

?o sc:name ?country .
}
GROUP BY ?o ?country
ORDER BY DESC (?arc_sites)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('country', "People's Republic of China"), ('arc_sites', '704499')]
[('country', 'Russia'), ('arc_sites', '203492')]
[('country', 'Mexico'), ('arc_sites', '175618')]
[('country', 'India'), ('arc_sites', '175391')]
[('country', 'Indonesia'), ('arc_sites', '116463')]
[('country', 'Germany'), ('arc_sites', '100927')]
[('country', 'United States of America'), ('arc_sites', '97388')]
[('country', 'Poland'), ('arc_sites', '69283')]
[('country', 'Italy'), ('arc_sites', '66558')]
[('country', 'Iran'), ('arc_sites', '63776')]


10

In [22]:
#select the maximum
queryString = """
SELECT ?country ?human_sites
WHERE
{
    {
        SELECT max(?human_sites) AS ?max_c
        WHERE
        {
            SELECT ?country count(DISTINCT ?arc) AS ?human_sites
            WHERE { 
            ?arc wdt:P31/wdt:P279* wd:Q486972;
                wdt:P17 ?o.

            ?o sc:name ?country .
            }
            GROUP BY ?o ?country
        }
    }
    {
        SELECT ?country count(DISTINCT ?arc) AS ?human_sites
        WHERE { 
        ?arc wdt:P31/wdt:P279* wd:Q486972;
            wdt:P17 ?o.

        ?o sc:name ?country .
        }
        GROUP BY ?o ?country
    }
    FILTER(?human_sites=?max_c)
}
"""

print("Results")
run_query(queryString)

Results
[('country', "People's Republic of China"), ('human_sites', '704499')]


1

<h3> Task 4.2 </h3>
Which countries have Ancient Rome sites, which other "archaeological cultures" are described?

In [23]:
#look for some property starting from the fact that wd:Q1747689 is the Ancient Rome
queryString = """
SELECT DISTINCT ?pr ?prop
WHERE { 
?arc wdt:P31/wdt:P279* wd:Q486972;
    ?pr wd:Q1747689.

?arc sc:name ?arc_sites .
?pr sc:name ?prop .
}
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P17'), ('prop', 'country')]
[('pr', 'http://www.wikidata.org/prop/direct/P1376'), ('prop', 'capital of')]
[('pr', 'http://www.wikidata.org/prop/direct/P149'), ('prop', 'architectural style')]
[('pr', 'http://www.wikidata.org/prop/direct/P2596'), ('prop', 'culture')]
[('pr', 'http://www.wikidata.org/prop/direct/P2348'), ('prop', 'time period')]
[('pr', 'http://www.wikidata.org/prop/direct/P361'), ('prop', 'part of')]


6

In [24]:
#find the country
queryString = """
SELECT DISTINCT ?country
WHERE { 
?arc wdt:P31/wdt:P279* wd:Q486972;
    wdt:P2596 wd:Q1747689;
    wdt:P17 ?c.

?arc sc:name ?arc_sites .
?c sc:name ?country .
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('country', 'United Kingdom')]
[('country', 'Ancient Rome')]
[('country', 'Germany')]
[('country', 'Romania')]
[('country', 'Spain')]
[('country', 'Italy')]
[('country', 'Libya')]
[('country', 'France')]
[('country', 'Turkey')]
[('country', 'Morocco')]
[('country', 'Austria')]
[('country', 'Jordan')]
[('country', 'Bulgaria')]
[('country', 'Syria')]
[('country', 'Portugal')]
[('country', 'Egypt')]
[('country', 'Israel')]
[('country', 'Byzantine Empire')]
[('country', 'Greece')]


19

If I want only proper country (exclude Byzantine Empire or Ancient Rome for example)

In [25]:
#look at the classes of object related with P17 ('country') to an archeological site
queryString = """
SELECT DISTINCT ?ist ?ista
WHERE { 
?arc wdt:P31/wdt:P279* wd:Q486972;
    wdt:P2596 wd:Q1747689;
    wdt:P17 ?c.
?c wdt:P31 ?ist.
?ist sc:name ?ista .
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('ist', 'http://www.wikidata.org/entity/Q112099'), ('ista', 'island nation')]
[('ist', 'http://www.wikidata.org/entity/Q20181813'), ('ista', 'colonial power')]
[('ist', 'http://www.wikidata.org/entity/Q3024240'), ('ista', 'historical country')]
[('ist', 'http://www.wikidata.org/entity/Q3624078'), ('ista', 'sovereign state')]
[('ist', 'http://www.wikidata.org/entity/Q4209223'), ('ista', 'Rechtsstaat')]
[('ist', 'http://www.wikidata.org/entity/Q43702'), ('ista', 'federation')]
[('ist', 'http://www.wikidata.org/entity/Q619610'), ('ista', 'social state')]
[('ist', 'http://www.wikidata.org/entity/Q6256'), ('ista', 'country')]
[('ist', 'http://www.wikidata.org/entity/Q63791824'), ('ista', 'country bordering the Baltic Sea')]
[('ist', 'http://www.wikidata.org/entity/Q7270'), ('ista', 'republic')]
[('ist', 'http://www.wikidata.org/entity/Q202686'), ('ista', 'Commonwealth realm')]
[('ist', 'http://www.wikidata.org/entity/Q51576574'), ('ista', 'Mediterranean country')]
[('ist', 'http://

24

Found that `Q6256` is the class of country.

In [26]:
queryString = """
SELECT DISTINCT ?country
WHERE { 
?arc wdt:P31/wdt:P279* wd:Q486972;
    wdt:P2596 wd:Q1747689;
    wdt:P17 ?c.
?c wdt:P31 wd:Q6256.
?arc sc:name ?arc_sites .
?c sc:name ?country .
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('country', 'United Kingdom')]
[('country', 'Romania')]
[('country', 'Italy')]
[('country', 'Libya')]
[('country', 'France')]
[('country', 'Turkey')]
[('country', 'Morocco')]
[('country', 'Austria')]
[('country', 'Jordan')]
[('country', 'Syria')]
[('country', 'Bulgaria')]
[('country', 'Egypt')]
[('country', 'Portugal')]
[('country', 'Israel')]
[('country', 'Greece')]


15

Find other archeological cultures

In [27]:
queryString = """
SELECT DISTINCT ?o ?cult
WHERE { 
?arc wdt:P31/wdt:P279* wd:Q486972;
    wdt:P2596 ?o.
?o sc:name ?cult .
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q202165'), ('cult', 'Hallstatt culture')]
[('o', 'http://www.wikidata.org/entity/Q223998'), ('cult', 'Urnfield culture')]
[('o', 'http://www.wikidata.org/entity/Q611017'), ('cult', 'Lusatian culture')]
[('o', 'http://www.wikidata.org/entity/Q206020'), ('cult', 'Circipania')]
[('o', 'http://www.wikidata.org/entity/Q83891'), ('cult', 'Sasanian Empire')]
[('o', 'http://www.wikidata.org/entity/Q35966'), ('cult', 'Ancient Celts')]
[('o', 'http://www.wikidata.org/entity/Q208247'), ('cult', 'La Tène culture')]
[('o', 'http://www.wikidata.org/entity/Q1036791'), ('cult', 'Hirpini')]
[('o', 'http://www.wikidata.org/entity/Q1140588'), ('cult', 'Russian Americans')]
[('o', 'http://www.wikidata.org/entity/Q536129'), ('cult', 'Tlingit people')]
[('o', 'http://www.wikidata.org/entity/Q173082'), ('cult', 'Dacia')]
[('o', 'http://www.wikidata.org/entity/Q640605'), ('cult', 'Yamato people')]
[('o', 'http://www.wikidata.org/entity/Q1747689'), ('cult', 'Ancie

100

<h3> Task 4.3 </h3>Which country has the most diverse set of civilizations or cultures across its sites?

With the property `P2596` I relate the site with its cultures. Then bind also the country and sum over the distinct cultures

In [28]:
queryString = """
SELECT ?country count(DISTINCT ?o) AS ?cultures
WHERE { 
?arc wdt:P31/wdt:P279* wd:Q486972;
    wdt:P17 ?c;
    wdt:P2596 ?o.
?o sc:name ?cult .
?c sc:name ?country .
}
GROUP BY ?c ?country
ORDER BY DESC (?cultures)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('country', 'Italy'), ('cultures', '27')]
[('country', 'United States of America'), ('cultures', '20')]
[('country', 'Czech Republic'), ('cultures', '16')]
[('country', 'Spain'), ('cultures', '12')]
[('country', 'Turkey'), ('cultures', '12')]
[('country', 'Israel'), ('cultures', '11')]
[('country', 'Greece'), ('cultures', '11')]
[('country', 'Germany'), ('cultures', '9')]
[('country', 'Syria'), ('cultures', '8')]
[('country', 'Iraq'), ('cultures', '8')]


10

In [29]:
#find the maximum
queryString = """
SELECT ?country ?cultures
WHERE
{
    {
        SELECT max(?cultures) AS ?max_c
        WHERE
        {
            SELECT ?country count(DISTINCT ?o) AS ?cultures
            WHERE { 
            ?arc wdt:P31/wdt:P279* wd:Q486972;
                wdt:P17 ?c;
                wdt:P2596 ?o.
            ?o sc:name ?cult .
            ?c sc:name ?country .
            }
            GROUP BY ?c ?country
        }
    }
    {
        SELECT ?country count(DISTINCT ?o) AS ?cultures
        WHERE { 
        ?arc wdt:P31/wdt:P279* wd:Q486972;
            wdt:P17 ?c;
            wdt:P2596 ?o.
        ?o sc:name ?cult .
        ?c sc:name ?country .
        }
        GROUP BY ?c ?country
    }
    FILTER(?cultures=?max_c)
}
"""

print("Results")
run_query(queryString)

Results
[('country', 'Italy'), ('cultures', '27')]


1

<h3> Task 4.4 </h3>If you are interested in visiting some sites, which country would you pick? Based on what criteria?

Since I like football and I was a football player I would like to see the bigger stadium in the world. So I can start looking for all the stadium and look for a numeric value that tells me the capacity of the stadium.

In [30]:
#find the class of stadium
queryString = """
SELECT DISTINCT ?ist ?ista
WHERE { 
?st wdt:P31/wdt:P279* wd:Q839954;
    wdt:P31 ?ist.

?ist sc:name ?ista .
filter(regex(?ista,\".*adium.*\") )
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('ist', 'http://www.wikidata.org/entity/Q483110'), ('ista', 'stadium')]


1

In [31]:
#look for numeric properties

queryString = """
SELECT DISTINCT ?pr ?prop
WHERE { 
?st wdt:P31/wdt:P279* wd:Q483110;
    ?pr ?o.

?pr sc:name ?prop .
filter(isNumeric(?o) )
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P1083'), ('prop', 'maximum capacity')]
[('pr', 'http://www.wikidata.org/prop/direct/P8687'), ('prop', 'social media followers')]
[('pr', 'http://www.wikidata.org/prop/direct/P1101'), ('prop', 'floors above ground')]
[('pr', 'http://www.wikidata.org/prop/direct/P2044'), ('prop', 'elevation above sea level')]
[('pr', 'http://www.wikidata.org/prop/direct/P2046'), ('prop', 'area')]
[('pr', 'http://www.wikidata.org/prop/direct/P2048'), ('prop', 'height')]
[('pr', 'http://www.wikidata.org/prop/direct/P2130'), ('prop', 'cost')]
[('pr', 'http://www.wikidata.org/prop/direct/P1301'), ('prop', 'number of elevators')]
[('pr', 'http://www.wikidata.org/prop/direct/P2043'), ('prop', 'length')]
[('pr', 'http://www.wikidata.org/prop/direct/P2049'), ('prop', 'width')]
[('pr', 'http://www.wikidata.org/prop/direct/P2067'), ('prop', 'mass')]
[('pr', 'http://www.wikidata.org/prop/direct/P2234'), ('prop', 'volume as quantity')]
[('pr', 'http://www.wikidata

20

Nice. `P1083` gives me the capacity of the stadium. Find the stadium with maximum capacity

In [32]:
queryString = """
SELECT DISTINCT ?stadium ?country ?o
WHERE { 
?st wdt:P31/wdt:P279* wd:Q483110;
    wdt:P1083 ?o;
    wdt:P17 ?c.

?st sc:name ?stadium .
?c sc:name ?country .
}
ORDER BY DESC(?o)
LIMIT 5
"""

print("Results")
run_query(queryString)

Results
[('stadium', 'Deutsches Stadion'), ('country', 'Germany'), ('o', '400000')]
[('stadium', 'Rungrado 1st of May Stadium'), ('country', 'North Korea'), ('o', '150000')]
[('stadium', 'Parc des Sports Aguiléra'), ('country', 'France'), ('o', '134000')]
[('stadium', 'Estádio do Morumbi'), ('country', 'Brazil'), ('o', '120000')]
[('stadium', 'Estadio Azteca'), ('country', 'Mexico'), ('o', '115000')]


5

So I will go in Germany to visit this Deutsches Stadion!