# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [2]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-17e02ab93a-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# GEO Workflow Series ("archaeological sites") 

Consider the following exploratory information need:

> Search for archaeological site in the world, across countries, continents, and in reference to their culture

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P279`    | subclass      | predicate |
| `wdt:P17`     | country       | predicate |
| `wd:Q38`      | Italy  | node      |
| `wd:Q641556`  | Verona Arena  | node      |
| `wd:Q1747689` | Ancient Rome  | node |
| `wd:Q46`      | Europe        | node |
| `wd:Q173527`  | Knossos       | node |
| `wd:Q839954`  | archaeological site | node |


Also consider

```
?p wdt:P17 wd:Q38  . 
?p wdt:P31 wd:Q839954  . 
```

is the BGP to retrieve all **archaeological sites in italy**

## Workload Goals

1. Identify the BGP that connect an archaeological site to the country, the continent, and the culture 

2. Identify the BGP to retrieve other types of an archaeological site, e.g., human settlement or theatre

3. Is there any relevant numerical attribute that describes these sites, e.g., number visitors?

4. Analyze the number of archaeological sites per type and country
 
   4.1 Which country has more archaeological sites? Which country has the most  human settlements?
   
   4.2 Which countries have Ancient Rome sites, which other "archaeological cultures" are described?
   
   4.3 Which country has the most diverse set of civilizations or cultures across its sites?
   
   4.4 If you are interested in visiting some sites, which country would you pick? Based on what criteria?


In [1]:
# start your workflow here

In [4]:
queryString = """
SELECT COUNT( ?p)
WHERE { 
?p wdt:P17 wd:Q38  . 
?p wdt:P31 wd:Q839954  . 

} 
GROUP BY ?cult  ?arch
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '1869')]


1

In [7]:
# get all archaeological sites in Italy

queryString = """
SELECT ?p ?plabel
WHERE { 
?p wdt:P17 wd:Q38  . 
?p wdt:P31 wd:Q839954  . 

?p <http://schema.org/name> ?plabel . 

}
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q1046434'), ('plabel', 'Casa Romuli')]
[('p', 'http://www.wikidata.org/entity/Q17624209'), ('plabel', 'Catacomba di Santa Savinilla')]
[('p', 'http://www.wikidata.org/entity/Q17631557'), ('plabel', 'Gymnasium')]
[('p', 'http://www.wikidata.org/entity/Q17637300'), ('plabel', 'Nuraghe Sirai')]
[('p', 'http://www.wikidata.org/entity/Q17651475'), ('plabel', 'Porta Venere')]
[('p', 'http://www.wikidata.org/entity/Q17652208'), ('plabel', 'Excavation at Largo Perosi - Via Giulia (Rome)')]
[('p', 'http://www.wikidata.org/entity/Q21208533'), ('plabel', 'Temple of Fortuna Respiciens')]
[('p', 'http://www.wikidata.org/entity/Q908472'), ('plabel', 'Domus Augustana')]
[('p', 'http://www.wikidata.org/entity/Q1665908'), ('plabel', 'Interamna Lirenas')]
[('p', 'http://www.wikidata.org/entity/Q22907580'), ('plabel', 'Palace of Domitian')]


10

In [10]:
# get all predicates and objects from "Verona Arena"

queryString = """
SELECT ?p ?plabel (GROUP_CONCAT(?olabel; separator="|") as ?objs)
WHERE { 

wd:Q641556 ?p ?o .

?p <http://schema.org/name> ?plabel . 
?o <http://schema.org/name> ?olabel . 

}
GROUP BY ?p ?plabel
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('plabel', 'country'), ('objs', 'Italy')]
[('p', 'http://www.wikidata.org/prop/direct/P276'), ('plabel', 'location'), ('objs', 'piazza Bra')]
[('p', 'http://www.wikidata.org/prop/direct/P8933'), ('plabel', 'category for the view from the item'), ('objs', 'Category:Views from the Arena (Verona)')]
[('p', 'http://www.wikidata.org/prop/direct/P131'), ('plabel', 'located in the administrative territorial entity'), ('objs', 'Verona')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('plabel', 'different from'), ('objs', 'Roman theatre of Verona')]
[('p', 'http://www.wikidata.org/prop/direct/P149'), ('plabel', 'architectural style'), ('objs', 'ancient Roman architecture')]
[('p', 'http://www.wikidata.org/prop/direct/P186'), ('plabel', 'made from material'), ('objs', 'stone')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('plabel', 'instance of'), ('objs', 'ancient Roman structure|Roman amphitheatre|theatre|tourist attracti

10

In [11]:
# get all predicates and objects from "archaeological site"

queryString = """
SELECT ?p ?plabel (GROUP_CONCAT(?olabel; separator="|") as ?objs)
WHERE { 

wd:Q839954 ?p ?o .

?p <http://schema.org/name> ?plabel . 
?o <http://schema.org/name> ?olabel . 

}
GROUP BY ?p ?plabel
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P2579'), ('plabel', 'studied by'), ('objs', 'archaeology')]
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('plabel', "topic's main template"), ('objs', 'Template:Infobox ancient site|Template:Infobox historic site')]
[('p', 'http://www.wikidata.org/prop/direct/P1269'), ('plabel', 'facet of'), ('objs', 'cultural heritage')]
[('p', 'http://www.wikidata.org/prop/direct/P6104'), ('plabel', 'maintained by WikiProject'), ('objs', 'WikiProject Historical Place')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('plabel', "topic's main category"), ('objs', 'Category:Archaeological sites')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('plabel', 'described by source'), ('objs', 'Armenian Soviet Encyclopedia|Green Map Icons')]
[('p', 'http://www.wikidata.org/prop/direct/P1687'), ('plabel', 'Wikidata property'), ('objs', 'archaeological site of')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('plabel', 'subclass of'), (

8

In [13]:
# get all predicates and objects from "Italy"

queryString = """
SELECT ?p ?plabel (GROUP_CONCAT(?olabel; separator="|") as ?objs)
WHERE { 

wd:Q38 ?p ?o .

?p <http://schema.org/name> ?plabel . 
?o <http://schema.org/name> ?olabel . 

}
GROUP BY ?p ?plabel
ORDER BY (?plabel)
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P5125'), ('plabel', 'Wikimedia outline'), ('objs', 'outline of Italy')]
[('p', 'http://www.wikidata.org/prop/direct/P85'), ('plabel', 'anthem'), ('objs', 'Il Canto degli Italiani')]
[('p', 'http://www.wikidata.org/prop/direct/P122'), ('plabel', 'basic form of government'), ('objs', 'parliamentary republic')]
[('p', 'http://www.wikidata.org/prop/direct/P36'), ('plabel', 'capital'), ('objs', 'Rome')]
[('p', 'http://www.wikidata.org/prop/direct/P1740'), ('plabel', 'category for films shot at this location'), ('objs', 'Category:Films shot in Italy')]
[('p', 'http://www.wikidata.org/prop/direct/P7867'), ('plabel', 'category for maps'), ('objs', 'Category:Maps of Italy')]
[('p', 'http://www.wikidata.org/prop/direct/P1465'), ('plabel', 'category for people who died here'), ('objs', 'Category:Deaths in Italy')]
[('p', 'http://www.wikidata.org/prop/direct/P1464'), ('plabel', 'category for people born here'), ('objs', 'Category:Births in Italy'

58

In [20]:
# GOAL (1)
# get all archaeological sites and the related information about country, continent and culture
# NOTE: I retrieved the labels only in order to get a cleaner output

queryString = """
SELECT ?site ?sitelbl (GROUP_CONCAT(DISTINCT ?culturelbl; separator="|") as ?cult) (GROUP_CONCAT(DISTINCT ?countrylbl; separator="|") as ?contr) (GROUP_CONCAT(DISTINCT ?continlbl; separator="|") as ?cont)
WHERE { 

?site wdt:P31 wd:Q839954 ;
      wdt:P17 ?country ;
      wdt:P2596 ?culture .
      
?country wdt:P30 ?contin .

?site <http://schema.org/name> ?sitelbl . 
?country <http://schema.org/name> ?countrylbl .
?contin <http://schema.org/name> ?continlbl . 
?culture <http://schema.org/name> ?culturelbl . 

}
GROUP BY ?site ?sitelbl
ORDER BY (?sitelbl)
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('site', 'http://www.wikidata.org/entity/Q4553352'), ('sitelbl', '1790 Footprints'), ('cult', 'Native Hawaiians'), ('contr', 'United States of America'), ('cont', 'Insular Oceania|North America')]
[('site', 'http://www.wikidata.org/entity/Q95627065'), ('sitelbl', '21SL55'), ('cult', 'indigenous peoples of North America'), ('contr', 'United States of America'), ('cont', 'Insular Oceania|North America')]
[('site', 'http://www.wikidata.org/entity/Q16995461'), ('sitelbl', 'Aaron Hilton Site'), ('cult', 'African-American culture'), ('contr', 'United States of America'), ('cont', 'Insular Oceania|North America')]
[('site', 'http://www.wikidata.org/entity/Q304670'), ('sitelbl', 'Abadiyeh'), ('cult', 'Ancient Egypt'), ('contr', 'Egypt'), ('cont', 'Africa|Asia')]
[('site', 'http://www.wikidata.org/entity/Q602078'), ('sitelbl', 'Aballava'), ('cult', 'Roman Empire'), ('contr', 'United Kingdom'), ('cont', 'Europe')]
[('site', 'http://www.wikidata.org/entity/Q318664'), ('sitelbl', 'Abellin

20

In [39]:
# get node of which each archaeological site is an instance

queryString = """
SELECT ?site ?sitelbl (GROUP_CONCAT(?type; separator="|") as ?typ) (GROUP_CONCAT(?typelbl; separator="|") as ?typlbl)
WHERE { 

?site wdt:P31 wd:Q839954 .
?site wdt:P31 ?type .

?site <http://schema.org/name> ?sitelbl . 
?type <http://schema.org/name> ?typelbl . 

}
GROUP BY ?site ?sitelbl
ORDER BY (?sitelbl)
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('site', 'http://www.wikidata.org/entity/Q106032525'), ('sitelbl', '"CUEVAS DEL ARROYO DE LAS AGUITAS"'), ('typ', 'http://www.wikidata.org/entity/Q839954|http://www.wikidata.org/entity/Q105440620'), ('typlbl', 'archaeological site|BADACOR')]
[('site', 'http://www.wikidata.org/entity/Q17986627'), ('sitelbl', '"Congarees" Site'), ('typ', 'http://www.wikidata.org/entity/Q839954'), ('typlbl', 'archaeological site')]
[('site', 'http://www.wikidata.org/entity/Q42898357'), ('sitelbl', '"La Cetățui" archaeological site in Tismana'), ('typ', 'http://www.wikidata.org/entity/Q839954'), ('typlbl', 'archaeological site')]
[('site', 'http://www.wikidata.org/entity/Q3209474'), ('sitelbl', '"La Grande Cité"'), ('typ', 'http://www.wikidata.org/entity/Q2065736|http://www.wikidata.org/entity/Q839954'), ('typlbl', 'cultural property|archaeological site')]
[('site', 'http://www.wikidata.org/entity/Q37300520'), ('sitelbl', '"Riparo Ranaldi" prehistoric site'), ('typ', 'http://www.wikidata.org/entit

100

In [33]:
# get all objects from "Verona Arena"->"instanceOf"

queryString = """
SELECT ?o ?olabel
WHERE { 

wd:Q641556 wdt:P31 ?o .

?o <http://schema.org/name> ?olabel . 

}
"""

print("Results")
run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q14752696'), ('olabel', 'ancient Roman structure')]
[('o', 'http://www.wikidata.org/entity/Q7362268'), ('olabel', 'Roman amphitheatre')]
[('o', 'http://www.wikidata.org/entity/Q24354'), ('olabel', 'theatre')]
[('o', 'http://www.wikidata.org/entity/Q570116'), ('olabel', 'tourist attraction')]
[('o', 'http://www.wikidata.org/entity/Q839954'), ('olabel', 'archaeological site')]
[('o', 'http://www.wikidata.org/entity/Q153562'), ('olabel', 'opera house')]


6

In [36]:
# get node which "theatre" is an instance of

queryString = """
SELECT ?o ?olabel
WHERE { 

wd:Q24354 wdt:P31 ?o .

?o <http://schema.org/name> ?olabel . 

}
"""

print("Results")
run_query(queryString)

Results
Empty


0

In [37]:
# get all predicates from "theatre" node

queryString = """
SELECT ?p ?plabel (GROUP_CONCAT(?olabel; separator="|") as ?objs)
WHERE { 

wd:Q24354 ?p ?o .

?p <http://schema.org/name> ?plabel . 
?o <http://schema.org/name> ?olabel . 

}
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('plabel', 'has part'), ('objs', 'stage|front of house|theater space|backstage|stage tower')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('plabel', 'different from'), ('objs', 'theatre company|theatre')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('plabel', "topic's main category"), ('objs', 'Category:Theatres')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('plabel', 'described by source'), ('objs', "Small Brockhaus and Efron Encyclopedic Dictionary|Real'nyj slovar' klassicheskih drevnostej po Ljubkeru|Brockhaus and Efron Encyclopedic Dictionary")]
[('p', 'http://www.wikidata.org/prop/direct/P366'), ('plabel', 'use'), ('objs', 'theatre')]
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('plabel', 'properties for this type'), ('objs', 'named after|country|owned by|located in the administrative territorial entity|operator|has part|street address|architect|official website|located on street|locati

7

In [46]:
# GOAL (2)
# get all the others types of each archaeological site

queryString = """
SELECT ?site ?sitelbl ?type ?typelbl
WHERE { 

?site wdt:P31 wd:Q839954 .
?site wdt:P31 ?type .

?site <http://schema.org/name> ?sitelbl . 
?type <http://schema.org/name> ?typelbl . 

FILTER (?type != wd:Q839954)

}
GROUP BY ?site ?sitelbl
ORDER BY (?sitelbl)
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('site', 'http://www.wikidata.org/entity/Q106032525'), ('sitelbl', '"CUEVAS DEL ARROYO DE LAS AGUITAS"'), ('type', 'http://www.wikidata.org/entity/Q105440620'), ('typelbl', 'BADACOR')]
[('site', 'http://www.wikidata.org/entity/Q3209474'), ('sitelbl', '"La Grande Cité"'), ('type', 'http://www.wikidata.org/entity/Q2065736'), ('typelbl', 'cultural property')]
[('site', 'http://www.wikidata.org/entity/Q106589023'), ('sitelbl', '"TRAMPA DE ZORRO IX - DE LA PAMPA"'), ('type', 'http://www.wikidata.org/entity/Q105440620'), ('typelbl', 'BADACOR')]
[('site', 'http://www.wikidata.org/entity/Q106589026'), ('sitelbl', '"TRAMPA DE ZORRO VI - DE LAS PAMP"'), ('type', 'http://www.wikidata.org/entity/Q105440620'), ('typelbl', 'BADACOR')]
[('site', 'http://www.wikidata.org/entity/Q106589028'), ('sitelbl', '"TRAMPA DE ZORRO VII - DE LA PAMP"'), ('type', 'http://www.wikidata.org/entity/Q105440620'), ('typelbl', 'BADACOR')]
[('site', 'http://www.wikidata.org/entity/Q106589030'), ('sitelbl', '"TRAM

100

In [3]:
# get all predicates and subjects to "Verona Arena"

queryString = """
SELECT ?p ?plabel (GROUP_CONCAT(?slabel; separator="|") as ?subjs)
WHERE { 

?s ?p wd:Q641556 .

?p <http://schema.org/name> ?plabel . 
?s <http://schema.org/name> ?slabel . 

}
GROUP BY ?p ?plabel
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P915'), ('plabel', 'filming location'), ('subjs', 'Live in Verona')]
[('p', 'http://www.wikidata.org/prop/direct/P69'), ('plabel', 'educated at'), ('subjs', 'Jean Pierre Gamarra')]
[('p', 'http://www.wikidata.org/prop/direct/P276'), ('plabel', 'location'), ('subjs', 'Festivalbar 1977|Festivalbar 1982|Festivalbar 1983|Festivalbar 1989|Festivalbar 2001|Festivalbar 2002|Festivalbar 2005|Wind Music Awards 2012|Wind Music Awards 2011|Festivalbar 1975|Festivalbar 1981|Festivalbar 1991|Festivalbar 1998|Festivalbar 2007|Wind Music Awards 2018|Festivalbar 2003|Festivalbar 1978|Festivalbar 1979|Festivalbar 1990|SEAT Music Awards 2020|Wind Music Awards 2017|Festivalbar 2004|Festivalbar 1999|Festivalbar 2006|Festivalbar 1980|Festivalbar 2000|Festivalbar 1976|Festivalbar 1988|Wind Music Awards 2010|Wind Music Awards 2016|Wind Music Awards 2009|Wind Music Awards 2015|SEAT Music Awards 2019')]
[('p', 'http://www.wikidata.org/prop/direct/P483'), ('pl

9

In [18]:
# GOAL (3)
# check whether objects retrieved from an archaeological site through 2 paths are numerical
# ANSWER: NO

queryString = """
SELECT ?p ?plabel (GROUP_CONCAT(?olabel; separator="|") as ?objs) (GROUP_CONCAT(?o2label; separator="|") as ?objs2)
WHERE { 

?site wdt:P31 wd:Q839954 ;
      ?p ?o .
?o ?p2 ?o2 .

?site <http://schema.org/name> ?sitelbl . 
?o <http://schema.org/name> ?olabel . 
?o2 <http://schema.org/name> ?o2label . 

FILTER (!isLiteral(?olabel) || !isLiteral(?o2label))

}
"""

print("Results")
run_query(queryString)

Results
Empty


0

In [4]:
# get all the nodes which archaeological sites are instance of

queryString = """
SELECT DISTINCT ?o ?olabel
WHERE { 

?s wdt:P31 wd:Q839954 ;
   wdt:P31 ?o .

?o <http://schema.org/name> ?olabel . 

}
ORDER BY ?olabel
"""

print("Results")
run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q24068255'), ('olabel', '13 Kohanic cities')]
[('o', 'http://www.wikidata.org/entity/Q1243603'), ('olabel', '14 regions of Augustan Rome')]
[('o', 'http://www.wikidata.org/entity/Q37947353'), ('olabel', 'Alaska state park')]
[('o', 'http://www.wikidata.org/entity/Q104880747'), ('olabel', 'Amluk-Dara stupa')]
[('o', 'http://www.wikidata.org/entity/Q4115349'), ('olabel', 'Ancient City of Aleppo')]
[('o', 'http://www.wikidata.org/entity/Q93342462'), ('olabel', 'Ancient Greek archaeological site')]
[('o', 'http://www.wikidata.org/entity/Q95463156'), ('olabel', 'Ancient Roman dam')]
[('o', 'http://www.wikidata.org/entity/Q1161819'), ('olabel', 'Ancient Roman unit of measurement')]
[('o', 'http://www.wikidata.org/entity/Q3467688'), ('olabel', 'Ancient settlements in Turkey')]
[('o', 'http://www.wikidata.org/entity/Q37944886'), ('olabel', 'Arizona state park')]
[('o', 'http://www.wikidata.org/entity/Q37947493'), ('olabel', 'Arkansas state park')]

1117

In [56]:
# GOAL (4.1)
# get number of archaeological sites and human settlements for each country
# In order to obtain both the answers, edit the "ORDER BY" variable 
# Country with most number of archaeological sites: UNITED KINGDOM
# Country with most number of human settlements: MEXICO

queryString = """
SELECT ?country ?countrylbl (COUNT(DISTINCT ?arc_site) AS ?site_count) (COUNT(DISTINCT ?hs) AS ?hs_count)
WHERE { 
    {
        ?arc_site wdt:P31 wd:Q839954 ;
                  wdt:P17 ?country .
    }
    UNION
    {
        ?hs wdt:P31 wd:Q486972 ;
            wdt:P17 ?country .
    }
    
    ?country <http://schema.org/name> ?countrylbl .
}
GROUP BY ?country ?countrylbl
ORDER BY DESC (?site_count)
"""

print("Results")
run_query(queryString)

Results
[('country', 'http://www.wikidata.org/entity/Q145'), ('countrylbl', 'United Kingdom'), ('site_count', '7026'), ('hs_count', '3900')]
[('country', 'http://www.wikidata.org/entity/Q29'), ('countrylbl', 'Spain'), ('site_count', '6168'), ('hs_count', '12488')]
[('country', 'http://www.wikidata.org/entity/Q221'), ('countrylbl', 'North Macedonia'), ('site_count', '4329'), ('hs_count', '266')]
[('country', 'http://www.wikidata.org/entity/Q183'), ('countrylbl', 'Germany'), ('site_count', '2825'), ('hs_count', '26744')]
[('country', 'http://www.wikidata.org/entity/Q217'), ('countrylbl', 'Moldova'), ('site_count', '2674'), ('hs_count', '3')]
[('country', 'http://www.wikidata.org/entity/Q30'), ('countrylbl', 'United States of America'), ('site_count', '2159'), ('hs_count', '6390')]
[('country', 'http://www.wikidata.org/entity/Q414'), ('countrylbl', 'Argentina'), ('site_count', '2073'), ('hs_count', '1590')]
[('country', 'http://www.wikidata.org/entity/Q38'), ('countrylbl', 'Italy'), ('sit

430

In [34]:
# GOAL (4.2)
# get the archaeological site's culture and the related countries

queryString = """
SELECT ?culture ?culturelbl (GROUP_CONCAT(DISTINCT ?countrylbl; separator="|") AS ?countries)
WHERE { 

?site wdt:P31 wd:Q839954 ;
      wdt:P2596 ?culture ;
      wdt:P17 ?country .

?culture <http://schema.org/name> ?culturelbl .
?country <http://schema.org/name> ?countrylbl .

}
GROUP BY ?culture ?culturelbl
ORDER BY ?culturelbl
"""

print("Results")
run_query(queryString)

Results
[('culture', 'http://www.wikidata.org/entity/Q643335'), ('culturelbl', 'Abealzu-Filigosa culture'), ('countries', 'Italy')]
[('culture', 'http://www.wikidata.org/entity/Q210774'), ('culturelbl', 'Acheulean'), ('countries', 'Israel|Tanzania|Zambia')]
[('culture', 'http://www.wikidata.org/entity/Q220379'), ('culturelbl', 'Adriatic Veneti'), ('countries', 'Italy')]
[('culture', 'http://www.wikidata.org/entity/Q3007177'), ('culturelbl', 'African-American culture'), ('countries', 'United States of America')]
[('culture', 'http://www.wikidata.org/entity/Q203507'), ('culturelbl', 'Amorites'), ('countries', 'Syria|Turkey')]
[('culture', 'http://www.wikidata.org/entity/Q2429397'), ('culturelbl', 'Ancient Carthage'), ('countries', 'Italy|Spain')]
[('culture', 'http://www.wikidata.org/entity/Q35966'), ('culturelbl', 'Ancient Celts'), ('countries', 'France|Germany|Italy|Spain')]
[('culture', 'http://www.wikidata.org/entity/Q11768'), ('culturelbl', 'Ancient Egypt'), ('countries', 'Ancient E

268

In [38]:
# GOAL (4.3)
# get the count of different cultures in each country
# Answer: Italy, 61 cultures

queryString = """
SELECT ?country ?countrylbl (COUNT(DISTINCT ?culture) AS ?cult_num)
WHERE { 

?site wdt:P31 wd:Q839954 ;
      wdt:P2596 ?culture ;
      wdt:P17 ?country .

?culture <http://schema.org/name> ?culturelbl .
?country <http://schema.org/name> ?countrylbl .

}
GROUP BY ?country ?countrylbl
ORDER BY DESC (?cult_num)
"""

print("Results")
run_query(queryString)

Results
[('country', 'http://www.wikidata.org/entity/Q38'), ('countrylbl', 'Italy'), ('cult_num', '61')]
[('country', 'http://www.wikidata.org/entity/Q30'), ('countrylbl', 'United States of America'), ('cult_num', '46')]
[('country', 'http://www.wikidata.org/entity/Q801'), ('countrylbl', 'Israel'), ('cult_num', '16')]
[('country', 'http://www.wikidata.org/entity/Q858'), ('countrylbl', 'Syria'), ('cult_num', '15')]
[('country', 'http://www.wikidata.org/entity/Q43'), ('countrylbl', 'Turkey'), ('cult_num', '15')]
[('country', 'http://www.wikidata.org/entity/Q29'), ('countrylbl', 'Spain'), ('cult_num', '14')]
[('country', 'http://www.wikidata.org/entity/Q183'), ('countrylbl', 'Germany'), ('cult_num', '14')]
[('country', 'http://www.wikidata.org/entity/Q41'), ('countrylbl', 'Greece'), ('cult_num', '14')]
[('country', 'http://www.wikidata.org/entity/Q796'), ('countrylbl', 'Iraq'), ('cult_num', '14')]
[('country', 'http://www.wikidata.org/entity/Q419'), ('countrylbl', 'Peru'), ('cult_num', '1

98

In [50]:
# get interesting information about a country (bad query)

queryString = """
SELECT ?country ?countrylbl ?continentlbl (COUNT(DISTINCT ?site) AS ?site_num) (COUNT(DISTINCT ?culture) AS ?cult_num) (COUNT(DISTINCT ?type) AS ?type_num)
WHERE { 

    ?site wdt:P31 wd:Q839954 ;
          wdt:P2596 ?culture ;
          wdt:P31 ?type ;
          wdt:P17 ?country .
          
    ?country wdt:P30 ?continent .

    ?culture <http://schema.org/name> ?culturelbl .
    ?country <http://schema.org/name> ?countrylbl .
    ?continent <http://schema.org/name> ?continentlbl .

}
GROUP BY ?country ?countrylbl ?continentlbl
ORDER BY DESC (?site_num)
"""

print("Results")
run_query(queryString)

Results
[('country', 'http://www.wikidata.org/entity/Q38'), ('countrylbl', 'Italy'), ('continentlbl', 'Europe'), ('site_num', '667'), ('cult_num', '61'), ('type_num', '120')]
[('country', 'http://www.wikidata.org/entity/Q41'), ('countrylbl', 'Greece'), ('continentlbl', 'Europe'), ('site_num', '156'), ('cult_num', '14'), ('type_num', '66')]
[('country', 'http://www.wikidata.org/entity/Q79'), ('countrylbl', 'Egypt'), ('continentlbl', 'Asia'), ('site_num', '135'), ('cult_num', '9'), ('type_num', '34')]
[('country', 'http://www.wikidata.org/entity/Q79'), ('countrylbl', 'Egypt'), ('continentlbl', 'Africa'), ('site_num', '135'), ('cult_num', '9'), ('type_num', '34')]
[('country', 'http://www.wikidata.org/entity/Q30'), ('countrylbl', 'United States of America'), ('continentlbl', 'Insular Oceania'), ('site_num', '100'), ('cult_num', '46'), ('type_num', '40')]
[('country', 'http://www.wikidata.org/entity/Q30'), ('countrylbl', 'United States of America'), ('continentlbl', 'North America'), ('sit

117

In [71]:
# GOAL (4.4)
# get interesting information about a country
# I would like visit archaeological sites in a specific country in according to the number of sites, number of cultures, number of kind of sites and the continent

queryString = """
SELECT ?country ?countrylbl ?continentlbl ?site_num ?cult_num ?type_num
WHERE { 

    {
        SELECT ?country (COUNT(DISTINCT ?culture) AS ?cult_num)
        WHERE {
            ?site wdt:P31 wd:Q839954 ;
                  wdt:P17 ?country ;
                  wdt:P2596 ?culture .
        }
    }
    {
        SELECT ?country (COUNT(DISTINCT ?type) AS ?type_num)
        WHERE {
            ?site wdt:P31 wd:Q839954 ;
                  wdt:P17 ?country ;
                  wdt:P31 ?type .
        }
    }
    {
        SELECT ?country (COUNT(DISTINCT ?site) AS ?site_num)
        WHERE {
            ?site wdt:P31 wd:Q839954 ;
                  wdt:P17 ?country .
        }
    }
    
    ?country wdt:P30 ?continent .

    ?country <http://schema.org/name> ?countrylbl .
    ?continent <http://schema.org/name> ?continentlbl .
}
GROUP BY ?country ?countrylbl ?continentlbl
ORDER BY DESC (?site_num)
LIMIT 40
"""

print("Results")
run_query(queryString)

Results
[('country', 'http://www.wikidata.org/entity/Q145'), ('countrylbl', 'United Kingdom'), ('continentlbl', 'Europe'), ('site_num', '7026'), ('cult_num', '8'), ('type_num', '177')]
[('country', 'http://www.wikidata.org/entity/Q29'), ('countrylbl', 'Spain'), ('continentlbl', 'Europe'), ('site_num', '6168'), ('cult_num', '16'), ('type_num', '184')]
[('country', 'http://www.wikidata.org/entity/Q183'), ('countrylbl', 'Germany'), ('continentlbl', 'Europe'), ('site_num', '2825'), ('cult_num', '14'), ('type_num', '188')]
[('country', 'http://www.wikidata.org/entity/Q30'), ('countrylbl', 'United States of America'), ('continentlbl', 'North America'), ('site_num', '2159'), ('cult_num', '46'), ('type_num', '189')]
[('country', 'http://www.wikidata.org/entity/Q30'), ('countrylbl', 'United States of America'), ('continentlbl', 'Insular Oceania'), ('site_num', '2159'), ('cult_num', '46'), ('type_num', '189')]
[('country', 'http://www.wikidata.org/entity/Q414'), ('countrylbl', 'Argentina'), ('co

40