# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-830baa5607-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString,verbose = True):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        results = sparql.query()
        json_results = results.convert()
        if len(json_results['results']['bindings'])==0:
            print("Empty")
            return []
        array = []
        for bindings in json_results['results']['bindings']:
            app =  [ (var, value['value'])  for var, value in bindings.items() ] 
            if verbose:
                print( app)
            array.append(app)
        if verbose:
            print(len(array))
        return array

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# Sport Workflow Series ("Running explorative search") 

Consider the following exploratory information need:

> investigate the association football players and find the main BGPs related to this sports. Compare awards and participation in the competitions of the players

## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | profession    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P27`     | nationality   | predicate |
| `wd:Q5`       | Human         | node      |
| `wd:Q9124`    | Eliud Kipchoge| node      |
| `wd:Q853003`| athletics at the 2008 Summer Olympics – men's 200 metres |node|





Also consider

```
wd:Q9124 ?p ?obj .
```

is the BGP to retrieve all **properties of Eliud Kipchoge**


Please consider that when you return a resource, you should return the IRI and the label of the resource. In particular, when the task require you to identify a BGP the result set must always be a list of couples IRI - label.


The workload should:


1. Identify the BGP for runner

2. Identify the BGP for long distance running

3. Return the disciplines of the long distance running (e.g. marathon) (the result set must be a list of couples discipline IRI and label).

4. Consider marathons ran in Europe. Return the number of different runners who won a marathon group by their country of citizenship (the result set must be a triples of country IRI, label and #different runners).

5. Identify the BGP for Olympic Games

6. Consider only the summer olympic games of 2004-2008-2012-2016. Return the number of marathon runners who had partecipated to all these editions group by their country of citizenship (if the runner partecipated in more than one edition of the game, consider him/her only once) (the result set must be a list of country IRI, label and #marathon runners)

7. For each disciplines of the long distance running find who hold the World Record (the result set must be a list of elements with discipline IRI, label and athlete IRI and label).

## Task 1
Identify the BGP for runner

In [2]:
# query example
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q9124 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1000'), ('name', 'record held')]
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P1146'), ('name', 'World Athletics athlete ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('name', 'participant in')]
[('p', 'http://www.wikidata.org/prop/direct/P1412'), ('name', 'languages spoken, written or signed')]
[('p', 'http://www.wikidata.org/prop/direct/P1447'), ('name', 'Sports-Reference.com Olympic athlete ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1532'), ('name', 'country for sport')]
[('p', 'http://www.wikidata.org/prop/direct/P1559'), ('name', 'name in native language')]
[('p', 'http://www.wikidata.org/prop/direct/P172'), ('name', 'ethnic group')]
[('p', 'http://www.wikidata.org/prop/direct/P1741'), ('name', 'GTAA ID')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P19'), ('name', 

In [3]:
# aim: get all the properties of 'Eliud Kipchoge'
# result: we got all the resource properties. Now we can investigate both the properties 'sport' and 'occupation'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q9124 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1000'), ('name', 'record held')]
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P1146'), ('name', 'World Athletics athlete ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('name', 'participant in')]
[('p', 'http://www.wikidata.org/prop/direct/P1412'), ('name', 'languages spoken, written or signed')]
[('p', 'http://www.wikidata.org/prop/direct/P1447'), ('name', 'Sports-Reference.com Olympic athlete ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1532'), ('name', 'country for sport')]
[('p', 'http://www.wikidata.org/prop/direct/P1559'), ('name', 'name in native language')]
[('p', 'http://www.wikidata.org/prop/direct/P172'), ('name', 'ethnic group')]
[('p', 'http://www.wikidata.org/prop/direct/P1741'), ('name', 'GTAA ID')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P19'), ('name', 

In [4]:
# aim: check the objects related to the property 'sport'
# result: we found the sport he played. Maybe he was a runner since running is part of 'athletics' as sport
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q9124 wdt:P641 ?obj .
   # get the label
   ?obj sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q542'), ('name', 'athletics')]
1


In [5]:
# aim: check the objects related to the property 'occupation'
# result: we found that he was a 'marathon runner'. We hope that the property 'subclass of' exists so that it might relate it to 'runner'
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q9124 wdt:P106 ?obj .
   # get the label
   ?obj sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q13382460'), ('name', 'marathon runner')]
[('obj', 'http://www.wikidata.org/entity/Q11513337'), ('name', 'athletics competitor')]
[('obj', 'http://www.wikidata.org/entity/Q4439155'), ('name', 'long-distance runner')]
3


In [6]:
# aim: check the properties related to the object 'marathon runner'
# result: we found the properties 'subclass of'. We need to explore it.
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q13382460 ?p ?obj .
   # get the label
   ?p sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('name', 'Commons category')]
[('p', 'http://www.wikidata.org/prop/direct/P425'), ('name', 'field of this occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('name', "topic's main category")]
5


In [8]:
# aim: check the objects related to 'subclass of' of 'marathon runner'
# result: we found that it's a subclass of 'long-distance runner'
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q13382460 wdt:P279 ?obj .
   # get the label
   ?obj sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q4439155'), ('name', 'long-distance runner')]
1


Final query for this task

In [9]:
# aim: directly check the objects related to an eventual property 'subclass of' of 'long-distance runner'
# result: we found the 'runner' object!
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q4439155 wdt:P279 ?obj .
   # get the label
   ?obj sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q12803959'), ('name', 'runner')]
1


## Task 2
Identify the BGP for long distance running

In [4]:
# aim: get all the properties of 'runner'
# result: all the properties of 'runner'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q12803959 ?p ?obj .
   # get the label
   ?p sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1014'), ('name', 'Art & Architecture Thesaurus ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1036'), ('name', 'Dewey Decimal Classification')]
[('p', 'http://www.wikidata.org/prop/direct/P1296'), ('name', 'Gran Enciclopèdia Catalana ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P227'), ('name', 'GND ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2347'), ('name', 'YSO ID')]
[('p', 'http://www.wikidata.org/prop/direct/P244'), ('name', 'Library of Congress authority ID')]
[('p', 'http://www.wikidata.org/prop/direct/P268'), ('name', 'Bibliothèque nationale de France ID')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('name', 'Commons category')]
[('p', 'http://www.wikidata.org/prop/direct

In [11]:
# aim: check if 'long-distance running' is a subclass of 'athletics'
# result: we found 'running'. Maybe 'long-distance running' is a subclass of 'running'
queryString = """
SELECT DISTINCT ?s ?name
WHERE {
   # bind something
   ?s wdt:P279 wd:Q542 .
   # get the label
   ?s sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q3312129'), ('name', 'track and field')]
[('s', 'http://www.wikidata.org/entity/Q1757181'), ('name', 'para athletics')]
[('s', 'http://www.wikidata.org/entity/Q56199072'), ('name', 'sprint')]
[('s', 'http://www.wikidata.org/entity/Q10235779'), ('name', 'indoor athletics')]
[('s', 'http://www.wikidata.org/entity/Q202391'), ('name', 'racewalking')]
[('s', 'http://www.wikidata.org/entity/Q2093999'), ('name', 'trail running')]
[('s', 'http://www.wikidata.org/entity/Q1934817'), ('name', 'road running')]
[('s', 'http://www.wikidata.org/entity/Q3694569'), ('name', 'running')]
[('s', 'http://www.wikidata.org/entity/Q20900741'), ('name', "women's athletics")]
[('s', 'http://www.wikidata.org/entity/Q99463481'), ('name', "men's athletics")]
[('s', 'http://www.wikidata.org/entity/Q1753702'), ('name', 'wheelchair athletics')]
[('s', 'http://www.wikidata.org/entity/Q1548072'), ('name', 'Parelloop')]
[('s', 'http://www.wikidata.org/entity/Q2997936'), ('n

In [12]:
# aim: check if 'long-distance running' is a subclass of 'running'
# result: we didn't find it. Maybe instead we need to check some data regarding the object 'long-distance runner'
queryString = """
SELECT DISTINCT ?s ?name
WHERE {
   # bind something
   ?s wdt:P279 wd:Q3694569 .
   # get the label
   ?s sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q675770'), ('name', 'stadion')]
[('s', 'http://www.wikidata.org/entity/Q3001473'), ('name', 'Barefoot running')]
[('s', 'http://www.wikidata.org/entity/Q2167372'), ('name', 'mountain running')]
[('s', 'http://www.wikidata.org/entity/Q27685564'), ('name', '20,000 m run')]
4


In [14]:
# aim: check if 'long-distance running' maybe is 'played' by the 'long-distance runner'
# result: we found the property 'practiced by' that could be useful
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   ?s ?p wd:Q4439155 .
   # get the label
   ?p sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P2416'), ('name', 'sports discipline competed in')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('name', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P3095'), ('name', 'practiced by')]
[('p', 'http://www.wikidata.org/prop/direct/P466'), ('name', 'occupant')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('name', 'sport')]
[('p', 'http://www.wikidata.org/prop/direct/P971'), ('name', 'category combines topics')]
9


Final query for this task

In [17]:
# aim: check if 'long-distance running' is 'practiced by' the 'long-distance runner'
# result: we found 'long-distance running'!
queryString = """
SELECT DISTINCT ?s ?name
WHERE {
   # bind something
   ?s wdt:P3095 wd:Q4439155 .
   # get the label
   ?s sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q917206'), ('name', 'long-distance running')]
[('s', 'http://www.wikidata.org/entity/Q1934817'), ('name', 'road running')]
2


## Task 3
Return the disciplines of the long distance running (e.g. marathon) (the result set must be a list of couples discipline IRI and label).

In [3]:
# aim: get all the properties related to 'long-distance running'
# result: we found some properties that can be useful such as 'part of', 'subclass of', 'sport' and 'sports discipline competed in'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   ?s ?p wd:Q917206 .
   # get the label
   ?p sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P156'), ('name', 'followed by')]
[('p', 'http://www.wikidata.org/prop/direct/P2416'), ('name', 'sports discipline competed in')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('name', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('name', 'part of')]
[('p', 'http://www.wikidata.org/prop/direct/P413'), ('name', 'position played on team / speciality')]
[('p', 'http://www.wikidata.org/prop/direct/P425'), ('name', 'field of this occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('name', 'sport')]
[('p', 'http://www.wikidata.org/prop/direct/P921'), ('name', 'main subject')]
11


In [10]:
# aim: get the objects for 'part of', 'sport' and 'sports discipline competed in'
# result: we only returned people names, so 'sports discipline competed in' can be avoided
queryString = """
SELECT DISTINCT ?s ?name ?p
WHERE {
   # bind something
   ?s ?p wd:Q917206 .
   # get the label
   ?s sc:name ?name .
   FILTER (?p IN(wdt:P361,wdt:P2416,wdt:P279,wdt:P641))
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q26844379'), ('name', 'one-hour run'), ('p', 'http://www.wikidata.org/prop/direct/P279')]
[('s', 'http://www.wikidata.org/entity/Q2815830'), ('name', '2 miles run'), ('p', 'http://www.wikidata.org/prop/direct/P279')]
[('s', 'http://www.wikidata.org/entity/Q215677'), ('name', 'half marathon'), ('p', 'http://www.wikidata.org/prop/direct/P279')]
[('s', 'http://www.wikidata.org/entity/Q310447'), ('name', 'Ville Ritola'), ('p', 'http://www.wikidata.org/prop/direct/P2416')]
[('s', 'http://www.wikidata.org/entity/Q18655088'), ('name', '2014 Rotterdam Marathon'), ('p', 'http://www.wikidata.org/prop/direct/P641')]
[('s', 'http://www.wikidata.org/entity/Q14912256'), ('name', 'Günter Wöller'), ('p', 'http://www.wikidata.org/prop/direct/P2416')]
[('s', 'http://www.wikidata.org/entity/Q163892'), ('name', '10,000 metres'), ('p', 'http://www.wikidata.org/prop/direct/P279')]
[('s', 'http://www.wikidata.org/entity/Q240500'), ('name', '5000 metres'), ('p', 

Final query for this task

In [8]:
# aim: get all the properties of 'part of' and 'subclass of'
# result: the disciplines for 'long-distance running'. We notice that '5k run' and '5000 metres' could be the same discipline (they relates to 'long-distance running' through different
#         properties). The same holds for '10,000 metres' and '10K run'
queryString = """
SELECT DISTINCT ?s ?name ?p
WHERE {
   # bind something
   ?s ?p wd:Q917206 .
   # get the label
   ?s sc:name ?name .
   FILTER (?p IN(wdt:P361,wdt:P279))
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q26844379'), ('name', 'one-hour run'), ('p', 'http://www.wikidata.org/prop/direct/P279')]
[('s', 'http://www.wikidata.org/entity/Q2815830'), ('name', '2 miles run'), ('p', 'http://www.wikidata.org/prop/direct/P279')]
[('s', 'http://www.wikidata.org/entity/Q215677'), ('name', 'half marathon'), ('p', 'http://www.wikidata.org/prop/direct/P279')]
[('s', 'http://www.wikidata.org/entity/Q163892'), ('name', '10,000 metres'), ('p', 'http://www.wikidata.org/prop/direct/P279')]
[('s', 'http://www.wikidata.org/entity/Q240500'), ('name', '5000 metres'), ('p', 'http://www.wikidata.org/prop/direct/P279')]
[('s', 'http://www.wikidata.org/entity/Q40244'), ('name', 'marathon'), ('p', 'http://www.wikidata.org/prop/direct/P279')]
[('s', 'http://www.wikidata.org/entity/Q2774730'), ('name', '10K run'), ('p', 'http://www.wikidata.org/prop/direct/P279')]
[('s', 'http://www.wikidata.org/entity/Q500050'), ('name', 'cross country running'), ('p', 'http://www.wikida

## Task 4
Consider marathons ran in Europe. Return the number of different runners who won a marathon group by their country of citizenship (the result set must be a triples of country IRI, label and #different runners).

In [11]:
# aim: get all the properties related to 'marathon' to find an 'instance of' or similar property
# result: we found the property 'instance of'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   ?s ?p wd:Q40244 .
   # get the label
   ?p sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P101'), ('name', 'field of work')]
[('p', 'http://www.wikidata.org/prop/direct/P1269'), ('name', 'facet of')]
[('p', 'http://www.wikidata.org/prop/direct/P138'), ('name', 'named after')]
[('p', 'http://www.wikidata.org/prop/direct/P155'), ('name', 'follows')]
[('p', 'http://www.wikidata.org/prop/direct/P2094'), ('name', 'competition class')]
[('p', 'http://www.wikidata.org/prop/direct/P2416'), ('name', 'sports discipline competed in')]
[('p', 'http://www.wikidata.org/prop/direct/P2522'), ('name', 'victory')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('name', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P360'), ('name', 'is a list of')]
[('p', 'http://www.wikidata.org/prop/direct/P413'), ('name', 'position played on team / speciality')]
[('p', 'http:/

In [13]:
# aim: get all the objects that are 'instance of' the class 'marathon'
# result: we got all the marathons that have been done. Now we need to find the ones performed in Europe
queryString = """
SELECT DISTINCT ?s ?name ?p
WHERE {
   # bind something
   ?s wdt:P31 wd:Q40244 .
   # get the label
   ?s sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q17021963'), ('name', 'Niagara Falls International Marathon')]
[('s', 'http://www.wikidata.org/entity/Q17051087'), ('name', 'Kaohsiung International Marathon')]
[('s', 'http://www.wikidata.org/entity/Q4494720'), ('name', "1966 European Athletics Championships – men's marathon")]
[('s', 'http://www.wikidata.org/entity/Q22073467'), ('name', 'Morgantown Marathon')]
[('s', 'http://www.wikidata.org/entity/Q448958'), ('name', 'Hamburg Marathon')]
[('s', 'http://www.wikidata.org/entity/Q596830'), ('name', 'Lake Biwa Marathon')]
[('s', 'http://www.wikidata.org/entity/Q818638'), ('name', 'Beppu-Ōita Marathon')]
[('s', 'http://www.wikidata.org/entity/Q1341108'), ('name', 'Rotterdam Marathon')]
[('s', 'http://www.wikidata.org/entity/Q1473381'), ('name', 'Fukuoka International Marathon')]
[('s', 'http://www.wikidata.org/entity/Q5091447'), ('name', 'Chennai Marathon')]
[('s', 'http://www.wikidata.org/entity/Q1256376'), ('name', 'Milano City Marathon')]

In [14]:
# aim: get all the properties of 'Milano City Marathon'
# result: we found the property 'country'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q1256376 ?p ?o.
   # get the label
   ?p sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('name', 'country')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P2257'), ('name', 'event interval')]
[('p', 'http://www.wikidata.org/prop/direct/P2397'), ('name', 'YouTube channel ID')]
[('p', 'http://www.wikidata.org/prop/direct/P276'), ('name', 'location')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('name', 'Commons category')]
[('p', 'http://www.wikidata.org/prop/direct/P4794'), ('name', 'season starts')]
[('p', 'http://www.wikidata.org/prop/direct/P571'), ('name', 'inception')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('name', 'sport')]
[('p', 'http://www.wikidata.org/prop/direct/P646'), ('name', 'Freebase ID')]
[('p', 'http://www.wikidata.org/prop/direct/P856'), ('name', 'official website')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('name', "topic

In [15]:
# aim: get the 'country' where the 'Milano City Marathon' was performed
# result: we found 'Italy' as expected
queryString = """
SELECT DISTINCT ?o ?name
WHERE {
   # bind something
   wd:Q1256376 wdt:P17 ?o .
   # get the label
   ?o sc:name ?name .
}
LIMIT 5
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q38'), ('name', 'Italy')]
1


In [16]:
# aim: get all the properties of 'Italy'
# result: we found the 'continent' P30 property
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q38 ?p ?o .
   # get the label
   ?p sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1005'), ('name', 'Portuguese National Library ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1036'), ('name', 'Dewey Decimal Classification')]
[('p', 'http://www.wikidata.org/prop/direct/P1081'), ('name', 'Human Development Index')]
[('p', 'http://www.wikidata.org/prop/direct/P1082'), ('name', 'population')]
[('p', 'http://www.wikidata.org/prop/direct/P1120'), ('name', 'number of deaths')]
[('p', 'http://www.wikidata.org/prop/direct/P1151'), ('name', "topic's main Wikimedia portal")]
[('p', 'http://www.wikidata.org/prop/direct/P1198'), ('name', 'unemployment rate')]
[('p', 'http://www.wikidata.org/prop/direct/P122'), ('name', 'basic form of government')]
[('p', 'http://www.wikidata.org/prop/direct/P1225'), ('name', 'U.S. National Archives Identifier')]
[('p', 'http://www.wikidata.org/prop/direct/P1245'), ('name', 'OmegaWiki Defined Meaning')]
[('p', 'http://www.wikidata.org/prop/direct/P1296'), ('name', 'Gran Enciclopèdia Catalan

In [17]:
# aim: get all the properties of 'Italy'
# result: we found the 'continent' P30 property
queryString = """
SELECT DISTINCT ?o ?name
WHERE {
   # bind something
   wd:Q38 wdt:P30 ?o .
   # get the label
   ?o sc:name ?name .
}
LIMIT 5
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q46'), ('name', 'Europe')]
1


In [18]:
# aim: get the marathons run in 'Europe'
# result: we found matarhons run in 'Europe'
queryString = """
SELECT DISTINCT ?europeanmarathon ?name
WHERE {
   # bind something
   ?europeanmarathon ?p wd:Q40244 .
   # get the label
   ?europeanmarathon wdt:P17 ?country .
   ?country wdt:P30 wd:Q46 .
   ?europeanmarathon sc:name ?name
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('europeanmarathon', 'http://www.wikidata.org/entity/Q161222'), ('name', 'Berlin Marathon')]
[('europeanmarathon', 'http://www.wikidata.org/entity/Q528634'), ('name', 'Frankfurt Marathon')]
[('europeanmarathon', 'http://www.wikidata.org/entity/Q1071285'), ('name', 'Paris Marathon')]
[('europeanmarathon', 'http://www.wikidata.org/entity/Q679106'), ('name', 'Madrid Marathon')]
[('europeanmarathon', 'http://www.wikidata.org/entity/Q807910'), ('name', 'Barcelona Marathon')]
[('europeanmarathon', 'http://www.wikidata.org/entity/Q19946825'), ('name', 'Chișinău Marathon')]
[('europeanmarathon', 'http://www.wikidata.org/entity/Q22080291'), ('name', 'Almaty Marathon')]
[('europeanmarathon', 'http://www.wikidata.org/entity/Q24993287'), ('name', 'Stirling Scottish Marathon')]
[('europeanmarathon', 'http://www.wikidata.org/entity/Q6861249'), ('name', 'Milton Keynes Marathon')]
[('europeanmarathon', 'http://www.wikidata.org/entity/Q20894807'), ('name', '2015 Berlin Marathon')]
[('europeanm

In [26]:
# aim: get all the properties related to '2006 Berlin Marathon'
# result: we didn't find interesting property but we noticed that '2006 Berlin Marathon' may related with 'instance of' to 'Berlin Marathon'
queryString = """
SELECT DISTINCT ?p1 ?name1 ?p2 ?name2
WHERE {
   # bind something
   ?s ?p1 wd:Q13813466 .
   wd:Q13813466 ?p2 ?o.
   # get the label
   ?p1 sc:name ?name1 .
   ?p2 sc:name ?name2 .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p1', 'http://www.wikidata.org/prop/direct/P155'), ('name1', 'follows'), ('p2', 'http://www.wikidata.org/prop/direct/P155'), ('name2', 'follows')]
[('p1', 'http://www.wikidata.org/prop/direct/P156'), ('name1', 'followed by'), ('p2', 'http://www.wikidata.org/prop/direct/P155'), ('name2', 'follows')]
[('p1', 'http://www.wikidata.org/prop/direct/P156'), ('name1', 'followed by'), ('p2', 'http://www.wikidata.org/prop/direct/P156'), ('name2', 'followed by')]
[('p1', 'http://www.wikidata.org/prop/direct/P155'), ('name1', 'follows'), ('p2', 'http://www.wikidata.org/prop/direct/P156'), ('name2', 'followed by')]
[('p1', 'http://www.wikidata.org/prop/direct/P155'), ('name1', 'follows'), ('p2', 'http://www.wikidata.org/prop/direct/P17'), ('name2', 'country')]
[('p1', 'http://www.wikidata.org/prop/direct/P156'), ('name1', 'followed by'), ('p2', 'http://www.wikidata.org/prop/direct/P17'), ('name2', 'country')]
[('p1', 'http://www.wikidata.org/prop/direct/P156'), ('name1', 'followed by'), (

In [42]:
# aim: get all the properties related to 'Berlin Marathon'
# result: we found the property 'participant in'
queryString = """
SELECT DISTINCT ?p1 ?name1 ?p2 ?name2
WHERE {
   # bind something
   ?s ?p1 wd:Q161222 .
   wd:Q161222 ?p2 ?o.
   # get the label
   ?p1 sc:name ?name1 .
   ?p2 sc:name ?name2 .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p1', 'http://www.wikidata.org/prop/direct/P1344'), ('name1', 'participant in'), ('p2', 'http://www.wikidata.org/prop/direct/P131'), ('name2', 'located in the administrative territorial entity')]
[('p1', 'http://www.wikidata.org/prop/direct/P301'), ('name1', "category's main topic"), ('p2', 'http://www.wikidata.org/prop/direct/P131'), ('name2', 'located in the administrative territorial entity')]
[('p1', 'http://www.wikidata.org/prop/direct/P31'), ('name1', 'instance of'), ('p2', 'http://www.wikidata.org/prop/direct/P131'), ('name2', 'located in the administrative territorial entity')]
[('p1', 'http://www.wikidata.org/prop/direct/P3450'), ('name1', 'sports season of league or competition'), ('p2', 'http://www.wikidata.org/prop/direct/P131'), ('name2', 'located in the administrative territorial entity')]
[('p1', 'http://www.wikidata.org/prop/direct/P527'), ('name1', 'has part'), ('p2', 'http://www.wikidata.org/prop/direct/P131'), ('name2', 'located in the administrative territ

In [41]:
# aim: explore the property 'victory' of 'marathon' just to see what is it related to
# result: we found a person but we don't know which marathon he or she participated in and if he or she won that marathon
queryString = """
SELECT DISTINCT ?s ?name ?p
WHERE {
   # bind something
   ?s wdt:P2522 wd:Q40244 .
   # get the label
   ?s sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q5930934'), ('name', 'Jhon Cusi Huaman')]
1


In [28]:
# aim: get all the 'participant in' in the 'Berlin Marathon'
# result: we found an athlete partecipating in the 'Berlin Marathon'
queryString = """
SELECT DISTINCT ?s ?name
WHERE {
   # bind something
   ?s wdt:P1344 wd:Q161222 .
   # get the label
   ?s sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q1705'), ('name', 'Wilson Kipsang Kiprotich')]
1


In [31]:
# aim: get all the properties of 'Wilson Kipsang Kiprotich'
# result: we didn't find any interesting properties apart from the 'country of citizenship' that will be used to group by country
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q1705 ?p ?o .
   # get the label
   ?p sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1000'), ('name', 'record held')]
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P1146'), ('name', 'World Athletics athlete ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('name', 'participant in')]
[('p', 'http://www.wikidata.org/prop/direct/P1412'), ('name', 'languages spoken, written or signed')]
[('p', 'http://www.wikidata.org/prop/direct/P1447'), ('name', 'Sports-Reference.com Olympic athlete ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1741'), ('name', 'GTAA ID')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P19'), ('name', 'place of birth')]
[('p', 'http://www.wikidata.org/prop/direct/P2048'), ('name', 'height')]
[('p', 'http://www.wikidata.org/prop/direct/P2067'), ('name', 'mass')]
[('p', 'http://www.wikidata.org/prop/direct/P21'), ('name', 'sex or gender')]
[('p', 'htt

In [34]:
# aim: get all the properties related to 'Wilson Kipsang Kiprotich'
# result: we find the property 'winner'! Let's check the subject of that property
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   ?s ?p wd:Q1705 .
   # get the label
   ?p sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1346'), ('name', 'winner')]
1


In [35]:
# aim: get the subject of the property 'winner' related to the object 'Wilson Kipsang Kiprotich'
# result: we found the marathons the athlete won! We need to use the 'winner' property to get the marathon winners
queryString = """
SELECT DISTINCT ?s ?name
WHERE {
   # bind something
   ?s wdt:P1346 wd:Q1705 .
   # get the label
   ?s sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q28802674'), ('name', '2017 Tokyo Marathon')]
[('s', 'http://www.wikidata.org/entity/Q17634742'), ('name', '2014 New York City Marathon')]
2


In [38]:
# aim: get all the european athletes who whon at least 1 european marathon ***ASK IF IT HAS TO BE JUST ONE OR CAN BE AT LEAST ONE WON MARATHON ***
# result: we found european marathons with their corresponding winners
queryString = """
SELECT DISTINCT ?winner ?winnername ?europeanmarathon
WHERE {
   ?europeanmarathon ?p wd:Q40244 .
   
   ?europeanmarathon wdt:P17 ?countrymarathon .
   ?countrymarathon wdt:P30 wd:Q46 .
   ?europeanmarathon sc:name ?name .
   
   ?europeanmarathon wdt:P1346 ?winner .
   ?winner sc:name ?winnername .
   
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('winner', 'http://www.wikidata.org/entity/Q3659293'), ('winnername', 'Carlo Durante'), ('europeanmarathon', 'http://www.wikidata.org/entity/Q4814257')]
[('winner', 'http://www.wikidata.org/entity/Q2340625'), ('winnername', 'Koen Naert'), ('europeanmarathon', 'http://www.wikidata.org/entity/Q55780710')]
[('winner', 'http://www.wikidata.org/entity/Q275831'), ('winnername', 'Christelle Daunay'), ('europeanmarathon', 'http://www.wikidata.org/entity/Q17496398')]
[('winner', 'http://www.wikidata.org/entity/Q510629'), ('winnername', 'Jemima Sumgong'), ('europeanmarathon', 'http://www.wikidata.org/entity/Q23905992')]
[('winner', 'http://www.wikidata.org/entity/Q688458'), ('winnername', 'Marcel Hug'), ('europeanmarathon', 'http://www.wikidata.org/entity/Q23905992')]
[('winner', 'http://www.wikidata.org/entity/Q2433125'), ('winnername', 'Tilahun Regassa'), ('europeanmarathon', 'http://www.wikidata.org/entity/Q13813340')]
[('winner', 'http://www.wikidata.org/entity/Q9119'), ('winnername

In [40]:
# aim: get all the european athletes who whon at least 1 european marathon ***ASK IF IT HAS TO BE JUST ONE OR CAN BE AT LEAST ONE WON MARATHON ***
# result: we found all the winners of european marathons
queryString = """
SELECT DISTINCT ?winner ?winnername
WHERE {
   ?europeanmarathon ?p wd:Q40244 .
   
   ?europeanmarathon wdt:P17 ?countrymarathon .
   ?countrymarathon wdt:P30 wd:Q46 .
   ?europeanmarathon sc:name ?name .
   
   ?europeanmarathon wdt:P1346 ?winner .
   ?winner sc:name ?winnername .
   
}
LIMIT 2000
"""

print("Results")
x=run_query(queryString)

Results
[('winner', 'http://www.wikidata.org/entity/Q3659293'), ('winnername', 'Carlo Durante')]
[('winner', 'http://www.wikidata.org/entity/Q2340625'), ('winnername', 'Koen Naert')]
[('winner', 'http://www.wikidata.org/entity/Q275831'), ('winnername', 'Christelle Daunay')]
[('winner', 'http://www.wikidata.org/entity/Q510629'), ('winnername', 'Jemima Sumgong')]
[('winner', 'http://www.wikidata.org/entity/Q688458'), ('winnername', 'Marcel Hug')]
[('winner', 'http://www.wikidata.org/entity/Q2433125'), ('winnername', 'Tilahun Regassa')]
[('winner', 'http://www.wikidata.org/entity/Q9119'), ('winnername', 'Kenenisa Bekele Beyecha')]
[('winner', 'http://www.wikidata.org/entity/Q2112488'), ('winnername', 'Tatyana McFadden')]
[('winner', 'http://www.wikidata.org/entity/Q16337559'), ('winnername', 'Henrique Crisóstomo')]
[('winner', 'http://www.wikidata.org/entity/Q9124'), ('winnername', 'Eliud Kipchoge')]
[('winner', 'http://www.wikidata.org/entity/Q180891'), ('winnername', 'Sergei Popov')]
[(

Final query for this task

In [106]:
# aim: get the countries and their number of athletes who won a european marathon ***ASK IF IT HAS TO BE JUST ONE OR CAN BE AT LEAST ONE WON MARATHON ***
# result: the countries and their number of athletes who won a european marathon
queryString = """
SELECT DISTINCT ?winnercountry ?winnercountryname (COUNT(?winner) AS ?numberofwinners)
WHERE {

   ?winner wdt:P27 ?winnercountry .
   ?winnercountry sc:name ?winnercountryname .
   
   {
       SELECT DISTINCT ?winner ?winnername
        WHERE {
           ?europeanmarathon ?p wd:Q40244 .

           ?europeanmarathon wdt:P17 ?countrymarathon .
           ?countrymarathon wdt:P30 wd:Q46 .
           ?europeanmarathon sc:name ?name .

           ?europeanmarathon wdt:P1346 ?winner .
           ?winner sc:name ?winnername .

        }
   }
   
   
} GROUP BY ?winnercountry ?winnercountryname
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('winnercountry', 'http://www.wikidata.org/entity/Q38'), ('winnercountryname', 'Italy'), ('numberofwinners', '1')]
[('winnercountry', 'http://www.wikidata.org/entity/Q31'), ('winnercountryname', 'Belgium'), ('numberofwinners', '1')]
[('winnercountry', 'http://www.wikidata.org/entity/Q142'), ('winnercountryname', 'France'), ('numberofwinners', '1')]
[('winnercountry', 'http://www.wikidata.org/entity/Q33'), ('winnercountryname', 'Finland'), ('numberofwinners', '3')]
[('winnercountry', 'http://www.wikidata.org/entity/Q114'), ('winnercountryname', 'Kenya'), ('numberofwinners', '11')]
[('winnercountry', 'http://www.wikidata.org/entity/Q45'), ('winnercountryname', 'Portugal'), ('numberofwinners', '1')]
[('winnercountry', 'http://www.wikidata.org/entity/Q30'), ('winnercountryname', 'United States of America'), ('numberofwinners', '1')]
[('winnercountry', 'http://www.wikidata.org/entity/Q145'), ('winnercountryname', 'United Kingdom'), ('numberofwinners', '3')]
[('winnercountry', 'http

## Task 5
Identify the BGP for Olympic Games

In [44]:
# aim: 'athletics' is for sure an Olympic Discipline. We check its properties
# result: we found the properties 'has part' and 'subclass of' 
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q542 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P2869'), ('name', 'record or record progression')]
[('p', 'http://www.wikidata.org/prop/direct/P1014'), ('name', 'Art & Architecture Thesaurus ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1036'), ('name', 'Dewey Decimal Classification')]
[('p', 'http://www.wikidata.org/prop/direct/P1151'), ('name', "topic's main Wikimedia portal")]
[('p', 'http://www.wikidata.org/prop/direct/P1225'), ('name', 'U.S. National Archives Identifier')]
[('p', 'http://www.wikidata.org/prop/direct/P1245'), ('name', 'OmegaWiki Defined Meaning')]
[('p', 'http://www.wikidata.org/prop/direct/P1282'), ('name', 'OpenStreetMap tag or key')]
[('p', 'http://www.wikidata.org/prop/direct/P1296'), ('name', 'Gran Enciclopèdia Catalana ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('name', 'described by source')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('name', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct

In [45]:
# aim: check the properties 'part of' and 'subclass of' related to 'athletics'
# result: we found the object 'Olympic sport'
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q542 ?p ?obj .
   # get the label
   ?obj sc:name ?name .
   FILTER (?p IN (wdt:P527,wdt:P279))
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q3312129'), ('name', 'track and field')]
[('obj', 'http://www.wikidata.org/entity/Q3216963'), ('name', 'throwing event')]
[('obj', 'http://www.wikidata.org/entity/Q11047409'), ('name', 'athletic culture')]
[('obj', 'http://www.wikidata.org/entity/Q21467672'), ('name', 'running disciplines of track and field')]
[('obj', 'http://www.wikidata.org/entity/Q10235779'), ('name', 'indoor athletics')]
[('obj', 'http://www.wikidata.org/entity/Q202391'), ('name', 'racewalking')]
[('obj', 'http://www.wikidata.org/entity/Q212434'), ('name', 'Olympic sport')]
[('obj', 'http://www.wikidata.org/entity/Q3924425'), ('name', 'combined track and field events')]
[('obj', 'http://www.wikidata.org/entity/Q52177345'), ('name', 'technical disciplines of athletics')]
[('obj', 'http://www.wikidata.org/entity/Q1151752'), ('name', 'jumping')]
10


In [46]:
# aim: check the properties of 'Olympic sport'
# result: we found the properties 'part of' and 'subclass of' 
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q212434 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('name', "topic's main template")]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P2670'), ('name', 'has parts of the class')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P3095'), ('name', 'practiced by')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P3417'), ('name', 'Quora topic ID')]
[('p', 'http://www.wikidata.org/prop/direct/P349'), ('name', 'National Diet Library ID')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('name', 'part of')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('name', 'Commons category')]
[('p', 'http://www.wikidata.org/prop/direct/P5250'), ('name', 'IAB code')]
[('p', 'http://www.wikidata.org

Final query for this task

In [48]:
# aim: check the properties 'part of' and 'subclass of' related to 'athletics'
# result: we found the object 'Olympic Games'
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q212434 ?p ?obj .
   # get the label
   ?obj sc:name ?name .
   FILTER (?p IN (wdt:P361,wdt:P279))
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q51031626'), ('name', 'sport competition at a multi-sport event')]
[('obj', 'http://www.wikidata.org/entity/Q349'), ('name', 'sport')]
[('obj', 'http://www.wikidata.org/entity/Q5389'), ('name', 'Olympic Games')]
3


## Task 6
Consider only the summer olympic games of 2004-2008-2012-2016. Return the number of marathon runners who had partecipated to all these editions group by their country of citizenship (if the runner partecipated in more than one edition of the game, consider him/her only once) (the result set must be a list of country IRI, label and #marathon runners)

In [8]:
# aim: get all the properties of 'Olympic Games'
# result: all the properties of 'Olympic Games'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q5389 ?p ?obj .
   # get the label
   ?p sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1051'), ('name', 'PSH ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1151'), ('name', "topic's main Wikimedia portal")]
[('p', 'http://www.wikidata.org/prop/direct/P1225'), ('name', 'U.S. National Archives Identifier')]
[('p', 'http://www.wikidata.org/prop/direct/P1245'), ('name', 'OmegaWiki Defined Meaning')]
[('p', 'http://www.wikidata.org/prop/direct/P1296'), ('name', 'Gran Enciclopèdia Catalana ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('name', 'described by source')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('name', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('name', "topic's main template")]
[('p', 'http://www.wikidata.org/prop/direct/P1546'), ('name', 'motto')]
[('p', 'http://www.wikidata.org/prop/direct/P163'), ('name', 'flag')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P1807

In [50]:
# aim: try to find if there exist subjects that are 'instance of' 'Olympic Games'
# result: the result set is empty
queryString = """
SELECT DISTINCT ?s ?name
WHERE {
   # bind something
   ?s wdt:P31 wd:Q5389 .
   # get the label
   ?s sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [51]:
# aim: find the properties related to 'Olympic Games'
# result: we found the 'subclass of' property that could be useful
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   ?s ?p wd:Q5389 .
   # get the label
   ?p sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1423'), ('name', 'template has topic')]
[('p', 'http://www.wikidata.org/prop/direct/P101'), ('name', 'field of work')]
[('p', 'http://www.wikidata.org/prop/direct/P1269'), ('name', 'facet of')]
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('name', 'participant in')]
[('p', 'http://www.wikidata.org/prop/direct/P138'), ('name', 'named after')]
[('p', 'http://www.wikidata.org/prop/direct/P156'), ('name', 'followed by')]
[('p', 'http://www.wikidata.org/prop/direct/P166'), ('name', 'award received')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P2522'), ('name', 'victory')]
[('p', 'http://www.wikidata.org/prop/direct/P2650'), ('name', 'interested in')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P2959'), ('name', 'permanent duplicated item')]
[('p', 'http://www.wikidata.org/pr

In [54]:
# aim: try to find if there exist subjects that are 'subclass of' 'Olympic Games'
# result: we found 'Summer Olympic Games'. Maybe there are subjects that are 'instance of' 'Summer Olympic Games'
queryString = """
SELECT DISTINCT ?p ?name ?s ?sname
WHERE {
   # bind something
   ?s ?p wd:Q5389 .
   # get the label
   ?p sc:name ?name .
   ?s sc:name ?sname .
   FILTER REGEX(?sname,"summer","i").
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of'), ('s', 'http://www.wikidata.org/entity/Q159821'), ('sname', 'Summer Olympic Games')]
[('p', 'http://www.wikidata.org/prop/direct/P921'), ('name', 'main subject'), ('s', 'http://www.wikidata.org/entity/Q106087337'), ('sname', 'Summer Olympics in Athens 2004')]
[('p', 'http://www.wikidata.org/prop/direct/P971'), ('name', 'category combines topics'), ('s', 'http://www.wikidata.org/entity/Q7215607'), ('sname', 'Category:2012 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P971'), ('name', 'category combines topics'), ('s', 'http://www.wikidata.org/entity/Q25322394'), ('sname', "Category:Cycling at the Summer Olympics – Men's keirin")]
[('p', 'http://www.wikidata.org/prop/direct/P971'), ('name', 'category combines topics'), ('s', 'http://www.wikidata.org/entity/Q8204577'), ('sname', 'Category:2016 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P971'), ('name', 'category combine

In [55]:
# aim: try to find if there exist subjects that are 'instance of' 'Olympic Games'
# result: we found 'Summer Olympic Games 2004','Summer Olympic Games 2008',... We should investigate about the marathons of these years
queryString = """
SELECT DISTINCT ?s ?sname
WHERE {
   # bind something
   ?s wdt:P31 wd:Q159821 .
   # get the label
   ?s sc:name ?sname .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q8098'), ('sname', '1904 Summer Olympics')]
[('s', 'http://www.wikidata.org/entity/Q8150'), ('sname', '1936 Summer Olympics')]
[('s', 'http://www.wikidata.org/entity/Q8415'), ('sname', '1960 Summer Olympics')]
[('s', 'http://www.wikidata.org/entity/Q181278'), ('sname', '2020 Summer Olympics')]
[('s', 'http://www.wikidata.org/entity/Q8088'), ('sname', '1900 Summer Olympics')]
[('s', 'http://www.wikidata.org/entity/Q8438'), ('sname', '1972 Summer Olympics')]
[('s', 'http://www.wikidata.org/entity/Q8444'), ('sname', '1976 Summer Olympics')]
[('s', 'http://www.wikidata.org/entity/Q8470'), ('sname', '1988 Summer Olympics')]
[('s', 'http://www.wikidata.org/entity/Q8488'), ('sname', '1992 Summer Olympics')]
[('s', 'http://www.wikidata.org/entity/Q8531'), ('sname', '1996 Summer Olympics')]
[('s', 'http://www.wikidata.org/entity/Q8544'), ('sname', '2000 Summer Olympics')]
[('s', 'http://www.wikidata.org/entity/Q8558'), ('sname', '2004 Summer Olympi

In [56]:
# aim: get the properties related to 'Summer Olympic Games 2004'
# result: the properties related to 'Summer Olympic Games 2004'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   ?s ?p wd:Q8558 .
   # get the label
   ?p sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P6291'), ('name', 'advertises')]
[('p', 'http://www.wikidata.org/prop/direct/P101'), ('name', 'field of work')]
[('p', 'http://www.wikidata.org/prop/direct/P1269'), ('name', 'facet of')]
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('name', 'participant in')]
[('p', 'http://www.wikidata.org/prop/direct/P155'), ('name', 'follows')]
[('p', 'http://www.wikidata.org/prop/direct/P156'), ('name', 'followed by')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('name', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('name', 'part of')]
[('p', 'http://www.wikidata.org/prop/direct/P466'), ('name', 'occupant')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('name', 'has part')]
[('p', 'http://www.wikidata.org/prop/direct/P793'), ('name', 'significant event')]
[('p', 'http://www.wikidata.org/prop/direct/P971'), ('name', 'category combines topics')]
12


In [57]:
# aim: get all subjects that are related to 'Summer Olympic Games 2004' through 'part of', 'significant event' and 'has part'
# result: we found 'athletics at the 2004 Summer Olympics'
queryString = """
SELECT DISTINCT ?s ?name ?p
WHERE {
   # bind something
   ?s ?p wd:Q8558 .
   # get the label
   ?s sc:name ?name .
   FILTER (?p IN (wdt:P361,wdt:P527,wdt:P793))
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q1322006'), ('name', '2004 Summer Olympics torch relay'), ('p', 'http://www.wikidata.org/prop/direct/P361')]
[('s', 'http://www.wikidata.org/entity/Q118446'), ('name', 'rowing at the 2004 Summer Olympics'), ('p', 'http://www.wikidata.org/prop/direct/P361')]
[('s', 'http://www.wikidata.org/entity/Q15637418'), ('name', 'artistic gymnastics at the 2004 Summer Olympics'), ('p', 'http://www.wikidata.org/prop/direct/P361')]
[('s', 'http://www.wikidata.org/entity/Q159821'), ('name', 'Summer Olympic Games'), ('p', 'http://www.wikidata.org/prop/direct/P527')]
[('s', 'http://www.wikidata.org/entity/Q1602350'), ('name', 'rhythmic gymnastics at the 2004 Summer Olympics'), ('p', 'http://www.wikidata.org/prop/direct/P361')]
[('s', 'http://www.wikidata.org/entity/Q2022374'), ('name', 'trampoline gymnastics at the 2004 Summer Olympics'), ('p', 'http://www.wikidata.org/prop/direct/P361')]
[('s', 'http://www.wikidata.org/entity/Q220886'), ('name', 'swimming

In [58]:
# aim: get all the properties related to 'athletics at the 2004 Summer Olympics'
# result: all the properties related to 'athletics at the 2004 Summer Olympics'
queryString = """
SELECT DISTINCT ?p ?name 
WHERE {
   # bind something
   ?s ?p wd:Q339283 .
   # get the label
   ?p sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1423'), ('name', 'template has topic')]
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('name', 'participant in')]
[('p', 'http://www.wikidata.org/prop/direct/P155'), ('name', 'follows')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('name', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('name', 'part of')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('name', 'has part')]
6


In [59]:
# aim: get all the subjects related with 'athletics at the 2004 Summer Olympics' through 'part of' and 'has part'
# result: we found 'athletics at the 2004 Summer Olympics – women's marathon' and 'athletics at the 2004 Summer Olympics – men's marathon'
queryString = """
SELECT DISTINCT ?s ?name ?p 
WHERE {
   # bind something
   ?s ?p wd:Q339283 .
   # get the label
   ?s sc:name ?name .
   FILTER (?p IN (wdt:P361,wdt:P527))
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q1433109'), ('name', "athletics at the 2004 Summer Olympics – men's pole vault"), ('p', 'http://www.wikidata.org/prop/direct/P361')]
[('s', 'http://www.wikidata.org/entity/Q3628756'), ('name', "athletics at the 2004 Summer Olympics – women's 4 × 400 metres relay"), ('p', 'http://www.wikidata.org/prop/direct/P361')]
[('s', 'http://www.wikidata.org/entity/Q1439323'), ('name', "athletics at the 2004 Summer Olympics – women's javelin throw"), ('p', 'http://www.wikidata.org/prop/direct/P361')]
[('s', 'http://www.wikidata.org/entity/Q8558'), ('name', '2004 Summer Olympics'), ('p', 'http://www.wikidata.org/prop/direct/P527')]
[('s', 'http://www.wikidata.org/entity/Q2313564'), ('name', "athletics at the 2004 Summer Olympics – men's 200 metres"), ('p', 'http://www.wikidata.org/prop/direct/P361')]
[('s', 'http://www.wikidata.org/entity/Q1433079'), ('name', "athletics at the 2004 Summer Olympics – women's long jump"), ('p', 'http://www.wikidata.org/p

In [60]:
# aim: get all the subjects related with 'athletics at the 2004 Summer Olympics' through 'part of' and 'has part'
# result: we found 'athletics at the 2004 Summer Olympics – women's marathon' and 'athletics at the 2004 Summer Olympics – men's marathon'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   ?s ?p ?o .
   # get the label
   ?p sc:name ?name .
   FILTER (?s IN (wd:Q2443986,wd:Q2552272))
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1132'), ('name', 'number of participants')]
[('p', 'http://www.wikidata.org/prop/direct/P1346'), ('name', 'winner')]
[('p', 'http://www.wikidata.org/prop/direct/P155'), ('name', 'follows')]
[('p', 'http://www.wikidata.org/prop/direct/P156'), ('name', 'followed by')]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('name', 'country')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P2094'), ('name', 'competition class')]
[('p', 'http://www.wikidata.org/prop/direct/P276'), ('name', 'location')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('name', 'part of')]
[('p', 'http://www.wikidata.org/prop/direct/P580'), ('name', 'start time')]
[('p', 'http://www.wikidata.org/prop/direct/P585'), ('name', 'point in time')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('name', 'sport')]
[('p'

In [61]:
# aim: get all the properties related with 'athletics at the 2004 Summer Olympics – women's marathon' and 'athletics at the 2004 Summer Olympics – men's marathon'
# result: we found 'participant in'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   ?s ?p ?o .
   # get the label
   ?p sc:name ?name .
   FILTER (?o IN (wd:Q2443986,wd:Q2552272))
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('name', 'participant in')]
[('p', 'http://www.wikidata.org/prop/direct/P155'), ('name', 'follows')]
[('p', 'http://www.wikidata.org/prop/direct/P156'), ('name', 'followed by')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('name', 'has part')]
4


In [62]:
# aim: get all the athletes who were 'participant in' the 'athletics at the 2004 Summer Olympics – women's marathon' and 'athletics at the 2004 Summer Olympics – men's marathon'
# result: we found just 2 athletes... Now we read the task again, maybe we are required to find only the people who are 'marathon runner' as 'occupation' and who 'participated in' the 
#         Summer Olympic games... 
queryString = """
SELECT DISTINCT ?s ?name
WHERE {
   # bind something
   ?s wdt:P1344 ?o .
   # get the label
   ?s sc:name ?name .
   FILTER (?o IN (wd:Q2443986,wd:Q2552272))
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q2271929'), ('name', 'Bat-Ochiryn Ser-Od')]
[('s', 'http://www.wikidata.org/entity/Q1617873'), ('name', 'Marcel Matanin')]
2


In [10]:
# aim: get all the athletes who were 'participant in' the 'athletics at the 2004 Summer Olympics – women's marathon' and 'athletics at the 2004 Summer Olympics – men's marathon'
# result: we found just 2 athletes... Now we read the task again, maybe we are required to find only the people who are 'marathon runner' as 'occupation' and who 'participated in' the 
#         Summer Olympic games... 
queryString = """
SELECT DISTINCT ?o ?name
WHERE {
   # bind something
   wd:Q2271929 wdt:P1344 ?o .
   # get the label
   ?o sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q18193712'), ('name', 'athletics at the 2016 Summer Olympics')]
[('o', 'http://www.wikidata.org/entity/Q3287514'), ('name', "2007 World Championships in Athletics – Men's Marathon")]
[('o', 'http://www.wikidata.org/entity/Q3702617'), ('name', "2003 World Championships in Athletics – Men's Marathon")]
[('o', 'http://www.wikidata.org/entity/Q4605105'), ('name', "2005 World Championships in Athletics – Men's Marathon")]
[('o', 'http://www.wikidata.org/entity/Q20312969'), ('name', "athletics at the 2014 Asian Games – men's marathon")]
[('o', 'http://www.wikidata.org/entity/Q285408'), ('name', '2002 IAAF World Half Marathon Championships')]
[('o', 'http://www.wikidata.org/entity/Q21858018'), ('name', "athletics at the 2016 Summer Olympics – men's marathon")]
[('o', 'http://www.wikidata.org/entity/Q20804554'), ('name', "2015 World Championships in Athletics – men's marathon")]
[('o', 'http://www.wikidata.org/entity/Q1553760'), ('name', '2005 IAA

In [63]:
# aim: try to find if there exist subjects that are 'instance of' 'Olympic Games'
# result: we found 'Summer Olympic Games 2004','Summer Olympic Games 2008',... We should investigate about the marathons of these years
queryString = """
SELECT DISTINCT ?o ?sname
WHERE {
   # bind something
   wd:Q2271929 wdt:P106 ?o.
   # get the label
   ?o sc:name ?sname .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q13382460'), ('sname', 'marathon runner')]
[('o', 'http://www.wikidata.org/entity/Q11513337'), ('sname', 'athletics competitor')]
2


In [64]:
# aim: find all the people that are 'marathon runner' as 'occupation'
# result: we found 'Summer Olympic Games 2004','Summer Olympic Games 2008',... We should investigate about the marathons of these years
queryString = """
SELECT DISTINCT ?s ?sname
WHERE {
   # bind something
   ?s wdt:P106 wd:Q13382460.
   # get the label
   ?s sc:name ?sname .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q1701191'), ('sname', 'John Nada Saya')]
[('s', 'http://www.wikidata.org/entity/Q16006623'), ('sname', 'Harry Payne')]
[('s', 'http://www.wikidata.org/entity/Q16007889'), ('sname', 'John Tarrant')]
[('s', 'http://www.wikidata.org/entity/Q16008400'), ('sname', 'Waldemar Karlsson')]
[('s', 'http://www.wikidata.org/entity/Q16011969'), ('sname', 'John J. Reynolds')]
[('s', 'http://www.wikidata.org/entity/Q1602744'), ('sname', 'Hellen Mugo')]
[('s', 'http://www.wikidata.org/entity/Q16185418'), ('sname', 'Jerry Kiernan')]
[('s', 'http://www.wikidata.org/entity/Q16186777'), ('sname', 'George Malley')]
[('s', 'http://www.wikidata.org/entity/Q16210841'), ('sname', 'Elva Dryer')]
[('s', 'http://www.wikidata.org/entity/Q16212704'), ('sname', 'José Alejandro Semprún')]
[('s', 'http://www.wikidata.org/entity/Q16226788'), ('sname', 'Jacob Bragg')]
[('s', 'http://www.wikidata.org/entity/Q17010986'), ('sname', 'Naomi Sakashita')]
[('s', 'http://www.wikida

In [65]:
# aim: get all the athletes who competed in the 4 Olympic Games and whose 'occupation' is 'marathon runner'. Get also their countries
# result: it seems that no marathon athlete have competed in the 4 Olympic Games...
queryString = """
SELECT DISTINCT ?athlete ?athletename ?athletecountry ?athletecountryname 
WHERE {
   # athlete participant in 
   #?athlete wdt:P1344 ?summerolympics .
   
   ?athlete wdt:P1344 wd:Q8558 .
   ?athlete wdt:P1344 wd:Q8567 .
   ?athlete wdt:P1344 wd:Q8577 .
   ?athlete wdt:P1344 wd:Q8613 .
   
   # athlete occupation marathon runner
   ?athlete wdt:P106 wd:Q13382460.
   
   # athlete country athletecountry
   ?athlete wdt:P27 ?athletecountry .
   
   ?athletecountry sc:name ?athletecountryname .
   
   # get the label
   ?athlete sc:name ?athletename .
   
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [11]:
# aim: find all the athletes who played in the 4 Olympic Games
# result: we found some athletes so there are actually athletes who competed in the 4 Olympic Games. Maybe we should consider other optional properties apart from occupation such as
#         'sports discipline competed in' and 'sport'
queryString = """
SELECT DISTINCT ?athlete ?athletename
WHERE {
   ?athlete wdt:P1344 wd:Q8558 .
   ?athlete wdt:P1344 wd:Q8567 .
   ?athlete wdt:P1344 wd:Q8577 .
   ?athlete wdt:P1344 wd:Q8613 .
   ?athlete sc:name ?athletename .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('athlete', 'http://www.wikidata.org/entity/Q2564215'), ('athletename', 'Tony Azevedo')]
[('athlete', 'http://www.wikidata.org/entity/Q316106'), ('athletename', 'José Manuel Calderón')]
[('athlete', 'http://www.wikidata.org/entity/Q240686'), ('athletename', 'Chris Brown')]
[('athlete', 'http://www.wikidata.org/entity/Q1332824'), ('athletename', 'Yūki Ōta')]
[('athlete', 'http://www.wikidata.org/entity/Q18718'), ('athletename', 'Géza Imre')]
[('athlete', 'http://www.wikidata.org/entity/Q955360'), ('athletename', 'Aldo Montano')]
[('athlete', 'http://www.wikidata.org/entity/Q312651'), ('athletename', 'Rudy Fernández')]
[('athlete', 'http://www.wikidata.org/entity/Q240726'), ('athletename', 'Diana Taurasi')]
[('athlete', 'http://www.wikidata.org/entity/Q2606477'), ('athletename', 'Pietro Figlioli')]
[('athlete', 'http://www.wikidata.org/entity/Q282285'), ('athletename', 'Richard Kruse')]
[('athlete', 'http://www.wikidata.org/entity/Q193052'), ('athletename', 'Pau Gasol')]
[('athl

In [8]:
# aim: get all the athletes who competed in the 4 Olympic Games and whose 'occupation' is 'marathon runner'. Get also their countries
# result: it seems that no marathon athlete have competed in the 4 Olympic Games...
queryString = """
SELECT DISTINCT ?athlete ?athletename ?athletecountry ?athletecountryname 
WHERE {
   # athlete participant in 
   
   ?athlete wdt:P1344 wd:Q8558 .
   ?athlete wdt:P1344 wd:Q8567 .
   ?athlete wdt:P1344 wd:Q8577 .
   ?athlete wdt:P1344 wd:Q8613 .
   
   # athlete occupation marathon runner
   ?athlete wdt:P106 ?occupation .
   ?athlete wdt:P2416 ?sportcompeted .
   ?athlete wdt:P641 ?sport .
   
   # athlete country athletecountry
   ?athlete wdt:P27 ?athletecountry .
   
   ?athletecountry sc:name ?athletecountryname .
   
   # get the label
   ?athlete sc:name ?athletename .
   
   FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 )
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [None]:
# aim: get all the athletes who competed in the 4 Olympic Games and whose 'occupation' is 'marathon runner'. Get also their countries
# result: it seems that no marathon athlete have competed in the 4 Olympic Games...
queryString = """
SELECT DISTINCT ?athlete ?athletename ?athletecountry ?athletecountryname 
WHERE {
   # athlete participant in 
   ?athlete wdt:P1344 ?summerolympics .
   ?athletics wdt:P361 ?summerolympics
   ?marathon wdt:P361 ?athletics 
   

   
   # athlete occupation marathon runner
   ?athlete wdt:P106 ?occupation .
   ?athlete wdt:P2416 ?sportcompeted .
   ?athlete wdt:P641 ?sport .
   
   # athlete country athletecountry
   ?athlete wdt:P27 ?athletecountry .
   
   ?athletecountry sc:name ?athletecountryname .
   
   # get the label
   ?athlete sc:name ?athletename .
   
   FILTER (?summerolympics IN(wd:Q8558,wd:Q8567,wd:Q8577,wd:Q8613)).
   FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 )
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

In [5]:
# aim: get all the athletes who competed in the 4 Olympic Games and whose 'occupation' is 'marathon runner'. Get also their countries
# result: it seems that no marathon athlete have competed in the 4 Olympic Games...
#forgot the labels here!!!
queryString = """
SELECT DISTINCT ?athlete ?athletename ?athletecountry ?athletecountryname 
WHERE {

   ?athletics wdt:P361 ?summerolympics .
   ?marathon wdt:P361 ?athletics .
   ?athlete wdt:P1344 ?marathon .
   
   
   # athlete country athletecountry
   ?athlete wdt:P27 ?athletecountry .
   
   ?athletecountry sc:name ?athletecountryname .
   
   # get the label
   ?athlete sc:name ?athletename .
   
   FILTER (?summerolympics IN(wd:Q8558,wd:Q8567,wd:Q8577,wd:Q8613)).
   FILTER REGEX(?athletics, "athletics" ) .
   FILTER REGEX(?marathon, "marathon" ) .
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [6]:
# aim: get all the athletes who competed in one of the 4 Olympic Games in the marathon discipline. Get also their countries
# result: all the athletes who competed in one of the 4 Olympic Games in the marathon discipline.
queryString = """
SELECT DISTINCT ?athlete ?athletename ?athletecountry ?athletecountryname 
WHERE {

   ?athletics wdt:P361 ?summerolympics ;
       sc:name ?athleticsname.
   ?marathon wdt:P361 ?athletics ;
       sc:name ?marathonname.
   ?athlete wdt:P1344 ?marathon .
   
   
   # athlete country athletecountry
   ?athlete wdt:P27 ?athletecountry .
   
   ?athletecountry sc:name ?athletecountryname .
   
   # get the label
   ?athlete sc:name ?athletename .
   
   FILTER (?summerolympics IN(wd:Q8558,wd:Q8567,wd:Q8577,wd:Q8613)).
   #FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 ) .
   FILTER REGEX(?athleticsname, "athletics" ) .
   FILTER REGEX(?marathonname, "marathon" ) .
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('athlete', 'http://www.wikidata.org/entity/Q526508'), ('athletename', 'Helalia Johannes'), ('athletecountry', 'http://www.wikidata.org/entity/Q1030'), ('athletecountryname', 'Namibia')]
[('athlete', 'http://www.wikidata.org/entity/Q1702'), ('athletename', 'Stephen Kiprotich'), ('athletecountry', 'http://www.wikidata.org/entity/Q1036'), ('athletecountryname', 'Uganda')]
[('athlete', 'http://www.wikidata.org/entity/Q9124'), ('athletename', 'Eliud Kipchoge'), ('athletecountry', 'http://www.wikidata.org/entity/Q114'), ('athletecountryname', 'Kenya')]
[('athlete', 'http://www.wikidata.org/entity/Q26203953'), ('athletename', 'Evans Kiplagat Barkowet'), ('athletecountry', 'http://www.wikidata.org/entity/Q114'), ('athletecountryname', 'Kenya')]
[('athlete', 'http://www.wikidata.org/entity/Q24083616'), ('athletename', 'Callum Hawkins'), ('athletecountry', 'http://www.wikidata.org/entity/Q145'), ('athletecountryname', 'United Kingdom')]
[('athlete', 'http://www.wikidata.org/entity/Q529

In [8]:
# aim: get all the athletes who competed in one of the 4 Olympic Games in the marathon discipline and the number of Olympic Games the athlete competed in. Get also their countries
# result: all the athletes who competed in one of the 4 Olympic Games in the marathon discipline.
queryString = """
SELECT DISTINCT ?athlete ?athletename ?athletecountry ?athletecountryname (COUNT(?summerolympics) as ?olympicsplayed)
WHERE {

   ?athletics wdt:P361 ?summerolympics ;
       sc:name ?athleticsname.
   ?marathon wdt:P361 ?athletics ;
       sc:name ?marathonname.
   ?athlete wdt:P1344 ?marathon .
   
   
   # athlete country athletecountry
   ?athlete wdt:P27 ?athletecountry .
   
   ?athletecountry sc:name ?athletecountryname .
   
   # get the label
   ?athlete sc:name ?athletename .
   
   FILTER (?summerolympics IN(wd:Q8558,wd:Q8567,wd:Q8577,wd:Q8613)).
   #FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 ) .
   FILTER REGEX(?athleticsname, "athletics" ) .
   FILTER REGEX(?marathonname, "marathon" ) .
} GROUP BY ?athlete ?athletename ?athletecountry ?athletecountryname
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('athlete', 'http://www.wikidata.org/entity/Q26472786'), ('athletename', 'Cristhian Pacheco'), ('athletecountry', 'http://www.wikidata.org/entity/Q419'), ('athletecountryname', 'Peru'), ('olympicsplayed', '1')]
[('athlete', 'http://www.wikidata.org/entity/Q9124'), ('athletename', 'Eliud Kipchoge'), ('athletecountry', 'http://www.wikidata.org/entity/Q114'), ('athletecountryname', 'Kenya'), ('olympicsplayed', '1')]
[('athlete', 'http://www.wikidata.org/entity/Q2301289'), ('athletename', 'Sondre Nordstad Moen'), ('athletecountry', 'http://www.wikidata.org/entity/Q20'), ('athletecountryname', 'Norway'), ('olympicsplayed', '1')]
[('athlete', 'http://www.wikidata.org/entity/Q2271929'), ('athletename', 'Bat-Ochiryn Ser-Od'), ('athletecountry', 'http://www.wikidata.org/entity/Q711'), ('athletecountryname', 'Mongolia'), ('olympicsplayed', '4')]
[('athlete', 'http://www.wikidata.org/entity/Q63691'), ('athletename', 'Galen Rupp'), ('athletecountry', 'http://www.wikidata.org/entity/Q30'),

In [9]:
# aim: get all the athletes who competed in the 4 Olympic Games in the marathon discipline and the number of Olympic Games the athlete competed in. Get also their countries
# result: all the athletes who competed in the 4 Olympic Games in the marathon discipline. We only got one athlete so we might want to consider all the athletes whose
#         sport is 'marathon' or occupation is 'marathon runner' that have participated in any event
queryString = """
SELECT DISTINCT ?athlete ?athletename ?athletecountry ?athletecountryname (COUNT(?summerolympics) as ?olympicsplayed)
WHERE {

   ?athletics wdt:P361 ?summerolympics ;
       sc:name ?athleticsname.
   ?marathon wdt:P361 ?athletics ;
       sc:name ?marathonname.
   ?athlete wdt:P1344 ?marathon .
   
   
   # athlete country athletecountry
   ?athlete wdt:P27 ?athletecountry .
   
   ?athletecountry sc:name ?athletecountryname .
   
   # get the label
   ?athlete sc:name ?athletename .
   
   FILTER (?summerolympics IN(wd:Q8558,wd:Q8567,wd:Q8577,wd:Q8613)).
   FILTER REGEX(?athleticsname, "athletics" ) .
   FILTER REGEX(?marathonname, "marathon" ) .
} GROUP BY ?athlete ?athletename ?athletecountry ?athletecountryname
HAVING (COUNT(?summerolympics)>=4)
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('athlete', 'http://www.wikidata.org/entity/Q2271929'), ('athletename', 'Bat-Ochiryn Ser-Od'), ('athletecountry', 'http://www.wikidata.org/entity/Q711'), ('athletecountryname', 'Mongolia'), ('olympicsplayed', '4')]
1


In [25]:
# aim: get all the athletes whose 'sport' is 'marathon' or whose occupation is 'marathon runner' who played in any possible event in all the Olympic Games of 2004.
#      We consider all the events since if considering only marathon events this would return only the athlete 'Bat-Ochiryn Ser-Od' as result
# result: there are few athletes...probably the query is wrong. We try to reformulate it

queryString = """
SELECT DISTINCT ?athlete ?athletename
WHERE {

   ?discipline wdt:P361 wd:Q8558 .

   ?subdiscipline wdt:P361 ?discipline .

   
   # athlete participated in any possible discipline at the Olympic Games of 2004
   OPTIONAL {?athlete ?participated ?subdiscipline} .
   OPTIONAL {?athlete ?participated ?discipline} .
   OPTIONAL {?athlete ?participated wd:Q8558} .
   
   # athlete occupation marathon runner
   ?athlete wdt:P106 ?occupation .
   ?athlete wdt:P2416 ?sportcompeted .
   ?athlete wdt:P641 ?sport .
   
   # athlete country athletecountry
   ?athlete wdt:P27 ?athletecountry .
   
   ?athletecountry sc:name ?athletecountryname .
   
   # get the label
   ?athlete sc:name ?athletename .
   
   FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 ) .
   FILTER (?participated=wdt:P1344) .
   
} GROUP BY ?athlete ?athletename
HAVING (COUNT(?participated)>=1)
LIMIT 20
"""
print("Results")
x=run_query(queryString)

Results
[('athlete', 'http://www.wikidata.org/entity/Q2271929'), ('athletename', 'Bat-Ochiryn Ser-Od')]
[('athlete', 'http://www.wikidata.org/entity/Q9124'), ('athletename', 'Eliud Kipchoge')]
[('athlete', 'http://www.wikidata.org/entity/Q1617873'), ('athletename', 'Marcel Matanin')]
3


In [14]:
# aim: get all the athletes whose 'sport' is 'marathon' or whose occupation is 'marathon runner' who played in any possible event in all the Olympic Games of 2004.
#      We consider all the events since if considering only marathon events this would return only the athlete 'Bat-Ochiryn Ser-Od' as result
# result:all the athletes whose 'sport' is 'marathon' or whose occupation is 'marathon runner' who played in any possible event in all the Olympic Games of 2004-2008-
#      2012-2016.

queryString = """
SELECT DISTINCT ?athlete ?athletename
WHERE {
   ?discipline wdt:P361 wd:Q8558 .

   ?subdiscipline wdt:P361 ?discipline .
   
   # athlete participated in any possible discipline at the Olympic Games of 2004
   ?athlete wdt:P1344 ?anydiscipline .
   
   # athlete occupation marathon runner
   ?athlete wdt:P106 ?occupation .
   ?athlete wdt:P2416 ?sportcompeted .
   ?athlete wdt:P641 ?sport .
   
   # athlete country athletecountry
   ?athlete wdt:P27 ?athletecountry .
   
   ?athletecountry sc:name ?athletecountryname .
   
   # get the label
   ?athlete sc:name ?athletename .
   
   FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 ) .
   FILTER (?anydiscipline=wd:Q8558 || ?anydiscipline=?discipline || ?anydiscipline=?subdiscipline)
   
} GROUP BY ?athlete ?athletename
LIMIT 20
"""
print("Results")
x=run_query(queryString)

Results
[('athlete', 'http://www.wikidata.org/entity/Q433878'), ('athletename', 'Nadia Ejjafini')]
[('athlete', 'http://www.wikidata.org/entity/Q6392970'), ('athletename', 'Kenza Wahbi')]
[('athlete', 'http://www.wikidata.org/entity/Q171500'), ('athletename', 'Gebrselassie Haile')]
[('athlete', 'http://www.wikidata.org/entity/Q466315'), ('athletename', 'Derartu Tulu')]
[('athlete', 'http://www.wikidata.org/entity/Q429888'), ('athletename', 'Werknesh Kidane')]
[('athlete', 'http://www.wikidata.org/entity/Q6763145'), ('athletename', 'Mariela González')]
[('athlete', 'http://www.wikidata.org/entity/Q16210935'), ('athletename', 'Maria McCambridge')]
[('athlete', 'http://www.wikidata.org/entity/Q274104'), ('athletename', 'Benita Johnson')]
[('athlete', 'http://www.wikidata.org/entity/Q1384426'), ('athletename', 'Wu Wen-Chien')]
[('athlete', 'http://www.wikidata.org/entity/Q2448283'), ('athletename', 'Ndabili Bashingili')]
[('athlete', 'http://www.wikidata.org/entity/Q276222'), ('athletename

In [26]:
# aim: get all the athletes whose 'sport' is 'marathon' or whose occupation is 'marathon runner' who played in any possible event in all the Olympic Games of 2004-2008-
#      2012-2016. We consider all the events since if considering only marathon events this would return only the athlete 'Bat-Ochiryn Ser-Od' as result
# result:all the athletes whose 'sport' is 'marathon' or whose occupation is 'marathon runner' who played in any possible event in all the Olympic Games of 2004-2008-
#      2012-2016.

queryString = """
SELECT DISTINCT ?athlete ?athletename
WHERE {
    {
        SELECT DISTINCT ?athlete ?athletename
        WHERE {
           ?discipline wdt:P361 wd:Q8558 .

           ?subdiscipline wdt:P361 ?discipline .


           # athlete participated in any possible discipline at the Olympic Games of 2004
           ?athlete wdt:P1344 ?anydiscipline .

           # athlete occupation marathon runner
           ?athlete wdt:P106 ?occupation .
           ?athlete wdt:P2416 ?sportcompeted .
           ?athlete wdt:P641 ?sport .

           # athlete country athletecountry
           ?athlete wdt:P27 ?athletecountry .

           ?athletecountry sc:name ?athletecountryname .

           # get the label
           ?athlete sc:name ?athletename .

           FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 ) .
           FILTER (?anydiscipline=wd:Q8558 || ?anydiscipline=?discipline || ?anydiscipline=?subdiscipline)

        }
    }
    .
    {
        SELECT DISTINCT ?athlete ?athletename
        WHERE {
           ?discipline wdt:P361 wd:Q8567 . 

           ?subdiscipline wdt:P361 ?discipline .


           # athlete participated in any possible discipline at the Olympic Games of 2008
           ?athlete wdt:P1344 ?anydiscipline .

           # athlete occupation marathon runner
           ?athlete wdt:P106 ?occupation .
           ?athlete wdt:P2416 ?sportcompeted .
           ?athlete wdt:P641 ?sport .

           # athlete country athletecountry
           ?athlete wdt:P27 ?athletecountry .

           ?athletecountry sc:name ?athletecountryname .

           # get the label
           ?athlete sc:name ?athletename .

           FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 ) .
           FILTER (?anydiscipline=wd:Q8567 || ?anydiscipline=?discipline || ?anydiscipline=?subdiscipline)

        }
    }
    .
    {
        SELECT DISTINCT ?athlete ?athletename
        WHERE {
           ?discipline wdt:P361 wd:Q8577 .

           ?subdiscipline wdt:P361 ?discipline .


           # athlete participated in any possible discipline at the Olympic Games of 2012
           ?athlete wdt:P1344 ?anydiscipline .

           # athlete occupation marathon runner
           ?athlete wdt:P106 ?occupation .
           ?athlete wdt:P2416 ?sportcompeted .
           ?athlete wdt:P641 ?sport .

           # athlete country athletecountry
           ?athlete wdt:P27 ?athletecountry .

           ?athletecountry sc:name ?athletecountryname .

           # get the label
           ?athlete sc:name ?athletename .

           FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 ) .
           FILTER (?anydiscipline=wd:Q8577 || ?anydiscipline=?discipline || ?anydiscipline=?subdiscipline)

        }
    }
    .
    {
        SELECT DISTINCT ?athlete ?athletename
        WHERE {
           ?discipline wdt:P361 wd:Q8613 .

           ?subdiscipline wdt:P361 ?discipline .


           # athlete participated in any possible discipline at the Olympic Games of 2016
           ?athlete wdt:P1344 ?anydiscipline .

           # athlete occupation marathon runner
           ?athlete wdt:P106 ?occupation .
           ?athlete wdt:P2416 ?sportcompeted .
           ?athlete wdt:P641 ?sport .

           # athlete country athletecountry
           ?athlete wdt:P27 ?athletecountry .

           ?athletecountry sc:name ?athletecountryname .

           # get the label
           ?athlete sc:name ?athletename .

           FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 ) .
           FILTER (?anydiscipline=wd:Q8613 || ?anydiscipline=?discipline || ?anydiscipline=?subdiscipline)

        }
    }
}
LIMIT 20
"""
print("Results")
x=run_query(queryString)

Results
[('athlete', 'http://www.wikidata.org/entity/Q192033'), ('athletename', 'Zersenay Tadese')]
[('athlete', 'http://www.wikidata.org/entity/Q443822'), ('athletename', 'Joanne Pavey')]
[('athlete', 'http://www.wikidata.org/entity/Q267879'), ('athletename', 'Olivera Jevtić')]
[('athlete', 'http://www.wikidata.org/entity/Q467303'), ('athletename', 'Souad Aït Salem')]
[('athlete', 'http://www.wikidata.org/entity/Q2271929'), ('athletename', 'Bat-Ochiryn Ser-Od')]
[('athlete', 'http://www.wikidata.org/entity/Q6003'), ('athletename', 'Tirunesh Dibaba')]
[('athlete', 'http://www.wikidata.org/entity/Q449558'), ('athletename', 'Kayoko Fukushi')]
7


Final query for this task

In [24]:
# aim: get all the athletes'countries whose 'sport' is 'marathon' or whose occupation is 'marathon runner' who played in any possible event in all the Olympic Games of 2004-2008-
#      2012-2016. We consider all the events since if considering only marathon events this would return only the athlete 'Bat-Ochiryn Ser-Od' as result.
# result:all the athletes whose 'sport' is 'marathon' or whose occupation is 'marathon runner' who played in any possible event in all the Olympic Games of 2004-2008-
#      2012-2016.

queryString = """
SELECT DISTINCT ?athletecountry ?athletecountryname (COUNT(?athlete) as ?numberathletes)
WHERE {
    {
        SELECT DISTINCT ?athlete ?athletecountry ?athletecountryname
        WHERE {
           ?discipline wdt:P361 wd:Q8558 .

           ?subdiscipline wdt:P361 ?discipline .


           # athlete participated in any possible discipline at the Olympic Games of 2004
           ?athlete wdt:P1344 ?anydiscipline .

           # athlete occupation marathon runner
           ?athlete wdt:P106 ?occupation .
           ?athlete wdt:P2416 ?sportcompeted .
           ?athlete wdt:P641 ?sport .

           # athlete country athletecountry
           ?athlete wdt:P27 ?athletecountry .

           ?athletecountry sc:name ?athletecountryname .

           FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 ) .
           FILTER (?anydiscipline=wd:Q8558 || ?anydiscipline=?discipline || ?anydiscipline=?subdiscipline)

        }
    }
    .
    {
        SELECT DISTINCT ?athlete ?athletecountry ?athletecountryname
        WHERE {
           ?discipline wdt:P361 wd:Q8567 . 

           ?subdiscipline wdt:P361 ?discipline .


           # athlete participated in any possible discipline at the Olympic Games of 2008
           ?athlete wdt:P1344 ?anydiscipline .

           # athlete occupation marathon runner
           ?athlete wdt:P106 ?occupation .
           ?athlete wdt:P2416 ?sportcompeted .
           ?athlete wdt:P641 ?sport .

           # athlete country athletecountry
           ?athlete wdt:P27 ?athletecountry .

           ?athletecountry sc:name ?athletecountryname .

           FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 ) .
           FILTER (?anydiscipline=wd:Q8567 || ?anydiscipline=?discipline || ?anydiscipline=?subdiscipline)

        }
    }
    .
    {
        SELECT DISTINCT ?athlete ?athletecountry ?athletecountryname
        WHERE {
           ?discipline wdt:P361 wd:Q8577 .

           ?subdiscipline wdt:P361 ?discipline .


           # athlete participated in any possible discipline at the Olympic Games of 2012
           ?athlete wdt:P1344 ?anydiscipline .

           # athlete occupation marathon runner
           ?athlete wdt:P106 ?occupation .
           ?athlete wdt:P2416 ?sportcompeted .
           ?athlete wdt:P641 ?sport .

           # athlete country athletecountry
           ?athlete wdt:P27 ?athletecountry .

           ?athletecountry sc:name ?athletecountryname .

           FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 ) .
           FILTER (?anydiscipline=wd:Q8577 || ?anydiscipline=?discipline || ?anydiscipline=?subdiscipline)

        }
    }
    .
    {
        SELECT DISTINCT ?athlete ?athletecountry ?athletecountryname
        WHERE {
           ?discipline wdt:P361 wd:Q8613 .

           ?subdiscipline wdt:P361 ?discipline .


           # athlete participated in any possible discipline at the Olympic Games of 2016
           ?athlete wdt:P1344 ?anydiscipline .

           # athlete occupation marathon runner
           ?athlete wdt:P106 ?occupation .
           ?athlete wdt:P2416 ?sportcompeted .
           ?athlete wdt:P641 ?sport .

           # athlete country athletecountry
           ?athlete wdt:P27 ?athletecountry .

           ?athletecountry sc:name ?athletecountryname .

           FILTER (?occupation=wd:Q13382460 || ?sport=wd:Q40244 || ?sportcompeted=wd:Q40244 ) .
           FILTER (?anydiscipline=wd:Q8613 || ?anydiscipline=?discipline || ?anydiscipline=?subdiscipline)

        }
    }
} GROUP BY ?athletecountry ?athletecountryname
LIMIT 200
"""
print("Results")
x=run_query(queryString)

Results
[('athletecountry', 'http://www.wikidata.org/entity/Q986'), ('athletecountryname', 'Eritrea'), ('numberathletes', '1')]
[('athletecountry', 'http://www.wikidata.org/entity/Q403'), ('athletecountryname', 'Serbia'), ('numberathletes', '1')]
[('athletecountry', 'http://www.wikidata.org/entity/Q711'), ('athletecountryname', 'Mongolia'), ('numberathletes', '1')]
[('athletecountry', 'http://www.wikidata.org/entity/Q262'), ('athletecountryname', 'Algeria'), ('numberathletes', '1')]
[('athletecountry', 'http://www.wikidata.org/entity/Q17'), ('athletecountryname', 'Japan'), ('numberathletes', '1')]
[('athletecountry', 'http://www.wikidata.org/entity/Q145'), ('athletecountryname', 'United Kingdom'), ('numberathletes', '1')]
[('athletecountry', 'http://www.wikidata.org/entity/Q115'), ('athletecountryname', 'Ethiopia'), ('numberathletes', '1')]
7


## Task 7
For each disciplines of the long distance running find who hold the World Record (the result set must be a list of elements with discipline IRI, label and athlete IRI and label).

In [69]:
# aim: get all the properties of 'long-distance runner'
# result: we found 'field of this occupation'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q4439155 ?p ?o .
   # get the label
   ?p sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P2671'), ('name', 'Google Knowledge Graph ID')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('name', 'Commons category')]
[('p', 'http://www.wikidata.org/prop/direct/P425'), ('name', 'field of this occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('name', "topic's main category")]
7


In [70]:
# aim: get the object related to 'long-distance runner' through 'field of this occupation'
# result: we found 'long-distance running'
queryString = """
SELECT DISTINCT ?o ?name
WHERE {
   # bind something
   wd:Q4439155 wdt:P425 ?o .
   # get the label
   ?o sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q917206'), ('name', 'long-distance running')]
1


In [71]:
# aim: get all the properties related to 'long-distance running'
# result: we found 'part of'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   ?s ?p wd:Q917206 .
   # get the label
   ?p sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P156'), ('name', 'followed by')]
[('p', 'http://www.wikidata.org/prop/direct/P2416'), ('name', 'sports discipline competed in')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('name', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('name', 'part of')]
[('p', 'http://www.wikidata.org/prop/direct/P413'), ('name', 'position played on team / speciality')]
[('p', 'http://www.wikidata.org/prop/direct/P425'), ('name', 'field of this occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('name', 'sport')]
[('p', 'http://www.wikidata.org/prop/direct/P921'), ('name', 'main subject')]
11


In [4]:
# aim: get all the subjects related to 'long-distance running' through 'part of'
# result: we found just '5k run' but there should be many disciplines
queryString = """
SELECT DISTINCT ?s ?name
WHERE {
   # bind something
   ?s wdt:P361 wd:Q917206 .
   # get the label
   ?s sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q9171043'), ('name', '5K run')]
1


In [73]:
# aim: get all the subjects related to 'long-distance running' through 'sports discipline competed in'
# result: we found athletes and not disciplines
queryString = """
SELECT DISTINCT ?s ?name
WHERE {
   # bind something
   ?s wdt:P2416 wd:Q917206 .
   # get the label
   ?s sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q310447'), ('name', 'Ville Ritola')]
[('s', 'http://www.wikidata.org/entity/Q14912256'), ('name', 'Günter Wöller')]
[('s', 'http://www.wikidata.org/entity/Q28074111'), ('name', 'Christel Rogiers')]
[('s', 'http://www.wikidata.org/entity/Q35309646'), ('name', 'Ignas Brasevičius')]
[('s', 'http://www.wikidata.org/entity/Q29963964'), ('name', 'Krishna Bahadur Basnet')]
[('s', 'http://www.wikidata.org/entity/Q29964020'), ('name', 'Hassan Karimou')]
[('s', 'http://www.wikidata.org/entity/Q225913'), ('name', 'Miruts Yifter')]
[('s', 'http://www.wikidata.org/entity/Q274531'), ('name', 'Maura Viceconte')]
[('s', 'http://www.wikidata.org/entity/Q983623'), ('name', 'Mariano Haro')]
[('s', 'http://www.wikidata.org/entity/Q5998'), ('name', 'Vivian Cheruiyot')]
[('s', 'http://www.wikidata.org/entity/Q5936464'), ('name', 'Otto Larsen')]
[('s', 'http://www.wikidata.org/entity/Q11478626'), ('name', 'Shinji Kawashima')]
[('s', 'http://www.wikidata.org/enti

In [74]:
# aim: get all the subjects related to 'long-distance running' through 'subclass of'
# result: we found all the disciplines related that are involved in 'long-distance running'
queryString = """
SELECT DISTINCT ?s ?name
WHERE {
   # discipline subclass_of long-distance_running
   ?s wdt:P279 wd:Q917206 .
   # get the label
   ?s sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q26844379'), ('name', 'one-hour run')]
[('s', 'http://www.wikidata.org/entity/Q2815830'), ('name', '2 miles run')]
[('s', 'http://www.wikidata.org/entity/Q215677'), ('name', 'half marathon')]
[('s', 'http://www.wikidata.org/entity/Q163892'), ('name', '10,000 metres')]
[('s', 'http://www.wikidata.org/entity/Q240500'), ('name', '5000 metres')]
[('s', 'http://www.wikidata.org/entity/Q40244'), ('name', 'marathon')]
[('s', 'http://www.wikidata.org/entity/Q2774730'), ('name', '10K run')]
[('s', 'http://www.wikidata.org/entity/Q500050'), ('name', 'cross country running')]
[('s', 'http://www.wikidata.org/entity/Q26303'), ('name', 'ultramarathon')]
[('s', 'http://www.wikidata.org/entity/Q2164200'), ('name', 'one hour run')]
[('s', 'http://www.wikidata.org/entity/Q19767716'), ('name', '15K run')]
[('s', 'http://www.wikidata.org/entity/Q19827858'), ('name', '20 kilometres')]
[('s', 'http://www.wikidata.org/entity/Q2538895'), ('name', 'Waffenlauf')]
1

In [103]:
# aim: get all the properties related to 'one-hour run' to find something related to records in this discipline
# result: the empty set is found
queryString = """
SELECT DISTINCT ?p 
WHERE {
   # bind something
   ?s ?p wd:Q26844379 .
   # get the label
   ?p sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [76]:
# aim: get all the properties relating 'one-hour run' to other resources to find something related to records in this discipline
# result: we found  'record or record progression'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q26844379 ?p ?o .
   # get the label
   ?p sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P2869'), ('name', 'record or record progression')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('name', 'sport')]
3


In [77]:
# aim: get all the objects related to 'one-hour run' through 'record or record progression'
# result: the empty set is found... We decide to avoid asking for the label to see if the objects have no labels associated
queryString = """
SELECT DISTINCT ?o ?name
WHERE {
   # bind something
   wd:Q26844379 wdt:P2869 ?o .
   # get the label
   ?o sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [None]:
# aim: get all the objects related to 'one-hour run' through 'record or record progression' without labels
# result: we found 2 objects but we don't know what they are so we must further inspect 
queryString = """
SELECT DISTINCT ?o
WHERE {
   # bind something
   wd:Q26844379 wdt:P2869 ?o .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q16592455')]
[('o', 'http://www.wikidata.org/entity/Q16592458')]
2


In [79]:
# aim: get all the objects related to 'Q16592455'
# result: we found that it's a list of something (probably records?). Maybe there are athletes correlated to this list so we try to find out if that's true
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q16592455 ?p ?o .
   # get the label
   ?p sc:name ?name
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P360'), ('name', 'is a list of')]
2


In [37]:
# aim: get all the subjects related through some properties to 'Q16592455'
# result: we found just 'record or record progression' that we had already found before. We decide to look for other paths to get records.
queryString = """
SELECT DISTINCT ?s ?sname ?p ?name
WHERE {
   # bind something
    ?s ?p ?o.
   ?p sc:name ?name .
   ?s sc:name ?sname .
   FILTER (?o IN (wd:Q16592455,wd:Q16592458))
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q26844379'), ('sname', 'one-hour run'), ('p', 'http://www.wikidata.org/prop/direct/P2869'), ('name', 'record or record progression')]
1


In [42]:
# aim: from the first query about 'Eliud Kipchoge' we found the property 'record held' that might be useful. We try to see what objects is related to it
# result: we found 'Marathon world record progression' which probably means a progressive list of records for the Marathon discipline! We try to check if that's true
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   
   wd:Q9124 wdt:P1000 ?obj .
   # get the label
   ?obj sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q929585'), ('name', 'Marathon world record progression')]
1


In [81]:
# aim: get all the properties of 'Marathon world record progression'
# result: we found that's actually a list. Maybe there are athletes correlated to it
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   
   wd:Q929585 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P360'), ('name', 'is a list of')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('name', 'sport')]
3


In [82]:
# aim: get all the properties related to 'Marathon world record progression'
# result: we found 'record held' so we try to see the subjects who held these records
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   
   ?s ?p wd:Q929585 .
   # get the label
   ?p sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1000'), ('name', 'record held')]
1


In [5]:
# aim: get all the subjects related to 'Marathon world record progression' through 'record held'
# result: we found the athletes who held a record for the Marathon discipline. There are no dates of when the records were established...
queryString = """
SELECT DISTINCT ?s ?name
WHERE {
   
   ?s ?p wd:Q929585 .
   # get the label
   ?s sc:name ?name.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q1044585'), ('name', 'Caroline Walker')]
[('s', 'http://www.wikidata.org/entity/Q171500'), ('name', 'Gebrselassie Haile')]
[('s', 'http://www.wikidata.org/entity/Q728634'), ('name', 'Robert de Castella')]
[('s', 'http://www.wikidata.org/entity/Q9124'), ('name', 'Eliud Kipchoge')]
[('s', 'http://www.wikidata.org/entity/Q133667'), ('name', 'Sohn Kee-chung')]
[('s', 'http://www.wikidata.org/entity/Q7635174'), ('name', 'Suh Yun-bok')]
[('s', 'http://www.wikidata.org/entity/Q1705'), ('name', 'Wilson Kipsang Kiprotich')]
[('s', 'http://www.wikidata.org/entity/Q518358'), ('name', 'Beth Bonner')]
[('s', 'http://www.wikidata.org/entity/Q999436'), ('name', 'Morio Shigematsu')]
[('s', 'http://www.wikidata.org/entity/Q815310'), ('name', 'Belayneh Densamo')]
[('s', 'http://www.wikidata.org/entity/Q454958'), ('name', 'Carlos Lopes')]
[('s', 'http://www.wikidata.org/entity/Q456857'), ('name', 'Johnny Hayes')]
[('s', 'http://www.wikidata.org/entity/Q47108

In [86]:
# aim: get all the subjects related with 'athletics at the 2004 Summer Olympics' through 'part of' and 'has part'
# result: we found 'athletics at the 2004 Summer Olympics – women's marathon' and 'athletics at the 2004 Summer Olympics – men's marathon'
queryString = """
SELECT DISTINCT ?o ?name
WHERE {
   # bind something
   wd:Q16592455 ?p ?o .
   # get the label
   ?o sc:name ?name
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q13406463'), ('name', 'Wikimedia list article')]
[('o', 'http://www.wikidata.org/entity/Q24033838'), ('name', "women's world record")]
2


In [87]:
# aim: get all the objects that are superclasses of 'women's world record'
# result: we found 'world record'
queryString = """
SELECT DISTINCT ?o ?name
WHERE {
   # bind something
   wd:Q24033838 wdt:P279 ?o .
   # get the label
   ?o sc:name ?name
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q688615'), ('name', 'world record')]
1


In [88]:
# aim: get all the objects that are 'instance of' 'world record'
# result: we found some world records but not the ones we are looking for
queryString = """
SELECT DISTINCT ?s ?name
WHERE {
   # bind something
   ?s wdt:P31 wd:Q688615 .
   # get the label
   ?s sc:name ?name
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q1139050'), ('name', "Men's 400 metres hurdles world record progression")]
[('s', 'http://www.wikidata.org/entity/Q1139195'), ('name', 'decathlon world record progression')]
[('s', 'http://www.wikidata.org/entity/Q83573'), ('name', "Women's heptathlon world record progression")]
[('s', 'http://www.wikidata.org/entity/Q1053563'), ('name', "Men's 4 x 400 metres relay world record progression")]
[('s', 'http://www.wikidata.org/entity/Q1140490'), ('name', "Men's 400 metres world record progression")]
[('s', 'http://www.wikidata.org/entity/Q30748015'), ('name', "men's 100 metres world record")]
[('s', 'http://www.wikidata.org/entity/Q70466'), ('name', "Men's discus throw world record progression")]
[('s', 'http://www.wikidata.org/entity/Q1050464'), ('name', '1500 metres world record progression')]
[('s', 'http://www.wikidata.org/entity/Q1364158'), ('name', '800 metres world record progression')]
[('s', 'http://www.wikidata.org/entity/Q95989670'

In [91]:
# aim: get all the objects related with 'Marathon world record progression'
# result: we didn't find anything useful... 'Marathon world record progression' is associated with 'athletics' and not with 'marathon'...
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   
   wd:Q929585 ?p ?obj .
   # get the label
   ?obj sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q1241356'), ('name', 'record')]
[('obj', 'http://www.wikidata.org/entity/Q13406463'), ('name', 'Wikimedia list article')]
[('obj', 'http://www.wikidata.org/entity/Q542'), ('name', 'athletics')]
[('obj', 'http://www.wikidata.org/entity/Q208410'), ('name', 'List of world records in athletics')]
4


In [2]:
# aim:get all the properties for 'Men's 400 metres hurdles world record progression'
# result: we didn't find something useful
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   
   ?s ?p wd:Q1139050 .
   # get the label
   ?p sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P2869'), ('name', 'record or record progression')]
[('p', 'http://www.wikidata.org/prop/direct/P1000'), ('name', 'record held')]
2


In [46]:
# aim: get all the objects related to 'marathon' through 'record held' or 'record or record progression'
# result: we found 2 athletes that are holding the record but not the record progression
queryString = """
SELECT DISTINCT ?o ?name
WHERE {
   # discipline subclass_of long-distance running
   wd:Q40244 wdt:P1000|wdt:P2869 ?o .
   # get the label
   ?o sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q9124'), ('name', 'Eliud Kipchoge')]
[('o', 'http://www.wikidata.org/entity/Q63223'), ('name', 'Dennis Kipruto Kimetto')]
2


In [5]:
# aim: get all the properties of 'Eliud Kipchoge'
# result: did not find any date regarding his/her 'record held'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # discipline subclass_of long-distance running
   wd:Q9124 ?p ?o .
   # get the label
   ?p sc:name ?name .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1000'), ('name', 'record held')]
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P1146'), ('name', 'World Athletics athlete ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('name', 'participant in')]
[('p', 'http://www.wikidata.org/prop/direct/P1412'), ('name', 'languages spoken, written or signed')]
[('p', 'http://www.wikidata.org/prop/direct/P1447'), ('name', 'Sports-Reference.com Olympic athlete ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1532'), ('name', 'country for sport')]
[('p', 'http://www.wikidata.org/prop/direct/P1559'), ('name', 'name in native language')]
[('p', 'http://www.wikidata.org/prop/direct/P172'), ('name', 'ethnic group')]
[('p', 'http://www.wikidata.org/prop/direct/P1741'), ('name', 'GTAA ID')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P19'), ('name', 

In [35]:
# aim: get all the progressions that could match the subclasses of 'long-distance running' through regexes
# result: we found some of them, we try to increase the "range" of the regex
queryString = """
SELECT DISTINCT ?progression ?progressionname
WHERE {
   
   ?s wdt:P1000|wdt:P2869 ?progression .
   # get the label
   ?s sc:name ?name.
   ?progression sc:name ?progressionname
   FILTER REGEX(?progressionname,"marathon|10000|10K|5K|5000|one hour|one-hour|cross country","i").
   FILTER REGEX(?progressionname,"progression","i").
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('progression', 'http://www.wikidata.org/entity/Q17069829'), ('progressionname', '5000 metres world record progression')]
[('progression', 'http://www.wikidata.org/entity/Q3922786'), ('progressionname', "Men's 5000 metres world record progression")]
[('progression', 'http://www.wikidata.org/entity/Q3922787'), ('progressionname', "Women's 5000 metres world record progression")]
[('progression', 'http://www.wikidata.org/entity/Q929585'), ('progressionname', 'Marathon world record progression')]
[('progression', 'http://www.wikidata.org/entity/Q875819'), ('progressionname', 'World record progression 5000 m speed skating women')]
[('progression', 'http://www.wikidata.org/entity/Q1054893'), ('progressionname', 'World record progression 5000 m speed skating men')]
[('progression', 'http://www.wikidata.org/entity/Q1054902'), ('progressionname', 'World record progression 10000 m speed skating men')]
7


In [33]:
# aim: get all the progressions that could match the subclasses of 'long-distance running' through regexes
# result: we found some of them, we try to increase the "range" of the regex
queryString = """
SELECT DISTINCT ?progression ?progressionname
WHERE {
   
   ?s wdt:P1000|wdt:P2869 ?progression .
   # get the label
   ?s sc:name ?name.
   ?progression sc:name ?progressionname
   FILTER REGEX(?progressionname,"marathon|10000|10,000|10K|5K|5000|one hour|one-hour|cross country|2 miles","i").
   FILTER REGEX(?progressionname,"progression","i").
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('progression', 'http://www.wikidata.org/entity/Q16714855'), ('progressionname', '10,000 metres world record progression')]
[('progression', 'http://www.wikidata.org/entity/Q17069829'), ('progressionname', '5000 metres world record progression')]
[('progression', 'http://www.wikidata.org/entity/Q3922786'), ('progressionname', "Men's 5000 metres world record progression")]
[('progression', 'http://www.wikidata.org/entity/Q3922787'), ('progressionname', "Women's 5000 metres world record progression")]
[('progression', 'http://www.wikidata.org/entity/Q929585'), ('progressionname', 'Marathon world record progression')]
[('progression', 'http://www.wikidata.org/entity/Q875819'), ('progressionname', 'World record progression 5000 m speed skating women')]
[('progression', 'http://www.wikidata.org/entity/Q1054893'), ('progressionname', 'World record progression 5000 m speed skating men')]
[('progression', 'http://www.wikidata.org/entity/Q1054902'), ('progressionname', 'World record pr

In [40]:
# aim: get all the progressions that could match the subclasses of 'long-distance running' through regexes
# result: we found just the record progressions of 5000 metres,10000 metres and marathon. We manually have to match the progressions with the sports and finally return all the
#         athletes that held a record for that sport as the final query
queryString = """
SELECT DISTINCT ?progression ?progressionname
WHERE {
   
   ?s wdt:P1000|wdt:P2869 ?progression .
   # get the label
   ?s sc:name ?name.
   ?progression sc:name ?progressionname .
   FILTER REGEX(?progressionname,"marathon|10000|10,000|10K|5K|5000|one hour|one-hour|cross country|2 miles|20 kilometers|20,000|20000|15K|15,0000|Waffenlauf|2|1","i").
   FILTER REGEX(?progressionname,"progression","i").
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('progression', 'http://www.wikidata.org/entity/Q1135532'), ('progressionname', 'World record progression 4 × 100 metres freestyle relay')]
[('progression', 'http://www.wikidata.org/entity/Q83583'), ('progressionname', "Women's 20 kilometres walk world record progression")]
[('progression', 'http://www.wikidata.org/entity/Q1188561'), ('progressionname', 'World record progression 4 × 100 metres medley relay')]
[('progression', 'http://www.wikidata.org/entity/Q1050464'), ('progressionname', '1500 metres world record progression')]
[('progression', 'http://www.wikidata.org/entity/Q1066353'), ('progressionname', "Men's 100 metres world record progression")]
[('progression', 'http://www.wikidata.org/entity/Q1316690'), ('progressionname', "Women's 100 metres world record progression")]
[('progression', 'http://www.wikidata.org/entity/Q1187829'), ('progressionname', "men's 200 metres world record progression")]
[('progression', 'http://www.wikidata.org/entity/Q1050432'), ('progressio

In [5]:
# aim: get all the athletes who held a record in the marathon discipline
# result: the athletes who held a record in the marathon discipline
queryString = """
SELECT DISTINCT (wd:Q40244 as ?discipline) ?disciplinename ?athlete ?athletename
WHERE {
   wd:Q40244 sc:name ?disciplinename .
   ?athlete wdt:P1000 wd:Q929585 .
   # get the label
   ?athlete sc:name ?athletename.
}
LIMIT 5
"""

print("Results")
x=run_query(queryString)

Results
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q1044585'), ('athletename', 'Caroline Walker')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q171500'), ('athletename', 'Gebrselassie Haile')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q728634'), ('athletename', 'Robert de Castella')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q9124'), ('athletename', 'Eliud Kipchoge')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q133667'), ('athletename', 'Sohn Kee-chung')]
5


In [4]:
# aim: get all the athletes who held a record in the marathon,10000 metres and 5000 metres disciplines
# result: the athletes who held a record in the marathon discipline. The other records must be found in other ways...
queryString = """
SELECT DISTINCT ?discipline ?disciplinename ?athlete ?athletename
WHERE {
    {
        # athletes who held a record for the marathon
        SELECT DISTINCT (wd:Q40244 as ?discipline) ?disciplinename ?athlete ?athletename WHERE
        {
           wd:Q40244 sc:name ?disciplinename .
           ?athlete wdt:P1000 wd:Q929585 .
           # get the label
           ?athlete sc:name ?athletename.
        }
    }
    UNION
    {
        # athletes who held a record for the 10000 metres 
        SELECT DISTINCT (wd:Q163892 as ?discipline) ?disciplinename ?athlete ?athletename WHERE
        {
           wd:Q163892 sc:name ?disciplinename .
           ?athlete wdt:P1000 wd:Q16714855 .
           # get the label
           ?athlete sc:name ?athletename.
        }
    
    }
    UNION
    {
        # athletes who held a record for the 5000 metres 
        SELECT DISTINCT (wd:Q240500 as ?discipline) ?disciplinename ?athlete ?athletename WHERE
        {
           wd:Q240500 sc:name ?disciplinename .
           ?athlete wdt:P1000 wd:Q17069829 .
           # get the label
           ?athlete sc:name ?athletename.
        }
    
    }
}
LIMIT 150
"""

print("Results")
x=run_query(queryString)

Results
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q1044585'), ('athletename', 'Caroline Walker')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q171500'), ('athletename', 'Gebrselassie Haile')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q728634'), ('athletename', 'Robert de Castella')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q9124'), ('athletename', 'Eliud Kipchoge')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q133667'), ('athletename', 'Sohn Kee-chung')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marath

In [4]:
# aim: get all the properties related to '10,000 metres world record progression'
# result: 'record or record progression' is found
queryString = """
SELECT DISTINCT ?p ?pname
WHERE {
   ?s ?p wd:Q16714855 .
   # get the label
   ?p sc:name ?pname.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P2869'), ('pname', 'record or record progression')]
1


In [5]:
# aim: get all the athletes who held a record for the '10,000 metres world record progression'
# result: no athletes have been found... we need to check all the properties of '10,000 metres'
queryString = """
SELECT DISTINCT ?s ?sname
WHERE {
   ?s ?p wd:Q16714855 .
   # get the label
   ?s sc:name ?sname.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q163892'), ('sname', '10,000 metres')]
1


In [6]:
# aim: get all the properties of '10,000 metres'
# result: 'statistical leader' has been found
queryString = """
SELECT DISTINCT ?p ?pname
WHERE {
   wd:Q163892 ?p ?o .
   # get the label
   ?p sc:name ?pname.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P2869'), ('pname', 'record or record progression')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('pname', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2581'), ('pname', 'BabelNet ID')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('pname', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('pname', 'sport')]
[('p', 'http://www.wikidata.org/prop/direct/P646'), ('pname', 'Freebase ID')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('pname', "topic's main category")]
[('p', 'http://www.wikidata.org/prop/direct/P3279'), ('pname', 'statistical leader')]
9


In [7]:
# aim: get all the 'statistical leader's related to '10,000 metres'
# result: 'Kenenisa Bekele Beyecha' has been found
queryString = """
SELECT DISTINCT ?o ?oname
WHERE {
   wd:Q163892 wdt:P3279 ?o .
   # get the label
   ?o sc:name ?oname.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q9119'), ('oname', 'Kenenisa Bekele Beyecha')]
1


In [8]:
# aim: get all the properties of 'Kenenisa Bekele Beyecha'
# result: 'record held' has been found
queryString = """
SELECT DISTINCT ?p ?pname
WHERE {
   wd:Q9119 ?p ?o .
   # get the label
   ?p sc:name ?pname.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1000'), ('pname', 'record held')]
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('pname', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P1146'), ('pname', 'World Athletics athlete ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1285'), ('pname', 'Munzinger Sport number')]
[('p', 'http://www.wikidata.org/prop/direct/P1296'), ('pname', 'Gran Enciclopèdia Catalana ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('pname', 'described by source')]
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('pname', 'participant in')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('pname', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1447'), ('pname', 'Sports-Reference.com Olympic athlete ID')]
[('p', 'http://www.wikidata.org/prop/direct/P166'), ('pname', 'award received')]
[('p', 'http://www.wikidata.org/prop/direct/P1741'), ('pname', 'GTAA ID')]
[('p', 'http://www.wik

In [9]:
# aim: get all the objects related to 'Kenenisa Bekele Beyecha' through 'record held'
# result: we found that for '5000 metres' and '10,000 metres' the records are stored differently with respect to the 'marathon' ones. We need to check if there
#         is a statistical leader also for 'marathon'
queryString = """
SELECT DISTINCT ?o ?oname
WHERE {
   wd:Q9119 wdt:P1000 ?o .
   # get the label
   ?o sc:name ?oname.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q240500'), ('oname', '5000 metres')]
[('o', 'http://www.wikidata.org/entity/Q52159822'), ('oname', "men's 10,000 meters world record in track and field")]
2


In [10]:
# aim: get all the properties of 'marathon'
# result: 'statistical leader' has not been found but we found 'record held'. if there is only one athlete maybe he/she is the last one holding a record
queryString = """
SELECT DISTINCT ?p ?pname
WHERE {
   wd:Q40244 ?p ?o .
   # get the label
   ?p sc:name ?pname.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1000'), ('pname', 'record held')]
[('p', 'http://www.wikidata.org/prop/direct/P1225'), ('pname', 'U.S. National Archives Identifier')]
[('p', 'http://www.wikidata.org/prop/direct/P1245'), ('pname', 'OmegaWiki Defined Meaning')]
[('p', 'http://www.wikidata.org/prop/direct/P1296'), ('pname', 'Gran Enciclopèdia Catalana ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('pname', 'described by source')]
[('p', 'http://www.wikidata.org/prop/direct/P138'), ('pname', 'named after')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('pname', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('pname', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P227'), ('pname', 'GND ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2347'), ('pname', 'YSO ID')]
[('p', 'http://www.wikidata.org/prop/direct/P244'), ('pname', 'Library of Congress authority ID')]
[('p', 'http://www.wikidata.org/prop/di

In [11]:
# aim: get all the objects related to 'marathon' through 'record held'
# result: we found 2 athletes holding a record
queryString = """
SELECT DISTINCT ?o ?oname
WHERE {
   wd:Q40244 wdt:P1000 ?o .
   # get the label
   ?o sc:name ?oname.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q9124'), ('oname', 'Eliud Kipchoge')]
[('o', 'http://www.wikidata.org/entity/Q63223'), ('oname', 'Dennis Kipruto Kimetto')]
2


In [20]:
# aim: get all the records that could match the subclasses of 'long-distance running' through regexes having at least a 'record held' property connected
# result: all the records or progressions that could match the subclasses of 'long-distance running' through regexes having at least a 'record held' property connected
queryString = """
SELECT DISTINCT ?record ?recordname
WHERE {
   
   ?s wdt:P1000 ?record .
   # get the label
   ?record sc:name ?recordname.
   FILTER REGEX(?recordname,"marathon|10000|10,000|10K|5K|5000|one hour|one-hour|cross country|2 miles|20 kilometers|20,000|20000|15K|15,0000|Waffenlauf|2|1|kilometers|km","i").
}
LIMIT 300
"""

print("Results")
x=run_query(queryString)

Results
[('record', 'http://www.wikidata.org/entity/Q164761'), ('recordname', '100 metres')]
[('record', 'http://www.wikidata.org/entity/Q211155'), ('recordname', '200 metres')]
[('record', 'http://www.wikidata.org/entity/Q1066353'), ('recordname', "Men's 100 metres world record progression")]
[('record', 'http://www.wikidata.org/entity/Q1316690'), ('recordname', "Women's 100 metres world record progression")]
[('record', 'http://www.wikidata.org/entity/Q1316274'), ('recordname', 'World record progression 1500 metres freestyle')]
[('record', 'http://www.wikidata.org/entity/Q163892'), ('recordname', '10,000 metres')]
[('record', 'http://www.wikidata.org/entity/Q191691'), ('recordname', '1500 metres')]
[('record', 'http://www.wikidata.org/entity/Q240500'), ('recordname', '5000 metres')]
[('record', 'http://www.wikidata.org/entity/Q52159822'), ('recordname', "men's 10,000 meters world record in track and field")]
[('record', 'http://www.wikidata.org/entity/Q9171043'), ('recordname', '5K r

In [15]:
# aim: get all the records that could match the subclasses of 'long-distance running' through regexes having at least a 'record held' property connected
# result: all the records that could match the subclasses of 'long-distance running' through regexes having at least a 'record held' property connected
queryString = """
SELECT DISTINCT ?record ?recordname
WHERE {
   
   ?record wdt:P1000 ?o .
   # get the label
   ?record sc:name ?recordname.
   FILTER REGEX(?recordname,"marathon|10000|10,000|10K|5K|5000|one hour|one-hour|cross country|2 miles|20 kilometers|20,000|20000|15K|15,0000|Waffenlauf|2|1","i").
}
LIMIT 300
"""

print("Results")
x=run_query(queryString)

Results
[('record', 'http://www.wikidata.org/entity/Q19980064'), ('recordname', 'Shindand An-12 crash')]
[('record', 'http://www.wikidata.org/entity/Q240500'), ('recordname', '5000 metres')]
[('record', 'http://www.wikidata.org/entity/Q40244'), ('recordname', 'marathon')]
[('record', 'http://www.wikidata.org/entity/Q4581223'), ('recordname', '1983 TAAG Angola Airlines 737 Boeing crash')]
[('record', 'http://www.wikidata.org/entity/Q3046072'), ('recordname', 'Air Algérie Flight 6289')]
[('record', 'http://www.wikidata.org/entity/Q26172'), ('recordname', 'Junkers F-13')]
[('record', 'http://www.wikidata.org/entity/Q19983192'), ('recordname', 'Air France D.342 crash')]
[('record', 'http://www.wikidata.org/entity/Q949020'), ('recordname', 'USAir Flight 1493')]
[('record', 'http://www.wikidata.org/entity/Q19993119'), ('recordname', 'Cuangar An-26 shootdown')]
[('record', 'http://www.wikidata.org/entity/Q7184944'), ('recordname', 'Philippine Airlines Flight 143')]
[('record', 'http://www.wik

Final query for this task

In [3]:
# aim: get all the athletes who held a record for the disciplines 10000 metres, marathon, 5000 metres, one hour run and 5k run 
#      (probably 5k run is the same as 5000 metres run)
# result: all the athletes who held a record for the disciplines 10000 metres, marathon, 5000 metres, one hour run and 5k run 
#      (probably 5k run is the same as 5000 metres run). We couldn't find dates so we can't establish who is the one holding a record right now for
#      one of the given disciplines

queryString = """
SELECT DISTINCT ?discipline ?disciplinename ?athlete ?athletename
WHERE {
    {
        # athletes who held a record for the marathon world record progression
        SELECT DISTINCT (wd:Q40244 as ?discipline) ?disciplinename ?athlete ?athletename WHERE
        {
           wd:Q40244 sc:name ?disciplinename .
           ?athlete wdt:P1000 wd:Q929585 .
           # get the label
           ?athlete sc:name ?athletename.
        }
    }
    UNION
    {
        # get all the athletes associated to 'marathon' through 'record held'
        SELECT DISTINCT (wd:Q40244 as ?discipline) ?disciplinename ?athlete ?athletename WHERE
        {
           wd:Q40244 sc:name ?disciplinename ;
                wdt:P1000 ?athlete .
           # get the label
           ?athlete sc:name ?athletename.
        }
    }
    UNION
    {
        # athletes who held a record for the 10000 metres 
        SELECT DISTINCT (wd:Q163892 as ?discipline) ?disciplinename ?athlete ?athletename WHERE
        {
           wd:Q163892 sc:name ?disciplinename .
           ?athlete wdt:P1000 ?recordheld .
           # get the label
           ?athlete sc:name ?athletename.
           FILTER(?recordheld IN (wd:Q52159822,wd:Q16714855,wd:Q163892))
        }
    
    }
    UNION
    {
        # athletes who held a record for the 5000 metres 
        SELECT DISTINCT (wd:Q240500 as ?discipline) ?disciplinename ?athlete ?athletename WHERE
        {
           wd:Q240500 sc:name ?disciplinename .
           ?athlete wdt:P1000 wd:Q240500 .
           # get the label
           ?athlete sc:name ?athletename.
        }
    
    }
    UNION
    {
        # athletes who held a record for the 5000 metres 
        SELECT DISTINCT (wd:Q240500 as ?discipline) ?disciplinename ?athlete ?athletename WHERE
        {
           wd:Q240500 sc:name ?disciplinename .
           wd:Q240500 wdt:P1000 ?athlete .
           # get the label
           ?athlete sc:name ?athletename.
        }
    
    }
    UNION
    {
        # athletes who held a record for the 5K run
        SELECT DISTINCT (wd:Q9171043 as ?discipline) ?disciplinename ?athlete ?athletename WHERE
        {
           wd:Q9171043 sc:name ?disciplinename .
           ?athlete wdt:P1000 wd:Q9171043 .
           # get the label
           ?athlete sc:name ?athletename.
        }
    }
    UNION
    {
        # athletes who held a record for the one hour run
        SELECT DISTINCT (wd:Q2164200 as ?discipline) ?disciplinename ?athlete ?athletename WHERE
        {
           wd:Q2164200 sc:name ?disciplinename .
           ?athlete wdt:P1000 wd:Q2164200 .
           # get the label
           ?athlete sc:name ?athletename.
        }
    }
}
LIMIT 150
"""

print("Results")
x=run_query(queryString)

Results
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q1044585'), ('athletename', 'Caroline Walker')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q171500'), ('athletename', 'Gebrselassie Haile')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q728634'), ('athletename', 'Robert de Castella')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q9124'), ('athletename', 'Eliud Kipchoge')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marathon'), ('athlete', 'http://www.wikidata.org/entity/Q133667'), ('athletename', 'Sohn Kee-chung')]
[('discipline', 'http://www.wikidata.org/entity/Q40244'), ('disciplinename', 'marath