# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-ea4433b172-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# Sport Workflow Series ("World Records explorative search") 

Consider the following exploratory information need:

> compile a list of athletes that held world records across some disciplines

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | profession    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P27`     | nationality   | predicate |
| `wd:Q1189`    | Usain Bolt    | node      |
| `wd:Q766`     | Jamaica       | node |
| `wd:Q688615`  | World Record  | node |
| `wd:Q542`     | athletics     | node |



Also consider

```
?a wdt:P106/wdt:P279 wd:Q2066131
```

is the BGP to retrieve all instances of **athlete**

```
?a wdt:P279 wd:Q688615
```

is the BGP to retrieve the types of **world records**


## Workload Goals

1. Identify the BGP for detaining a world record

2. Identify the BGP for the types of competitions in which a world record is classified

3. How many world records are held by Italian athletes

4. Compare number of world records held across different dimensions
 
   4.1 In which specific sport France has the most world records
   
   4.2 How many world records are held across genders
   
   4.3 Which sport has the highest number of world records


In [2]:
# start your workflow here

In [3]:
queryString = """
SELECT COUNT(*)
WHERE { 

?a wdt:P106/wdt:P279 wd:Q2066131 .

} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '929008')]


1

## Part 1
### Identify the BGP for detaining a world record

In [4]:
queryString = """
SELECT ?record ?recordName
WHERE { 

    ?record wdt:P279 wd:Q688615 ;
        sc:name ?recordName .

} LIMIT 10

"""

print("Results")
run_query(queryString)

Results
[('record', 'http://www.wikidata.org/entity/Q1066353'), ('recordName', "Men's 100 metres world record progression")]
[('record', 'http://www.wikidata.org/entity/Q3240929'), ('recordName', 'largest known prime number')]
[('record', 'http://www.wikidata.org/entity/Q24033838'), ('recordName', "women's world record")]
[('record', 'http://www.wikidata.org/entity/Q24033834'), ('recordName', "men's world record")]
[('record', 'http://www.wikidata.org/entity/Q24255295'), ('recordName', 'junior world record')]
[('record', 'http://www.wikidata.org/entity/Q23580887'), ('recordName', 'highest temperature recorded on Earth')]
[('record', 'http://www.wikidata.org/entity/Q16883666'), ('recordName', 'speed record')]
[('record', 'http://www.wikidata.org/entity/Q69907823'), ('recordName', 'world best time')]
[('record', 'http://www.wikidata.org/entity/Q3422413'), ('recordName', "Women's 100 metres hurdles world record progression")]
[('record', 'http://www.wikidata.org/entity/Q1136298'), ('recor

10

### Find information related to world record

In [5]:
queryString = """
SELECT DISTINCT ?pName ?oName ?p ?o
WHERE { 

    ?record wdt:P279 wd:Q688615 ;
        ?p ?o;
        sc:name ?recordName .
    OPTIONAL{ ?o sc:name ?oName .}
    OPTIONAL{ ?p sc:name ?pName .}
}
ORDER BY ?pName

"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'), ('o', 'http://wikiba.se/ontology#Item')]
[('p', 'http://www.wikidata.org/prop/direct-normalized/P2671'), ('o', 'http://g.co/kg/g/11bxdrpxb3')]
[('p', 'http://www.wikidata.org/prop/direct-normalized/P508'), ('o', 'http://purl.org/bncf/tid/37534')]
[('p', 'http://www.wikidata.org/prop/direct-normalized/P6366'), ('o', 'http://ma-graph.org/entity/81416803')]
[('p', 'http://www.wikidata.org/prop/direct-normalized/P646'), ('o', 'http://g.co/kg/m/05m291')]
[('p', 'http://www.wikidata.org/prop/direct-normalized/P646'), ('o', 'http://g.co/kg/m/0117z5t6')]
[('p', 'http://www.wikidata.org/prop/direct-normalized/P646'), ('o', 'http://g.co/kg/m/06zl5vw')]
[('p', 'http://www.wikidata.org/prop/direct-normalized/P646'), ('o', 'http://g.co/kg/m/0117q9c8')]
[('p', 'http://schema.org/name'), ('o', "Men's 100 metres world record progression")]
[('p', 'http://schema.org/name'), ('o', 'largest known prime number')]
[('p', 'http://schema.org/

54

### Remove wdt:P279 

In [6]:
queryString = """
SELECT DISTINCT ?pName ?oName ?p ?o
WHERE { 

    ?record ?x wd:Q688615 ;
        ?p ?o;
        sc:name ?recordName .
    OPTIONAL{ ?o sc:name ?oName .}
    OPTIONAL{ ?p sc:name ?pName .}
}
ORDER BY ?pName
LIMIT 30

"""

print("Results")
run_query(queryString)

Results
[('pName', 'Atlas Obscura identifier'), ('p', 'http://www.wikidata.org/prop/direct/P7772'), ('o', 'umbrella-cover-museum')]
[('pName', 'BNCF Thesaurus ID'), ('p', 'http://www.wikidata.org/prop/direct/P508'), ('o', '37534')]
[('pName', 'Bibliothèque nationale de France ID'), ('p', 'http://www.wikidata.org/prop/direct/P268'), ('o', '11870796x')]
[('pName', 'College Football Data Warehouse ID'), ('p', 'http://www.wikidata.org/prop/direct/P3560'), ('o', '1381')]
[('pName', 'Commons category'), ('p', 'http://www.wikidata.org/prop/direct/P373'), ('o', 'Records')]
[('pName', 'Commons category'), ('p', 'http://www.wikidata.org/prop/direct/P373'), ('o', 'Guinness World Records')]
[('pName', 'Commons category'), ('p', 'http://www.wikidata.org/prop/direct/P373'), ('o', 'World records')]
[('pName', 'Dailymotion channel ID'), ('p', 'http://www.wikidata.org/prop/direct/P2942'), ('o', 'guinnessworldrecords')]
[('pName', "Enciclopèdia de l'Esport Català ID"), ('p', 'http://www.wikidata.org/pro

30

### Find predicates and objects related to World Record (different from before)

In [7]:
queryString = """
SELECT DISTINCT ?pName ?oName ?p ?o
WHERE { 

    wd:Q688615 ?p ?o;
        sc:name ?recordName .
    OPTIONAL{ ?o sc:name ?oName .}
    OPTIONAL{ ?p sc:name ?pName .}
}
ORDER BY ?pName
LIMIT 30

"""

print("Results")
run_query(queryString)

Results
[('pName', 'Commons category'), ('p', 'http://www.wikidata.org/prop/direct/P373'), ('o', 'World records')]
[('pName', 'Freebase ID'), ('p', 'http://www.wikidata.org/prop/direct/P646'), ('o', '/m/0b_559')]
[('pName', 'KBpedia ID'), ('p', 'http://www.wikidata.org/prop/direct/P8408'), ('o', 'WorldRecord')]
[('pName', 'Quora topic ID'), ('p', 'http://www.wikidata.org/prop/direct/P3417'), ('o', 'World-Records')]
[('pName', 'Zhihu topic ID'), ('p', 'http://www.wikidata.org/prop/direct/P3553'), ('o', '19632840')]
[('pName', 'icon'), ('p', 'http://www.wikidata.org/prop/direct/P2910'), ('o', 'http://commons.wikimedia.org/wiki/Special:FilePath/Sport%20records%20icon%20WR.svg')]
[('pName', 'instance of'), ('oName', 'criterion'), ('p', 'http://www.wikidata.org/prop/direct/P31'), ('o', 'http://www.wikidata.org/entity/Q1789452')]
[('pName', 'instance of'), ('oName', 'quality'), ('p', 'http://www.wikidata.org/prop/direct/P31'), ('o', 'http://www.wikidata.org/entity/Q1207505')]
[('pName', 'sai

15

### Try to reach person instance

In [8]:
queryString = """
SELECT DISTINCT ?name ?instance
WHERE { 
    wd:Q1189 wdt:P31 ?instance .
    ?instance sc:name ?name .
}
LIMIT 5

"""

print("Results")
run_query(queryString)

Results
[('name', 'human'), ('instance', 'http://www.wikidata.org/entity/Q5')]


1

### Search person related to world records

In [9]:
queryString = """
SELECT DISTINCT ?pName ?p
WHERE { 

    ?record wdt:P279 wd:Q688615 ;
        sc:name ?recordName .
   
   ?person ?p ?record ;
           wdt:P31 wd:Q5 .
    ?p sc:name ?pName .
}


"""

print("Results")
run_query(queryString)

Results
[('pName', 'record held'), ('p', 'http://www.wikidata.org/prop/direct/P1000')]
[('pName', 'significant event'), ('p', 'http://www.wikidata.org/prop/direct/P793')]


2

### Retrieve name and relative name

In [10]:
queryString = """
SELECT DISTINCT ?personName (GROUP_CONCAT(DISTINCT ?recordName; SEPARATOR=", ") AS ?records) 
WHERE { 
 
    ?person wdt:P1000 ?record ;     # record held
        wdt:P31 wd:Q5 ;
        sc:name ?personName .
    
    ?record wdt:P279 wd:Q688615 ;
        sc:name ?recordName .
   
} GROUP BY ?personName

"""

print("Results")
run_query(queryString)

Results
[('personName', 'Usain Bolt'), ('records', "Men's 100 metres world record progression")]
[('personName', 'Yordanka Donkova'), ('records', "Women's 100 metres hurdles world record progression")]
[('personName', 'Kliment Kolesnikov'), ('records', 'junior world record')]
[('personName', 'Yaroslava Mahuchikh'), ('records', 'junior world record')]


4

### Add OPTIONAL 

In [11]:
queryString = """
SELECT DISTINCT ?personName (GROUP_CONCAT(DISTINCT ?recordName; SEPARATOR=", ") AS ?records)
WHERE { 
 
    ?person wdt:P1000 ?record ;     # record held
           wdt:P31 wd:Q5 .
    OPTIONAL {?person sc:name ?personName .}
    
    ?record ?p wd:Q688615 .
    
    OPTIONAL{ ?record sc:name ?recordName . }
    
    FILTER regex(?recordName, ".*world.*")

} GROUP BY ?personName
LIMIT 15

"""

print("Results")
run_query(queryString)

Results
[('personName', 'Genzebe Dibaba'), ('records', '1500 metres world record progression')]
[('personName', 'Tomáš Dvořák'), ('records', 'decathlon world record progression')]
[('personName', 'Phil Shinnick'), ('records', 'Long jump world record progression')]
[('personName', 'Caitlyn Jenner'), ('records', 'decathlon world record progression')]
[('personName', 'Guido Kratschmer'), ('records', 'decathlon world record progression')]
[('personName', 'Sten Pettersson'), ('records', "Men's 400 metres hurdles world record progression")]
[('personName', 'Sara Simeoni'), ('records', "Men's high jump world record progression")]
[('personName', 'Russ Hodge'), ('records', 'decathlon world record progression')]
[('personName', 'Edward Lindberg'), ('records', "Men's 4 x 400 metres relay world record progression")]
[('personName', 'Sasha Zhoya'), ('records', 'list of world junior records in athletics')]
[('personName', 'Ted Meredith'), ('records', "800 metres world record progression, Men's 4 x 

15

### Result part 1
From the last query we can see that some althlets held more than one records.

## Part 2
### Identify the BGP for the types of competitions in which a world record is classified

Identify information related to ("2017 World Championships in Athletics – men's 100 metres"), ('p', 'http://www.wikidata.org/prop/direct/P1344'), ('o', 'http://www.wikidata.org/entity/Q30765148

In [12]:
queryString = """
SELECT DISTINCT ?pName ?oName ?p ?o
WHERE{
    wd:Q30765148 ?p ?o .
    OPTIONAL{?p sc:name ?pName.}
    OPTIONAL{?o sc:name ?oName.}
 
}LIMIT 30

"""

print("Results")
run_query(queryString)

Results
[('pName', 'winner'), ('oName', 'Justin Gatlin'), ('p', 'http://www.wikidata.org/prop/direct/P1346'), ('o', 'http://www.wikidata.org/entity/Q1652')]
[('pName', 'follows'), ('oName', "2015 World Championships in Athletics – men's 100 metres"), ('p', 'http://www.wikidata.org/prop/direct/P155'), ('o', 'http://www.wikidata.org/entity/Q20785653')]
[('pName', 'followed by'), ('oName', "2019 World Championships in Athletics – men's 100 metres"), ('p', 'http://www.wikidata.org/prop/direct/P156'), ('o', 'http://www.wikidata.org/entity/Q61013304')]
[('pName', 'competition class'), ('oName', '100 metres'), ('p', 'http://www.wikidata.org/prop/direct/P2094'), ('o', 'http://www.wikidata.org/entity/Q164761')]
[('pName', 'competition class'), ('oName', 'male'), ('p', 'http://www.wikidata.org/prop/direct/P2094'), ('o', 'http://www.wikidata.org/entity/Q6581097')]
[('pName', 'location'), ('oName', 'London Stadium'), ('p', 'http://www.wikidata.org/prop/direct/P276'), ('o', 'http://www.wikidata.org

18

### Search information about the event 2017 World Championships in Athletics

In [13]:
queryString = """
SELECT DISTINCT ?pName ?oName ?p ?o
WHERE{
    wd:Q175508 ?p ?o .
    OPTIONAL{?p sc:name ?pName.}
    OPTIONAL{?o sc:name ?oName.}
 
}LIMIT 30

"""

print("Results")
run_query(queryString)

Results
[('pName', 'home venue'), ('oName', 'London Stadium'), ('p', 'http://www.wikidata.org/prop/direct/P115'), ('o', 'http://www.wikidata.org/entity/Q172190')]
[('pName', 'follows'), ('oName', '2015 World Championships in Athletics'), ('p', 'http://www.wikidata.org/prop/direct/P155'), ('o', 'http://www.wikidata.org/entity/Q1026555')]
[('pName', 'followed by'), ('oName', '2019 World Championships in Athletics'), ('p', 'http://www.wikidata.org/prop/direct/P156'), ('o', 'http://www.wikidata.org/entity/Q15694436')]
[('pName', 'country'), ('oName', 'United Kingdom'), ('p', 'http://www.wikidata.org/prop/direct/P17'), ('o', 'http://www.wikidata.org/entity/Q145')]
[('pName', 'location'), ('oName', 'London'), ('p', 'http://www.wikidata.org/prop/direct/P276'), ('o', 'http://www.wikidata.org/entity/Q84')]
[('pName', 'instance of'), ('oName', 'World Championships in Athletics'), ('p', 'http://www.wikidata.org/prop/direct/P31'), ('o', 'http://www.wikidata.org/entity/Q182653')]
[('pName', 'instan

26

### Search subclass of sport season 
[('pName', 'instance of'), ('oName', 'sports season'), ('p', 'http://www.wikidata.org/prop/direct/P31'), ('o', 'http://www.wikidata.org/entity/Q27020041')]

In [14]:
queryString = """
SELECT DISTINCT ?oName ?o
WHERE{
    wd:Q27020041 wdt:P31|wdt:P279* ?o .
    ?o sc:name ?oName.
    
    FILTER (regex(?oName, "sport")||regex(?oName, "event"))
 
}LIMIT 30

"""

print("Results")
run_query(queryString)

Results
[('oName', 'sports competition'), ('o', 'http://www.wikidata.org/entity/Q13406554')]
[('oName', 'recurring sporting event'), ('o', 'http://www.wikidata.org/entity/Q18608583')]
[('oName', 'sports season'), ('o', 'http://www.wikidata.org/entity/Q27020041')]
[('oName', 'recurring event'), ('o', 'http://www.wikidata.org/entity/Q15275719')]
[('oName', 'event'), ('o', 'http://www.wikidata.org/entity/Q1656682')]


5

### Search sport events

In [15]:
queryString = """
SELECT DISTINCT ?eventName
WHERE { 
    ?event wdt:P31|wdt:P279* wd:Q13406554 ;
        sc:name ?eventName .      
 
}LIMIT 15

"""

print("Results")
run_query(queryString)

Results
[('eventName', 'División de Honor Juvenil de Fútbol')]
[('eventName', 'Copa de la Reina de Baloncesto')]
[('eventName', 'Copa America')]
[('eventName', "EHF Women's Champions League")]
[('eventName', 'football in Azerbaijan')]
[('eventName', 'North African Cup Winners Cup')]
[('eventName', '2015 Omloop Het Nieuwsblad')]
[('eventName', 'Deutschland-Cup Gerätturnen 2019')]
[('eventName', 'German Male Junior Gymnastics Championships 2017')]
[('eventName', 'German Male Gymnastics Championships 2017')]
[('eventName', 'Sailing World Championships')]
[('eventName', 'Snowboard World Junior Championships')]
[('eventName', 'NBB Cup')]
[('eventName', 'Swedish Cross-Country Skiing Championships')]
[('eventName', '2013 UK Open')]


15

### Search predicates and objects associated with subject that are related to world records

In [16]:
queryString = """
SELECT DISTINCT ?pName ?oName ?p ?o
WHERE { 
    
    ?s ?x wd:Q688615 ;
        ?p ?o .
    OPTIONAL{ ?p sc:name ?pName .}
    OPTIONAL{ ?o sc:name ?oName .}
    
 
}LIMIT 30

"""

print("Results")
run_query(queryString)

Results
[('pName', 'occupation'), ('oName', 'swimmer'), ('p', 'http://www.wikidata.org/prop/direct/P106'), ('o', 'http://www.wikidata.org/entity/Q10843402')]
[('pName', 'occupation'), ('oName', 'athletics competitor'), ('p', 'http://www.wikidata.org/prop/direct/P106'), ('o', 'http://www.wikidata.org/entity/Q11513337')]
[('pName', 'occupation'), ('oName', 'shot putter'), ('p', 'http://www.wikidata.org/prop/direct/P106'), ('o', 'http://www.wikidata.org/entity/Q18534714')]
[('pName', 'occupation'), ('oName', 'American football player'), ('p', 'http://www.wikidata.org/prop/direct/P106'), ('o', 'http://www.wikidata.org/entity/Q19204627')]
[('pName', 'occupation'), ('oName', 'short track speed skater'), ('p', 'http://www.wikidata.org/prop/direct/P106'), ('o', 'http://www.wikidata.org/entity/Q18200514')]
[('pName', 'located in the administrative territorial entity'), ('oName', 'Peaks Island, Maine'), ('p', 'http://www.wikidata.org/prop/direct/P131'), ('o', 'http://www.wikidata.org/entity/Q715

30

### Search predicate and object associate to world records 

In [17]:
queryString = """
SELECT DISTINCT ?pName ?oName ?p ?o
WHERE { 
 
    ?person wdt:P1000 ?record ;     # record held
           wdt:P31 wd:Q5 .
    OPTIONAL {?person sc:name ?personName .}
    
    ?record wdt:P31 wd:Q688615 ;
            ?p ?o .
    OPTIONAL { ?p sc:name ?pName .}
    OPTIONAL { ?o sc:name ?oName .} 
    OPTIONAL{ ?record sc:name ?recordName . } 
 
}

"""

print("Results")
run_query(queryString)

Results
[('pName', 'instance of'), ('oName', 'world record'), ('p', 'http://www.wikidata.org/prop/direct/P31'), ('o', 'http://www.wikidata.org/entity/Q688615')]
[('pName', 'sport'), ('oName', 'athletics'), ('p', 'http://www.wikidata.org/prop/direct/P641'), ('o', 'http://www.wikidata.org/entity/Q542')]
[('pName', 'competition class'), ('oName', "men's sports"), ('p', 'http://www.wikidata.org/prop/direct/P2094'), ('o', 'http://www.wikidata.org/entity/Q35555522')]
[('pName', 'sport'), ('oName', 'heptathlon'), ('p', 'http://www.wikidata.org/prop/direct/P641'), ('o', 'http://www.wikidata.org/entity/Q243264')]
[('pName', 'competition class'), ('oName', "men's 100 metres"), ('p', 'http://www.wikidata.org/prop/direct/P2094'), ('o', 'http://www.wikidata.org/entity/Q55243390')]
[('pName', 'competition class'), ('oName', '20 ft catamaran'), ('p', 'http://www.wikidata.org/prop/direct/P2094'), ('o', 'http://www.wikidata.org/entity/Q107718305')]
[('pName', 'sport'), ('oName', 'sailing'), ('p', 'http

35

### Search if a record has part in a event

In [18]:
queryString = """
SELECT DISTINCT ?personName ?recordName ?eventName
WHERE { 
 
    ?person wdt:P1000 ?record ;     # record held
           wdt:P31 wd:Q5 .
    OPTIONAL {?person sc:name ?personName .}
    
    ?record wdt:P31 wd:Q688615 ;
            wdt:P527 ?event .       # has part
    OPTIONAl{ ?event sc:name ?eventName . }
    OPTIONAL{ ?record sc:name ?recordName . } 
 
}

"""

print("Results")
run_query(queryString)

Results
[('personName', 'Ted Meredith'), ('recordName', '800 metres world record progression'), ('eventName', "Women's 800 metres world record progression")]
[('personName', 'Ted Meredith'), ('recordName', '800 metres world record progression'), ('eventName', "Men's 800 metres world record progression")]


2

### Search a relation with  sports competitions

In [19]:
queryString = """
SELECT DISTINCT ?personName ?recordName ?eventName
WHERE { 
 
    ?person wdt:P1000 ?record ;     # record held
           wdt:P31 wd:Q5 .
    OPTIONAL {?person sc:name ?personName .}
    
    ?record wdt:P31 wd:Q688615 ;
            wdt:P527 ?event .      # has part
    ?event wdt:P31|wdt:P279* wd:Q13406554 .  # event instance of or subclass of sport event
    OPTIONAl{ ?event sc:name ?eventName . }
    OPTIONAL{ ?record sc:name ?recordName . } 
 
}

"""

print("Results")
run_query(queryString)

Results
Empty


0

### Try to see objects and predicates associated with the event

In [20]:
queryString = """
SELECT DISTINCT ?pName ?oName ?p ?o
WHERE { 
 
    ?person wdt:P1000 ?record ;     # record held
           wdt:P31 wd:Q5 .
    OPTIONAL {?person sc:name ?personName .}
    
    ?record wdt:P31 wd:Q688615 ;
            wdt:P527 ?event .       # has part
    ?event ?p ?o .
    OPTIONAl{ ?p sc:name ?pName . }
    OPTIONAL{ ?o sc:name ?oName . } 
 
}

"""

print("Results")
run_query(queryString)

Results
[('pName', 'instance of'), ('oName', 'Wikimedia list article'), ('p', 'http://www.wikidata.org/prop/direct/P31'), ('o', 'http://www.wikidata.org/entity/Q13406463')]
[('pName', 'is a list of'), ('oName', "women's world record"), ('p', 'http://www.wikidata.org/prop/direct/P360'), ('o', 'http://www.wikidata.org/entity/Q24033838')]
[('p', 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type'), ('o', 'http://wikiba.se/ontology#Item')]
[('p', 'http://schema.org/name'), ('o', "Men's 800 metres world record progression")]
[('p', 'http://schema.org/name'), ('o', "Women's 800 metres world record progression")]


5

### Search event related to a person

In [21]:
queryString = """
SELECT DISTINCT ?eventName
WHERE { 
 
    ?person wdt:P1000 ?record ;     # record held
           wdt:P31 wd:Q5 ;
           wdt:P793 ?event .
    OPTIONAL {?person sc:name ?personName .}
    OPTIONAL {?event sc:name ?eventName .}
    
    ?record wdt:P31 wd:Q688615 .
}

"""

print("Results")
run_query(queryString)

Results
[('eventName', 'lighting the Olympic cauldron')]
[('eventName', 'male-to-female sex reassignment surgery')]


2

### Retrieve in which event the records were done

In [22]:
queryString = """
SELECT DISTINCT ?personName ?eventName (GROUP_CONCAT(DISTINCT ?recordName; SEPARATOR=", ") AS ?records)
WHERE { 
 
    ?person wdt:P1000 ?record ;     # record held
           wdt:P31 wd:Q5 ;
           wdt:P793 ?event ;
           sc:name ?personName .
    
    ?event sc:name ?eventName .
    
    OPTIONAL{ ?record sc:name ?recordName . }

} GROUP BY ?personName ?eventName
LIMIT 15

"""

print("Results")
run_query(queryString)

Results
[('personName', 'Rafer Johnson'), ('eventName', 'lighting the Olympic cauldron'), ('records', 'decathlon world record progression')]
[('personName', 'Caitlyn Jenner'), ('eventName', 'male-to-female sex reassignment surgery'), ('records', 'decathlon world record progression')]
[('personName', 'Usain Bolt'), ('eventName', '2012 Summer Olympics'), ('records', "Men's 100 metres world record progression, Men's 4 × 100 metres relay world record progression, men's 200 metres world record progression")]


3

### Retrieve information about event

In [23]:
queryString = """
SELECT DISTINCT ?pName ?oName ?p ?o
WHERE { 
 
    ?person wdt:P1000 ?record ;     # record held
           wdt:P31 wd:Q5 ;
           wdt:P793 ?event ;
           sc:name ?personName .
    
    ?event sc:name ?eventName ;
        ?p ?o .
    ?p sc:name ?pName .
    ?o sc:name ?oName .

} LIMIT 30


"""

print("Results")
run_query(queryString)

Results
[('pName', 'has effect'), ('oName', 'neovagina'), ('p', 'http://www.wikidata.org/prop/direct/P1542'), ('o', 'http://www.wikidata.org/entity/Q10336336')]
[('pName', 'follows'), ('oName', '2008 Summer Olympics'), ('p', 'http://www.wikidata.org/prop/direct/P155'), ('o', 'http://www.wikidata.org/entity/Q8567')]
[('pName', 'followed by'), ('oName', '2016 Summer Olympics'), ('p', 'http://www.wikidata.org/prop/direct/P156'), ('o', 'http://www.wikidata.org/entity/Q8613')]
[('pName', 'country'), ('oName', 'United Kingdom'), ('p', 'http://www.wikidata.org/prop/direct/P17'), ('o', 'http://www.wikidata.org/entity/Q145')]
[('pName', 'uses'), ('oName', 'Olympic torch'), ('p', 'http://www.wikidata.org/prop/direct/P2283'), ('o', 'http://www.wikidata.org/entity/Q15056334')]
[('pName', 'uses'), ('oName', 'Olympic cauldron'), ('p', 'http://www.wikidata.org/prop/direct/P2283'), ('o', 'http://www.wikidata.org/entity/Q3643463')]
[('pName', 'has parts of the class'), ('oName', 'Olympic sporting event

30

### Connect information about events and world record

In [24]:
queryString = """
SELECT ?recordName ?partOf
WHERE { 
 
    ?person wdt:P1000 ?record ;     # record held
           wdt:P31 wd:Q5 ;
           wdt:P793 ?event .
    
    OPTIONAL{ ?event wdt:P2670 ?isPartOf .}
    ?isPartOf sc:name ?partOf .
    
    OPTIONAL{ ?record sc:name ?recordName . }

}
LIMIT 30

"""

print("Results")
run_query(queryString)

Results
[('recordName', "Men's 4 × 100 metres relay world record progression"), ('partOf', 'Olympic sporting event')]
[('recordName', "Men's 100 metres world record progression"), ('partOf', 'Olympic sporting event')]
[('recordName', "men's 200 metres world record progression"), ('partOf', 'Olympic sporting event')]


3

### Result part 2
As we can see from the last query there is a way to connect the world records to the event where they are conquered.

The strange thing is that there are only 3 results, this can be probably related to the data present in our database.

## Part 3
### How many world records are held by Italian athletes

### Search country predicate 

In [25]:
queryString = """
SELECT ?sName ?pName ?s ?p
WHERE {
    ?s ?p wd:Q766 .
    OPTIONAL { ?s sc:name ?sName . }
    OPTIONAL { ?p sc:name ?pName . }
} 
LIMIT 25

"""

print("Results")
run_query(queryString)

Results
[('sName', 'Jamaica at the 2014 Winter Olympics'), ('pName', 'country'), ('s', 'http://www.wikidata.org/entity/Q15487664'), ('p', 'http://www.wikidata.org/prop/direct/P17')]
[('sName', 'Falmouth'), ('pName', 'country'), ('s', 'http://www.wikidata.org/entity/Q1394786'), ('p', 'http://www.wikidata.org/prop/direct/P17')]
[('sName', 'French Jamaican'), ('pName', 'country'), ('s', 'http://www.wikidata.org/entity/Q16201967'), ('p', 'http://www.wikidata.org/prop/direct/P17')]
[('sName', 'Irish people in Jamaica'), ('pName', 'country'), ('s', 'http://www.wikidata.org/entity/Q16202231'), ('p', 'http://www.wikidata.org/prop/direct/P17')]
[('sName', 'Jamaicans'), ('pName', 'located in the administrative territorial entity'), ('s', 'http://www.wikidata.org/entity/Q6127469'), ('p', 'http://www.wikidata.org/prop/direct/P131')]
[('sName', 'Jamaicans'), ('pName', 'named after'), ('s', 'http://www.wikidata.org/entity/Q6127469'), ('p', 'http://www.wikidata.org/prop/direct/P138')]
[('sName', 'Jam

25

### Find Italy URI

In [26]:
queryString = """
SELECT DISTINCT ?italy ?name
WHERE {
    ?s wdt:P17 ?italy .
    ?italy sc:name ?name .
    FILTER regex(?name,"[Ii]taly")
} 

"""

print("Results")
run_query(queryString)

Results
[('italy', 'http://www.wikidata.org/entity/Q913582'), ('name', 'Roman Italy')]
[('italy', 'http://www.wikidata.org/entity/Q4345530'), ('name', 'Northern Italy')]
[('italy', 'http://www.wikidata.org/entity/Q223936'), ('name', 'Kingdom of Italy')]
[('italy', 'http://www.wikidata.org/entity/Q172579'), ('name', 'Kingdom of Italy')]
[('italy', 'http://www.wikidata.org/entity/Q38'), ('name', 'Italy')]
[('italy', 'http://www.wikidata.org/entity/Q838931'), ('name', 'Kingdom of Italy')]
[('italy', 'http://www.wikidata.org/entity/Q3155864'), ('name', 'Geographical region of Italy')]
[('italy', 'http://www.wikidata.org/entity/Q2480041'), ('name', 'Italy in the Middle Ages')]
[('italy', 'http://www.wikidata.org/entity/Q48742118'), ('name', 'Kingdom of Italy under Fascism')]
[('italy', 'http://www.wikidata.org/entity/Q5343710'), ('name', 'Kingdom of Italy')]


10

### Identify data about italian athletes

In [27]:
queryString = """
SELECT ?aName ?a
WHERE {
    ?a wdt:P106/wdt:P279 wd:Q2066131 ;
        wdt:P27 wd:Q38 ; #italy
        sc:name ?aName .
} LIMIT 5

"""

print("Results")
run_query(queryString)

Results
[('aName', 'Cristina Tartarone'), ('a', 'http://www.wikidata.org/entity/Q96754211')]
[('aName', 'Barbara Masi'), ('a', 'http://www.wikidata.org/entity/Q16531085')]
[('aName', 'Manuela Manetta'), ('a', 'http://www.wikidata.org/entity/Q3845091')]
[('aName', 'Andrea Capella'), ('a', 'http://www.wikidata.org/entity/Q80350481')]
[('aName', 'Davide Bianchetti'), ('a', 'http://www.wikidata.org/entity/Q3703498')]


5

### Search athletes who held a world record

In [28]:
queryString = """
SELECT ?aName (GROUP_CONCAT(DISTINCT ?recordName; SEPARATOR=", ") AS ?records)
WHERE {
    ?a wdt:P106/wdt:P279 wd:Q2066131 ;
        wdt:P27 wd:Q38 ; #italy
        wdt:P1000 ?record ; 
        sc:name ?aName .
    ?record sc:name ?recordName ;
            ?is wd:Q688615 .
            
} LIMIT 10

"""

print("Results")
run_query(queryString)

Results
[('aName', 'Sara Simeoni'), ('records', "Men's high jump world record progression")]
[('aName', 'Salvatore Morale'), ('records', "Men's 400 metres hurdles world record progression")]


2

### Count athletes who held a world record

In [29]:
queryString = """
SELECT (COUNT(DISTINCT ?record) AS ?howMany)
WHERE {
    ?a wdt:P106/wdt:P279 wd:Q2066131 ;
        wdt:P27 wd:Q38 ; #italy
        wdt:P1000 ?record .
        
    ?record ?is wd:Q688615 .
            
} 

"""

print("Results")
run_query(queryString)

Results
[('howMany', '2')]


1

### Results part 3
We can conclude that 2 records are hold. We can also see in the previous query in which categories are and the name of the athlete who helds the record.

## Part 4
Compare number of world records held across different dimensions

### Part 4.1 In which specific sport France has the most world records

### Find France URI

In [30]:
queryString = """
SELECT DISTINCT ?france ?name
WHERE {
    ?s wdt:P17 ?france .
    ?france sc:name ?name .
    FILTER regex(?name,"[Ff]rance")
} 

"""

print("Results")
run_query(queryString)

Results
[('france', 'http://www.wikidata.org/entity/Q3088906'), ('name', 'France–United Kingdom border')]
[('france', 'http://www.wikidata.org/entity/Q203396'), ('name', 'Overseas France')]
[('france', 'http://www.wikidata.org/entity/Q69323'), ('name', 'Kingdom of France')]
[('france', 'http://www.wikidata.org/entity/Q170604'), ('name', 'New France')]
[('france', 'http://www.wikidata.org/entity/Q142'), ('name', 'France')]
[('france', 'http://www.wikidata.org/entity/Q69808'), ('name', 'Vichy France')]
[('france', 'http://www.wikidata.org/entity/Q70972'), ('name', 'Kingdom of France')]
[('france', 'http://www.wikidata.org/entity/Q152582'), ('name', 'Free France')]
[('france', 'http://www.wikidata.org/entity/Q99294601'), ('name', 'France')]
[('france', 'http://www.wikidata.org/entity/Q1573912'), ('name', 'Kingdom of France')]


10

### Is wd:Q142 the correct one?

In [31]:
queryString = """
SELECT DISTINCT ?pName ?oName ?p
WHERE {
     wd:Q142 ?p ?o .
    ?p sc:name ?pName .
    ?o sc:name ?oName
} LIMIT 50

"""

print("Results")
run_query(queryString)

Results
[('pName', 'category for people born here'), ('oName', 'Category:Births in France'), ('p', 'http://www.wikidata.org/prop/direct/P1464')]
[('pName', 'has part'), ('oName', 'Loire Valley'), ('p', 'http://www.wikidata.org/prop/direct/P527')]
[('pName', 'top-level Internet domain'), ('oName', '.fr'), ('p', 'http://www.wikidata.org/prop/direct/P78')]
[('pName', 'flag'), ('oName', 'flag of France'), ('p', 'http://www.wikidata.org/prop/direct/P163')]
[('pName', 'contains administrative territorial entity'), ('oName', 'Clipperton Island'), ('p', 'http://www.wikidata.org/prop/direct/P150')]
[('pName', 'contains administrative territorial entity'), ('oName', 'Saint Barthélemy'), ('p', 'http://www.wikidata.org/prop/direct/P150')]
[('pName', 'contains administrative territorial entity'), ('oName', 'Martinique'), ('p', 'http://www.wikidata.org/prop/direct/P150')]
[('pName', 'anthem'), ('oName', 'La Marseillaise'), ('p', 'http://www.wikidata.org/prop/direct/P85')]
[('pName', 'production stat

50

### Find France athletes

In [32]:
queryString = """
SELECT ?recordName (COUNT(DISTINCT ?a) AS ?howMany)
WHERE {
    ?a wdt:P106/wdt:P279 wd:Q2066131 ;
        wdt:P27 wd:Q142 ; #france
        wdt:P1000 ?record .
        
    ?record ?is wd:Q688615 ;
        sc:name ?recordName
            
} GROUP BY ?recordName
ORDER BY DESC(?howMany)


"""

print("Results")
run_query(queryString)

Results
[('recordName', 'decathlon world record progression'), ('howMany', '2')]
[('recordName', 'list of world junior records in athletics'), ('howMany', '1')]
[('recordName', "Women's heptathlon world record progression"), ('howMany', '1')]


3

### Display only the first one category (specific sport France has the most world records)

In [33]:
queryString = """
SELECT ?recordName
WHERE {
    ?a wdt:P106/wdt:P279 wd:Q2066131 ;
        wdt:P27 wd:Q142 ; #france
        wdt:P1000 ?record .
        
    ?record ?is wd:Q688615 ;
        sc:name ?recordName .
            
} GROUP BY ?recordName
ORDER BY DESC(COUNT(DISTINCT ?a))
LIMIT 1


"""

print("Results")
run_query(queryString)

Results
[('recordName', 'decathlon world record progression')]


1

### Search predicates associated to a record

In [34]:
queryString = """
SELECT DISTINCT ?pName ?p
WHERE {
    ?a wdt:P106/wdt:P279 wd:Q2066131 ;
        wdt:P27 wd:Q142 ; #france
        wdt:P1000 ?record .
        
    ?record ?is wd:Q688615 ;
        sc:name ?recordName ;
        ?p ?o .
    ?p sc:name ?pName .        
}

"""

print("Results")
run_query(queryString)

Results
[('pName', 'instance of'), ('p', 'http://www.wikidata.org/prop/direct/P31')]
[('pName', 'is a list of'), ('p', 'http://www.wikidata.org/prop/direct/P360')]
[('pName', 'sport'), ('p', 'http://www.wikidata.org/prop/direct/P641')]


3

### Retrieve the sport's name of the categories and return the one which has the higher number of records

In [35]:
queryString = """
SELECT ?sportName
WHERE {
    ?a wdt:P106/wdt:P279 wd:Q2066131 ;
        wdt:P27 wd:Q142 ; #france
        wdt:P1000 ?record .
        
    ?record ?is wd:Q688615 ;
          wdt:P641 ?sport .
    ?sport sc:name ?sportName .
            
} GROUP BY ?sportName
ORDER BY DESC(COUNT(DISTINCT ?a))
LIMIT 1


"""

print("Results")
run_query(queryString)

Results
[('sportName', 'athletics')]


1

### Result part 4.1
The french athletes won the higher number of records in athletics.

### Part 4.2 How many world records are held across genders

### Search gender predicate

In [36]:
queryString = """
SELECT DISTINCT ?pName ?oName ?p ?o
WHERE{
    wd:Q1189 ?p ?o .
    OPTIONAL{?p sc:name ?pName.}
    OPTIONAL{?o sc:name ?oName.}
    
    FILTER regex(?pName, "gender" )
 
}LIMIT 30

"""

print("Results")
run_query(queryString)

Results
[('pName', 'sex or gender'), ('oName', 'male'), ('p', 'http://www.wikidata.org/prop/direct/P21'), ('o', 'http://www.wikidata.org/entity/Q6581097')]


1

### Display athlets who held World Record based on their gender

In [37]:
queryString = """
SELECT ?record ?recordName ?genderType (COUNT(DISTINCT ?a) AS ?howMany)
WHERE {
    ?a wdt:P106/wdt:P279 wd:Q2066131 ;
        wdt:P21 ?gender ;
        wdt:P1000 ?record .
        
    ?record ?is wd:Q688615 ;
          sc:name ?recordName .
    
    ?gender sc:name ?genderType .
            
} GROUP BY ?record ?recordName ?genderType
ORDER BY ?recordName
LIMIT 15

"""

print("Results")
run_query(queryString)

Results
[('record', 'http://www.wikidata.org/entity/Q1364158'), ('recordName', '800 metres world record progression'), ('genderType', 'male'), ('howMany', '1')]
[('record', 'http://www.wikidata.org/entity/Q208410'), ('recordName', 'List of world records in athletics'), ('genderType', 'male'), ('howMany', '1')]
[('record', 'http://www.wikidata.org/entity/Q1066353'), ('recordName', "Men's 100 metres world record progression"), ('genderType', 'male'), ('howMany', '1')]
[('record', 'http://www.wikidata.org/entity/Q1053563'), ('recordName', "Men's 4 x 400 metres relay world record progression"), ('genderType', 'male'), ('howMany', '5')]
[('record', 'http://www.wikidata.org/entity/Q1139050'), ('recordName', "Men's 400 metres hurdles world record progression"), ('genderType', 'male'), ('howMany', '14')]
[('record', 'http://www.wikidata.org/entity/Q1140490'), ('recordName', "Men's 400 metres world record progression"), ('genderType', 'male'), ('howMany', '1')]
[('record', 'http://www.wikidata.

15

### Search derivation

In [38]:
queryString = """
SELECT ?pName ?oName ?p ?o
WHERE {    
    wd:Q1364158 ?p ?o .
    ?p sc:name ?pName .
    ?o sc:name ?oName .
            
} 

"""

print("Results")
run_query(queryString)

Results
[('pName', 'has part'), ('oName', "Women's 800 metres world record progression"), ('p', 'http://www.wikidata.org/prop/direct/P527'), ('o', 'http://www.wikidata.org/entity/Q2757357')]
[('pName', 'sport'), ('oName', 'athletics'), ('p', 'http://www.wikidata.org/prop/direct/P641'), ('o', 'http://www.wikidata.org/entity/Q542')]
[('pName', 'instance of'), ('oName', 'world record'), ('p', 'http://www.wikidata.org/prop/direct/P31'), ('o', 'http://www.wikidata.org/entity/Q688615')]
[('pName', 'has part'), ('oName', "Men's 800 metres world record progression"), ('p', 'http://www.wikidata.org/prop/direct/P527'), ('o', 'http://www.wikidata.org/entity/Q67173303')]


4

### Try to see if each record has the "has part" predicate

In [39]:
queryString = """
SELECT ?compName ?genderType (COUNT(DISTINCT ?a) AS ?howMany)
WHERE {
    ?a wdt:P106/wdt:P279 wd:Q2066131 ;
        wdt:P21 ?gender ;
        wdt:P1000 ?record .
        
    ?record ?is wd:Q688615 ;
        wdt:P527 ?comp ;
        sc:name ?recordName .
    ?comp sc:name ?compName .
    ?gender sc:name ?genderType .
            
} GROUP BY ?compName ?genderType
ORDER BY ?compName
LIMIT 15

"""

print("Results")
run_query(queryString)

Results
[('compName', "Men's 800 metres world record progression"), ('genderType', 'male'), ('howMany', '1')]
[('compName', "Women's 800 metres world record progression"), ('genderType', 'male'), ('howMany', '1')]


2

### Result part 4.2
As we can see not all the records are connected with a "has part" predicate. So, is better to put in relation the gender of the athlete that helds the record and the gender type of the competition in which the record was won.

### Part 4.3 Which sport has the highest number of world records

In [40]:
queryString = """
SELECT ?sportName (COUNT(DISTINCT ?record) AS ?howManyRecords)
WHERE {
    ?a wdt:P106/wdt:P279 wd:Q2066131 ;
        wdt:P1000 ?record .
        
    ?record ?is wd:Q688615 ;
          wdt:P641 ?sport .
    ?sport sc:name ?sportName .
            
} GROUP BY ?sportName
ORDER BY DESC(?howManyRecords)
LIMIT 3


"""

print("Results")
run_query(queryString)

Results
[('sportName', 'athletics'), ('howManyRecords', '11')]
[('sportName', 'swimming'), ('howManyRecords', '2')]
[('sportName', 'heptathlon'), ('howManyRecords', '1')]


3

### Result part 4.3
In this query is easy to connect each record to its sport. So grouping by sport we are able to count how many record was won in our database. 

In this case athletics has the higher number of records.