# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [2]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-68b0245169-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# Sport Workflow Series ("Olympic explorative search") 

Consider the following exploratory information need:

> investigate the results, in terms of participations and competitions won, in the past editions of the olympics between France and Germany

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | profession    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P27`     | nationality   | predicate |
| `wd:Q159821`  | Olympic games | node      |
| `wd:Q10843402`| Swimmer       | node |
| `wd:Q2066131` | Athlete       | node |
| `wd:Q142`     | France        | node |
| `wd:Q183`     | Germany       | node |


Also consider


```
?a wdt:P106/wdt:P279 wd:Q2066131
```

is the BGP to retrieve all instances of **athlete**


```
?p wdt:P27 wd:Q142
```

is the BGP to retrieve all **citizens of France**

## Workload Goals

1. Identify the BGP for participating or winning some competition

2. Identify the BGP for different sports or types of competitions

3. Is there some French or German Athlete participating to any olympic event with double citizenship?

4. Compare number of participations and winning events across different dimensions
 
   4.1 In which specific sport France has more winners than Germany
   
   4.2 Who has the highest number of female winners
   
   4.3 For each country , which are the atheletes with the highest number of participations


In [3]:
# start your workflow here

In [4]:
queryString = """
SELECT COUNT(*)
WHERE { 

?a wdt:P106/wdt:P279 wd:Q2066131

} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '929008')]


1

### 1. Identify the BGP for participating or winning some competition

Let's search for swimmers

In [63]:
queryString = """
SELECT ?a ?athlete ?sname ?p ?pname WHERE {
   
    ?a wdt:P106/wdt:P279 wd:Q10843402 ;
        wdt:P27 ?country ;
        ?p ?s .
    ?s ?p2 wd:Q159821 .
    
    # this returns the labels
    ?a <http://schema.org/name> ?athlete .
    ?s <http://schema.org/name> ?sname .
    ?p <http://schema.org/name> ?pname .
    
    
     
     FILTER ((?country = wd:Q183) || (?country = wd:Q142)) .
}
LIMIT 100

"""
print("Results")
run_query(queryString)

Results
[('a', 'http://www.wikidata.org/entity/Q2375792'), ('athlete', 'Muriel Hermine'), ('sname', '1988 Summer Olympics'), ('p', 'http://www.wikidata.org/prop/direct/P1344'), ('pname', 'participant in')]
[('a', 'http://www.wikidata.org/entity/Q19519973'), ('athlete', 'Karine Schuler'), ('sname', '1988 Summer Olympics'), ('p', 'http://www.wikidata.org/prop/direct/P1344'), ('pname', 'participant in')]
[('a', 'http://www.wikidata.org/entity/Q18685209'), ('athlete', 'Anne Capron'), ('sname', '1988 Summer Olympics'), ('p', 'http://www.wikidata.org/prop/direct/P1344'), ('pname', 'participant in')]
[('a', 'http://www.wikidata.org/entity/Q453274'), ('athlete', 'Sylvie Fréchette'), ('sname', '1992 Summer Olympics'), ('p', 'http://www.wikidata.org/prop/direct/P1344'), ('pname', 'participant in')]
[('a', 'http://www.wikidata.org/entity/Q16146393'), ('athlete', 'Marianne Aeschbacher'), ('sname', '1992 Summer Olympics'), ('p', 'http://www.wikidata.org/prop/direct/P1344'), ('pname', 'participant i

48

Let's find out if one of them (let's use Q5085930), has something that links with some competitions.

In [24]:
queryString = """
SELECT ?p ?pname ?o ?oname WHERE {
   
    wd:Q5085930 ?p ?o .
    
    # this returns the labels
    ?p <http://schema.org/name> ?pname .
    ?o <http://schema.org/name> ?oname .

        
}

"""
print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P3373'), ('pname', 'sibling'), ('o', 'http://www.wikidata.org/entity/Q6308161'), ('oname', 'Julie Fabre')]
[('p', 'http://www.wikidata.org/prop/direct/P735'), ('pname', 'given name'), ('o', 'http://www.wikidata.org/entity/Q264002'), ('oname', 'Charlotte')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('pname', 'sport'), ('o', 'http://www.wikidata.org/entity/Q180692'), ('oname', 'synchronized swimming')]
[('p', 'http://www.wikidata.org/prop/direct/P27'), ('pname', 'country of citizenship'), ('o', 'http://www.wikidata.org/entity/Q142'), ('oname', 'France')]
[('p', 'http://www.wikidata.org/prop/direct/P1412'), ('pname', 'languages spoken, written or signed'), ('o', 'http://www.wikidata.org/entity/Q150'), ('oname', 'French')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('pname', 'sport'), ('o', 'http://www.wikidata.org/entity/Q31920'), ('oname', 'swimming')]
[('p', 'http://www.wikidata.org/prop/direct/P19'), ('pname', 'plac

12

I see that P1344 'participant in' is what I am searching for. 

Now let's search something for winning, using as example the 2000 Summer Olympics (Q8544).

In [53]:
queryString = """
SELECT ?p ?pname ?o ?oname WHERE {
    wd:Q8544 ?p ?o .
    
    # this returns the labels
    ?p <http://schema.org/name> ?pname .
    ?o <http://schema.org/name> ?oname .

}
LIMIT 30

"""
print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q513725'), ('oname', 'synchronized swimming at the 2000 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q579844'), ('oname', 'fencing at the 2000 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q847738'), ('oname', 'cycling at the 2000 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q679703'), ('oname', 'judo at the 2000 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of'), ('o', 'http://www.wikidata.org/entity/Q159821'), ('oname', 'Summer Olympic Games')]
[('p', 'http://www.wikidata.org/prop/direct/P2670'), ('pname', 'has parts of the class'), ('o', 'http://www.wikidata.org/ent

30

Let's find winners in volleyball (Q638193)

In [37]:
queryString = """
SELECT ?p ?pname ?o ?oname WHERE {
   
   wd:Q754589 ?p ?o .
    
    # this returns the labels
    ?p <http://schema.org/name> ?pname .
    ?o <http://schema.org/name> ?oname .
    
    
}
LIMIT 10

"""
print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of'), ('o', 'http://www.wikidata.org/entity/Q27020041'), ('oname', 'sports season')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('pname', 'sport'), ('o', 'http://www.wikidata.org/entity/Q2736'), ('oname', 'association football')]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('pname', 'country'), ('o', 'http://www.wikidata.org/entity/Q408'), ('oname', 'Australia')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('pname', 'part of'), ('o', 'http://www.wikidata.org/entity/Q8544'), ('oname', '2000 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P3450'), ('pname', 'sports season of league or competition'), ('o', 'http://www.wikidata.org/entity/Q188317'), ('oname', 'association football at the Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q795004'), ('oname', "football at the 2000 Summer Olympic

10

Let's analyze Q377600: football at the 2000 Summer Olympics – men's tournament

In [39]:
queryString = """
SELECT ?p ?pname ?o ?oname WHERE {
   
   wd:Q377600 ?p ?o .
    
    # this returns the labels
    ?p <http://schema.org/name> ?pname .
    ?o <http://schema.org/name> ?oname .
    
    
}

"""
print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of'), ('o', 'http://www.wikidata.org/entity/Q18536594'), ('oname', 'Olympic sporting event')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('pname', 'sport'), ('o', 'http://www.wikidata.org/entity/Q2736'), ('oname', 'association football')]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('pname', 'country'), ('o', 'http://www.wikidata.org/entity/Q408'), ('oname', 'Australia')]
[('p', 'http://www.wikidata.org/prop/direct/P664'), ('pname', 'organizer'), ('o', 'http://www.wikidata.org/entity/Q40970'), ('oname', 'International Olympic Committee')]
[('p', 'http://www.wikidata.org/prop/direct/P2094'), ('pname', 'competition class'), ('o', 'http://www.wikidata.org/entity/Q31930761'), ('oname', "men's association football")]
[('p', 'http://www.wikidata.org/prop/direct/P1346'), ('pname', 'winner'), ('o', 'http://www.wikidata.org/entity/Q175309'), ('oname', 'Cameroon national football team')]
[('p', 'http:/

9

### 2. Identify the BGP for different sports or types of competitions



In [44]:
queryString = """
SELECT ?p ?pname ?o ?oname WHERE {
   
    wd:Q159821 ?p ?o .        
    
    # this returns the labels
    ?p <http://schema.org/name> ?pname .
    ?o <http://schema.org/name> ?oname .
    
}
"""
print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q8098'), ('oname', '1904 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q8150'), ('oname', '1936 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q8415'), ('oname', '1960 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q181278'), ('oname', '2020 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of'), ('o', 'http://www.wikidata.org/entity/Q18608583'), ('oname', 'recurring sporting event')]
[('p', 'http://www.wikidata.org/prop/direct/P664'), ('pname', 'organizer'), ('o', 'http://www.wikidata.org/entity/Q40970'), ('oname', 'International Olympic Committee')]
[('p', 'http://www.wikida

42

Let's select '2008 Summer Olympics'

In [45]:
queryString = """
SELECT ?p ?pname ?o ?oname WHERE {
   
   wd:Q8567 ?p ?o .        
    
    # this returns the labels
    ?p <http://schema.org/name> ?pname .
    ?o <http://schema.org/name> ?oname .
    
}
"""
print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q221132'), ('oname', 'field hockey at the 2008 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P542'), ('pname', 'officially opened by'), ('o', 'http://www.wikidata.org/entity/Q15029'), ('oname', 'Hu Jintao')]
[('p', 'http://www.wikidata.org/prop/direct/P543'), ('pname', 'oath made by'), ('o', 'http://www.wikidata.org/entity/Q197335'), ('oname', 'Zhang Yining')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q387526'), ('oname', 'table tennis at the 2008 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q220654'), ('oname', 'wrestling at the 2008 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q204844'), ('oname', 'swimming 

60

Let's use 'canoeing at the 2008 Summer Olympics'

In [48]:
queryString = """
SELECT ?p ?p2 WHERE {
   
    wd:Q318197 ?p wd:Q8567 .
    wd:Q8567 ?p2 wd:Q159821 .
    
}
"""
print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('p2', 'http://www.wikidata.org/prop/direct/P31')]


1

In [None]:
So the BGP is     
    ?sport wdt:P361 ?olympicEvent .
       ?olympicEvent wdt:p31 wd:Q159821 .

### 3. Is there some French or German Athlete participating to any olympic event with double citizenship?

In [None]:
I use the property P2670 --> wd:Q18536594 (has partecipating in --> Olympic sporting event)

In [36]:
queryString = """
SELECT DISTINCT ?a ?aname ?c1 ?country1 ?c2 ?country2 WHERE {
   
    ?a wdt:P106/wdt:P279 wd:Q2066131 ;
        wdt:P27 ?c1 ;
        wdt:P27 ?c2 ;
        wdt:P1344 ?competition .
    ?competition wdt:P31 wd:Q18536594 .
    
        
    
    # this returns the labels
    ?a <http://schema.org/name> ?aname .
    ?c1 <http://schema.org/name> ?country1 .   
    ?c2 <http://schema.org/name> ?country2 .   
     
      FILTER ((?c1 = wd:Q183) || (?c1 = wd:Q142) || (?c2 = wd:Q183) || (?c2 = wd:Q142)) . 
      FILTER (?c1 != ?c2) .
}
LIMIT 20

"""
print("Results")
run_query(queryString)

Results
[('a', 'http://www.wikidata.org/entity/Q26252931'), ('aname', 'Jean-Pierre Bourhis'), ('c1', 'http://www.wikidata.org/entity/Q1041'), ('country1', 'Senegal'), ('c2', 'http://www.wikidata.org/entity/Q142'), ('country2', 'France')]
[('a', 'http://www.wikidata.org/entity/Q66684569'), ('aname', 'Simon Bachmann'), ('c1', 'http://www.wikidata.org/entity/Q1042'), ('country1', 'Seychelles'), ('c2', 'http://www.wikidata.org/entity/Q142'), ('country2', 'France')]
[('a', 'http://www.wikidata.org/entity/Q26252931'), ('aname', 'Jean-Pierre Bourhis'), ('c1', 'http://www.wikidata.org/entity/Q142'), ('country1', 'France'), ('c2', 'http://www.wikidata.org/entity/Q1041'), ('country2', 'Senegal')]
[('a', 'http://www.wikidata.org/entity/Q66684569'), ('aname', 'Simon Bachmann'), ('c1', 'http://www.wikidata.org/entity/Q142'), ('country1', 'France'), ('c2', 'http://www.wikidata.org/entity/Q1042'), ('country2', 'Seychelles')]
[('a', 'http://www.wikidata.org/entity/Q1512523'), ('aname', 'Gerhard Schulz

20

I see that someone is citizen of German Democratic Republic. I decide to delete old states that no longer exists.
Let's see the properties of German Democratic Republic

In [29]:
queryString = """
SELECT DISTINCT ?p ?pname ?o ?oname WHERE {
   
    wd:Q16957 ?p ?o .
    
    # this returns the labels
    ?p <http://schema.org/name> ?pname .
    ?o <http://schema.org/name> ?oname .
}
"""
print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1151'), ('pname', "topic's main Wikimedia portal"), ('o', 'http://www.wikidata.org/entity/Q14615751'), ('oname', 'Portal:East Germany')]
[('p', 'http://www.wikidata.org/prop/direct/P38'), ('pname', 'currency'), ('o', 'http://www.wikidata.org/entity/Q16068'), ('oname', 'Deutsche Mark')]
[('p', 'http://www.wikidata.org/prop/direct/P150'), ('pname', 'contains administrative territorial entity'), ('o', 'http://www.wikidata.org/entity/Q48054'), ('oname', 'Erfurt District')]
[('p', 'http://www.wikidata.org/prop/direct/P150'), ('pname', 'contains administrative territorial entity'), ('o', 'http://www.wikidata.org/entity/Q48066'), ('oname', 'Halle District')]
[('p', 'http://www.wikidata.org/prop/direct/P150'), ('pname', 'contains administrative territorial entity'), ('o', 'http://www.wikidata.org/entity/Q48083'), ('oname', 'Magdeburg District')]
[('p', 'http://www.wikidata.org/prop/direct/P150'), ('pname', 'contains administrative territoria

80

I use instance of --> historical country(Q3024240)

In [39]:
queryString = """
SELECT DISTINCT ?a ?aname ?c1 ?country1 ?c2 ?country2 WHERE {
   
    ?a wdt:P106/wdt:P279 wd:Q2066131 ;
        wdt:P27 ?c1 ;
        wdt:P27 ?c2 ;
        wdt:P1344 ?competition .
    ?competition wdt:P31 wd:Q18536594 .
    
        
    
    # this returns the labels
    ?a <http://schema.org/name> ?aname .
    ?c1 <http://schema.org/name> ?country1 .   
    ?c2 <http://schema.org/name> ?country2 .   
     
      FILTER ((?c1 = wd:Q183) || (?c1 = wd:Q142) || (?c2 = wd:Q183) || (?c2 = wd:Q142)) . 
      FILTER (?c1 != ?c2) .
      FILTER NOT EXISTS {?c1 wdt:P31 wd:Q3024240} .
      FILTER NOT EXISTS {?c2 wdt:P31 wd:Q3024240} .
}
LIMIT 20

"""
print("Results")
run_query(queryString)

Results
[('a', 'http://www.wikidata.org/entity/Q26252931'), ('aname', 'Jean-Pierre Bourhis'), ('c1', 'http://www.wikidata.org/entity/Q1041'), ('country1', 'Senegal'), ('c2', 'http://www.wikidata.org/entity/Q142'), ('country2', 'France')]
[('a', 'http://www.wikidata.org/entity/Q26252931'), ('aname', 'Jean-Pierre Bourhis'), ('c1', 'http://www.wikidata.org/entity/Q142'), ('country1', 'France'), ('c2', 'http://www.wikidata.org/entity/Q1041'), ('country2', 'Senegal')]
[('a', 'http://www.wikidata.org/entity/Q20016855'), ('aname', 'Stefanie Horn'), ('c1', 'http://www.wikidata.org/entity/Q183'), ('country1', 'Germany'), ('c2', 'http://www.wikidata.org/entity/Q38'), ('country2', 'Italy')]
[('a', 'http://www.wikidata.org/entity/Q20016855'), ('aname', 'Stefanie Horn'), ('c1', 'http://www.wikidata.org/entity/Q38'), ('country1', 'Italy'), ('c2', 'http://www.wikidata.org/entity/Q183'), ('country2', 'Germany')]
[('a', 'http://www.wikidata.org/entity/Q248926'), ('aname', 'Josefa Idem'), ('c1', 'http:/

20

### 4. Compare number of participations and winning events across different dimensions

#### 4.1 In which specific sport France has more winners than Germany


Let's search for volleyball at the 2008 Summer Olympics (Q318185) if the winner is linked.


In [7]:
queryString = """
SELECT DISTINCT ?p ?oname ?o WHERE {

    wd:Q318185 ?p ?o .
         
    ?p <http://schema.org/name> ?pname .
    ?o <http://schema.org/name> ?oname .

    
}
"""
print("Results")
run_query(queryString)    

Results
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('oname', 'Template:VolleyballAt2008SummerOlympics'), ('o', 'http://www.wikidata.org/entity/Q13364506')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('oname', 'volleyball at the Summer Olympics'), ('o', 'http://www.wikidata.org/entity/Q6876')]
[('p', 'http://www.wikidata.org/prop/direct/P276'), ('oname', 'Capital Indoor Stadium'), ('o', 'http://www.wikidata.org/entity/Q1589654')]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('oname', "People's Republic of China"), ('o', 'http://www.wikidata.org/entity/Q148')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('oname', 'volleyball'), ('o', 'http://www.wikidata.org/entity/Q1734')]
[('p', 'http://www.wikidata.org/prop/direct/P156'), ('oname', 'volleyball at the 2012 Summer Olympics'), ('o', 'http://www.wikidata.org/entity/Q209112')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('oname', 'Olympic sports discipline event'), ('o', 'http://www.wikidata.org/entit

12

In [9]:
queryString = """
SELECT DISTINCT ?p ?pname ?o WHERE {

    wd:Q535378 ?p ?o .
    
    ?p <http://schema.org/name> ?pname .


}
"""
print("Results")
run_query(queryString)    

Results
[('p', 'http://www.wikidata.org/prop/direct/P1132'), ('pname', 'number of participants'), ('o', '144')]
[('p', 'http://www.wikidata.org/prop/direct/P1350'), ('pname', 'number of matches played/races/starts'), ('o', '38')]
[('p', 'http://www.wikidata.org/prop/direct/P276'), ('pname', 'location'), ('o', 'http://www.wikidata.org/entity/Q1589654')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of'), ('o', 'http://www.wikidata.org/entity/Q18536594')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of'), ('o', 'http://www.wikidata.org/entity/Q27020041')]
[('p', 'http://www.wikidata.org/prop/direct/P3450'), ('pname', 'sports season of league or competition'), ('o', 'http://www.wikidata.org/entity/Q3892574')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('pname', 'part of'), ('o', 'http://www.wikidata.org/entity/Q318185')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('pname', 'Commons category'), ('o', '2008 Summer Olympics

15

So I understand that Wikidata only list individual winners for the tournaments, not linking team players as winners.

In [27]:
queryString = """
SELECT DISTINCT (COUNT(?winner) as ?numWinner) ?country WHERE {

    ?comp wdt:P1346 ?w ;
            wdt:P361*/wdt:P31 wd:Q159821 .
    ?w wdt:P27 ?c .
         
    ?comp <http://schema.org/name> ?competition .
    ?w <http://schema.org/name> ?winner .
    ?c <http://schema.org/name> ?country .
    
    FILTER ((?c = wd:Q183) || (?c = wd:Q142)) .
}
GROUP BY ?country
ORDER BY desc(?numWinner)
LIMIT 10
"""
print("Results")
run_query(queryString)    

Results
[('numWinner', '470'), ('country', 'Germany')]
[('numWinner', '312'), ('country', 'France')]


2

In [31]:
queryString = """
SELECT DISTINCT ?p ?pname ?o ?oname WHERE {

    ?comp wdt:P1346 ?w ;
            ?p ?o ;
            wdt:P361*/wdt:P31 wd:Q159821 .
        
         
    ?p <http://schema.org/name> ?pname .
    ?o <http://schema.org/name> ?oname .
}
LIMIT 100
"""
print("Results")
run_query(queryString) 

Results
[('p', 'http://www.wikidata.org/prop/direct/P1346'), ('pname', 'winner'), ('o', 'http://www.wikidata.org/entity/Q20984245'), ('oname', 'Zhang Yufei')]
[('p', 'http://www.wikidata.org/prop/direct/P155'), ('pname', 'follows'), ('o', 'http://www.wikidata.org/entity/Q25991464'), ('oname', "swimming at the 2016 Summer Olympics – women's 200 metre butterfly")]
[('p', 'http://www.wikidata.org/prop/direct/P2094'), ('pname', 'competition class'), ('o', 'http://www.wikidata.org/entity/Q78157296'), ('oname', "women's 200 metre butterfly")]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of'), ('o', 'http://www.wikidata.org/entity/Q18536594'), ('oname', 'Olympic sporting event')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('pname', 'part of'), ('o', 'http://www.wikidata.org/entity/Q39080758'), ('oname', 'swimming at the 2020 Summer Olympics')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('pname', 'sport'), ('o', 'http://www.wikidata.org/entity/Q31920

100

In [None]:
P641(sport) is what I need

In [44]:
queryString = """
SELECT DISTINCT (COUNT(?winner) as ?numWinner) ?country ?sportName WHERE {

    ?comp wdt:P1346 ?w ;
            wdt:P641 ?sport ;
            wdt:P361*/wdt:P31 wd:Q159821 .
    ?w wdt:P27 ?c .
         
    ?comp <http://schema.org/name> ?competition .
    ?w <http://schema.org/name> ?winner .
    ?c <http://schema.org/name> ?country .
    ?sport <http://schema.org/name> ?sportName .
    
    FILTER ((?c = wd:Q183) || (?c = wd:Q142)) .
}
GROUP BY ?country ?sportName
ORDER BY desc(?numWinner)
LIMIT 5
"""
print("Results")
run_query(queryString)    

Results
[('numWinner', '99'), ('country', 'Germany'), ('sportName', 'track cycling')]
[('numWinner', '86'), ('country', 'France'), ('sportName', 'track cycling')]
[('numWinner', '73'), ('country', 'Germany'), ('sportName', 'athletics')]
[('numWinner', '48'), ('country', 'Germany'), ('sportName', 'rowing')]
[('numWinner', '43'), ('country', 'France'), ('sportName', 'fencing')]


5

Fencing is the first sport where France does better than Germany

#### 4.2 Who has the highest number of female winners


Remembering that P21 is the property for gender of a person

In [4]:
queryString = """
SELECT DISTINCT (COUNT(?winner) as ?numWinner) ?gender ?country WHERE {

    
    
    ?comp wdt:P1346 ?w ;
            wdt:P361*/wdt:P31 wd:Q159821 .
    ?w wdt:P27 ?c ;
        wdt:P21 ?g .
         
    ?comp <http://schema.org/name> ?competition .
    ?w <http://schema.org/name> ?winner .
    ?c <http://schema.org/name> ?country .
    ?g <http://schema.org/name> ?gender .
   
    FILTER ((?c = wd:Q183) || (?c = wd:Q142)) .
    FILTER REGEX (?gender, "female") .
}
GROUP BY ?gender ?country 
ORDER BY desc(?numWinner)
LIMIT 10
"""
print("Results")
run_query(queryString)   

Results
[('numWinner', '148'), ('gender', 'female'), ('country', 'Germany')]
[('numWinner', '45'), ('gender', 'female'), ('country', 'France')]


2

#### 4.3 For each country , which are the athletes with the highest number of participations?

In [3]:
queryString = """
SELECT DISTINCT (COUNT(?c) as ?numPartecipations) ?athlete ?country WHERE {

    ?a wdt:P27 ?c ;
            wdt:P1344 ?competition .
    ?competition wdt:P31 wd:Q159821 .
    
    ?c <http://schema.org/name> ?country .
    ?a <http://schema.org/name> ?athlete .

    
    FILTER ((?c = wd:Q183) || (?c = wd:Q142)) .

}
GROUP BY ?athlete ?country
ORDER BY desc(?numPartecipations)
LIMIT 2
"""
print("Results")
run_query(queryString)   

Results
[('numPartecipations', '7'), ('athlete', 'Jeannie Longo'), ('country', 'France')]
[('numPartecipations', '7'), ('athlete', 'Ralf Schumann'), ('country', 'Germany')]


2