# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [2]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-NOTEBOOK_CODE_HERE-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# Sport Workflow Series ("F1 pilots explorative search") 

Consider the following exploratory information need:

> explore F1 pilots, their nationality, racing teams, and the country of the parent organization of the team

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | profession    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P27`     | nationality   | predicate |
| `wdt:P17`     | country       | predicate |
| `wd:Q10497835`| F1 Team       | node |
| `wd:Q10841764`| F1 Driver       | node |
| `wd:Q9671`    | Michael Schumacher      | node |
| `wd:Q169898`  | Scuderia Ferrari         | node |
| `wd:Q27586`   |  Ferrari         | node |
| `wd:Q183`     | Germany       | node |


Also consider

```
?p wdt:P106 wd:Q10841764 .
?p wdt:P27 wd:Q183 .
```

is the BGP to retrieve all **citizens of Germany**

```
?p wdt:P17 wd:Q183 .
?p wdt:P31 wd:Q10497835 .

``` 

is the BGP to retrieve all **F1 teams of Germany**

## Workload Goals

1. Identify the BGP for obtaining the team of a given  F1 driver  and also the F1 competitions in which they participated

2. Identify the BGP that connects a F1 team to a parent company

3. How many F1 drivers and F1 teams exist in each country?

4. Compare number of F1 drivers to their nationality
 
   4.1 Is there some driver that joined a team in their own country?
   
   4.2 Which team employed the larger number of drivers?
   
   4.3 For each country , which are the teams and drivers with the highest number of F1 race participations


In [3]:
# start your workflow here

In [4]:
queryString = """
SELECT COUNT(*)
WHERE { 

?p wdt:P106 wd:Q10841764 .
?p wdt:P27 wd:Q183 .

} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '58')]


1

# Workload 1

In [None]:
# 1. Identify the BGP for obtaining the team of a given F1 driver and also the F1 competitions in which they participated

In [44]:
# To find the IRI of each team and to explore it, we can start from the given node wd:Q10497835 --> F1 Team

In [45]:
queryString = """
SELECT  ?s ?sname
WHERE { 

?s wdt:P31 wd:Q10497835 .

?s <http://schema.org/name> ?sname .

} 

LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q141818'), ('sname', 'Scuderia Toro Rosso')]
[('s', 'http://www.wikidata.org/entity/Q173663'), ('sname', 'Red Bull Racing')]
[('s', 'http://www.wikidata.org/entity/Q142381'), ('sname', 'Minardi')]
[('s', 'http://www.wikidata.org/entity/Q171320'), ('sname', 'British American Racing')]
[('s', 'http://www.wikidata.org/entity/Q171420'), ('sname', 'HRT Formula One Team')]
[('s', 'http://www.wikidata.org/entity/Q171992'), ('sname', 'Tyrrell Racing')]
[('s', 'http://www.wikidata.org/entity/Q172765'), ('sname', 'Spyker F1')]
[('s', 'http://www.wikidata.org/entity/Q172974'), ('sname', 'Forti')]
[('s', 'http://www.wikidata.org/entity/Q173012'), ('sname', 'British Racing Motors')]
[('s', 'http://www.wikidata.org/entity/Q173054'), ('sname', 'Midland F1 Racing')]


10

In [46]:
# Explore one F1 team: for example Scuderia Ferrari --> wd:Q169898

In [47]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

?s ?p wd:Q169898 .

?p <http://schema.org/name> ?pname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('pname', 'employer')]
[('p', 'http://www.wikidata.org/prop/direct/P1346'), ('pname', 'winner')]
[('p', 'http://www.wikidata.org/prop/direct/P176'), ('pname', 'manufacturer')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('pname', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P355'), ('pname', 'subsidiary')]
[('p', 'http://www.wikidata.org/prop/direct/P463'), ('pname', 'member of')]
[('p', 'http://www.wikidata.org/prop/direct/P54'), ('pname', 'member of sports team')]
[('p', 'http://www.wikidata.org/prop/direct/P664'), ('pname', 'organizer')]
[('p', 'http://www.wikidata.org/prop/direct/P749'), ('pname', 'parent organization')]
[('p', 'http://www.wikidata.org/prop/direct/P921'), ('pname', 'main subject')]
[('p', 'http://www.wikidata.org/prop/direct/P1923'), ('pname', 'participating team')]
[('p', 'http://www.wikidata.org/prop/direct/P3764'), ('pname', 'pole position')]


12

In [48]:
# I want to look who are the 'member of' --> wdt:P463

In [49]:
queryString = """
SELECT DISTINCT ?s ?sname
WHERE { 

?s wdt:P463 wd:Q169898 .

?s <http://schema.org/name> ?sname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q171432'), ('sname', 'Didier Pironi')]


1

In [50]:
# I want to look who are the 'employer' --> wdt:P108

In [51]:
queryString = """
SELECT DISTINCT ?s ?sname
WHERE { 

?s wdt:P108 wd:Q169898 .

?s <http://schema.org/name> ?sname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q2835947'), ('sname', 'Ben Agathangelou')]
[('s', 'http://www.wikidata.org/entity/Q7107447'), ('sname', 'Ossi Oikarinen')]
[('s', 'http://www.wikidata.org/entity/Q3165116'), ('sname', 'Jean-Claude Migeot')]
[('s', 'http://www.wikidata.org/entity/Q173631'), ('sname', 'Pat Fry')]
[('s', 'http://www.wikidata.org/entity/Q19788193'), ('sname', 'Mattia Binotto')]
[('s', 'http://www.wikidata.org/entity/Q173692'), ('sname', 'Loïc Bigois')]
[('s', 'http://www.wikidata.org/entity/Q173232'), ('sname', 'Jock Clear')]
[('s', 'http://www.wikidata.org/entity/Q173629'), ('sname', 'Nikolas Tombazis')]
[('s', 'http://www.wikidata.org/entity/Q18572825'), ('sname', 'Maurizio Arrivabene')]
[('s', 'http://www.wikidata.org/entity/Q63992826'), ('sname', 'Diane Holl')]
[('s', 'http://www.wikidata.org/entity/Q173658'), ('sname', 'Mauro Forghieri')]
[('s', 'http://www.wikidata.org/entity/Q173555'), ('sname', 'Nigel Stepney')]


12

In [None]:
# I want to look who are the 'partacipating team' --> wdt:P1923

In [52]:
queryString = """
SELECT DISTINCT ?s ?sname
WHERE { 

?s wdt:P1923 wd:Q169898 .

?s <http://schema.org/name> ?sname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q17455748'), ('sname', '2016 Formula One World Championship')]
[('s', 'http://www.wikidata.org/entity/Q20755615'), ('sname', '2017 Formula One World Championship')]
[('s', 'http://www.wikidata.org/entity/Q23051084'), ('sname', '2018 Formula One World Championship')]


3

In [53]:
# Now explore the properties of F1 driver: given  wd:Q10841764 F1 Driver

In [54]:
queryString = """
SELECT  ?p ?pname
WHERE { 

wd:Q10841764 ?p ?o .

?p <http://schema.org/name> ?pname .
} 

LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('pname', "topic's main template")]
[('p', 'http://www.wikidata.org/prop/direct/P2354'), ('pname', 'has list')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('pname', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P425'), ('pname', 'field of this occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P8408'), ('pname', 'KBpedia ID')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('pname', "topic's main category")]


7

In [55]:
# Now explore all properties linked to F1 driver  

In [56]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

?s ?p wd:Q10841764 .

?p <http://schema.org/name> ?pname .
} 

LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1423'), ('pname', 'template has topic')]
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('pname', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P1269'), ('pname', 'facet of')]
[('p', 'http://www.wikidata.org/prop/direct/P1535'), ('pname', 'used by')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('pname', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P3095'), ('pname', 'practiced by')]
[('p', 'http://www.wikidata.org/prop/direct/P360'), ('pname', 'is a list of')]
[('p', 'http://www.wikidata.org/prop/direct/P971'), ('pname', 'category combines topics')]


8

In [57]:
# Now explore a given driver wd:Q9671 Michael Schumacher

In [58]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

wd:Q9671 ?p ?o .

?p <http://schema.org/name> ?pname .
} 

LIMIT 80
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1006'), ('pname', 'Nationale Thesaurus voor Auteurs ID')]
[('p', 'http://www.wikidata.org/prop/direct/P103'), ('pname', 'native language')]
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('pname', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('pname', 'employer')]
[('p', 'http://www.wikidata.org/prop/direct/P109'), ('pname', 'signature')]
[('p', 'http://www.wikidata.org/prop/direct/P1258'), ('pname', 'Rotten Tomatoes ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1263'), ('pname', 'NNDB people ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1266'), ('pname', 'AlloCiné person ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1273'), ('pname', 'CANTIC ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1280'), ('pname', 'CONOR.SI ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1285'), ('pname', 'Munzinger Sport number')]
[('p', 'http://www.wikidata.org/prop/direct/P1296'), ('pname', 'Gran Enciclopèd

80

In [None]:
# Explore 'occupation' --> wdt:P106

In [31]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

wd:Q9671 wdt:P106 ?p .

?p <http://schema.org/name> ?pname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q10841764'), ('pname', 'Formula One driver')]
[('p', 'http://www.wikidata.org/entity/Q193391'), ('pname', 'diplomat')]
[('p', 'http://www.wikidata.org/entity/Q3014296'), ('pname', 'motorcycle rider')]


3

In [None]:
# Explore 'number of matches played/races/starts' --> wdt:P1350

In [60]:
queryString = """
SELECT DISTINCT ?p 
WHERE { 

wd:Q9671 wdt:P1350 ?p .

#?p <http://schema.org/name> ?pname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('p', '308')]


1

In [None]:
# Explore 'participant in' --> wdt:P1344

In [64]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

wd:Q9671 wdt:P1344 ?p .

?p <http://schema.org/name> ?pname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q69713'), ('pname', '1994 Formula One season')]
[('p', 'http://www.wikidata.org/entity/Q69722'), ('pname', '1995 Formula One season')]
[('p', 'http://www.wikidata.org/entity/Q69805'), ('pname', '2000 Formula One season')]
[('p', 'http://www.wikidata.org/entity/Q69835'), ('pname', '2001 Formula One season')]
[('p', 'http://www.wikidata.org/entity/Q69844'), ('pname', '2002 Formula One season')]
[('p', 'http://www.wikidata.org/entity/Q69859'), ('pname', '2003 Formula One season')]
[('p', 'http://www.wikidata.org/entity/Q69869'), ('pname', '2004 Formula One season')]


7

In [None]:
# Explore 'competition class' --> wdt:P2094

In [65]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

wd:Q9671 wdt:P2094 ?p .

?p <http://schema.org/name> ?pname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q187579'), ('pname', '24 Hours of Le Mans')]
[('p', 'http://www.wikidata.org/entity/Q1968'), ('pname', 'Formula One')]
[('p', 'http://www.wikidata.org/entity/Q283085'), ('pname', 'Formula Nippon')]


3

In [38]:
# I explore the nodes for 'member of sports team' --> wdt:P54

In [39]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

wd:Q9671 wdt:P54 ?p .

?p <http://schema.org/name> ?pname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q169898'), ('pname', 'Scuderia Ferrari')]
[('p', 'http://www.wikidata.org/entity/Q171329'), ('pname', 'Benetton Formula')]
[('p', 'http://www.wikidata.org/entity/Q171367'), ('pname', 'Jordan Grand Prix')]
[('p', 'http://www.wikidata.org/entity/Q172721'), ('pname', 'Mercedes')]


4

In [None]:
# Here I find a way to see all the teams where a driver worked for

In [66]:
# GET FROM ABOVE: 
# 'member of sports team' --> wdt:P54
# 'participant in' --> wdt:P1344

In [None]:
# I try to find the F1 team and the F1 competitions for F1 drivers

In [69]:
queryString = """
SELECT DISTINCT ?d ?dname (GROUP_CONCAT(DISTINCT ?sname ; separator = ", ") AS ?scuderie) (GROUP_CONCAT(DISTINCT ?cname ; separator = ", ") AS ?competizioni)
WHERE { 

?d wdt:P106 wd:Q10841764 ;  # ?d --> driver F1
   wdt:P54 ?s ;             # ?s --> their scuderia
   wdt:P1344 ?c .           # ?c --> their competitions
   

?d <http://schema.org/name> ?dname .
?s <http://schema.org/name> ?sname .
?c <http://schema.org/name> ?cname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('d', 'http://www.wikidata.org/entity/Q172815'), ('dname', 'Keith Greene'), ('scuderie', 'Gilby Engineering'), ('competizioni', '24 Hours of Le Mans')]
[('d', 'http://www.wikidata.org/entity/Q172759'), ('dname', 'Riccardo Patrese'), ('scuderie', 'Alfa Romeo'), ('competizioni', '24 Hours of Le Mans')]
[('d', 'http://www.wikidata.org/entity/Q171479'), ('dname', 'Jean-Louis Schlesser'), ('scuderie', 'Maurer Motorsport (Germany), RAM Racing, Williams Grand Prix Engineering'), ('competizioni', '24 Hours of Le Mans')]
[('d', 'http://www.wikidata.org/entity/Q369348'), ('dname', 'Perry McCarthy'), ('scuderie', 'Andrea Moda Formula'), ('competizioni', '24 Hours of Le Mans')]
[('d', 'http://www.wikidata.org/entity/Q173336'), ('dname', 'Innes Ireland'), ('scuderie', 'Bernard White Racing, British Racing Partnership, Reg Parnell Racing, Team Lotus'), ('competizioni', '24 Hours of Le Mans')]
[('d', 'http://www.wikidata.org/entity/Q171481'), ('dname', 'Tony Brooks'), ('scuderie', 'British R

20

In [42]:
#
#

# Workload 2

In [None]:
# 2. Identify the BGP that connects a F1 team to a parent company

In [None]:
# We can start to analyze the F1 team --> wd:10497835

In [44]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

?s ?p wd:Q10497835 .

?p <http://schema.org/name> ?pname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('pname', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('pname', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of')]


3

In [None]:
# We find an team to analyze its properties

In [45]:
queryString = """
SELECT DISTINCT ?s ?sname
WHERE { 

?s wdt:P31 wd:Q10497835 .

?s <http://schema.org/name> ?sname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q141818'), ('sname', 'Scuderia Toro Rosso')]
[('s', 'http://www.wikidata.org/entity/Q173663'), ('sname', 'Red Bull Racing')]
[('s', 'http://www.wikidata.org/entity/Q142381'), ('sname', 'Minardi')]
[('s', 'http://www.wikidata.org/entity/Q171320'), ('sname', 'British American Racing')]
[('s', 'http://www.wikidata.org/entity/Q171420'), ('sname', 'HRT Formula One Team')]
[('s', 'http://www.wikidata.org/entity/Q171992'), ('sname', 'Tyrrell Racing')]
[('s', 'http://www.wikidata.org/entity/Q172765'), ('sname', 'Spyker F1')]
[('s', 'http://www.wikidata.org/entity/Q172974'), ('sname', 'Forti')]
[('s', 'http://www.wikidata.org/entity/Q173012'), ('sname', 'British Racing Motors')]
[('s', 'http://www.wikidata.org/entity/Q173054'), ('sname', 'Midland F1 Racing')]
[('s', 'http://www.wikidata.org/entity/Q173177'), ('sname', 'March Engineering')]
[('s', 'http://www.wikidata.org/entity/Q174008'), ('sname', 'Caterham F1')]
[('s', 'http://www.wikidata.org/en

20

In [None]:
# Choose 'Red Bull Racing' --> wd:Q173663

In [46]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

wd:Q173663 ?p ?o .

?p <http://schema.org/name> ?pname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1056'), ('pname', 'product or material produced')]
[('p', 'http://www.wikidata.org/prop/direct/P127'), ('pname', 'owned by')]
[('p', 'http://www.wikidata.org/prop/direct/P159'), ('pname', 'headquarters location')]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('pname', 'country')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('pname', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P2002'), ('pname', 'Twitter username')]
[('p', 'http://www.wikidata.org/prop/direct/P2003'), ('pname', 'Instagram username')]
[('p', 'http://www.wikidata.org/prop/direct/P2013'), ('pname', 'Facebook ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2397'), ('pname', 'YouTube channel ID')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P3417'), ('pname', 'Quora topic ID')]
[('p', 'http://www.wikidata.org/prop/direct/P355'), ('pname', 'subsidiary')]
[('p', 'http://www

20

In [47]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

?s ?p wd:Q173663.

?p <http://schema.org/name> ?pname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('pname', 'employer')]
[('p', 'http://www.wikidata.org/prop/direct/P1346'), ('pname', 'winner')]
[('p', 'http://www.wikidata.org/prop/direct/P176'), ('pname', 'manufacturer')]
[('p', 'http://www.wikidata.org/prop/direct/P1830'), ('pname', 'owner of')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('pname', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P54'), ('pname', 'member of sports team')]
[('p', 'http://www.wikidata.org/prop/direct/P749'), ('pname', 'parent organization')]
[('p', 'http://www.wikidata.org/prop/direct/P1923'), ('pname', 'participating team')]


8

In [57]:
# Here I find 'parent organization' --> wdt:P749 that seems exactly what I'm looking for

In [None]:
# I try it on Red Bull Racing

In [56]:
queryString = """
SELECT DISTINCT ?s ?sname
WHERE { 

?s wdt:P749 wd:Q173663.

?s <http://schema.org/name> ?sname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q2079949'), ('sname', 'Red Bull Junior Team')]


1

In [58]:
# Now I want to identify all F1 Teams connected to parent company (if existing)

In [59]:
queryString = """
SELECT DISTINCT ?f1t ?f1teamname (GROUP_CONCAT(DISTINCT ?pcname ; separator = ", ") AS ?parentcompanies)
WHERE { 

?pc wdt:P749 ?f1t.             # ?pc --> parent company
?f1t wdt:P31 wd:Q10497835 .     # all F1 teams
    
    

?f1t <http://schema.org/name> ?f1teamname .
?pc  <http://schema.org/name> ?pcname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('f1t', 'http://www.wikidata.org/entity/Q169898'), ('f1teamname', 'Scuderia Ferrari'), ('parentcompanies', 'Ferrari Driver Academy')]
[('f1t', 'http://www.wikidata.org/entity/Q172030'), ('f1teamname', 'McLaren'), ('parentcompanies', 'McLaren Young Driver Programme')]
[('f1t', 'http://www.wikidata.org/entity/Q35931'), ('f1teamname', 'Ligier'), ('parentcompanies', 'Microcar')]
[('f1t', 'http://www.wikidata.org/entity/Q173663'), ('f1teamname', 'Red Bull Racing'), ('parentcompanies', 'Red Bull Junior Team')]


4

In [None]:
#
#

# Workload 3

In [None]:
# 3. How many F1 drivers and F1 teams exist in each country?

In [None]:
# Get:
# wd:Q10841764 --> F1 driver
# wdt:P27 --> nationality

In [None]:
# Explore for all F1 drivers and their nationality (all F1 driver in the story of F1)

In [5]:
queryString = """
SELECT DISTINCT  ?countryname count(?d)
WHERE { 

?d   wdt:P27 ?country ;
     wdt:P106 wd:Q10841764 .    # ?d --> driver
             
    
    

#?d <http://schema.org/name> ?dname .
?country  <http://schema.org/name> ?countryname .
} 

LIMIT 40
"""

print("Results")
run_query(queryString)

Results
[('countryname', 'Poland'), ('callret-1', '2')]
[('countryname', 'Australia'), ('callret-1', '3')]
[('countryname', 'New Zealand'), ('callret-1', '1')]
[('countryname', 'France'), ('callret-1', '14')]
[('countryname', 'Northern Ireland'), ('callret-1', '1')]
[('countryname', 'Mexico'), ('callret-1', '5')]
[('countryname', 'Spain'), ('callret-1', '18')]
[('countryname', 'Belgium'), ('callret-1', '28')]
[('countryname', 'Venezuela'), ('callret-1', '1')]
[('countryname', 'Thailand'), ('callret-1', '2')]
[('countryname', 'United Kingdom'), ('callret-1', '172')]
[('countryname', 'Kingdom of Italy'), ('callret-1', '9')]
[('countryname', 'Finland'), ('callret-1', '9')]
[('countryname', 'West Germany'), ('callret-1', '1')]
[('countryname', 'German Democratic Republic'), ('callret-1', '3')]
[('countryname', 'Italy'), ('callret-1', '19')]
[('countryname', 'United Kingdom of Great Britain and Ireland'), ('callret-1', '16')]
[('countryname', 'Germany'), ('callret-1', '58')]
[('countryname'

34

In [None]:
# Now I have to find a way to show the F1 team country

In [100]:
queryString = """
SELECT DISTINCT  ?cname count(?f1t)
WHERE { 

 ?f1t   wdt:P17   ?c .               # ?c --> country
 ?f1t   wdt:P31 wd:Q10497835 .       # ?f1t --> all formula one teams
    

?c  <http://schema.org/name> ?cname .
} 

GROUP BY (?cname)
LIMIT 40
"""

print("Results")
run_query(queryString)

Results
[('cname', 'Australia'), ('callret-1', '1')]
[('cname', 'France'), ('callret-1', '6')]
[('cname', 'Netherlands'), ('callret-1', '2')]
[('cname', 'Spain'), ('callret-1', '1')]
[('cname', 'United Kingdom'), ('callret-1', '26')]
[('cname', 'Italy'), ('callret-1', '16')]
[('cname', 'Germany'), ('callret-1', '2')]
[('cname', 'Malaysia'), ('callret-1', '1')]
[('cname', 'Russia'), ('callret-1', '2')]
[('cname', 'Japan'), ('callret-1', '2')]
[('cname', 'Switzerland'), ('callret-1', '1')]


11

In [None]:
# Now union of the queries above

In [121]:
queryString = """
SELECT DISTINCT  ?countryname (count(distinct ?d) as ?drivers) (count (distinct ?f1t) as ?f1teams)
WHERE { 

  
?d wdt:P106 wd:Q10841764 ;     # ?d --> driver
        wdt:P27 ?country .     
    
OPTIONAL{
?f1t  wdt:P17   ?country ;              # ?c --> country
      wdt:P31 wd:Q10497835 .   }      # ?f1t --> all formula one teams
    
{SELECT DISTINCT  ?country
WHERE { 
?d wdt:P106 wd:Q10841764 ;     # ?d --> driver
        wdt:P27 ?country . 
}}
#?c  <http://schema.org/name> ?countrynam .
?country  <http://schema.org/name> ?countryname .

} 


LIMIT 40
"""

print("Results")
run_query(queryString)

Results
[('countryname', 'South Africa'), ('drivers', '27'), ('f1teams', '0')]
[('countryname', 'Brazil'), ('drivers', '6'), ('f1teams', '0')]
[('countryname', 'Portugal'), ('drivers', '5'), ('f1teams', '0')]
[('countryname', 'Ireland'), ('drivers', '7'), ('f1teams', '0')]
[('countryname', 'Argentina'), ('drivers', '25'), ('f1teams', '0')]
[('countryname', 'Wales'), ('drivers', '1'), ('f1teams', '0')]
[('countryname', 'Georgia'), ('drivers', '1'), ('f1teams', '0')]
[('countryname', 'United Kingdom'), ('drivers', '172'), ('f1teams', '26')]
[('countryname', 'Poland'), ('drivers', '2'), ('f1teams', '0')]
[('countryname', 'Canada'), ('drivers', '16'), ('f1teams', '0')]
[('countryname', 'Spain'), ('drivers', '18'), ('f1teams', '1')]
[('countryname', 'United Kingdom of Great Britain and Ireland'), ('drivers', '16'), ('f1teams', '0')]
[('countryname', 'Kingdom of Italy'), ('drivers', '9'), ('f1teams', '0')]
[('countryname', 'Australia'), ('drivers', '3'), ('f1teams', '1')]
[('countryname', 'J

34

In [None]:
#
#

# Workload 4

In [None]:
# 4. Compare number of F1 drivers to their nationality

## Workload 4.1

In [127]:
# 4.1 Is there some driver that joined a team in their own country?

In [None]:
# Here I find all the F1 drivers who works in a team based in the same country where they were born. I use group_concat 
# to show all the scuderie for the same person (if exists). All the IRI came from above. (yet explored)

In [22]:
queryString = """
SELECT DISTINCT  ?countryname ?d ?dname (GROUP_CONCAT(DISTINCT ?scuderia ; separator = ", ") AS ?scuderie) ?country_teams
WHERE {


?f1t  wdt:P17   ?cc ;                # ?c --> country scuderia
      wdt:P31 wd:Q10497835 .       
    
  
?d   wdt:P106 wd:Q10841764 ;       # ?d --> driver
     wdt:P54 ?f1t ;                # ?f1t --> formula one team (scuderia)
     wdt:P27 ?c .                  # ?c --> country
        

?d  <http://schema.org/name> ?dname .
?c  <http://schema.org/name> ?countryname .
?cc  <http://schema.org/name> ?country_teams .
?f1t <http://schema.org/name> ?scuderia .


FILTER(?c = ?cc)
} 


LIMIT 40
"""

print("Results")
run_query(queryString)

Results
[('countryname', 'Italy'), ('d', 'http://www.wikidata.org/entity/Q171312'), ('dname', 'Mario Andretti'), ('scuderie', 'Alfa Romeo'), ('country_teams', 'Italy')]
[('countryname', 'United Kingdom'), ('d', 'http://www.wikidata.org/entity/Q17319645'), ('dname', 'George Russell'), ('scuderie', 'Williams Grand Prix Engineering'), ('country_teams', 'United Kingdom')]
[('countryname', 'France'), ('d', 'http://www.wikidata.org/entity/Q171767'), ('dname', 'Johnny Servoz-Gavin'), ('scuderie', 'Equipe Matra Sports'), ('country_teams', 'France')]
[('countryname', 'Italy'), ('d', 'http://www.wikidata.org/entity/Q171518'), ('dname', 'Arturo Merzario'), ('scuderie', 'Scuderia Ferrari'), ('country_teams', 'Italy')]
[('countryname', 'United Kingdom'), ('d', 'http://www.wikidata.org/entity/Q9673'), ('dname', 'Lewis Hamilton'), ('scuderie', 'McLaren, Mercedes'), ('country_teams', 'United Kingdom')]
[('countryname', 'United Kingdom'), ('d', 'http://www.wikidata.org/entity/Q171720'), ('dname', 'Jack

28

In [None]:
#
#

## Workload 4.2

In [None]:
# 4.2 Which team employed the larger number of drivers?

In [None]:
# Also for this question, I'm going to use IRI found above.
# I start by exploring the number of F1 drivers linked to F1 teams

In [138]:
queryString = """
SELECT  count(?d)
WHERE { 

?d   wdt:P106 wd:Q10841764 ;       # ?d --> driver
     wdt:P54 ?f1t .                # ?f1t --> formula one team (scuderia)

?d <http://schema.org/name> ?dname .
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '306')]


1

In [None]:
# I find the F1 teams and their number of F1 driver in all the story of the Formula Omne

In [9]:
queryString = """
SELECT DISTINCT ?f1tname (count(?d) AS ?numerodrivers)
WHERE { 

?d   wdt:P106 wd:Q10841764 ;       # ?d --> driver
     wdt:P54 ?f1t .                # ?f1t --> formula one team (scuderia)

?d <http://schema.org/name> ?dname .
?f1t <http://schema.org/name> ?f1tname .

#FILTER REGEX(?f1tname, "Toro")
} 
ORDER BY DESC (?numerodrivers) 

LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('f1tname', 'Team Lotus'), ('numerodrivers', '17')]
[('f1tname', 'Scuderia Ferrari'), ('numerodrivers', '15')]
[('f1tname', 'British Racing Motors'), ('numerodrivers', '14')]
[('f1tname', 'McLaren'), ('numerodrivers', '14')]
[('f1tname', 'Cooper Car Company'), ('numerodrivers', '13')]
[('f1tname', 'Williams Grand Prix Engineering'), ('numerodrivers', '13')]
[('f1tname', 'Alfa Romeo'), ('numerodrivers', '10')]
[('f1tname', 'Brabham'), ('numerodrivers', '9')]
[('f1tname', 'Scuderia Centro Sud'), ('numerodrivers', '9')]
[('f1tname', 'Reg Parnell Racing'), ('numerodrivers', '8')]


10

In [None]:
# Now to have the team with max number of drivers, i can use order by descendent and litim to 1 --> to have the team 
# with max value

In [155]:
queryString = """
SELECT DISTINCT ?f1tname (count(?d) AS ?numerodrivers)
WHERE { 

?d   wdt:P106 wd:Q10841764 ;       # ?d --> driver
     wdt:P54 ?f1t .                # ?f1t --> formula one team (scuderia)

?d <http://schema.org/name> ?dname .
?f1t <http://schema.org/name> ?f1tname .

#FILTER REGEX(?f1tname, "Toro")
} 
ORDER BY DESC (?numerodrivers) 

LIMIT 1
"""

print("Results")
run_query(queryString)

Results
[('f1tname', 'Team Lotus'), ('numerodrivers', '17')]


1

In [None]:
# 
#

## Workload 4.3

In [None]:
# 4.3 For each country , which are the teams and drivers with the highest number of F1 race participations

In [None]:
# In the workload 1, I had already found: 'number of matches played/races/starts' --> wdt:P1350
# that I can use for this question

In [None]:
# Start finding the teams with the highest number of F1 race 

In [4]:
queryString = """
SELECT DISTINCT ?dname ?nr
WHERE { 

?d   wdt:P106 wd:Q10841764 ;       # ?d --> driver
     wdt:P1350 ?nr .

?d <http://schema.org/name> ?dname .
?f1t <http://schema.org/name> ?f1tname .


} 


LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('dname', 'Michael Schumacher'), ('nr', '308')]


1

In [6]:
# Seems not to work the 'number of matches played/races/starts' --> wdt:P1350
# It gives me only the number of matches for Michael Schumacher
# In this way it's not possible to solve it

In [None]:
# Use wd:Q9673  --> "Hamilton Lewis", it gives me empty results

In [5]:
queryString = """
SELECT DISTINCT ?dname ?nr 
WHERE { 

wd:Q9673  wdt:P1355 ?nr .

wd:Q9673 <http://schema.org/name> ?dname .

#FILTER (isNumeric(?nr))

} 

LIMIT 100
"""

print("Results")
run_query(queryString)

Results
Empty


0