# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-f4291c389f-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# Sport Workflow Series ("FIFA World Cup events explorative search") 

Consider the following exploratory information need:

> Explore the participants and winners of different editions of the FIFA World Cup

## Useful URIs for the current workflow


The following are given:

| IRI           | Description    | Role      |
| -----------   | -----------    |-----------|
| `wdt:P1647`   | subproperty    | predicate |
| `wdt:P31`     | instance of    | predicate |
| `wdt:P106`    | profession     | predicate | 
| `wdt:P279`    | subclass       | predicate |
| `wdt:P27`     | nationality    | predicate |
| `wdt:P1532`   | country of sport    | predicate |
| `wd:Q29`      | Spain           | node |
| `wd:Q42267`   | Spain national football team | node |
| `wd:Q19317`   | FIFA World Cup  | node |
| `wd:Q176883`  | 2010 FIFA World cup  | node |
| `wd:Q208401`  | 2010 FIFA World Cup Final | node      |
| `wd:Q6979593`  |national football team | node      |


Also consider

```
?a wdt:P31 wd:Q6979593
```

is the BGP to retrieve all the  **national footbal teams**


## Workload Goals

1. Identify the BGP for connecting national football teams with the participations in the finals or in the world cups

2. Identify the BGP that connects a country with a national footbal team or as a location to a world cup

3. In which countries took place a FIFA world cup or world cup final? With how many participants?

4. Compare number of participants and winners of different world cups
 
   4.1 How many times each country participated in a world cup?
   
   4.2 Which country participated in the most finals? Which country won the most finals?
   
   4.3 Who participated in the most world cups among Italy, France, Spain, and Germany?


In [1]:
# start your workflow here

In [2]:
queryString = """
SELECT COUNT(*)
WHERE { 

?a wdt:P31 wd:Q6979593
} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '1220')]


1

## Task 1 -- Identify the BGP for connecting national football teams with the participations in the finals or in the world cups

#### I find all properties of national football teams

In [4]:
queryString = """
SELECT DISTINCT ?p ?pname (COUNT(?p) AS ?number) WHERE { 

?team wdt:P31 wd:Q6979593;
      ?p ?obj.
      
?p <http://schema.org/name> ?pname .
} GROUP BY ?p ?pname
ORDER BY DESC (?number)
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1352'), ('pname', 'ranking'), ('number', '14027')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of'), ('number', '1376')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('pname', 'sport'), ('number', '1207')]
[('p', 'http://www.wikidata.org/prop/direct/P2094'), ('pname', 'competition class'), ('number', '1199')]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('pname', 'country'), ('number', '1124')]
[('p', 'http://www.wikidata.org/prop/direct/P1532'), ('pname', 'country for sport'), ('number', '1055')]
[('p', 'http://www.wikidata.org/prop/direct/P41'), ('pname', 'flag image'), ('number', '1019')]
[('p', 'http://www.wikidata.org/prop/direct/P3441'), ('pname', 'FIFA country code'), ('number', '952')]
[('p', 'http://www.wikidata.org/prop/direct/P127'), ('pname', 'owned by'), ('number', '919')]
[('p', 'http://www.wikidata.org/prop/direct/P646'), ('pname', 'Freebase ID'), ('number', '883')]
[('p', 'http:

115

In [None]:
queryString = """
SELECT DISTINCT ?p ?pname WHERE { 

?team wdt:P31 wd:Q6979593.
?sub ?p ?team.
      
?p <http://schema.org/name> ?pname .
} 
"""

print("Results")
run_query(queryString)

#### I search in "participate in"

In [9]:
queryString = """
SELECT DISTINCT ?sub ?subname WHERE { 

?team wdt:P31 wd:Q6979593.
?sub wdt:P1344 ?team.
      
?sub <http://schema.org/name> ?subname .
} 
"""

print("Results")
run_query(queryString)

Results
[('sub', 'http://www.wikidata.org/entity/Q100973539'), ('subname', 'Hlieb Roŭda')]
[('sub', 'http://www.wikidata.org/entity/Q19857005'), ('subname', 'Robert Burbano')]
[('sub', 'http://www.wikidata.org/entity/Q16941646'), ('subname', 'Javier Charcopa')]
[('sub', 'http://www.wikidata.org/entity/Q21070693'), ('subname', 'Eddy Corozo')]
[('sub', 'http://www.wikidata.org/entity/Q98668875'), ('subname', 'Thiago Prieto Acosta')]
[('sub', 'http://www.wikidata.org/entity/Q21682606'), ('subname', 'Bryan Ruiz')]
[('sub', 'http://www.wikidata.org/entity/Q18387214'), ('subname', 'Kasim Nuhu')]
[('sub', 'http://www.wikidata.org/entity/Q18417603'), ('subname', 'Osbaldo Lastra')]


8

#### Nothing useful, let's try with "participant P710"

In [None]:
queryString = """
SELECT DISTINCT ?sub ?subname WHERE { 

?team wdt:P31 wd:Q6979593.
?sub wdt:P710 ?team.
      
?sub <http://schema.org/name> ?subname .
} 
"""

print("Results")
run_query(queryString)

#### Nothing useful also here, let's try with "participating team" P1923

In [None]:
queryString = """
SELECT DISTINCT ?sub ?subname WHERE { 

?team wdt:P31 wd:Q6979593.
?sub wdt:P1923 ?team.
      
?sub <http://schema.org/name> ?subname .
} 
"""

print("Results")
run_query(queryString)

#### Found, now i select only the world cups (wd:Q19317)

In [12]:
queryString = """
SELECT DISTINCT ?sub ?subname WHERE { 

?team wdt:P31 wd:Q6979593.
?sub wdt:P1923 ?team;
     wdt:P31 wd:Q19317.
      
?sub <http://schema.org/name> ?subname .
} 
"""

print("Results")
run_query(queryString)

Results
Empty


0

#### The World cups are not saved as "instance of" "Fifa World Cup". Let's pick for example the 2010 World cup (Q176883) and inspect its properties

In [14]:
queryString = """
SELECT DISTINCT ?p ?pname WHERE { 

wd:Q176883 ?p ?obj.
      
?p <http://schema.org/name> ?pname .
} 
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P4967'), ('pname', 'prerequisite')]
[('p', 'http://www.wikidata.org/prop/direct/P1132'), ('pname', 'number of participants')]
[('p', 'http://www.wikidata.org/prop/direct/P1346'), ('pname', 'winner')]
[('p', 'http://www.wikidata.org/prop/direct/P1350'), ('pname', 'number of matches played/races/starts')]
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('pname', "topic's main template")]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('pname', 'country')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('pname', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P2094'), ('pname', 'competition class')]
[('p', 'http://www.wikidata.org/prop/direct/P214'), ('pname', 'VIAF ID')]
[('p', 'http://www.wikidata.org/prop/direct/P227'), ('pname', 'GND ID')]
[('p', 'http://www.wikidata.org/prop/direct/P244'), ('pname', 'Library of Congress authority ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2581'), ('pname', 'BabelNet ID')]


42

#### Let's print its "instance of"

In [15]:
queryString = """
SELECT DISTINCT ?obj ?objname WHERE { 

wd:Q176883 wdt:P31 ?obj.
      
?obj <http://schema.org/name> ?objname .
} 
"""

print("Results")
run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q27020041'), ('objname', 'sports season')]


1

#### Let's use this (instance of "sport season")

In [None]:
queryString = """
SELECT DISTINCT ?sub ?subname WHERE { 

?team wdt:P31 wd:Q6979593.
?sub wdt:P1923 ?team;
     wdt:P31 wd:Q27020041.
      
?sub <http://schema.org/name> ?subname .
} 
"""

print("Results")
run_query(queryString)

#### Still too much data. I check property "sports season of league or competition" (P3450)

In [17]:
queryString = """
SELECT DISTINCT ?obj ?objname WHERE { 

wd:Q176883 wdt:P3450 ?obj.
      
?obj <http://schema.org/name> ?objname .
} 
"""

print("Results")
run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q19317'), ('objname', 'FIFA World Cup')]


1

#### Found the connection!!

In [19]:
queryString = """
SELECT DISTINCT ?sub ?subname WHERE { 

?team wdt:P31 wd:Q6979593.
?sub wdt:P1923 ?team;
     wdt:P3450 wd:Q19317.
      
?sub <http://schema.org/name> ?subname .
} 
"""

print("Results")
run_query(queryString)

Results
[('sub', 'http://www.wikidata.org/entity/Q101751'), ('subname', '1994 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q131591'), ('subname', '1938 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q132664'), ('subname', '1970 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q48432'), ('subname', '1930 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q176883'), ('subname', '2010 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q37285'), ('subname', '2006 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q46934'), ('subname', '1982 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q46938'), ('subname', '1986 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q46943'), ('subname', '1978 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q47735'), ('subname', '2002 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q79859'), ('subname', '2014 FIFA World Cup')]
[('sub', 'http://www.wikidata.o

20

#### I run the query for that Spain national team (wd:Q42267)

In [2]:
queryString = """
SELECT DISTINCT ?sub ?subname WHERE { 


?sub wdt:P1923 wd:Q42267;
     wdt:P3450 wd:Q19317.
      
?sub <http://schema.org/name> ?subname .
} 
"""

print("Results")
run_query(queryString)

Results
[('sub', 'http://www.wikidata.org/entity/Q101751'), ('subname', '1994 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q176883'), ('subname', '2010 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q37285'), ('subname', '2006 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q46934'), ('subname', '1982 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q46938'), ('subname', '1986 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q46943'), ('subname', '1978 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q47735'), ('subname', '2002 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q79859'), ('subname', '2014 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q101730'), ('subname', '1998 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q132515'), ('subname', '1950 FIFA World Cup')]
[('sub', 'http://www.wikidata.org/entity/Q132529'), ('subname', '1990 FIFA World Cup')]
[('sub', 'http://www.wikidata.

14

## Task 2 - Identify the BGP that connects a country with a national footbal team or as a location to a world cup

#### Let's see if the given property "country of sport" gives what we need (wdt:P1532)

In [5]:
queryString = """
SELECT DISTINCT ?obj ?objname WHERE { 

?team wdt:P31 wd:Q6979593;
      wdt:P1532 ?obj.
      
?obj <http://schema.org/name> ?objname .
} 
"""

print("Results")
run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q180573'), ('objname', 'South Vietnam')]
[('obj', 'http://www.wikidata.org/entity/Q16644'), ('objname', 'Northern Mariana Islands')]
[('obj', 'http://www.wikidata.org/entity/Q951302'), ('objname', 'Saaremaa')]
[('obj', 'http://www.wikidata.org/entity/Q3356210'), ('objname', 'Frøya')]
[('obj', 'http://www.wikidata.org/entity/Q223'), ('objname', 'Greenland')]
[('obj', 'http://www.wikidata.org/entity/Q43048'), ('objname', 'Rhodes')]
[('obj', 'http://www.wikidata.org/entity/Q786'), ('objname', 'Dominican Republic')]
[('obj', 'http://www.wikidata.org/entity/Q42000'), ('objname', 'Tahiti')]
[('obj', 'http://www.wikidata.org/entity/Q25362'), ('objname', 'Saint Barthélemy')]
[('obj', 'http://www.wikidata.org/entity/Q170468'), ('objname', 'United Arab Republic')]
[('obj', 'http://www.wikidata.org/entity/Q17054'), ('objname', 'Martinique')]
[('obj', 'http://www.wikidata.org/entity/Q216923'), ('objname', 'Chinese Taipei')]
[('obj', 'http://www.wiki

281

#### Found the connection. Now to get all the countries that hosted a world cup i try property "country"(P17) of a World Cup 

In [23]:
###WRONG QUERY

queryString = """
SELECT DISTINCT ?obj ?objname WHERE { 

wd:Q19317 wdt:P17 ?obj.
      
?obj <http://schema.org/name> ?objname .
} 
"""

print("Results")
run_query(queryString)

Results
Empty


0

#### I retrieve all the world cups with the BGP found before and then i get countries wirh property P17

In [25]:
queryString = """
SELECT DISTINCT ?obj ?objname WHERE { 

?wc wdt:P3450 wd:Q19317;
    wdt:P17 ?obj.
      
?obj <http://schema.org/name> ?objname .
} 
"""

print("Results")
run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q142'), ('objname', 'France')]
[('obj', 'http://www.wikidata.org/entity/Q145'), ('objname', 'United Kingdom')]
[('obj', 'http://www.wikidata.org/entity/Q155'), ('objname', 'Brazil')]
[('obj', 'http://www.wikidata.org/entity/Q159'), ('objname', 'Russia')]
[('obj', 'http://www.wikidata.org/entity/Q16'), ('objname', 'Canada')]
[('obj', 'http://www.wikidata.org/entity/Q17'), ('objname', 'Japan')]
[('obj', 'http://www.wikidata.org/entity/Q183'), ('objname', 'Germany')]
[('obj', 'http://www.wikidata.org/entity/Q258'), ('objname', 'South Africa')]
[('obj', 'http://www.wikidata.org/entity/Q298'), ('objname', 'Chile')]
[('obj', 'http://www.wikidata.org/entity/Q30'), ('objname', 'United States of America')]
[('obj', 'http://www.wikidata.org/entity/Q34'), ('objname', 'Sweden')]
[('obj', 'http://www.wikidata.org/entity/Q38'), ('objname', 'Italy')]
[('obj', 'http://www.wikidata.org/entity/Q39'), ('objname', 'Switzerland')]
[('obj', 'http://www.wikida

18

#### I use the first BGP to retrieve the team of Spain

In [9]:
queryString = """
SELECT DISTINCT ?sub ?subname WHERE { 

?sub wdt:P1532 wd:Q29;
     wdt:P31 wd:Q6979593.
      
?sub <http://schema.org/name> ?subname .

} 
"""

print("Results")
run_query(queryString)

Results
[('sub', 'http://www.wikidata.org/entity/Q1129595'), ('subname', 'Spain national under-21 football team')]
[('sub', 'http://www.wikidata.org/entity/Q2356290'), ('subname', 'Spain national under-20 football team')]
[('sub', 'http://www.wikidata.org/entity/Q2637337'), ('subname', 'Spain national under-23 football team')]
[('sub', 'http://www.wikidata.org/entity/Q2713491'), ('subname', 'Spain national under-19 football team')]
[('sub', 'http://www.wikidata.org/entity/Q3589784'), ('subname', 'Spain national under-18 football team')]
[('sub', 'http://www.wikidata.org/entity/Q3873503'), ('subname', 'Spain national under-16 football team')]
[('sub', 'http://www.wikidata.org/entity/Q42267'), ('subname', 'Spain national association football team')]
[('sub', 'http://www.wikidata.org/entity/Q18128271'), ('subname', "Spain women's national under-19 football team")]
[('sub', 'http://www.wikidata.org/entity/Q6124177'), ('subname', 'Spain national under-15 football team')]
[('sub', 'http://ww

11

In [10]:
queryString = """
SELECT DISTINCT ?sub ?subname WHERE { 

?sub wdt:P1532 wd:Q29;
     wdt:P31 wd:Q6979593.
      
?sub <http://schema.org/name> ?subname .

FILTER CONTAINS (?subname, "national association")
} 
"""

print("Results")
run_query(queryString)

Results
[('sub', 'http://www.wikidata.org/entity/Q42267'), ('subname', 'Spain national association football team')]


1

#### I develop an ask query to answer if a country ever hosted a World Cup (i run with the Spain example)

In [11]:
queryString = """
ASK WHERE { 

?wc wdt:P3450 wd:Q19317;
    wdt:P17 wd:Q29.
      
} 
"""

print("Results")
run_ask_query(queryString)

Results


{'head': {'link': []}, 'boolean': False}

#### However i know this is not true as Spain hosted a WC in 1982...that we won :)

## Task 3 -In which countries took place a FIFA world cup or world cup final? With how many participants?

#### I print all properties of a "Fifa World Cup" (wd:Q19317)

In [27]:
queryString = """
SELECT DISTINCT ?p ?pname WHERE { 

wd:Q19317 ?p ?obj.
      
?p <http://schema.org/name> ?pname.
} 
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P7441'), ('pname', 'WorldFootball.net competition ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1015'), ('pname', 'NORAF ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1151'), ('pname', "topic's main Wikimedia portal")]
[('p', 'http://www.wikidata.org/prop/direct/P1296'), ('pname', 'Gran Enciclopèdia Catalana ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('pname', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('pname', "topic's main template")]
[('p', 'http://www.wikidata.org/prop/direct/P1705'), ('pname', 'native label')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('pname', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P2002'), ('pname', 'Twitter username')]
[('p', 'http://www.wikidata.org/prop/direct/P2003'), ('pname', 'Instagram username')]
[('p', 'http://www.wikidata.org/prop/direct/P2013'), ('pname', 'Facebook ID')]
[('p', 'http://www.wikidata.org/

45

#### Not useful, i search for all properties of a the specific editions of the world cup

In [29]:
queryString = """
SELECT DISTINCT ?p ?pname WHERE { 

?wc wdt:P3450 wd:Q19317;
    ?p ?obj.
      
?p <http://schema.org/name> ?pname .
} 
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P4967'), ('pname', 'prerequisite')]
[('p', 'http://www.wikidata.org/prop/direct/P1132'), ('pname', 'number of participants')]
[('p', 'http://www.wikidata.org/prop/direct/P1151'), ('pname', "topic's main Wikimedia portal")]
[('p', 'http://www.wikidata.org/prop/direct/P131'), ('pname', 'located in the administrative territorial entity')]
[('p', 'http://www.wikidata.org/prop/direct/P1346'), ('pname', 'winner')]
[('p', 'http://www.wikidata.org/prop/direct/P1350'), ('pname', 'number of matches played/races/starts')]
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('pname', "topic's main template")]
[('p', 'http://www.wikidata.org/prop/direct/P154'), ('pname', 'logo image')]
[('p', 'http://www.wikidata.org/prop/direct/P1546'), ('pname', 'motto')]
[('p', 'http://www.wikidata.org/prop/direct/P155'), ('pname', 'follows')]
[('p', 'http://www.wikidata.org/prop/direct/P1617'), ('pname', 'BBC Things ID')]
[('p', 'http://www.wikidata.org/prop/

68

#### Found "number of participants(wdt:P1132), for each edition of the World Cup i print the country that hosted it and the number of participants

In [39]:
queryString = """
SELECT DISTINCT ?Edition (GROUP_CONCAT(DISTINCT ?ctrname; separator=", ") AS ?Country) ?Number WHERE { 

?wc wdt:P3450 wd:Q19317;
    wdt:P1132 ?Number;
    wdt:P17 ?c.
      
?wc <http://schema.org/name> ?Edition.
?c <http://schema.org/name> ?ctrname.
} 
GROUP BY ?Edition ?Number
"""

print("Results")
run_query(queryString)

Results
[('Edition', '2010 FIFA World Cup'), ('Country', 'South Africa'), ('Number', '32')]
[('Edition', '2014 FIFA World Cup'), ('Country', 'Brazil'), ('Number', '32')]
[('Edition', '1950 FIFA World Cup'), ('Country', 'Brazil'), ('Number', '13')]
[('Edition', '2006 FIFA World Cup'), ('Country', 'Germany'), ('Number', '32')]
[('Edition', '1954 FIFA World Cup'), ('Country', 'Switzerland'), ('Number', '16')]
[('Edition', '1930 FIFA World Cup'), ('Country', 'Uruguay'), ('Number', '13')]
[('Edition', '1978 FIFA World Cup'), ('Country', 'Argentina'), ('Number', '16')]
[('Edition', '1966 FIFA World Cup'), ('Country', 'United Kingdom'), ('Number', '16')]
[('Edition', '1994 FIFA World Cup'), ('Country', 'United States of America'), ('Number', '24')]
[('Edition', '1998 FIFA World Cup'), ('Country', 'France'), ('Number', '32')]
[('Edition', '1958 FIFA World Cup'), ('Country', 'Sweden'), ('Number', '16')]
[('Edition', '1938 FIFA World Cup'), ('Country', 'France'), ('Number', '15')]
[('Edition', '

23

## Task 4.1 - How many times each country participated in a world cup?

#### I reuse the BGP found for task 1 to count the participation

In [42]:
queryString = """
SELECT DISTINCT ?teamname (COUNT (?sub) as ?participation) WHERE { 

?team wdt:P31 wd:Q6979593.
?sub wdt:P1923 ?team;
     wdt:P3450 wd:Q19317.
      
?team <http://schema.org/name> ?teamname.
}
GROUP BY (?teamname)
ORDER BY DESC (?participation)
"""

print("Results")
run_query(queryString)

Results
[('teamname', 'Brazil national football team'), ('participation', '20')]
[('teamname', 'Germany national association football team'), ('participation', '18')]
[('teamname', 'Italy national association football team'), ('participation', '18')]
[('teamname', 'Argentina national football team'), ('participation', '16')]
[('teamname', 'Mexico national football team'), ('participation', '15')]
[('teamname', 'Spain national association football team'), ('participation', '14')]
[('teamname', 'England national association football team'), ('participation', '14')]
[('teamname', 'France national association football team'), ('participation', '14')]
[('teamname', 'Uruguay national football team'), ('participation', '12')]
[('teamname', 'Belgium national football team'), ('participation', '12')]
[('teamname', 'Sweden national association football team'), ('participation', '11')]
[('teamname', "United States men's national soccer team"), ('participation', '10')]
[('teamname', 'Swiss nationa

81

#### I search for the final events for each World Cup, i try with property "final event"

In [44]:
queryString = """
SELECT DISTINCT ?obj ?objname WHERE { 

?wc wdt:P3450 wd:Q19317;
    wdt:P3967 ?obj.
      
?obj <http://schema.org/name> ?objname .
} 
"""

print("Results")
run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q208401'), ('objname', '2010 FIFA World Cup Final')]
[('obj', 'http://www.wikidata.org/entity/Q31043671'), ('objname', '2018 FIFA World Cup Final')]
[('obj', 'http://www.wikidata.org/entity/Q585295'), ('objname', '1998 FIFA World Cup Final')]
[('obj', 'http://www.wikidata.org/entity/Q715021'), ('objname', '2002 FIFA World Cup Final')]
[('obj', 'http://www.wikidata.org/entity/Q754483'), ('objname', 'Uruguay v Brazil 1950')]
[('obj', 'http://www.wikidata.org/entity/Q665946'), ('objname', '1954 FIFA World Cup Final')]
[('obj', 'http://www.wikidata.org/entity/Q610156'), ('objname', '1986 FIFA World Cup Final')]
[('obj', 'http://www.wikidata.org/entity/Q858415'), ('objname', '1978 FIFA World Cup Final')]
[('obj', 'http://www.wikidata.org/entity/Q909551'), ('objname', '1982 FIFA World Cup Final')]
[('obj', 'http://www.wikidata.org/entity/Q389104'), ('objname', '1994 FIFA World Cup Final')]
[('obj', 'http://www.wikidata.org/entity/Q15926885'), 

21

#### I found the finals, i inspect one of them to search the properties (wd:Q208401)

In [45]:
queryString = """
SELECT DISTINCT ?p ?pname WHERE { 

wd:Q208401 ?p ?obj.
      
?p <http://schema.org/name> ?pname.
} 
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1363'), ('pname', 'points/goal scored by')]
[('p', 'http://www.wikidata.org/prop/direct/P1652'), ('pname', 'referee')]
[('p', 'http://www.wikidata.org/prop/direct/P1346'), ('pname', 'winner')]
[('p', 'http://www.wikidata.org/prop/direct/P155'), ('pname', 'follows')]
[('p', 'http://www.wikidata.org/prop/direct/P156'), ('pname', 'followed by')]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('pname', 'country')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('pname', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P276'), ('pname', 'location')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('pname', 'part of')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('pname', 'Commons category')]
[('p', 'http://www.wikidata.org/prop/direct/P585'), ('pname', 'point in time')]
[('p', 'http://www.wikidata.org/prop/direct/P641'), ('pname', 'sport'

18

In [46]:
queryString = """
SELECT DISTINCT ?p ?pname WHERE { 

?sub ?p wd:Q208401.
      
?p <http://schema.org/name> ?pname.
} 
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('pname', 'participant in')]
[('p', 'http://www.wikidata.org/prop/direct/P155'), ('pname', 'follows')]
[('p', 'http://www.wikidata.org/prop/direct/P156'), ('pname', 'followed by')]
[('p', 'http://www.wikidata.org/prop/direct/P3967'), ('pname', 'final event')]
[('p', 'http://www.wikidata.org/prop/direct/P793'), ('pname', 'significant event')]


5

#### I use "participating team" (wdt:P1923) to get all the participants of the finals. Example with 2010 final

In [47]:
queryString = """
SELECT DISTINCT ?obj ?objname WHERE { 

wd:Q208401 wdt:P1923 ?obj.
      
?obj <http://schema.org/name> ?objname.
} 
"""

print("Results")
run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q47050'), ('objname', 'Netherlands national association football team')]
[('obj', 'http://www.wikidata.org/entity/Q42267'), ('objname', 'Spain national association football team')]


2

#### Now for all finals of all World Cups

In [25]:
queryString = """
SELECT DISTINCT ?Edition ?Nation WHERE{

?wc wdt:P3450 wd:Q19317;
    wdt:P3967 ?final.
    
?final wdt:P1923 ?team.
      
?wc <http://schema.org/name> ?Edition.
?team <http://schema.org/name> ?Nation.
} 
"""

print("Results")
run_query(queryString)

Results
[('Edition', '1994 FIFA World Cup'), ('Nation', 'Italy national association football team')]
[('Edition', '1938 FIFA World Cup'), ('Nation', 'Hungary national association football team')]
[('Edition', '1970 FIFA World Cup'), ('Nation', 'Italy national association football team')]
[('Edition', '1970 FIFA World Cup'), ('Nation', 'Brazil national football team')]
[('Edition', '1930 FIFA World Cup'), ('Nation', 'Uruguay national football team')]
[('Edition', '1930 FIFA World Cup'), ('Nation', 'Argentina national football team')]
[('Edition', '2018 FIFA World Cup'), ('Nation', 'Croatia at the 2018 FIFA World Cup')]
[('Edition', '2018 FIFA World Cup'), ('Nation', 'France at the 2018 FIFA World Cup\u200e')]
[('Edition', '2010 FIFA World Cup'), ('Nation', 'Spain national association football team')]
[('Edition', '2010 FIFA World Cup'), ('Nation', 'Netherlands national association football team')]
[('Edition', '2006 FIFA World Cup'), ('Nation', 'France national association football team

34

In [4]:
queryString = """
SELECT DISTINCT ?Nation WHERE{

wd:Q389104 wdt:P1923 ?team.

?team <http://schema.org/name> ?Nation.
} 
"""

print("Results")
run_query(queryString)

Results
[('Nation', 'Italy national association football team')]


1

#### Miss some data as for this final i have only 1 team (instead of 2)

#### I count participations

In [6]:
queryString = """
SELECT DISTINCT ?Nation (COUNT (?final) AS ?nr_finalist) WHERE{

?wc wdt:P3450 wd:Q19317;
    wdt:P3967 ?final.
    
?final wdt:P1923 ?team.
      
?team <http://schema.org/name> ?Nation.
} 
GROUP BY ?Nation
ORDER BY DESC (?nr_finalist)
"""

print("Results")
run_query(queryString)

Results
[('Nation', 'Germany national association football team'), ('nr_finalist', '6')]
[('Nation', 'Brazil national football team'), ('nr_finalist', '5')]
[('Nation', 'Italy national association football team'), ('nr_finalist', '4')]
[('Nation', 'Argentina national football team'), ('nr_finalist', '3')]
[('Nation', 'Hungary national association football team'), ('nr_finalist', '2')]
[('Nation', 'Uruguay national football team'), ('nr_finalist', '2')]
[('Nation', 'Czechoslovakia national association football team'), ('nr_finalist', '2')]
[('Nation', 'Netherlands national association football team'), ('nr_finalist', '2')]
[('Nation', 'France national association football team'), ('nr_finalist', '2')]
[('Nation', 'France at the 2018 FIFA World Cup\u200e'), ('nr_finalist', '1')]
[('Nation', 'Spain national association football team'), ('nr_finalist', '1')]
[('Nation', 'Sweden national association football team'), ('nr_finalist', '1')]
[('Nation', 'England at the 1966 FIFA World Cup'), ('

15

#### For the team that won most finals i use the proper "winner" (wdt:P1346) of the finals

In [7]:
queryString = """
SELECT DISTINCT ?Nation (COUNT (?final) AS ?nr_winner) WHERE{

?wc wdt:P3450 wd:Q19317;
    wdt:P3967 ?final.
    
?final wdt:P1346 ?team.
      
?team <http://schema.org/name> ?Nation.
} 
GROUP BY ?Nation
ORDER BY DESC (?nr_winner)
"""

print("Results")
run_query(queryString)

Results
[('Nation', 'Brazil national football team'), ('nr_winner', '3')]
[('Nation', 'Italy national association football team'), ('nr_winner', '3')]
[('Nation', 'Germany national association football team'), ('nr_winner', '2')]
[('Nation', 'Uruguay national football team'), ('nr_winner', '2')]
[('Nation', 'France at the 2018 FIFA World Cup\u200e'), ('nr_winner', '1')]
[('Nation', 'Spain national association football team'), ('nr_winner', '1')]
[('Nation', 'England at the 1966 FIFA World Cup'), ('nr_winner', '1')]


7

## Task 4.3 - Who participated in the most world cups among Italy, France, Spain, and Germany?

#### Let's try to retrieve the team with CONTAINS

In [8]:
queryString = """
SELECT DISTINCT  * WHERE{

?team wdt:P31 wd:Q6979593.

?team <http://schema.org/name> ?name.

FILTER CONTAINS(?name,"Italy")
}
"""
print("Results")
run_query(queryString)

Results
[('team', 'http://www.wikidata.org/entity/Q3873531'), ('name', "Italy women's national under-19 football team")]
[('team', 'http://www.wikidata.org/entity/Q3873502'), ('name', 'Italy national under-16 football team')]
[('team', 'http://www.wikidata.org/entity/Q676899'), ('name', 'Italy national association football team')]
[('team', 'http://www.wikidata.org/entity/Q2331232'), ('name', 'Italy national under-17 football team')]
[('team', 'http://www.wikidata.org/entity/Q3632340'), ('name', 'Italy under-21 Serie B representative team')]
[('team', 'http://www.wikidata.org/entity/Q3873548'), ('name', 'Italy national under-20 football team')]
[('team', 'http://www.wikidata.org/entity/Q922698'), ('name', 'Italy national under-21 football team')]
[('team', 'http://www.wikidata.org/entity/Q3873513'), ('name', 'Italy national under-18 football team')]
[('team', 'http://www.wikidata.org/entity/Q2266580'), ('name', 'Italy national under-19 football team')]
[('team', 'http://www.wikidata.or

12

#### I inspect the Team that i'm interested in (wd:Q676899)

In [None]:
queryString = """
SELECT DISTINCT ?p ?pname WHERE{

wd:Q676899 ?p ?obj.

?p <http://schema.org/name> ?pname.

}
"""
print("Results")
run_query(queryString)

#### Search for "Instance of"

In [10]:
queryString = """
SELECT DISTINCT ?obj ?objname WHERE{

wd:Q676899 wdt:P31 ?obj.

?obj <http://schema.org/name> ?objname.

}
"""
print("Results")
run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q6979593'), ('objname', 'national association football team')]


1

#### It was a loop, i try with the work done in the Task 1 to get teams that participated in the World Cups

In [24]:
queryString = """
SELECT DISTINCT ?name WHERE { 

?team wdt:P31 wd:Q6979593.
?sub wdt:P1923 ?team;
     wdt:P3450 wd:Q19317.
      
?team <http://schema.org/name> ?name.

FILTER REGEX (?name, "(Italy|Germany|France|Spain)")
} 
"""

print("Results")
run_query(queryString)

Results
[('name', 'Italy national association football team')]
[('name', 'France national association football team')]
[('name', 'Spain national association football team')]
[('name', 'Germany national association football team')]
[('name', 'East Germany national football team')]


5

#### Found! Now i count participation

In [28]:
queryString = """
SELECT DISTINCT ?name (COUNT(?sub) AS ?participation) WHERE { 

?team wdt:P31 wd:Q6979593.
?sub wdt:P1923 ?team;
     wdt:P3450 wd:Q19317.
      
?team <http://schema.org/name> ?name.

FILTER REGEX (?name, "(Italy|Germany|France|Spain)")
} 
GROUP BY ?name
ORDER BY DESC (?participation)
"""

print("Results")
run_query(queryString)

Results
[('name', 'Germany national association football team'), ('participation', '18')]
[('name', 'Italy national association football team'), ('participation', '18')]
[('name', 'Spain national association football team'), ('participation', '14')]
[('name', 'France national association football team'), ('participation', '14')]
[('name', 'East Germany national football team'), ('participation', '1')]


5