# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-d0ebe8389f-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# Sport Workflow Series ("FIFA World Cup events explorative search") 

Consider the following exploratory information need:

> Explore the participants and winners of different editions of the FIFA World Cup

## Useful URIs for the current workflow


The following are given:

| IRI           | Description    | Role      |
| -----------   | -----------    |-----------|
| `wdt:P1647`   | subproperty    | predicate |
| `wdt:P31`     | instance of    | predicate |
| `wdt:P106`    | profession     | predicate | 
| `wdt:P279`    | subclass       | predicate |
| `wdt:P27`     | nationality    | predicate |
| `wdt:P1532`   | country of sport    | predicate |
| `wd:Q29`      | Spain           | node |
| `wd:Q42267`   | Spain national football team | node |
| `wd:Q19317`   | FIFA World Cup  | node |
| `wd:Q176883`  | 2010 FIFA World cup  | node |
| `wd:Q208401`  | 2010 FIFA World Cup Final | node      |
| `wd:Q6979593`  |national football team | node      |


Also consider

```
?a wdt:P31 wd:Q6979593
```

is the BGP to retrieve all the  **national footbal teams**


## Workload Goals

1. Identify the BGP for connecting national football teams with the participations in the finals or in the world cups

2. Identify the BGP that connects a country with a national footbal team or as a location to a world cup

3. In which countries took place a FIFA world cup or world cup final? With how many participants?

4. Compare number of participants and winners of different world cups
 
   4.1 How many times each country participated in a world cup?
   
   4.2 Which country participated in the most finals? Which country won the most finals?
   
   4.3 Who participated in the most world cups among Italy, France, Spain, and Germany?


In [1]:
# start your workflow here

In [2]:
queryString = """
SELECT COUNT(*)
WHERE { 

?a wdt:P31 wd:Q6979593

} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '1220')]


1

## World Cup and World Cup Finals (Football National Teams)
This section contains the queries used to answer the first question: _Identify the BGP for connecting national football teams with the participations in the finals or in the world cups_

### BGP for obtaining all the editions of the FIFA World Cup
The first thing I need to answer the queries are the FIFA World Cups. I check if 2010 FIFA World Cup (wd:Q176883) is an instance of World Cup (wd:Q19317).

In [3]:
queryString = """
SELECT ?i ?instanceOf
WHERE { 

wd:Q176883 wdt:P31 ?i.

?i <http://schema.org/name> ?instanceOf.

} 
"""

print("Results")
run_query(queryString)

Results
[('i', 'http://www.wikidata.org/entity/Q27020041'), ('instanceOf', 'sports season')]


1

Since my assumption was wrong I look for a connection between the two nodes in both directions.

In [4]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

wd:Q176883 ?p wd:Q19317.

?p <http://schema.org/name> ?pname.

} 
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P3450'), ('pname', 'sports season of league or competition')]


1

I discovered the property "sports season of league or competition" (wdt:P3450). I now check for the other direction.

In [5]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

wd:Q19317 ?p wd:Q176883.

?p <http://schema.org/name> ?pname.

} 
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part')]


1

I found the property "has part" (wdt:P527). First I check if the two properties return the information I am looking for and if that is the case for both of them I then check if they return the same results.

Firstly I check if property "sports season of league or competition" (wdt:P3450) returns all editions of the FIFA World Cup.

In [6]:
queryString = """
SELECT DISTINCT ?edition
WHERE { 

?s wdt:P3450 wd:Q19317.

?s <http://schema.org/name> ?edition.

} 
ORDER BY ?edition
"""

print("Results")
run_query(queryString)

Results
[('edition', '1930 FIFA World Cup')]
[('edition', '1934 FIFA World Cup')]
[('edition', '1938 FIFA World Cup')]
[('edition', '1942 FIFA World Cup')]
[('edition', '1946 FIFA World Cup')]
[('edition', '1950 FIFA World Cup')]
[('edition', '1954 FIFA World Cup')]
[('edition', '1958 FIFA World Cup')]
[('edition', '1962 FIFA World Cup')]
[('edition', '1966 FIFA World Cup')]
[('edition', '1970 FIFA World Cup')]
[('edition', '1974 FIFA World Cup')]
[('edition', '1978 FIFA World Cup')]
[('edition', '1982 FIFA World Cup')]
[('edition', '1986 FIFA World Cup')]
[('edition', '1990 FIFA World Cup')]
[('edition', '1994 FIFA World Cup')]
[('edition', '1998 FIFA World Cup')]
[('edition', '2002 FIFA World Cup')]
[('edition', '2006 FIFA World Cup')]
[('edition', '2010 FIFA World Cup')]
[('edition', '2014 FIFA World Cup')]
[('edition', '2018 FIFA World Cup')]
[('edition', '2022 FIFA World Cup')]
[('edition', '2026 FIFA World Cup')]
[('edition', '2030 FIFA World Cup')]
[('edition', '2034 FIFA World 

27

I now do the same thing for property "has part" (wdt:P527).

In [7]:
queryString = """
SELECT DISTINCT ?edition
WHERE { 

wd:Q19317 wdt:P527 ?o.

?o <http://schema.org/name> ?edition.

} 
ORDER BY ?edition
"""

print("Results")
run_query(queryString)

Results
[('edition', '1930 FIFA World Cup')]
[('edition', '1934 FIFA World Cup')]
[('edition', '1938 FIFA World Cup')]
[('edition', '1942 FIFA World Cup')]
[('edition', '1946 FIFA World Cup')]
[('edition', '1950 FIFA World Cup')]
[('edition', '1954 FIFA World Cup')]
[('edition', '1958 FIFA World Cup')]
[('edition', '1962 FIFA World Cup')]
[('edition', '1966 FIFA World Cup')]
[('edition', '1970 FIFA World Cup')]
[('edition', '1974 FIFA World Cup')]
[('edition', '1978 FIFA World Cup')]
[('edition', '1982 FIFA World Cup')]
[('edition', '1986 FIFA World Cup')]
[('edition', '1990 FIFA World Cup')]
[('edition', '1994 FIFA World Cup')]
[('edition', '1998 FIFA World Cup')]
[('edition', '2002 FIFA World Cup')]
[('edition', '2006 FIFA World Cup')]
[('edition', '2010 FIFA World Cup')]
[('edition', '2014 FIFA World Cup')]
[('edition', '2018 FIFA World Cup')]
[('edition', '2022 FIFA World Cup')]
[('edition', '2026 FIFA World Cup')]
[('edition', '2030 FIFA World Cup')]
[('edition', '2034 FIFA World 

27

The two properties are symmetric, i.e. they return the same results, therefore I can use either of them to return all the FIFA World Cup editions. 

I notice there are also future editions therefore I check if each FIFA World Cup edition has a property about the year it took or it will take place in order to filter out future editions.

In [8]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

?s wdt:P3450 wd:Q19317.

?s ?p ?o.

?p <http://schema.org/name> ?pname.

} 
ORDER BY ?pname
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1617'), ('pname', 'BBC Things ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2581'), ('pname', 'BabelNet ID')]
[('p', 'http://www.wikidata.org/prop/direct/P268'), ('pname', 'Bibliothèque nationale de France ID')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('pname', 'Commons category')]
[('p', 'http://www.wikidata.org/prop/direct/P935'), ('pname', 'Commons gallery')]
[('p', 'http://www.wikidata.org/prop/direct/P3509'), ('pname', 'Dagens Nyheter topic ID')]
[('p', 'http://www.wikidata.org/prop/direct/P3219'), ('pname', 'Encyclopædia Universalis ID')]
[('p', 'http://www.wikidata.org/prop/direct/P7829'), ('pname', 'English Vikidia ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2013'), ('pname', 'Facebook ID')]
[('p', 'http://www.wikidata.org/prop/direct/P6262'), ('pname', 'Fandom article ID')]
[('p', 'http://www.wikidata.org/prop/direct/P646'), ('pname', 'Freebase ID')]
[('p', 'http://www.wikidata.org/prop/direct/P7818')

68

Interesting properties retrieved:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P17`   | country  | predicate |
| `wdt:P393`     | edition   | predicate |
| `wdt:P582`    | end time    | predicate | 
| `wdt:P3967`    | final event    | predicate | 
| `wdt:P276`    | location    | predicate | 
| `wdt:P1350`    | number of matches    | predicate | 
| `wdt:P1351`    | number of points/goals/set scored    | predicate | 
| `wdt:P1132`    | number of participants   | predicate | 
| `wdt:P710`    | participant    | predicate | 
| `wdt:P1923`    | participating team   | predicate | 
| `wdt:P580`    | start time    | predicate | 
| `wdt:P1346`    | winner    | predicate | 

As far as temporal information is concerned, in order to filter out future editions I can check if the start time is less than NOW(). Before doing that though, I check if future editions have such property.

In [9]:
queryString = """
SELECT ?s ?sname
WHERE { 

?s wdt:P3450 wd:Q19317;
    <http://schema.org/name> ?sname.

FILTER NOT EXISTS {?s wdt:P580 ?o.}
} 
ORDER BY ?sname
"""

print("Results")
run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q203851'), ('sname', '1942 FIFA World Cup')]
[('s', 'http://www.wikidata.org/entity/Q65273310'), ('sname', '1946 FIFA World Cup')]


2

These two editions did not actually take place therefore I can filter them out as well. Since future editions have such property I can return all past editions of the FIFA World Cup with the following query.

In [10]:
queryString = """
SELECT ?s ?sname
WHERE { 

?s wdt:P3450 wd:Q19317;
    wdt:P580 ?start;
    <http://schema.org/name> ?sname.
    
FILTER(?start < NOW()).

}
ORDER BY ?sname
"""

print("Results")
run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q48432'), ('sname', '1930 FIFA World Cup')]
[('s', 'http://www.wikidata.org/entity/Q1477177'), ('sname', '1934 FIFA World Cup')]
[('s', 'http://www.wikidata.org/entity/Q131591'), ('sname', '1938 FIFA World Cup')]
[('s', 'http://www.wikidata.org/entity/Q132515'), ('sname', '1950 FIFA World Cup')]
[('s', 'http://www.wikidata.org/entity/Q131075'), ('sname', '1954 FIFA World Cup')]
[('s', 'http://www.wikidata.org/entity/Q132533'), ('sname', '1958 FIFA World Cup')]
[('s', 'http://www.wikidata.org/entity/Q160813'), ('sname', '1962 FIFA World Cup')]
[('s', 'http://www.wikidata.org/entity/Q134202'), ('sname', '1966 FIFA World Cup')]
[('s', 'http://www.wikidata.org/entity/Q132664'), ('sname', '1970 FIFA World Cup')]
[('s', 'http://www.wikidata.org/entity/Q166121'), ('sname', '1974 FIFA World Cup')]
[('s', 'http://www.wikidata.org/entity/Q46943'), ('sname', '1978 FIFA World Cup')]
[('s', 'http://www.wikidata.org/entity/Q46934'), ('sname', '1982 FIFA

21

Since I retrieved some interesting property I decide to run a query to order the editions based on the average number of goals scored during the turnament. In fact, I recall I discovered the properties "number of matches" (wdt:P1350) and "number of points/goals/set scored" (wdt:P1351).

In [11]:
queryString = """
SELECT ?sname ?matches ?goals (?goals/?matches) AS ?score
WHERE { 

?s wdt:P3450 wd:Q19317;
    wdt:P580 ?start;
    wdt:P1351 ?goals;
    wdt:P1350 ?matches;
    <http://schema.org/name> ?sname.
    
FILTER(?start < NOW()).

}
ORDER BY DESC(?score)
"""

print("Results")
run_query(queryString)

Results
[('sname', '1954 FIFA World Cup'), ('matches', '26'), ('goals', '140'), ('score', '5.384615384615385')]
[('sname', '1938 FIFA World Cup'), ('matches', '18'), ('goals', '84'), ('score', '4.666666666666667')]
[('sname', '1934 FIFA World Cup'), ('matches', '17'), ('goals', '70'), ('score', '4.117647058823529')]
[('sname', '1950 FIFA World Cup'), ('matches', '22'), ('goals', '88'), ('score', '4')]
[('sname', '1930 FIFA World Cup'), ('matches', '18'), ('goals', '70'), ('score', '3.888888888888889')]
[('sname', '1958 FIFA World Cup'), ('matches', '35'), ('goals', '126'), ('score', '3.6')]
[('sname', '1970 FIFA World Cup'), ('matches', '32'), ('goals', '95'), ('score', '2.96875')]
[('sname', '1982 FIFA World Cup'), ('matches', '52'), ('goals', '146'), ('score', '2.807692307692308')]
[('sname', '1966 FIFA World Cup'), ('matches', '32'), ('goals', '89'), ('score', '2.78125')]
[('sname', '1962 FIFA World Cup'), ('matches', '32'), ('goals', '89'), ('score', '2.78125')]
[('sname', '1994 FI

21

I discovered they scored 140 goals in the 1954 edition with only 26 matches. The number of goals is similar to the one of the 1994 edition where 141 goals were scored but the number of matches is 52 that is twice as much as in the 1954 edition. That means the teams scored half the amount of goals per match than in the 1954 edition.

### Connection between National Football Teams and FIFA World Cups

Firstly, I count how many properties there are with a national football team as subject.

In [14]:
queryString = """
SELECT COUNT(DISTINCT ?p) AS ?subjProps
WHERE { 

?ft wdt:P31 wd:Q6979593;
    ?p ?o.

} 
"""

print("Results")
run_query(queryString)

Results
[('subjProps', '126')]


1

Now I count how many properties have a national football team as object.

In [15]:
queryString = """
SELECT COUNT(DISTINCT ?p) AS ?objProps
WHERE { 

?ft wdt:P31 wd:Q6979593.
?s ?p ?ft.

} 
"""

print("Results")
run_query(queryString)

Results
[('objProps', '45')]


1

Since they are not so many I can return all the properties that have a national football team as object. As for the other notebooks, I order the results based on the number of football teams appearing with the same property.

In [17]:
queryString = """
SELECT ?p ?pname COUNT(?ft) AS ?teams
WHERE { 

?ft wdt:P31 wd:Q6979593.
?s ?p ?ft.

?p <http://schema.org/name> ?pname.

} 
GROUP BY ?p ?pname
ORDER BY DESC(?teams)
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P54'), ('pname', 'member of sports team'), ('teams', '94684')]
[('p', 'http://www.wikidata.org/prop/direct/P1923'), ('pname', 'participating team'), ('teams', '2658')]
[('p', 'http://www.wikidata.org/prop/direct/P1346'), ('pname', 'winner'), ('teams', '842')]
[('p', 'http://www.wikidata.org/prop/direct/P1830'), ('pname', 'owner of'), ('teams', '609')]
[('p', 'http://www.wikidata.org/prop/direct/P6087'), ('pname', 'coach of sports team'), ('teams', '350')]
[('p', 'http://www.wikidata.org/prop/direct/P5138'), ('pname', 'season of club or team'), ('teams', '323')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('pname', "category's main topic"), ('teams', '293')]
[('p', 'http://www.wikidata.org/prop/direct/P466'), ('pname', 'occupant'), ('teams', '270')]
[('p', 'http://www.wikidata.org/prop/direct/P710'), ('pname', 'participant'), ('teams', '229')]
[('p', 'http://www.wikidata.org/prop/direct/P1269'), ('pname', 'facet of'), ('teams',

43

I recall I am looking for the connection between national football teams and their participation in the finals or the world cups therefore some useful property could be "participant team" (wdt:P1923). For this reason I return what type of subjects are connected to the football teams by means of such property.

In [18]:
queryString = """
SELECT ?i ?instanceOf COUNT(DISTINCT ?s) AS ?subjs
WHERE { 

?ft wdt:P31 wd:Q6979593.
?s wdt:P1923 ?ft;
    wdt:P31 ?i.

?i <http://schema.org/name> ?instanceOf.

} 
GROUP BY ?i ?instanceOf
ORDER BY DESC(?subjs)
"""

print("Results")
run_query(queryString)

Results
[('i', 'http://www.wikidata.org/entity/Q16466010'), ('instanceOf', 'association football match'), ('subjs', '431')]
[('i', 'http://www.wikidata.org/entity/Q17315159'), ('instanceOf', 'international association football match'), ('subjs', '295')]
[('i', 'http://www.wikidata.org/entity/Q27020041'), ('instanceOf', 'sports season'), ('subjs', '116')]
[('i', 'http://www.wikidata.org/entity/Q65770283'), ('instanceOf', 'association football final'), ('subjs', '97')]
[('i', 'http://www.wikidata.org/entity/Q1366722'), ('instanceOf', 'final'), ('subjs', '51')]
[('i', 'http://www.wikidata.org/entity/Q2122052'), ('instanceOf', 'qualification'), ('subjs', '24')]
[('i', 'http://www.wikidata.org/entity/Q12708896'), ('instanceOf', 'FIFA World Cup final'), ('subjs', '18')]
[('i', 'http://www.wikidata.org/entity/Q107540719'), ('instanceOf', 'edition of the UEFA European Championship'), ('subjs', '16')]
[('i', 'http://www.wikidata.org/entity/Q59658968'), ('instanceOf', 'UEFA Euro Final'), ('subjs

43

I now return some triples in order to better understand the property.

In [19]:
queryString = """
SELECT ?s ?sname ?ft ?ftname
WHERE { 

?ft wdt:P31 wd:Q6979593.
?s wdt:P1923 ?ft.

?s <http://schema.org/name> ?sname.
?ft <http://schema.org/name> ?ftname.

} 
LIMIT 25
"""

print("Results")
run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q3512678'), ('sname', 'Muratti'), ('ft', 'http://www.wikidata.org/entity/Q1478243'), ('ftname', 'Guernsey official association football team')]
[('s', 'http://www.wikidata.org/entity/Q29572139'), ('sname', '2019 UEFA European Under-21 Championship qualification Group 7'), ('ft', 'http://www.wikidata.org/entity/Q2660379'), ('ftname', 'North Macedonia national under-21 football team')]
[('s', 'http://www.wikidata.org/entity/Q29572050'), ('sname', '2019 UEFA European Under-21 Championship qualification Group 4'), ('ft', 'http://www.wikidata.org/entity/Q2630537'), ('ftname', 'Latvia national under-21 football team')]
[('s', 'http://www.wikidata.org/entity/Q29572211'), ('sname', '2019 UEFA European Under-21 Championship qualification Group 9'), ('ft', 'http://www.wikidata.org/entity/Q651714'), ('ftname', 'Bulgaria national under-21 football team')]
[('s', 'http://www.wikidata.org/entity/Q62338382'), ('sname', 'Cyprus v San Marino'), ('ft', 'htt

25

Since I know wd:Q176883 is the node for the 2010 FIFA World Cup I try to see if such property connects some national football teams to the World Cup.

In [20]:
queryString = """
SELECT ?ft ?ftname
WHERE { 

?ft wdt:P31 wd:Q6979593.
wd:Q176883 wdt:P1923 ?ft.


?ft <http://schema.org/name> ?ftname.

} 
"""

print("Results")
run_query(queryString)

Results
[('ft', 'http://www.wikidata.org/entity/Q676899'), ('ftname', 'Italy national association football team')]
[('ft', 'http://www.wikidata.org/entity/Q268208'), ('ftname', 'Australia national association football team')]
[('ft', 'http://www.wikidata.org/entity/Q134916'), ('ftname', 'Uruguay national football team')]
[('ft', 'http://www.wikidata.org/entity/Q170327'), ('ftname', 'South Africa national football team')]
[('ft', 'http://www.wikidata.org/entity/Q14132'), ('ftname', 'North Korea national football team')]
[('ft', 'http://www.wikidata.org/entity/Q174512'), ('ftname', 'Slovakia national association football team')]
[('ft', 'http://www.wikidata.org/entity/Q175145'), ('ftname', 'Ivory Coast national football team')]
[('ft', 'http://www.wikidata.org/entity/Q180187'), ('ftname', 'Honduras national football team')]
[('ft', 'http://www.wikidata.org/entity/Q172221'), ('ftname', 'Slovenia national football team')]
[('ft', 'http://www.wikidata.org/entity/Q47050'), ('ftname', 'Nether

32

From this result I can assess the property "partecipating team" (wdt:P1923) connects the national football teams with the participation in the World Cups. Before moving on I check if there are some other properties that connect a national football team to the FIFA World Cup. In order to do that I use again the node wd:Q176883 "2010 FIFA World Cup".

In [21]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

?ft wdt:P31 wd:Q6979593.
wd:Q176883 ?p ?ft.


?p <http://schema.org/name> ?pname.

} 
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1346'), ('pname', 'winner')]
[('p', 'http://www.wikidata.org/prop/direct/P1923'), ('pname', 'participating team')]


2

I see I have another property "winner" (wdt:P1346) that could be useful for later questions. Now I check if there are other properties connecting the two entities in the order way.

In [22]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

?ft wdt:P31 wd:Q6979593.
?ft ?p wd:Q176883.


?p <http://schema.org/name> ?pname.

} 
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('pname', 'participant in')]
[('p', 'http://www.wikidata.org/prop/direct/P2522'), ('pname', 'victory')]


2

Looking in the other direction I found the properties "participant in" (wdt:P1344) and "victory" (wdt:P2522) that look like symmetric properties for the others. I check if my assumption is right by checking if the two properties return different results.

The first check I can do is in the number of national football teams I return with both properties.

In [23]:
queryString = """
SELECT ?edition COUNT(DISTINCT ?t1) AS ?teams1923 COUNT(DISTINCT ?t2) AS ?teams1344
WHERE{

    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        <http://schema.org/name> ?edition.

    ?t1 wdt:P31 wd:Q6979593.   
    ?t2 wdt:P31 wd:Q6979593.
    
    ?wc wdt:P1923 ?t1.
    ?t2 wdt:P1344 ?wc.
    
    FILTER(?start < NOW()).

}
GROUP BY ?edition
ORDER BY ?edition
"""

print("Results")
run_query(queryString)

Results
[('edition', '1930 FIFA World Cup'), ('teams1923', '13'), ('teams1344', '13')]
[('edition', '1934 FIFA World Cup'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1938 FIFA World Cup'), ('teams1923', '15'), ('teams1344', '15')]
[('edition', '1950 FIFA World Cup'), ('teams1923', '13'), ('teams1344', '12')]
[('edition', '1954 FIFA World Cup'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1958 FIFA World Cup'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1962 FIFA World Cup'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1966 FIFA World Cup'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1970 FIFA World Cup'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1974 FIFA World Cup'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1978 FIFA World Cup'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1982 FIFA World Cup'), ('teams1923', '24'), ('teams1344', '24')]
[('edition', '1986 FIFA World Cup'), ('teams

20

I discovered two editions have different results for the two properties. In order to see which one contains the right information I can return also the property "number of participant" (wdt:P1132) (I discovered it in the previous section) and I check the correctness of the information. I assume that if the number is the same for 2 out of 3 properties then such properties have the correct information.

In [24]:
queryString = """
SELECT ?edition ?numParticipants COUNT(DISTINCT ?t1) AS ?teams1923 COUNT(DISTINCT ?t2) AS ?teams1344
WHERE{

    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        wdt:P1132 ?numParticipants;
        <http://schema.org/name> ?edition.

    ?t1 wdt:P31 wd:Q6979593.   
    ?t2 wdt:P31 wd:Q6979593.
    
    ?wc wdt:P1923 ?t1.
    ?t2 wdt:P1344 ?wc.
    
    FILTER(?start < NOW()).

}
GROUP BY ?edition ?numParticipants
ORDER BY ?edition
"""

print("Results")
run_query(queryString)

Results
[('edition', '1930 FIFA World Cup'), ('numParticipants', '13'), ('teams1923', '13'), ('teams1344', '13')]
[('edition', '1934 FIFA World Cup'), ('numParticipants', '16'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1938 FIFA World Cup'), ('numParticipants', '15'), ('teams1923', '15'), ('teams1344', '15')]
[('edition', '1950 FIFA World Cup'), ('numParticipants', '13'), ('teams1923', '13'), ('teams1344', '12')]
[('edition', '1954 FIFA World Cup'), ('numParticipants', '16'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1958 FIFA World Cup'), ('numParticipants', '16'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1962 FIFA World Cup'), ('numParticipants', '16'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1966 FIFA World Cup'), ('numParticipants', '16'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1970 FIFA World Cup'), ('numParticipants', '16'), ('teams1923', '16'), ('teams1344', '16')]
[('edition', '1974 FIFA World Cup'), (

20

I decide to use the property "participating team" (wdt:P1923) to connect the National Football Teams with the FIFA World Cups. 

Now that I found the connection between the FIFA World Cup and the National Football Teams I can return all the teams that participated in each edition.

In [25]:
queryString = """
SELECT ?edition GROUP_CONCAT(?team; separator="; ") AS ?particapatingTeams
WHERE{

    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        <http://schema.org/name> ?edition.

    ?nft wdt:P31 wd:Q6979593;
         <http://schema.org/name> ?team.  
    
    ?wc wdt:P1923 ?nft.
    
    FILTER(?start < NOW()).

}
GROUP BY ?edition
ORDER BY ?edition
"""

print("Results")
run_query(queryString)

Results
[('edition', '1930 FIFA World Cup'), ('particapatingTeams', "Romania national association football team; Uruguay national football team; Peru national football team; France national association football team; Argentina national football team; Brazil national football team; Belgium national football team; Paraguay national football team; Yugoslavia national association football team; Mexico national football team; United States men's national soccer team; Bolivia national football team; Chile national football team")]
[('edition', '1934 FIFA World Cup'), ('particapatingTeams', "Romania national association football team; Italy national association football team; Austria national association football team; Netherlands national association football team; France national association football team; Spain national association football team; Argentina national football team; Brazil national football team; Belgium national football team; Germany national association football team; Hung

20

### Connection between National Football Teams and FIFA World Cup Finals

Firstly, I need to return all the FIFA World Cup finals. I recall I found the property "final event" (wdt:P3967) when exploring the FIFA World Cup therefore I check if such property connects each FIFA World Cup edition with its final.

In [26]:
queryString = """
SELECT ?edition ?fe ?final
WHERE{

    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        wdt:P3967 ?fe;
        <http://schema.org/name> ?edition.

    ?fe <http://schema.org/name> ?final.  
    
    FILTER(?start < NOW()).

}
ORDER BY ?edition
"""

print("Results")
run_query(queryString)

Results
[('edition', '1930 FIFA World Cup'), ('fe', 'http://www.wikidata.org/entity/Q63843'), ('final', '1930 FIFA World Cup Final')]
[('edition', '1934 FIFA World Cup'), ('fe', 'http://www.wikidata.org/entity/Q945608'), ('final', '1934 FIFA World Cup Final')]
[('edition', '1938 FIFA World Cup'), ('fe', 'http://www.wikidata.org/entity/Q945575'), ('final', '1938 FIFA World Cup Final')]
[('edition', '1950 FIFA World Cup'), ('fe', 'http://www.wikidata.org/entity/Q754483'), ('final', 'Uruguay v Brazil 1950')]
[('edition', '1954 FIFA World Cup'), ('fe', 'http://www.wikidata.org/entity/Q665946'), ('final', '1954 FIFA World Cup Final')]
[('edition', '1958 FIFA World Cup'), ('fe', 'http://www.wikidata.org/entity/Q1078991'), ('final', '1958 FIFA World Cup Final')]
[('edition', '1962 FIFA World Cup'), ('fe', 'http://www.wikidata.org/entity/Q1191064'), ('final', '1962 FIFA World Cup Final')]
[('edition', '1966 FIFA World Cup'), ('fe', 'http://www.wikidata.org/entity/Q1065912'), ('final', '1966 FI

21

I asses property "final event" (wdt:P3967) allow me to retrieve all FIFA World Cup finals. Now I return the finals' properties ordered by the number of objects connected by each of them.

In [27]:
queryString = """
SELECT ?p ?pname COUNT(DISTINCT ?o) AS ?objs
WHERE{

    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        wdt:P3967 ?fe.

    ?fe ?p ?o.
    
    ?p <http://schema.org/name> ?pname.  
    
    FILTER(?start < NOW()).

}
ORDER BY DESC(?objs)
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P710'), ('pname', 'participant'), ('objs', '51')]
[('p', 'http://www.wikidata.org/prop/direct/P1652'), ('pname', 'referee'), ('objs', '22')]
[('p', 'http://www.wikidata.org/prop/direct/P585'), ('pname', 'point in time'), ('objs', '21')]
[('p', 'http://www.wikidata.org/prop/direct/P156'), ('pname', 'followed by'), ('objs', '21')]
[('p', 'http://www.wikidata.org/prop/direct/P1110'), ('pname', 'attendance'), ('objs', '21')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('pname', 'part of'), ('objs', '21')]
[('p', 'http://www.wikidata.org/prop/direct/P155'), ('pname', 'follows'), ('objs', '20')]
[('p', 'http://www.wikidata.org/prop/direct/P646'), ('pname', 'Freebase ID'), ('objs', '20')]
[('p', 'http://www.wikidata.org/prop/direct/P276'), ('pname', 'location'), ('objs', '19')]
[('p', 'http://www.wikidata.org/prop/direct/P1923'), ('pname', 'participating team'), ('objs', '15')]
[('p', 'http://www.wikidata.org/prop/direct/P1363'), ('p

36

Interesting properties retrieved:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P710`   | participant  | predicate |
| `wdt:P1652`     | referee   | predicate |
| `wdt:P582`    | point in time   | predicate | 
| `wdt:P276`    | location   | predicate | 
| `wdt:P1346`    | winner   | predicate | 
| `wdt:P1363`    | points/goal scored by  | predicate | 

I check if the property "participant" returns the National Football Teams involved in the FIFA World Cup Final.

In [28]:
queryString = """
SELECT ?final GROUP_CONCAT(DISTINCT ?team; separator=", ") AS ?participants
WHERE{

    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        wdt:P3967 ?fe.

    ?fe wdt:P710 ?p;
        <http://schema.org/name> ?final.
    
    ?p <http://schema.org/name> ?team.  
    
    FILTER(?start < NOW()).

}
GROUP BY ?final
ORDER BY ?final
"""

print("Results")
run_query(queryString)

Results
[('final', '1998 FIFA World Cup Final'), ('participants', 'Brazil national football team, France national association football team')]
[('final', '2002 FIFA World Cup Final'), ('participants', 'Bernd Schneider, Cafu, Carsten Jancker, Carsten Ramelow, Christian Ziege, Christoph Metzelder, Denílson de Oliveira Araújo, Dida, Dietmar Hamann, Edmílson, Edílson, Frank Baumann, Gerald Asamoah, Gilberto Silva, Hans-Jörg Butt, Jens Jeremies, Jens Lehmann, Jenílson Ângelo de Souza, José Kléberson, Juliano Belletti, Juninho Paulista, Jörg Böhme, Kaká, Lars Ricken, Luiz Bombonato Goulart, Lúcio, Marco Bode, Marcos, Marko Rehmer, Miroslav Klose, Oliver Bierhoff, Oliver Kahn, Oliver Neuville, Ricardinho, Rivaldo, Roberto Carlos, Rogério Ceni, Ronaldinho, Ronaldo, Roque Júnior, Sebastian Kehl, Thomas Linke, Torsten Frings, Vampeta, Ânderson Polga')]
[('final', '2006 FIFA World Cup Final'), ('participants', 'Daniele De Rossi, Francesco Totti')]
[('final', 'Uruguay v Brazil 1950'), ('participan

4

As I can see from the results the property returns a different type of information. I then return all properties where each FIFA World Cup Final is the object.

In [29]:
queryString = """
SELECT ?p ?pname COUNT(DISTINCT ?fe) AS ?finals
WHERE{

    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        wdt:P3967 ?fe.

    ?s ?p ?fe.
    
    ?p <http://schema.org/name> ?pname.  
    
    FILTER(?start < NOW()).

}
ORDER BY DESC(?finals)
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P155'), ('pname', 'follows'), ('finals', '21')]
[('p', 'http://www.wikidata.org/prop/direct/P3967'), ('pname', 'final event'), ('finals', '21')]
[('p', 'http://www.wikidata.org/prop/direct/P156'), ('pname', 'followed by'), ('finals', '20')]
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('pname', 'participant in'), ('finals', '7')]
[('p', 'http://www.wikidata.org/prop/direct/P793'), ('pname', 'significant event'), ('finals', '3')]
[('p', 'http://www.wikidata.org/prop/direct/P1855'), ('pname', 'Wikidata property example'), ('finals', '3')]
[('p', 'http://www.wikidata.org/prop/direct/P2522'), ('pname', 'victory'), ('finals', '2')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('finals', '2')]
[('p', 'http://www.wikidata.org/prop/direct/P921'), ('pname', 'main subject'), ('finals', '2')]
[('p', 'http://www.wikidata.org/prop/direct/P366'), ('pname', 'use'), ('finals', '1')]


10

Since I cannot find any useful information with this approach I check how a National Football Team is connected to the final.

In [30]:
queryString = """
SELECT DISTINCT ?p ?pname 
WHERE{

    ?nft wdt:P31 wd:Q6979593.
    
    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        wdt:P3967 ?fe.

    ?nft ?p ?fe.
    
    ?p <http://schema.org/name> ?pname.  
    
    FILTER(?start < NOW()).

}
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('pname', 'participant in')]


1

I also check the other direction.

In [31]:
queryString = """
SELECT DISTINCT ?p ?pname 
WHERE{

    ?nft wdt:P31 wd:Q6979593.
    
    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        wdt:P3967 ?fe.

    ?fe ?p ?nft.
    
    ?p <http://schema.org/name> ?pname.  
    
    FILTER(?start < NOW()).

}
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1346'), ('pname', 'winner')]
[('p', 'http://www.wikidata.org/prop/direct/P710'), ('pname', 'participant')]
[('p', 'http://www.wikidata.org/prop/direct/P1923'), ('pname', 'participating team')]


3

I discovered FIFA World Cup Finals and National Football Teams are connected by the properties "participant" (wdt:P710), "participating team" (wdt:P1923) and "participant in" (wdt:P1344). I already checked the property "participant and discovered it only provides information about the "1998 FIFA World Cup Final". I now check the other two properties starting from "participating team" (wdt:P1923).

In [32]:
queryString = """
SELECT ?final GROUP_CONCAT(DISTINCT ?team; separator=", ") AS ?participants
WHERE{

    ?nft wdt:P31 wd:Q6979593;
         <http://schema.org/name> ?team.
    
    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        wdt:P3967 ?fe.

    ?fe wdt:P1923 ?nft;
        <http://schema.org/name> ?final.
    
    FILTER(?start < NOW()).

}
GROUP BY ?final
ORDER BY ?final
"""

print("Results")
run_query(queryString)

Results
[('final', '1930 FIFA World Cup Final'), ('participants', 'Argentina national football team, Uruguay national football team')]
[('final', '1934 FIFA World Cup Final'), ('participants', 'Czechoslovakia national association football team, Italy national association football team')]
[('final', '1938 FIFA World Cup Final'), ('participants', 'Hungary national association football team')]
[('final', '1954 FIFA World Cup Final'), ('participants', 'Germany national association football team, Hungary national association football team')]
[('final', '1958 FIFA World Cup Final'), ('participants', 'Brazil national football team, Sweden national association football team')]
[('final', '1962 FIFA World Cup Final'), ('participants', 'Czechoslovakia national association football team')]
[('final', '1970 FIFA World Cup Final'), ('participants', 'Brazil national football team, Italy national association football team')]
[('final', '1974 FIFA World Cup Final'), ('participants', 'Germany national 

19

I discovered this property also provides information for the 1998 World Cup Final therefore I can avoid to use the property "participant in" (wdt:P1344). I now check the property "participant" (wdt:P710).

In [33]:
queryString = """
SELECT ?final GROUP_CONCAT(DISTINCT ?team; separator=", ") AS ?participants
WHERE{

    ?nft wdt:P31 wd:Q6979593;
         <http://schema.org/name> ?team.
    
    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        wdt:P3967 ?fe.

    ?fe wdt:P710 ?nft;
        <http://schema.org/name> ?final.
    
    FILTER(?start < NOW()).

}
GROUP BY ?final
ORDER BY ?final
"""

print("Results")
run_query(queryString)

Results
[('final', '1998 FIFA World Cup Final'), ('participants', 'Brazil national football team, France national association football team')]


1

Again, this property provide no additional information therefore I will use property "participating team" (wdt:P1923) although I can clearly see it does not provide information for all the FIFA World Cup Finals and it does not return both teams in most cases.

## Country Information 
This section contains the queries used to answer the second question: _Identify the BGP that connects a country with a national footbal team or as a location to a world cup_

### National Football Team Country
Here I retrieve the connection between a country and a national football team.

I recall I have the property "nationality" (wdt:P27) therefore I check if such property provides the information i am looking for.

In [34]:
queryString = """
SELECT ?team ?c ?country
WHERE { 

?nft wdt:P31 wd:Q6979593;
     wdt:P27 ?c;
     <http://schema.org/name> ?team.
     
?c <http://schema.org/name> ?country.

} 
LIMIT 15
"""

print("Results")
run_query(queryString)

Results
Empty


0

Since I found no results I check how "Spain national football team" (wd:Q42267) is connected to "Spain" (wd:Q29).

In [35]:
queryString = """
SELECT ?p ?pname
WHERE { 

wd:Q42267 ?p wd:Q29.
     
?p <http://schema.org/name> ?pname.

} 
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1532'), ('pname', 'country for sport')]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('pname', 'country')]


2

I found the properties "country" (wdt:P17) and "country for sport" (wdt:P1532) I check how many National Football teams have these properties before deciding which one to use.

In [36]:
queryString = """
SELECT COUNT(DISTINCT ?nft) AS ?teams17
WHERE { 

?nft wdt:P31 wd:Q6979593.

FILTER EXISTS { ?nft wdt:P17 ?c. }

} 
"""

print("Results")
run_query(queryString)

Results
[('teams17', '1123')]


1

In [37]:
queryString = """
SELECT COUNT(DISTINCT ?nft) AS ?teams1532
WHERE { 

?nft wdt:P31 wd:Q6979593.

FILTER EXISTS { ?nft wdt:P1532 ?c. }

} 
"""

print("Results")
run_query(queryString)

Results
[('teams1532', '1054')]


1

Since both properties are present for most teams I check if there are teams that have property "country of sport" (wdt:P1532) that do no have the property "country" (wdt:P17).

In [38]:
queryString = """
SELECT COUNT(DISTINCT ?nft) AS ?teams
WHERE { 
{
    ?nft wdt:P31 wd:Q6979593.

    FILTER EXISTS { ?nft wdt:P1532 ?c. }
    
}
MINUS{

    ?nft wdt:P31 wd:Q6979593.

    FILTER EXISTS { ?nft wdt:P17 ?c. }
    
}

} 
"""

print("Results")
run_query(queryString)

Results
[('teams', '55')]


1

I will use both properties when looking for a team's country.

### Countries Hosting a FIFA World Cup
Here I retrieve information about which countries hosted a World Cup edition.

I recall I discovered FIFA World Cups have the property "country" (wdt:P17) and "location" (wdt:P276) therefore I check if such properties provide the same information.

In [39]:
queryString = """
SELECT ?wc ?country ?location
WHERE { 

?s wdt:P3450 wd:Q19317;
    wdt:P580 ?start;
    wdt:P17 ?c;
    wdt:P276 ?l;
    <http://schema.org/name> ?wc.
    
?c <http://schema.org/name> ?country.
?l <http://schema.org/name> ?location.

FILTER(?start < NOW()).

}
ORDER BY ?wc
"""

print("Results")
run_query(queryString)

Results
[('wc', '1930 FIFA World Cup'), ('country', 'Uruguay'), ('location', 'Estadio Centenario')]
[('wc', '1930 FIFA World Cup'), ('country', 'Uruguay'), ('location', 'Estadio Gran Parque Central')]
[('wc', '1930 FIFA World Cup'), ('country', 'Uruguay'), ('location', 'Estadio Pocitos')]
[('wc', '1934 FIFA World Cup'), ('country', 'Italy'), ('location', 'Stadio Giuseppe Meazza')]
[('wc', '1934 FIFA World Cup'), ('country', 'Italy'), ('location', 'Stadio Olimpico Grande Torino')]
[('wc', '1934 FIFA World Cup'), ('country', 'Italy'), ('location', 'Stadio Artemio Franchi (Florence)')]
[('wc', '1934 FIFA World Cup'), ('country', 'Italy'), ('location', "Stadio Renato Dall'Ara")]
[('wc', '1934 FIFA World Cup'), ('country', 'Italy'), ('location', 'Stadio Luigi Ferraris')]
[('wc', '1934 FIFA World Cup'), ('country', 'Italy'), ('location', 'Stadio Giuseppe Grezar')]
[('wc', '1934 FIFA World Cup'), ('country', 'Italy'), ('location', 'Stadio Nazionale PNF')]
[('wc', '1934 FIFA World Cup'), ('cou

216

"location" property actually returns the stadiums where the FIFA World Cup matches were held. I use only the property "country" (wdt:P17) in order to retrieve the country where a FIFA World Cup edition was hosted.

## World Cups (Countries)
This section contains the queries used to answer the third question: _In which countries took place a FIFA world cup or world cup final? With how many participants?_

Firstly, by means of the property "country" (wdt:P17) I return the hosting country for each FIFA World Cup while with the property "number of participant" (wdt:P1132) .

In [40]:
queryString = """
SELECT DISTINCT ?wc ?country ?numParticipants
WHERE { 

?s wdt:P3450 wd:Q19317;
    wdt:P580 ?start;
    wdt:P1132 ?numParticipants;
    wdt:P17 ?c;
    <http://schema.org/name> ?wc.
    
?c <http://schema.org/name> ?country.

FILTER(?start < NOW()).

}
ORDER BY ?wc
"""

print("Results")
run_query(queryString)

Results
[('wc', '1930 FIFA World Cup'), ('country', 'Uruguay'), ('numParticipants', '13')]
[('wc', '1934 FIFA World Cup'), ('country', 'Italy'), ('numParticipants', '16')]
[('wc', '1938 FIFA World Cup'), ('country', 'France'), ('numParticipants', '15')]
[('wc', '1950 FIFA World Cup'), ('country', 'Brazil'), ('numParticipants', '13')]
[('wc', '1954 FIFA World Cup'), ('country', 'Switzerland'), ('numParticipants', '16')]
[('wc', '1958 FIFA World Cup'), ('country', 'Sweden'), ('numParticipants', '16')]
[('wc', '1962 FIFA World Cup'), ('country', 'Chile'), ('numParticipants', '16')]
[('wc', '1966 FIFA World Cup'), ('country', 'United Kingdom'), ('numParticipants', '16')]
[('wc', '1970 FIFA World Cup'), ('country', 'Mexico'), ('numParticipants', '16')]
[('wc', '1974 FIFA World Cup'), ('country', 'Germany'), ('numParticipants', '16')]
[('wc', '1978 FIFA World Cup'), ('country', 'Argentina'), ('numParticipants', '16')]
[('wc', '1982 FIFA World Cup'), ('country', 'Mexico'), ('numParticipants',

22

From the result I notice there is a mistake in the data. In fact, the 1982 FIFA World Cup (wd:Q46934) was hosted in Spain, however the database returned Mexico as country. I check if there is an incorrect information in the database or there is another property for the hosting countries.

In [41]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

wd:Q46934 wdt:P17 wd:Q29.
    
?p <http://schema.org/name> ?pname.

}
"""

print("Results")
run_query(queryString)

Results
Empty


0

Since I have no connection between the 1982 World Cup and Spain I assess there is an incorrect information.

The country hosting the FIFA World Cup final of an edition is the same country hosting the whole turnament however when exploring the finals I discovered the property "location" (wdt:P276). For this reason I can return the stadium where each final was competed in.

In [42]:
queryString = """
SELECT ?final ?location
WHERE{

    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        wdt:P3967 ?fe.

    ?fe wdt:P276 ?l;
        <http://schema.org/name> ?final.  
        
    ?l <http://schema.org/name> ?location.
    
    FILTER(?start < NOW()).

}
ORDER BY ?final
"""

print("Results")
run_query(queryString)

Results
[('final', '1930 FIFA World Cup Final'), ('location', 'Estadio Centenario')]
[('final', '1934 FIFA World Cup Final'), ('location', 'Stadio Nazionale PNF')]
[('final', '1938 FIFA World Cup Final'), ('location', 'Stade Olympique Yves-du-Manoir')]
[('final', '1954 FIFA World Cup Final'), ('location', 'Wankdorf Stadium')]
[('final', '1958 FIFA World Cup Final'), ('location', 'Råsunda Stadium')]
[('final', '1962 FIFA World Cup Final'), ('location', 'Estadio Nacional Julio Martínez Prádanos')]
[('final', '1966 FIFA World Cup Final'), ('location', 'Wembley Stadium')]
[('final', '1970 FIFA World Cup Final'), ('location', 'Estadio Azteca')]
[('final', '1974 FIFA World Cup Final'), ('location', 'Olympic Stadium of Munich')]
[('final', '1978 FIFA World Cup Final'), ('location', 'Estadio Monumental Antonio Vespucio Liberti')]
[('final', '1982 FIFA World Cup Final'), ('location', 'Madrid')]
[('final', '1986 FIFA World Cup Final'), ('location', 'Estadio Azteca')]
[('final', '1990 FIFA World 

21

I check if the node location have the property "country" (wdt:P17).

In [43]:
queryString = """
SELECT ?final ?location ?country
WHERE{

    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        wdt:P3967 ?fe.

    ?fe wdt:P276 ?l;
        <http://schema.org/name> ?final.  
        
    ?l wdt:P17 ?c;
        <http://schema.org/name> ?location.
        
    ?c <http://schema.org/name> ?country.
    
    FILTER(?start < NOW()).

}
ORDER BY ?final
"""

print("Results")
run_query(queryString)

Results
[('final', '1930 FIFA World Cup Final'), ('location', 'Estadio Centenario'), ('country', 'Uruguay')]
[('final', '1934 FIFA World Cup Final'), ('location', 'Stadio Nazionale PNF'), ('country', 'Italy')]
[('final', '1938 FIFA World Cup Final'), ('location', 'Stade Olympique Yves-du-Manoir'), ('country', 'France')]
[('final', '1954 FIFA World Cup Final'), ('location', 'Wankdorf Stadium'), ('country', 'Switzerland')]
[('final', '1958 FIFA World Cup Final'), ('location', 'Råsunda Stadium'), ('country', 'Sweden')]
[('final', '1962 FIFA World Cup Final'), ('location', 'Estadio Nacional Julio Martínez Prádanos'), ('country', 'Chile')]
[('final', '1966 FIFA World Cup Final'), ('location', 'Wembley Stadium'), ('country', 'United Kingdom')]
[('final', '1970 FIFA World Cup Final'), ('location', 'Estadio Azteca'), ('country', 'Mexico')]
[('final', '1974 FIFA World Cup Final'), ('location', 'Olympic Stadium of Munich'), ('country', 'Germany')]
[('final', '1978 FIFA World Cup Final'), ('locat

21

In this way I have more detailed information and I know the country that hosted the final in editions where there were multiple hosting country (2002 FIFA World Cup).

## Partecipants and Winners
This section contains the queries used to answer the last question: _Compare number of participants and winners of different world cups._

### Countries Participation
_How many times each country participated in a world cup?_

I answer to this question by means of the property "country" (wdt:P17).

In [44]:
queryString = """
SELECT ?country COUNT(DISTINCT ?wc) AS ?hostedWCs
WHERE { 

?wc wdt:P3450 wd:Q19317;
    wdt:P580 ?start;
    wdt:P17 ?c.
    
?c <http://schema.org/name> ?country.

FILTER(?start < NOW()).

}
ORDER BY DESC(?hostedWCs)
"""

print("Results")
run_query(queryString)

Results
[('country', 'Mexico'), ('hostedWCs', '3')]
[('country', 'Brazil'), ('hostedWCs', '2')]
[('country', 'France'), ('hostedWCs', '2')]
[('country', 'Germany'), ('hostedWCs', '2')]
[('country', 'Italy'), ('hostedWCs', '2')]
[('country', 'South Africa'), ('hostedWCs', '1')]
[('country', 'Sweden'), ('hostedWCs', '1')]
[('country', 'South Korea'), ('hostedWCs', '1')]
[('country', 'Argentina'), ('hostedWCs', '1')]
[('country', 'United Kingdom'), ('hostedWCs', '1')]
[('country', 'Uruguay'), ('hostedWCs', '1')]
[('country', 'Japan'), ('hostedWCs', '1')]
[('country', 'Russia'), ('hostedWCs', '1')]
[('country', 'Chile'), ('hostedWCs', '1')]
[('country', 'United States of America'), ('hostedWCs', '1')]
[('country', 'Switzerland'), ('hostedWCs', '1')]


16

I recall Mexico is wrongly connected to the 1982 World Cup therefore it actually hosted 2 World Cups and Spain should be in the result set with 1 World Cup (1982 World Cup).

By means of the property "country" I can also retrieve each football national teams nationality and return the countries that participated in more FIFA World Cup Editions.

In [46]:
queryString = """
SELECT ?country COUNT(DISTINCT ?wc) AS ?participations
WHERE{

    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start.

    ?nft wdt:P31 wd:Q6979593;
         wdt:P17|wdt:P1532 ?c.  
    
    ?wc wdt:P1923 ?nft.
    
    ?c <http://schema.org/name> ?country.
    
    FILTER(?start < NOW()).

}
GROUP BY ?country
ORDER BY DESC(?participations)
LIMIT 25
"""

print("Results")
run_query(queryString)

Results
[('country', 'Brazil'), ('participations', '20')]
[('country', 'Germany'), ('participations', '18')]
[('country', 'Italy'), ('participations', '18')]
[('country', 'Argentina'), ('participations', '16')]
[('country', 'United Kingdom'), ('participations', '16')]
[('country', 'Mexico'), ('participations', '15')]
[('country', 'England'), ('participations', '14')]
[('country', 'Spain'), ('participations', '14')]
[('country', 'France'), ('participations', '14')]
[('country', 'Uruguay'), ('participations', '12')]
[('country', 'Belgium'), ('participations', '12')]
[('country', 'Sweden'), ('participations', '11')]
[('country', 'Netherlands'), ('participations', '10')]
[('country', 'United States of America'), ('participations', '10')]
[('country', 'Switzerland'), ('participations', '10')]
[('country', 'South Korea'), ('participations', '9')]
[('country', 'Chile'), ('participations', '9')]
[('country', 'Hungary'), ('participations', '9')]
[('country', 'Yugoslavia'), ('participations', '8

25

### World Cup Finals
_Which country participated in the most finals? Which country won the most finals?_

I found the connection between FIFA World Cup finals and the national football teams in a previous section and the same goes for the country information of each team. I recall data are incomplete as far as final participants are concerned. 

In [47]:
queryString = """
SELECT ?country COUNT(DISTINCT ?fe) AS ?finals
WHERE { 

?nft wdt:P31 wd:Q6979593;
     wdt:P17|wdt:P1532 ?c.
    
?wc wdt:P3450 wd:Q19317;
    wdt:P580 ?start;
    wdt:P3967 ?fe.

?fe wdt:P1923 ?nft.

?c <http://schema.org/name> ?country.
     
FILTER(?start < NOW()).

} 
GROUP BY ?country
ORDER BY DESC(?finals)
"""

print("Results")
run_query(queryString)

Results
[('country', 'Germany'), ('finals', '6')]
[('country', 'Brazil'), ('finals', '5')]
[('country', 'Italy'), ('finals', '4')]
[('country', 'Argentina'), ('finals', '3')]
[('country', 'Czechoslovakia'), ('finals', '2')]
[('country', 'Netherlands'), ('finals', '2')]
[('country', 'Uruguay'), ('finals', '2')]
[('country', 'Hungary'), ('finals', '2')]
[('country', 'France'), ('finals', '2')]
[('country', 'Sweden'), ('finals', '1')]
[('country', 'Spain'), ('finals', '1')]


11

I recall I found the property "winner" (wdt:P1346) for the FIFA World Cup final. Firstly I check if such property is present for each final.

In [48]:
queryString = """
SELECT ?final ?winner 
WHERE { 
    
?wc wdt:P3450 wd:Q19317;
    wdt:P580 ?start;
    wdt:P3967 ?f.

?f wdt:P1346 ?w.

?f <http://schema.org/name> ?final.
?w <http://schema.org/name> ?winner.
     
FILTER(?start < NOW()).

} 
ORDER BY ?final
"""

print("Results")
run_query(queryString)

Results
[('final', '1930 FIFA World Cup Final'), ('winner', 'Uruguay national football team')]
[('final', '1934 FIFA World Cup Final'), ('winner', 'Italy national association football team')]
[('final', '1938 FIFA World Cup Final'), ('winner', 'Italy national association football team')]
[('final', '1954 FIFA World Cup Final'), ('winner', 'Germany national association football team')]
[('final', '1958 FIFA World Cup Final'), ('winner', 'Brazil national football team')]
[('final', '1966 FIFA World Cup Final'), ('winner', 'England at the 1966 FIFA World Cup')]
[('final', '1970 FIFA World Cup Final'), ('winner', 'Brazil national football team')]
[('final', '2002 FIFA World Cup Final'), ('winner', 'Brazil national football team')]
[('final', '2006 FIFA World Cup Final'), ('winner', 'Italy national association football team')]
[('final', '2010 FIFA World Cup Final'), ('winner', 'Spain national association football team')]
[('final', '2014 FIFA World Cup Final'), ('winner', 'Germany national

13

We miss information about World Cups held between 1974 and 1998. I recall such property is also connected to the FIFA World Cup therefore I check if the information is present in that case. Of course, whoever won the FIFA World Cup turnament won the final as well.

In [49]:
queryString = """
SELECT ?edition ?winner 
WHERE { 
    
?wc wdt:P3450 wd:Q19317;
    wdt:P580 ?start;
    wdt:P1346 ?w.

?wc <http://schema.org/name> ?edition.
?w <http://schema.org/name> ?winner.
     
FILTER(?start < NOW()).

} 
ORDER BY ?edition
"""

print("Results")
run_query(queryString)

Results
[('edition', '1930 FIFA World Cup'), ('winner', 'Uruguay national football team')]
[('edition', '1934 FIFA World Cup'), ('winner', 'Italy national association football team')]
[('edition', '1938 FIFA World Cup'), ('winner', 'Italy national association football team')]
[('edition', '1950 FIFA World Cup'), ('winner', 'Uruguay national football team')]
[('edition', '1954 FIFA World Cup'), ('winner', 'Germany national association football team')]
[('edition', '1958 FIFA World Cup'), ('winner', 'Brazil national football team')]
[('edition', '1962 FIFA World Cup'), ('winner', 'Brazil national football team')]
[('edition', '1966 FIFA World Cup'), ('winner', 'England national association football team')]
[('edition', '1970 FIFA World Cup'), ('winner', 'Brazil national football team')]
[('edition', '1974 FIFA World Cup'), ('winner', 'Germany national association football team')]
[('edition', '1978 FIFA World Cup'), ('winner', 'Argentina national football team')]
[('edition', '1982 FIFA 

21

I use the property "winner" (wdt:P1346) connected to the FIFA World Cup since it has more complete information. Now I can return the countries who won the most finals.

In [50]:
queryString = """
SELECT ?c ?country COUNT(DISTINCT ?wc) AS ?wins
WHERE { 

?nft wdt:P31 wd:Q6979593;
     wdt:P17|wdt:P1532 ?c.
    
?wc wdt:P3450 wd:Q19317;
    wdt:P580 ?start;
    wdt:P1346 ?nft.

?c <http://schema.org/name> ?country.
     
FILTER(?start < NOW()).

} 
GROUP BY ?c ?country
ORDER BY DESC(?wins)
"""

print("Results")
run_query(queryString)

Results
[('c', 'http://www.wikidata.org/entity/Q155'), ('country', 'Brazil'), ('wins', '5')]
[('c', 'http://www.wikidata.org/entity/Q38'), ('country', 'Italy'), ('wins', '4')]
[('c', 'http://www.wikidata.org/entity/Q183'), ('country', 'Germany'), ('wins', '4')]
[('c', 'http://www.wikidata.org/entity/Q142'), ('country', 'France'), ('wins', '2')]
[('c', 'http://www.wikidata.org/entity/Q77'), ('country', 'Uruguay'), ('wins', '2')]
[('c', 'http://www.wikidata.org/entity/Q414'), ('country', 'Argentina'), ('wins', '2')]
[('c', 'http://www.wikidata.org/entity/Q145'), ('country', 'United Kingdom'), ('wins', '1')]
[('c', 'http://www.wikidata.org/entity/Q21'), ('country', 'England'), ('wins', '1')]
[('c', 'http://www.wikidata.org/entity/Q29'), ('country', 'Spain'), ('wins', '1')]


9

### Italy, France, Spain, Germany
_Who participated in the most world cups among Italy, France, Spain, and Germany?_

I recall:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wd:Q38`   | Italy   | node |
| `wd:Q142`   | France   | node |
| `wd:Q29`   | Spain  | node |
| `wd:Q183`   | Germany   | node |

Therefore I filter a query counting the participation of each country in order to select only the interested countries. I recall I use property "participating team" (wdt:P1923) to assess the team participation.

In [51]:
queryString = """
SELECT ?country COUNT(DISTINCT ?wc) AS ?participations
WHERE{

    ?wc wdt:P3450 wd:Q19317;
        wdt:P580 ?start;
        wdt:P1923 ?nft.

    ?nft wdt:P31 wd:Q6979593;
         wdt:P17|wdt:P1532 ?c.  
    
    ?c <http://schema.org/name> ?country.
    
    FILTER(?start < NOW()).
    FILTER(?c = wd:Q38 || ?c = wd:Q142 || ?c = wd:Q29 || ?c = wd:Q183).

}
GROUP BY ?country
ORDER BY DESC(?participations)
"""

print("Results")
run_query(queryString)

Results
[('country', 'Germany'), ('participations', '18')]
[('country', 'Italy'), ('participations', '18')]
[('country', 'Spain'), ('participations', '14')]
[('country', 'France'), ('participations', '14')]


4

Germany annd Italy participated in most FIFA World Cups and they also won more editions than the other countries. 