# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.

In [8]:
## SETUP used later
import sys
import os
import json
import pandas as pd
sys.path.insert(1, '/locale/data/jupyter/prando/notebook/sparqlthesis/')
import modules.evaluation as evaluation
from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-sport3-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString,verbose = True):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://gracevirtuoso.dei.unipd.it/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        results = sparql.query()
        json_results = results.convert()
        if len(json_results['results']['bindings'])==0:
            print("Empty")
            return []
        array = []
        for bindings in json_results['results']['bindings']:
            app =  [ (var, value['value'])  for var, value in bindings.items() ] 
            if verbose:
                print( app)
            array.append(app)
        if verbose:
            print(len(array))
        return array

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://gracevirtuoso.dei.unipd.it/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# Sport Workflow Series ("Association Football Players explorative search") 

Consider the following exploratory information need:

> investigate the association football players and find the main BGPs related to this sport. Compare players per countries, awards of the players and participation in the FIFA World Cup.

**Important consideration: in wikidata, we are talking of ASSOCIATION FOOTBALL**

## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | profession    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P27`     | nationality   | predicate |
| `wd:Q5`| Human       | node |
| `wd:Q11571`| Cristiano Ronaldo       | node |





Also consider

```
wd:Q11571 ?p ?obj .
```

is the BGP to retrieve all **properties of Cristiano Ronaldo**

Please consider that when you return a resource, you should return the IRI and the label of the resource. In particular, when the task require you to identify a BGP the result set must always be a list of couples IRI - label.



The workload should:


1. Identify the BGP for association football player

2. Identify the BGP for association football club

3. Identify the BGP for association football league

4. Return for each european country the number of association football leagues (the result set must be a list of triples country IRI, label and #leagues).

5. Identify the BGP for the ballon d'Or award

6. Return the countries which have at least two different players of their nationality (consider the country of citizenship) who won the ballon d'Or (the result set must be a list of triples country IRI, label and #different players).

7. Identify the BGP of the FIFA World Cup

8. Who are the players who partecipated in more than 4 world's cup editions? (the result set must be a list of triples player IRI, label and #participations).

9. Consider the countries that won the FIFA World Cup at least twice. Find the players of those countries won the ballon d'Or (the result set must be a list of couples player IRI and label).


In [2]:
## startup the evaluation
# setup the file and create the empty json
ipname = "s3.ipynb"
pt = os.getcwd()+os.sep+ipname
evaluation.setup(pt)

The index of this workflow is: 0_6


## Task 1 
Identify the BGP for football player

In [3]:
# starting from Cristiano Ronaldo I look at his properties
queryString = """
SELECT DISTINCT ?pr ?name
WHERE { 
    wd:Q11571 ?pr ?obj.
    ?pr <http://schema.org/name> ?name .
}
"""

print("Results")
x = run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P1005'), ('name', 'Portuguese National Library ID')]
[('pr', 'http://www.wikidata.org/prop/direct/P1006'), ('name', 'Nationale Thesaurus voor Auteurs ID')]
[('pr', 'http://www.wikidata.org/prop/direct/P1015'), ('name', 'NORAF ID')]
[('pr', 'http://www.wikidata.org/prop/direct/P103'), ('name', 'native language')]
[('pr', 'http://www.wikidata.org/prop/direct/P1038'), ('name', 'relative')]
[('pr', 'http://www.wikidata.org/prop/direct/P1050'), ('name', 'medical condition')]
[('pr', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('pr', 'http://www.wikidata.org/prop/direct/P109'), ('name', 'signature')]
[('pr', 'http://www.wikidata.org/prop/direct/P118'), ('name', 'league')]
[('pr', 'http://www.wikidata.org/prop/direct/P1263'), ('name', 'NNDB people ID')]
[('pr', 'http://www.wikidata.org/prop/direct/P1266'), ('name', 'AlloCiné person ID')]
[('pr', 'http://www.wikidata.org/prop/direct/P1273'), ('name', 'CANTIC ID')]
[

In [4]:
# starting from Cristiano Ronaldo I look at his occupations
queryString = """
SELECT DISTINCT ?obj ?name
WHERE { 
    wd:Q11571 wdt:P106 ?obj.
    ?obj <http://schema.org/name> ?name .
} 
LIMIT 100
"""

print("Results")
x = run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q937857'), ('name', 'association football player')]
1


In [3]:
### insert the result of TASK 1 in the file
og_uri = "http://www.wikidata.org/entity/Q937857"
og_name = "association football player"
obj = {"uri":og_uri,"name":og_name}
evaluation.add_result(evaluation.get_index_workflow(pt),"1", evaluation.TYPE_SINGLE ,"uri", [obj] ,"all")

The index of this workflow is: 0_6
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow0_6.json
JSON object updated


`wd:Q937857` is the node of the association football player in wikidata

## Task 2
Identify the BGP for football club

In [21]:
# in the first query I found P54 that is member of sports team. Use it and explore
queryString = """
SELECT DISTINCT ?team ?name
WHERE { 
    wd:Q11571 wdt:P54 ?team .
    ?team <http://schema.org/name> ?name .
}
"""

print("Results")
x= run_query(queryString)

Results
[('team', 'http://www.wikidata.org/entity/Q3590758'), ('name', 'Portugal Olympic football team')]
[('team', 'http://www.wikidata.org/entity/Q1422'), ('name', 'Juventus F.C.')]
[('team', 'http://www.wikidata.org/entity/Q18656'), ('name', 'Manchester United F.C.')]
[('team', 'http://www.wikidata.org/entity/Q8682'), ('name', 'Real Madrid CF')]
[('team', 'http://www.wikidata.org/entity/Q1630430'), ('name', 'Portugal national under-21 football team')]
[('team', 'http://www.wikidata.org/entity/Q267245'), ('name', 'Portugal national association football team')]
[('team', 'http://www.wikidata.org/entity/Q75729'), ('name', 'Sporting CP')]
[('team', 'http://www.wikidata.org/entity/Q1772776'), ('name', 'Portugal national under-20 football team')]
[('team', 'http://www.wikidata.org/entity/Q3590754'), ('name', 'Portugal national under-17 football team')]
[('team', 'http://www.wikidata.org/entity/Q21079208'), ('name', 'Portugal national under-15 football team')]
10


In [22]:
# look for the istance of the results found above
queryString = """
SELECT DISTINCT ?inst ?name
WHERE { 
    wd:Q11571 wdt:P54 ?team .
    ?team wdt:P31 ?inst.  
    ?inst <http://schema.org/name> ?name .
}
"""

print("Results")
x= run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q23759293'), ('name', 'national Olympic football team')]
[('inst', 'http://www.wikidata.org/entity/Q15944511'), ('name', 'association football team')]
[('inst', 'http://www.wikidata.org/entity/Q476028'), ('name', 'association football club')]
[('inst', 'http://www.wikidata.org/entity/Q4830453'), ('name', 'business')]
[('inst', 'http://www.wikidata.org/entity/Q891723'), ('name', 'public company')]
[('inst', 'http://www.wikidata.org/entity/Q6979593'), ('name', 'national association football team')]
[('inst', 'http://www.wikidata.org/entity/Q847017'), ('name', 'sports club')]
[('inst', 'http://www.wikidata.org/entity/Q23847779'), ('name', 'national under-23 association football team')]
[('inst', 'http://www.wikidata.org/entity/Q23904673'), ('name', 'national under-15 association football team')]
9


In [4]:
### insert the result of TASK 2 in the file
og_uri = "http://www.wikidata.org/entity/Q476028"
og_name = "association football club"
obj = {"uri":og_uri,"name":og_name}
evaluation.add_result(evaluation.get_index_workflow(pt),"2", evaluation.TYPE_SINGLE ,"uri", [obj] ,"all")

The index of this workflow is: 0_6
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow0_6.json
JSON object updated


`wd:Q476028` is the node of the association football club in wikidata

## Task 3 
Identify the BGP for football league

In [23]:
# In the query above I found Real Madrid, wd:Q8682 look at its properties
queryString = """
SELECT DISTINCT ?p ?pName
WHERE { 

wd:Q8682 ?p ?obj.

?p <http://schema.org/name> ?pName .
}
"""

print("Results")
x = run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1015'), ('pName', 'NORAF ID')]
[('p', 'http://www.wikidata.org/prop/direct/P112'), ('pName', 'founded by')]
[('p', 'http://www.wikidata.org/prop/direct/P115'), ('pName', 'home venue')]
[('p', 'http://www.wikidata.org/prop/direct/P118'), ('pName', 'league')]
[('p', 'http://www.wikidata.org/prop/direct/P1273'), ('pName', 'CANTIC ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1296'), ('pName', 'Gran Enciclopèdia Catalana ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('pName', 'described by source')]
[('p', 'http://www.wikidata.org/prop/direct/P1411'), ('pName', 'nominated for')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('pName', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('pName', "topic's main template")]
[('p', 'http://www.wikidata.org/prop/direct/P159'), ('pName', 'headquarters location')]
[('p', 'http://www.wikidata.org/prop/direct/P166'), ('pName', 'award 

In [24]:
# Found property league (wdt:P118). Let's see the result
queryString = """
SELECT DISTINCT ?league ?name
WHERE { 

wd:Q8682 wdt:P118 ?league.

?league <http://schema.org/name> ?name .
}
"""

print("Results")
x = run_query(queryString)

Results
[('league', 'http://www.wikidata.org/entity/Q324867'), ('name', 'La Liga')]
1


In [25]:
# Real Madrid plays in La Liga. Find the instance.
queryString = """
SELECT DISTINCT ?inst ?name
WHERE { 

wd:Q8682 wdt:P118 ?league.
?league wdt:P31 ?inst.

?inst <http://schema.org/name> ?name .
}
"""

print("Results")
x = run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q15991290'), ('name', 'professional sports league')]
[('inst', 'http://www.wikidata.org/entity/Q15991303'), ('name', 'association football league')]
2


In [5]:
### insert the result of TASK 3 in the file
og_uri = "http://www.wikidata.org/entity/Q15991303"
og_name = "association football league"
obj = {"uri":og_uri,"name":og_name}
evaluation.add_result(evaluation.get_index_workflow(pt),"3", evaluation.TYPE_SINGLE ,"uri", [obj] ,"all")

The index of this workflow is: 0_6
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow0_6.json
JSON object updated


`wd:Q15991303` is the node of the association football league in wikidata

## Task 4
Find how many association football leagues each country have in Europe

First of all we have to identify countries of Europe. 

In [49]:
# bind the football player and look at his nationality
queryString = """
SELECT DISTINCT ?nation ?name
WHERE { 

?player wdt:P106 wd:Q937857;
        wdt:P27 ?nation.

?nation sc:name ?name .
}
ORDER BY (?name)
LIMIT 20
"""

print("Results")
x = run_query(queryString)

Results
[('nation', 'http://www.wikidata.org/entity/Q13417403'), ('name', '.الجزائر')]
[('nation', 'http://www.wikidata.org/entity/Q889'), ('name', 'Afghanistan')]
[('nation', 'http://www.wikidata.org/entity/Q170366'), ('name', 'Ahwaz')]
[('nation', 'http://www.wikidata.org/entity/Q137014'), ('name', 'AlSaudiah')]
[('nation', 'http://www.wikidata.org/entity/Q222'), ('name', 'Albania')]
[('nation', 'http://www.wikidata.org/entity/Q1242155'), ('name', 'Albanians in Kosovo')]
[('nation', 'http://www.wikidata.org/entity/Q262'), ('name', 'Algeria')]
[('nation', 'http://www.wikidata.org/entity/Q218272'), ('name', 'Algeria')]
[('nation', 'http://www.wikidata.org/entity/Q2415901'), ('name', 'Allied-occupied Germany')]
[('nation', 'http://www.wikidata.org/entity/Q16641'), ('name', 'American Samoa')]
[('nation', 'http://www.wikidata.org/entity/Q228'), ('name', 'Andorra')]
[('nation', 'http://www.wikidata.org/entity/Q916'), ('name', 'Angola')]
[('nation', 'http://www.wikidata.org/entity/Q25228'),

In [53]:
# look at properties of the nationalities
queryString = """
SELECT DISTINCT ?x ?name
WHERE { 

?player wdt:P106 wd:Q937857;
        wdt:P27 ?nation.
?nation wdt:P31 ?x.

?x sc:name ?name .
}
"""

print("Results")
x = run_query(queryString)

Results
[('x', 'http://www.wikidata.org/entity/Q1489259'), ('name', 'superpower')]
[('x', 'http://www.wikidata.org/entity/Q1323642'), ('name', 'transcontinental country')]
[('x', 'http://www.wikidata.org/entity/Q99541706'), ('name', 'historical unrecognized state')]
[('x', 'http://www.wikidata.org/entity/Q112099'), ('name', 'island nation')]
[('x', 'http://www.wikidata.org/entity/Q123480'), ('name', 'landlocked country')]
[('x', 'http://www.wikidata.org/entity/Q1520223'), ('name', 'constitutional republic')]
[('x', 'http://www.wikidata.org/entity/Q179164'), ('name', 'unitary state')]
[('x', 'http://www.wikidata.org/entity/Q20181813'), ('name', 'colonial power')]
[('x', 'http://www.wikidata.org/entity/Q3024240'), ('name', 'historical country')]
[('x', 'http://www.wikidata.org/entity/Q3624078'), ('name', 'sovereign state')]
[('x', 'http://www.wikidata.org/entity/Q41614'), ('name', 'constitutional monarchy')]
[('x', 'http://www.wikidata.org/entity/Q4209223'), ('name', 'Rechtsstaat')]
[('x

In [56]:
# Q6256 is the class of countries
# look at properties to find the properties continent
queryString = """
SELECT DISTINCT ?prop ?name
WHERE { 

?nation wdt:P31 wd:Q6256;
        ?prop ?x.

?prop sc:name ?name .
FILTER(REGEX(?name,"ontin")).
}
"""

print("Results")
x = run_query(queryString)

Results
[('prop', 'http://www.wikidata.org/prop/direct/P30'), ('name', 'continent')]
1


In [57]:
# Q6256 is the class of countries
# P30 is the properties continent: list them and find Europe
queryString = """
SELECT DISTINCT ?x ?name
WHERE { 

?nation wdt:P31 wd:Q6256;
        wdt:P30 ?x.

?x sc:name ?name .
}
"""

print("Results")
x = run_query(queryString)

Results
[('x', 'http://www.wikidata.org/entity/Q15'), ('name', 'Africa')]
[('x', 'http://www.wikidata.org/entity/Q18'), ('name', 'South America')]
[('x', 'http://www.wikidata.org/entity/Q46'), ('name', 'Europe')]
[('x', 'http://www.wikidata.org/entity/Q48'), ('name', 'Asia')]
[('x', 'http://www.wikidata.org/entity/Q49'), ('name', 'North America')]
[('x', 'http://www.wikidata.org/entity/Q538'), ('name', 'Insular Oceania')]
[('x', 'http://www.wikidata.org/entity/Q5401'), ('name', 'Eurasia')]
[('x', 'http://www.wikidata.org/entity/Q27611'), ('name', 'Central America')]
8


`wd:Q46` is the Europe node.

In [59]:
# look at properties of intances of football league to find something related with countries
queryString = """
SELECT DISTINCT ?prop ?name
WHERE { 
?league wdt:P31 wd:Q15991303;
        ?prop ?country.
?country wdt:P31 wd:Q6256.

?prop sc:name ?name .
}
"""

print("Results")
x = run_query(queryString)

Results
[('prop', 'http://www.wikidata.org/prop/direct/P17'), ('name', 'country')]
[('prop', 'http://www.wikidata.org/prop/direct/P2541'), ('name', 'operating area')]
[('prop', 'http://www.wikidata.org/prop/direct/P276'), ('name', 'location')]
3


In [9]:
# using all the informations found above I can find how many leagues each country have
queryString = """
SELECT DISTINCT ?country ?name COUNT(DISTINCT ?league) AS ?leagues
WHERE { 
#leagues
?league wdt:P31 wd:Q15991303;
        wdt:P17 ?country.
# europe countries
?country wdt:P31 wd:Q6256;
        wdt:P30 wd:Q46.

?country sc:name ?name .
}
GROUP BY ?country ?name
ORDER BY ?name
"""

print("Results")
x = run_query(queryString)

Results
[('country', 'http://www.wikidata.org/entity/Q222'), ('name', 'Albania'), ('leagues', '3')]
[('country', 'http://www.wikidata.org/entity/Q228'), ('name', 'Andorra'), ('leagues', '2')]
[('country', 'http://www.wikidata.org/entity/Q40'), ('name', 'Austria'), ('leagues', '12')]
[('country', 'http://www.wikidata.org/entity/Q184'), ('name', 'Belarus'), ('leagues', '3')]
[('country', 'http://www.wikidata.org/entity/Q219'), ('name', 'Bulgaria'), ('leagues', '4')]
[('country', 'http://www.wikidata.org/entity/Q224'), ('name', 'Croatia'), ('leagues', '5')]
[('country', 'http://www.wikidata.org/entity/Q229'), ('name', 'Cyprus'), ('leagues', '13')]
[('country', 'http://www.wikidata.org/entity/Q213'), ('name', 'Czech Republic'), ('leagues', '199')]
[('country', 'http://www.wikidata.org/entity/Q35'), ('name', 'Denmark'), ('leagues', '8')]
[('country', 'http://www.wikidata.org/entity/Q33'), ('name', 'Finland'), ('leagues', '14')]
[('country', 'http://www.wikidata.org/entity/Q142'), ('name', '

In [10]:
objs = []
for i in x:
    f_uri = i[0][1]
    f_name = i[1][1]
    leagues = i[2][1]
    obj = {}
    obj["refers_to"] = f_uri
    obj["refers_to_name"] = f_name
    obj["check"] = "value"
    obj["value"]= leagues
    objs.append(obj)
evaluation.add_result(evaluation.get_index_workflow(pt),"4", evaluation.TYPE_REFERRED ,"value", objs)

The index of this workflow is: 0_6
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow0_6.json
JSON object updated


## Task 5
Find BGP for the ballon d'Or award

In [46]:
# football players properties
queryString = """
SELECT DISTINCT ?prop ?name
WHERE { 

?player wdt:P106 wd:Q937857;
        ?prop ?obj.
FILTER(!isLiteral(?obj)).
?prop sc:name ?name .
}
"""

print("Results")
x = run_query(queryString)

Results
[('prop', 'http://www.wikidata.org/prop/direct/P103'), ('name', 'native language')]
[('prop', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('prop', 'http://www.wikidata.org/prop/direct/P108'), ('name', 'employer')]
[('prop', 'http://www.wikidata.org/prop/direct/P118'), ('name', 'league')]
[('prop', 'http://www.wikidata.org/prop/direct/P1343'), ('name', 'described by source')]
[('prop', 'http://www.wikidata.org/prop/direct/P1344'), ('name', 'participant in')]
[('prop', 'http://www.wikidata.org/prop/direct/P140'), ('name', 'religion')]
[('prop', 'http://www.wikidata.org/prop/direct/P1412'), ('name', 'languages spoken, written or signed')]
[('prop', 'http://www.wikidata.org/prop/direct/P1532'), ('name', 'country for sport')]
[('prop', 'http://www.wikidata.org/prop/direct/P166'), ('name', 'award received')]
[('prop', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('prop', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different fro

In [47]:
# see the object of the properties awards received
queryString = """
SELECT DISTINCT ?obj ?name
WHERE { 

?player wdt:P106 wd:Q937857;
        wdt:P166 ?obj.
?obj sc:name ?name .
FILTER(REGEX(?name,"all")).
}
"""

print("Results")
x = run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q1998642'), ('name', 'Belgian professional football awards')]
[('obj', 'http://www.wikidata.org/entity/Q1478322'), ('name', 'Footballer of the month')]
[('obj', 'http://www.wikidata.org/entity/Q311830'), ('name', 'Footballer of the Year')]
[('obj', 'http://www.wikidata.org/entity/Q1323117'), ('name', 'English Football Hall of Fame')]
[('obj', 'http://www.wikidata.org/entity/Q4835326'), ('name', 'BC Sports Hall of Fame')]
[('obj', 'http://www.wikidata.org/entity/Q11901935'), ('name', "Finnish FA's Women's Footballer of the Year")]
[('obj', 'http://www.wikidata.org/entity/Q17355204'), ('name', 'World Cup Golden Ball')]
[('obj', 'http://www.wikidata.org/entity/Q166177'), ('name', "Ballon d'Or")]
[('obj', 'http://www.wikidata.org/entity/Q373232'), ('name', 'UEFA Club Football Awards')]
[('obj', 'http://www.wikidata.org/entity/Q1187699'), ('name', 'UEFA Club Footballer of the Year')]
[('obj', 'http://www.wikidata.org/entity/Q1478325'), ('name

In [11]:
### insert the result of TASK 3 in the file
og_uri = "http://www.wikidata.org/entity/Q166177"
og_name = "Ballon d'Or"
obj = {"uri":og_uri,"name":og_name}
evaluation.add_result(evaluation.get_index_workflow(pt),"5", evaluation.TYPE_SINGLE ,"uri", [obj] ,"all")

The index of this workflow is: 0_6
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow0_6.json
JSON object updated


`wd:Q166177` is the Ballon d'Or

## Task 6
Find the countries who have at least two players of their nationality who won a ballon d'Or

In [12]:
# find countries with more players who won the ballon d'Or
queryString = """
SELECT ?country ?name count(*) AS ?won
WHERE { 

?people wdt:P27 ?country;
        wdt:P106 wd:Q937857;
        wdt:P166 wd:Q166177.

?country sc:name ?name
}
GROUP BY ?country ?name
HAVING (count(*) > 1)
ORDER BY DESC (?won)
"""

print("Results")
x = run_query(queryString)

Results
[('country', 'http://www.wikidata.org/entity/Q145'), ('name', 'United Kingdom'), ('won', '6')]
[('country', 'http://www.wikidata.org/entity/Q38'), ('name', 'Italy'), ('won', '6')]
[('country', 'http://www.wikidata.org/entity/Q183'), ('name', 'Germany'), ('won', '5')]
[('country', 'http://www.wikidata.org/entity/Q29'), ('name', 'Spain'), ('won', '4')]
[('country', 'http://www.wikidata.org/entity/Q155'), ('name', 'Brazil'), ('won', '4')]
[('country', 'http://www.wikidata.org/entity/Q142'), ('name', 'France'), ('won', '4')]
[('country', 'http://www.wikidata.org/entity/Q212'), ('name', 'Ukraine'), ('won', '3')]
[('country', 'http://www.wikidata.org/entity/Q45'), ('name', 'Portugal'), ('won', '3')]
[('country', 'http://www.wikidata.org/entity/Q414'), ('name', 'Argentina'), ('won', '3')]
[('country', 'http://www.wikidata.org/entity/Q15180'), ('name', 'Soviet Union'), ('won', '3')]
[('country', 'http://www.wikidata.org/entity/Q29999'), ('name', 'Kingdom of the Netherlands'), ('won', '

In [13]:
objs = []
for i in x:
    f_uri = i[0][1]
    f_name = i[1][1]
    victories = i[2][1]
    obj = {}
    obj["refers_to"] = f_uri
    obj["refers_to_name"] = f_name
    obj["check"] = "value"
    obj["value"]= victories
    objs.append(obj)
evaluation.add_result(evaluation.get_index_workflow(pt),"6", evaluation.TYPE_REFERRED ,"value", objs)

The index of this workflow is: 0_6
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow0_6.json
JSON object updated


## Task 7
Identify the BGP of the FIFA World Cup

In [41]:
# print properties of the players that are not literals
queryString = """
SELECT DISTINCT ?prop ?name
WHERE { 

?player wdt:P106 wd:Q937857;
        ?prop ?x.
FILTER(!isLiteral(?x)).
?prop sc:name ?name .
}
"""

print("Results")
x = run_query(queryString)

Results
[('prop', 'http://www.wikidata.org/prop/direct/P103'), ('name', 'native language')]
[('prop', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('prop', 'http://www.wikidata.org/prop/direct/P108'), ('name', 'employer')]
[('prop', 'http://www.wikidata.org/prop/direct/P118'), ('name', 'league')]
[('prop', 'http://www.wikidata.org/prop/direct/P1343'), ('name', 'described by source')]
[('prop', 'http://www.wikidata.org/prop/direct/P1344'), ('name', 'participant in')]
[('prop', 'http://www.wikidata.org/prop/direct/P140'), ('name', 'religion')]
[('prop', 'http://www.wikidata.org/prop/direct/P1412'), ('name', 'languages spoken, written or signed')]
[('prop', 'http://www.wikidata.org/prop/direct/P1532'), ('name', 'country for sport')]
[('prop', 'http://www.wikidata.org/prop/direct/P166'), ('name', 'award received')]
[('prop', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('prop', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different fro

In [42]:
# print objects of the properties related with partecipant in (P1344)
queryString = """
SELECT DISTINCT ?x ?name
WHERE { 

?player wdt:P106 wd:Q937857;
        wdt:P1344 ?x.
?x sc:name ?name .
}
LIMIT 20
"""

print("Results")
x = run_query(queryString)

Results
[('x', 'http://www.wikidata.org/entity/Q104439579'), ('name', '2020 Belarusian Premier League relegation play-offs')]
[('x', 'http://www.wikidata.org/entity/Q77927099'), ('name', '2020 Belarusian Premier League')]
[('x', 'http://www.wikidata.org/entity/Q11831154'), ('name', '2019 Africa Cup of Nations')]
[('x', 'http://www.wikidata.org/entity/Q11865693'), ('name', 'football at the 2016 Summer Olympics')]
[('x', 'http://www.wikidata.org/entity/Q8150'), ('name', '1936 Summer Olympics')]
[('x', 'http://www.wikidata.org/entity/Q48432'), ('name', '1930 FIFA World Cup')]
[('x', 'http://www.wikidata.org/entity/Q18479838'), ('name', "football at the 2016 Summer Olympics – women's tournament")]
[('x', 'http://www.wikidata.org/entity/Q62589205'), ('name', "football at the 2020 Summer Olympics – women's tournament")]
[('x', 'http://www.wikidata.org/entity/Q102920'), ('name', 'UEFA Euro 2004')]
[('x', 'http://www.wikidata.org/entity/Q1185431'), ('name', '2015 Pan American Games')]
[('x', '

I found a lot of objects that seems to be part of a super class or something else. Starting point: `wd:Q37285`

In [43]:
# print objects of the properties related with partecipant in (P1344)
queryString = """
SELECT DISTINCT ?prop ?name
WHERE { 

wd:Q37285 ?prop ?x.
?prop sc:name ?name .
}
"""

print("Results")
x = run_query(queryString)

Results
[('prop', 'http://www.wikidata.org/prop/direct/P1132'), ('name', 'number of participants')]
[('prop', 'http://www.wikidata.org/prop/direct/P1151'), ('name', "topic's main Wikimedia portal")]
[('prop', 'http://www.wikidata.org/prop/direct/P1346'), ('name', 'winner')]
[('prop', 'http://www.wikidata.org/prop/direct/P1350'), ('name', 'number of matches played/races/starts')]
[('prop', 'http://www.wikidata.org/prop/direct/P1424'), ('name', "topic's main template")]
[('prop', 'http://www.wikidata.org/prop/direct/P1546'), ('name', 'motto')]
[('prop', 'http://www.wikidata.org/prop/direct/P17'), ('name', 'country')]
[('prop', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('prop', 'http://www.wikidata.org/prop/direct/P2094'), ('name', 'competition class')]
[('prop', 'http://www.wikidata.org/prop/direct/P214'), ('name', 'VIAF ID')]
[('prop', 'http://www.wikidata.org/prop/direct/P227'), ('name', 'GND ID')]
[('prop', 'http://www.wikidata.org/prop/direct/P244'), ('name', 'L

In [44]:
# use the property sports league or competition
queryString = """
SELECT DISTINCT ?x ?name
WHERE { 

wd:Q37285 wdt:P3450 ?x.
?x sc:name ?name .
}
"""

print("Results")
x = run_query(queryString)

Results
[('x', 'http://www.wikidata.org/entity/Q19317'), ('name', 'FIFA World Cup')]
1


In [14]:
### insert the result of TASK 3 in the file
og_uri = "http://www.wikidata.org/entity/Q19317"
og_name = "FIFA World Cup"
obj = {"uri":og_uri,"name":og_name}
evaluation.add_result(evaluation.get_index_workflow(pt),"7", evaluation.TYPE_SINGLE ,"uri", [obj] ,"all")

The index of this workflow is: 0_6
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow0_6.json
JSON object updated


`wd:Q19317` is the FIFA World Cup 

## Task 8
Who are the players who partecipated in more than 4 world's cup editions?

In [15]:
# print properties of the players that are not literals
queryString = """
SELECT DISTINCT ?player ?name count(*) AS ?wcs
WHERE { 

?player wdt:P106 wd:Q937857;
        wdt:P1344 ?wc.
?wc wdt:P3450 wd:Q19317.
?player sc:name ?name
}
GROUP BY ?player ?name
HAVING (count(*) > 4)
ORDER BY DESC (?wcs)
"""

print("Results")
x = run_query(queryString)

Results
[('player', 'http://www.wikidata.org/entity/Q186330'), ('name', 'Rafael Márquez'), ('wcs', '5')]
[('player', 'http://www.wikidata.org/entity/Q68060'), ('name', 'Gianluigi Buffon'), ('wcs', '5')]
[('player', 'http://www.wikidata.org/entity/Q265428'), ('name', 'Antonio Carbajal'), ('wcs', '5')]
[('player', 'http://www.wikidata.org/entity/Q36386'), ('name', 'Lothar Matthäus'), ('wcs', '5')]
4


In [16]:
objs = []
for i in x:
    f_uri = i[0][1]
    f_name = i[1][1]
    victories = i[2][1]
    obj = {}
    obj["refers_to"] = f_uri
    obj["refers_to_name"] = f_name
    obj["check"] = "value"
    obj["value"]= victories
    objs.append(obj)
evaluation.add_result(evaluation.get_index_workflow(pt),"8", evaluation.TYPE_REFERRED ,"value", objs)

The index of this workflow is: 0_6
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow0_6.json
JSON object updated


## Task 9
For each country who won the FIFA World Cup at least twice find how many players won the ballon d'Or

In [72]:
# verify connections between countries and FIFA World Cup
queryString = """
SELECT DISTINCT ?prop ?name
WHERE { 

?nation wdt:P31 wd:Q6256.
?wc wdt:P3450 wd:Q19317.
?nation ?prop ?wc.

?prop sc:name ?name.
}

"""

print("Results")
x = run_query(queryString)

Results
[('prop', 'http://www.wikidata.org/prop/direct/P793'), ('name', 'significant event')]
1


In [77]:
# start from the FIFA World Cup, look at his properties to find something related with the winner
queryString = """
SELECT DISTINCT ?prop ?name
WHERE { 

?wc wdt:P3450 wd:Q19317;
    ?prop ?x.
?prop sc:name ?name.
}

"""

print("Results")
x = run_query(queryString)

Results
[('prop', 'http://www.wikidata.org/prop/direct/P4967'), ('name', 'prerequisite')]
[('prop', 'http://www.wikidata.org/prop/direct/P1132'), ('name', 'number of participants')]
[('prop', 'http://www.wikidata.org/prop/direct/P1151'), ('name', "topic's main Wikimedia portal")]
[('prop', 'http://www.wikidata.org/prop/direct/P131'), ('name', 'located in the administrative territorial entity')]
[('prop', 'http://www.wikidata.org/prop/direct/P1346'), ('name', 'winner')]
[('prop', 'http://www.wikidata.org/prop/direct/P1350'), ('name', 'number of matches played/races/starts')]
[('prop', 'http://www.wikidata.org/prop/direct/P1424'), ('name', "topic's main template")]
[('prop', 'http://www.wikidata.org/prop/direct/P154'), ('name', 'logo image')]
[('prop', 'http://www.wikidata.org/prop/direct/P1546'), ('name', 'motto')]
[('prop', 'http://www.wikidata.org/prop/direct/P155'), ('name', 'follows')]
[('prop', 'http://www.wikidata.org/prop/direct/P1617'), ('name', 'BBC Things ID')]
[('prop', 'http

In [80]:
# use the property winner (P1346) and print some results to understand
queryString = """
SELECT DISTINCT ?cup ?x ?name
WHERE { 

?wc wdt:P3450 wd:Q19317;
    wdt:P1346 ?x.
?x sc:name ?name.
?wc sc:name ?cup.
}
ORDER BY ?cup

"""

print("Results")
x = run_query(queryString)

Results
[('cup', '1930 FIFA World Cup'), ('x', 'http://www.wikidata.org/entity/Q134916'), ('name', 'Uruguay national football team')]
[('cup', '1934 FIFA World Cup'), ('x', 'http://www.wikidata.org/entity/Q676899'), ('name', 'Italy national association football team')]
[('cup', '1938 FIFA World Cup'), ('x', 'http://www.wikidata.org/entity/Q676899'), ('name', 'Italy national association football team')]
[('cup', '1950 FIFA World Cup'), ('x', 'http://www.wikidata.org/entity/Q134916'), ('name', 'Uruguay national football team')]
[('cup', '1954 FIFA World Cup'), ('x', 'http://www.wikidata.org/entity/Q43310'), ('name', 'Germany national association football team')]
[('cup', '1958 FIFA World Cup'), ('x', 'http://www.wikidata.org/entity/Q83459'), ('name', 'Brazil national football team')]
[('cup', '1962 FIFA World Cup'), ('x', 'http://www.wikidata.org/entity/Q83459'), ('name', 'Brazil national football team')]
[('cup', '1966 FIFA World Cup'), ('x', 'http://www.wikidata.org/entity/Q47762'), ('

I understand that we are talking of national football team. Verify how these teams are related with their country

In [81]:
# use the property winner (P1346) and print some results to understand
queryString = """
SELECT DISTINCT ?prop ?name
WHERE { 

?wc wdt:P3450 wd:Q19317;
    wdt:P1346 ?x.
    
?country wdt:P31 wd:Q6256.
?x ?prop ?country.
    
?prop sc:name ?name.
}

"""

print("Results")
x = run_query(queryString)

Results
[('prop', 'http://www.wikidata.org/prop/direct/P1532'), ('name', 'country for sport')]
[('prop', 'http://www.wikidata.org/prop/direct/P17'), ('name', 'country')]
2


In [82]:
# use the property country (P17) and print some results to understand
queryString = """
SELECT DISTINCT ?x ?team ?country ?cName
WHERE { 

?wc wdt:P3450 wd:Q19317;
    wdt:P1346 ?x.
    
?country wdt:P31 wd:Q6256.
?x wdt:P17 ?country.
    
?country sc:name ?cName.
?x sc:name ?team.
}

"""

print("Results")
x = run_query(queryString)

Results
[('x', 'http://www.wikidata.org/entity/Q47774'), ('team', 'France national association football team'), ('country', 'http://www.wikidata.org/entity/Q142'), ('cName', 'France')]
[('x', 'http://www.wikidata.org/entity/Q47762'), ('team', 'England national association football team'), ('country', 'http://www.wikidata.org/entity/Q145'), ('cName', 'United Kingdom')]
[('x', 'http://www.wikidata.org/entity/Q83459'), ('team', 'Brazil national football team'), ('country', 'http://www.wikidata.org/entity/Q155'), ('cName', 'Brazil')]
[('x', 'http://www.wikidata.org/entity/Q676899'), ('team', 'Italy national association football team'), ('country', 'http://www.wikidata.org/entity/Q38'), ('cName', 'Italy')]
[('x', 'http://www.wikidata.org/entity/Q79800'), ('team', 'Argentina national football team'), ('country', 'http://www.wikidata.org/entity/Q414'), ('cName', 'Argentina')]
[('x', 'http://www.wikidata.org/entity/Q134916'), ('team', 'Uruguay national football team'), ('country', 'http://www.

In [11]:
# print the countries with the number of FIFA World Cup won
queryString = """
SELECT ?country ?cName (COUNT(?wc) AS ?wcs)
WHERE { 

?wc wdt:P3450 wd:Q19317;
    wdt:P1346 ?x.
    
?x wdt:P17 ?country.
    
?country sc:name ?cName.
}
GROUP BY ?country ?cName

"""

print("Results")
x = run_query(queryString)

Results
[('country', 'http://www.wikidata.org/entity/Q38'), ('cName', 'Italy'), ('wcs', '4')]
[('country', 'http://www.wikidata.org/entity/Q183'), ('cName', 'Germany'), ('wcs', '4')]
[('country', 'http://www.wikidata.org/entity/Q155'), ('cName', 'Brazil'), ('wcs', '5')]
[('country', 'http://www.wikidata.org/entity/Q29'), ('cName', 'Spain'), ('wcs', '1')]
[('country', 'http://www.wikidata.org/entity/Q414'), ('cName', 'Argentina'), ('wcs', '2')]
[('country', 'http://www.wikidata.org/entity/Q142'), ('cName', 'France'), ('wcs', '2')]
[('country', 'http://www.wikidata.org/entity/Q145'), ('cName', 'United Kingdom'), ('wcs', '1')]
[('country', 'http://www.wikidata.org/entity/Q77'), ('cName', 'Uruguay'), ('wcs', '2')]
8


In [19]:
# print the countries with the number of FIFA World Cup won
queryString = """
SELECT DISTINCT ?player ?name
WHERE{
    {
        SELECT ?country 
        WHERE { 
            ?wc wdt:P3450 wd:Q19317;
                wdt:P1346 ?x.
            ?x wdt:P17 ?country.
        }
        GROUP BY ?country
        HAVING (COUNT(DISTINCT ?wc) > 1)
    }
    
    ?player wdt:P106 wd:Q937857;
        wdt:P166 wd:Q166177;
        wdt:P27 ?country.
    ?player sc:name ?name.
}

"""

print("Results")
x = run_query(queryString)

Results
[('player', 'http://www.wikidata.org/entity/Q4457'), ('name', 'Franz Beckenbauer')]
[('player', 'http://www.wikidata.org/entity/Q4261'), ('name', 'Michel Platini')]
[('player', 'http://www.wikidata.org/entity/Q39444'), ('name', 'Ronaldinho')]
[('player', 'http://www.wikidata.org/entity/Q531814'), ('name', 'Kaká')]
[('player', 'http://www.wikidata.org/entity/Q13494'), ('name', 'Jean-Pierre Papin')]
[('player', 'http://www.wikidata.org/entity/Q102027'), ('name', 'Fabio Cannavaro')]
[('player', 'http://www.wikidata.org/entity/Q36386'), ('name', 'Lothar Matthäus')]
[('player', 'http://www.wikidata.org/entity/Q123118'), ('name', 'Raymond Kopa')]
[('player', 'http://www.wikidata.org/entity/Q43744'), ('name', 'Gerd Müller')]
[('player', 'http://www.wikidata.org/entity/Q1835'), ('name', 'Zinedine Zidane')]
[('player', 'http://www.wikidata.org/entity/Q230046'), ('name', 'Gianni Rivera')]
[('player', 'http://www.wikidata.org/entity/Q220700'), ('name', 'Omar Sívori')]
[('player', 'http://

In [20]:
obj = [{"uri":r[0][1],"name":r[1][1]} for r in x]
evaluation.add_result(evaluation.get_index_workflow(pt),"9", evaluation.TYPE_SET ,"uri", obj)

The index of this workflow is: 0_6
The path is /locale/data/jupyter/prando/notebook/2022/results/workflow0_6.json
JSON object updated
