# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.

In [2]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-movie2-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString,verbose = True):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        results = sparql.query()
        json_results = results.convert()
        if len(json_results['results']['bindings'])==0:
            print("Empty")
            return []
        array = []
        for bindings in json_results['results']['bindings']:
            app =  [ (var, value['value'])  for var, value in bindings.items() ] 
            if verbose:
                print( app)
            array.append(app)
        if verbose:
            print(len(array))
        return array

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# Movie Workflow Series ("Disney explorative search") 

Consider the following exploratory information need:

> investigate films produced by Disney and compare workers among films.

## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | profession    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P27`     | nationality   | predicate |
| `wdt:P106`     | profession   | predicate |
| `wdt:P3342`     | Significant person       | predicate |
| `wd:Q5`| Human       | node |
| `wd:Q36479`    | The Lion King      | node |





Also consider

```
wd:Q36479 ?p ?obj .
```

is the BGP to retrieve all **properties of The Lion King**

Please consider that when you return a resource, you should return the IRI and the label of the resource. In particular, when the task require you to identify a BGP the result set must always be a list of couples IRI - label.

The workload should:


1. Identify the BGP for films

2. Identify the BGP for Walt Disney Animation Studios

3. Identify the BGP for Walt Disney (person)

4. List all the films created by the Walt Disney Animation Studios while Walt Disney was alive (the result set must be a list of couples films IRI and label).

5. Find the name of the voice actors who participated in more than 5 films by the Walt Disney Animation Studios (the result set must be a list of triples voice actors IRI, label and #films).

6. Identify the BGP for the film "The Little Mermaid"

7. Find who worked on both "The Little Mermaid" and "The Lion King" (the result set must be a list of couples worker IRI and label). 

8. Identify the BGP for Academy Awards

9. For each films created by the Walt Disney Animation Studios return the number of nomination for an Academy Awards. Return only films with at least one nomination. (the result set must be triples of films IRI, label and #nomination).

## Task 1

In [4]:
queryString = """
SELECT DISTINCT ?p ?name
WHERE { 
    wd:Q36479 ?p ?obj .
    FILTER(!isLiteral(?obj)).
    ?p sc:name ?name.
} 
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P3092'), ('name', 'film crew member')]
[('p', 'http://www.wikidata.org/prop/direct/P1040'), ('name', 'film editor')]
[('p', 'http://www.wikidata.org/prop/direct/P136'), ('name', 'genre')]
[('p', 'http://www.wikidata.org/prop/direct/P1411'), ('name', 'nominated for')]
[('p', 'http://www.wikidata.org/prop/direct/P1434'), ('name', 'takes place in fictional universe')]
[('p', 'http://www.wikidata.org/prop/direct/P144'), ('name', 'based on')]
[('p', 'http://www.wikidata.org/prop/direct/P154'), ('name', 'logo image')]
[('p', 'http://www.wikidata.org/prop/direct/P1552'), ('name', 'has quality')]
[('p', 'http://www.wikidata.org/prop/direct/P156'), ('name', 'followed by')]
[('p', 'http://www.wikidata.org/prop/direct/P162'), ('name', 'producer')]
[('p', 'http://www.wikidata.org/prop/direct/P1657'), ('name', 'MPAA film rating')]
[('p', 'http://www.wikidata.org/prop/direct/P166'), ('name', 'award received')]
[('p', 'http://www.wikidata.org/prop/d

In [5]:
queryString = """
SELECT DISTINCT ?obj ?name
WHERE { 
    wd:Q36479 wdt:P31 ?obj .
    ?obj sc:name ?name.
} 
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q29168811'), ('name', 'animated feature film')]
1


In [7]:
queryString = """
SELECT DISTINCT ?obj ?name
WHERE { 
    wd:Q36479 wdt:P31/wdt:P279+ ?obj .
    ?obj sc:name ?name.
} 
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q15621286'), ('name', 'intellectual work')]
[('obj', 'http://www.wikidata.org/entity/Q20937557'), ('name', 'series')]
[('obj', 'http://www.wikidata.org/entity/Q2150504'), ('name', 'signal')]
[('obj', 'http://www.wikidata.org/entity/Q35120'), ('name', 'entity')]
[('obj', 'http://www.wikidata.org/entity/Q2995644'), ('name', 'result')]
[('obj', 'http://www.wikidata.org/entity/Q1150771'), ('name', 'output')]
[('obj', 'http://www.wikidata.org/entity/Q11028'), ('name', 'information')]
[('obj', 'http://www.wikidata.org/entity/Q11424'), ('name', 'film')]
[('obj', 'http://www.wikidata.org/entity/Q17537576'), ('name', 'creative work')]
[('obj', 'http://www.wikidata.org/entity/Q386724'), ('name', 'work')]
[('obj', 'http://www.wikidata.org/entity/Q42848'), ('name', 'data')]
[('obj', 'http://www.wikidata.org/entity/Q202866'), ('name', 'animated film')]
[('obj', 'http://www.wikidata.org/entity/Q174984'), ('name', 'signal')]
[('obj', 'http://www.wikida

`wd:Q11424` is the class of film

## Task 2

In [12]:
queryString = """
SELECT DISTINCT ?p ?name ?obj ?nO
WHERE { 
    wd:Q36479 ?p ?obj .
    FILTER(!isLiteral(?obj)).
    ?obj sc:name ?nO.
    FILTER(REGEX(?nO,"[D,d]isney")).
    ?p sc:name ?name.
} 
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P179'), ('name', 'part of the series'), ('obj', 'http://www.wikidata.org/entity/Q56070713'), ('nO', 'Walt Disney Animation Studios film')]
[('p', 'http://www.wikidata.org/prop/direct/P272'), ('name', 'production company'), ('obj', 'http://www.wikidata.org/entity/Q1047410'), ('nO', 'Walt Disney Animation Studios')]
[('p', 'http://www.wikidata.org/prop/direct/P272'), ('name', 'production company'), ('obj', 'http://www.wikidata.org/entity/Q191224'), ('nO', 'Walt Disney Pictures')]
[('p', 'http://www.wikidata.org/prop/direct/P750'), ('name', 'distributed by'), ('obj', 'http://www.wikidata.org/entity/Q1323594'), ('nO', 'Walt Disney Studios Motion Pictures')]
[('p', 'http://www.wikidata.org/prop/direct/P750'), ('name', 'distributed by'), ('obj', 'http://www.wikidata.org/entity/Q54958752'), ('nO', 'Disney+')]
5


`wd:Q1047410` is the Walt Disney Animation Studios

## Task 3

In [13]:
queryString = """
SELECT DISTINCT ?p ?name ?obj ?nO
WHERE { 
    wd:Q1047410 ?p ?obj .
    FILTER(!isLiteral(?obj)).
    ?obj sc:name ?nO.
    FILTER(REGEX(?nO,"[D,d]isney")).
    ?p sc:name ?name.
} 
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P112'), ('name', 'founded by'), ('obj', 'http://www.wikidata.org/entity/Q347432'), ('nO', 'Roy O. Disney')]
[('p', 'http://www.wikidata.org/prop/direct/P112'), ('name', 'founded by'), ('obj', 'http://www.wikidata.org/entity/Q8704'), ('nO', 'Walt Disney')]
[('p', 'http://www.wikidata.org/prop/direct/P127'), ('name', 'owned by'), ('obj', 'http://www.wikidata.org/entity/Q7414'), ('nO', 'The Walt Disney Company')]
[('p', 'http://www.wikidata.org/prop/direct/P127'), ('name', 'owned by'), ('obj', 'http://www.wikidata.org/entity/Q191224'), ('nO', 'Walt Disney Pictures')]
[('p', 'http://www.wikidata.org/prop/direct/P749'), ('name', 'parent organization'), ('obj', 'http://www.wikidata.org/entity/Q2216535'), ('nO', 'Walt Disney Studios')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('name', "topic's main category"), ('obj', 'http://www.wikidata.org/entity/Q25088476'), ('nO', 'Category:Walt Disney Animation Studios')]
6


`wd:Q8704` is Walt Disney

## Task 4

In [18]:
queryString = """
SELECT DISTINCT ?p ?name
WHERE { 
    wd:Q36479 ?p ?obj .
    ?p sc:name ?name.
    FILTER(REGEX(?name,"[D,d]ate")).
} 
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P577'), ('name', 'publication date')]
1


In [17]:
queryString = """
SELECT DISTINCT ?prop ?name
WHERE { 
    wd:Q8704 ?prop ?obj .
    ?prop sc:name ?name.
    FILTER(REGEX(?name,"[D,d]ate")).
} 
"""

print("Results")
x=run_query(queryString)

Results
[('prop', 'http://www.wikidata.org/prop/direct/P569'), ('name', 'date of birth')]
[('prop', 'http://www.wikidata.org/prop/direct/P570'), ('name', 'date of death')]
2


In [23]:
queryString = """
SELECT DISTINCT ?film ?name
WHERE { 
    wd:Q8704 wdt:P570 ?death.
    ?film wdt:P272 wd:Q1047410 ;
          wdt:P577 ?publ.
    FILTER(?publ < ?death).
    ?film sc:name ?name.
} 
"""

print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q2275783'), ('name', 'The Barn Dance')]
[('film', 'http://www.wikidata.org/entity/Q3521221'), ('name', 'The Haunted House')]
[('film', 'http://www.wikidata.org/entity/Q3392773'), ('name', 'Playful Pluto')]
[('film', 'http://www.wikidata.org/entity/Q2700743'), ('name', 'Clock Cleaners')]
[('film', 'http://www.wikidata.org/entity/Q842306'), ('name', 'Saludos Amigos')]
[('film', 'http://www.wikidata.org/entity/Q3207087'), ('name', 'Hockey Homicide')]
[('film', 'http://www.wikidata.org/entity/Q3178943'), ('name', "Donald's Dilemma")]
[('film', 'http://www.wikidata.org/entity/Q2399982'), ('name', 'Tea for Two Hundred')]
[('film', 'http://www.wikidata.org/entity/Q3520106'), ('name', 'The Brave Engineer')]
[('film', 'http://www.wikidata.org/entity/Q3228508'), ('name', 'Hold That Pose')]
[('film', 'http://www.wikidata.org/entity/Q3028329'), ('name', 'Two Weeks Vacation')]
[('film', 'http://www.wikidata.org/entity/Q2090328'), ('name', 'Trick or 

## Task 5

In [7]:
queryString = """
SELECT ?va ?name COUNT(DISTINCT ?film) AS ?counts
WHERE { 
    ?film wdt:P272 wd:Q1047410 .
    ?film wdt:P725 ?va.
    ?va sc:name ?name.
} 
GROUP BY ?va ?name
HAVING (COUNT(DISTINCT ?film) > 5)
"""

print("Results")
x=run_query(queryString)

Results
[('va', 'http://www.wikidata.org/entity/Q264418'), ('name', 'Russi Taylor'), ('counts', '6')]
[('va', 'http://www.wikidata.org/entity/Q310292'), ('name', 'Jim Cummings'), ('counts', '10')]
[('va', 'http://www.wikidata.org/entity/Q178882'), ('name', 'Kristen Bell'), ('counts', '6')]
[('va', 'http://www.wikidata.org/entity/Q349350'), ('name', 'Alan Tudyk'), ('counts', '7')]
[('va', 'http://www.wikidata.org/entity/Q296577'), ('name', 'Frank Welker'), ('counts', '7')]
5


## Task 6

In [35]:
queryString = """
SELECT DISTINCT ?film ?name
WHERE { 
    ?film wdt:P272 wd:Q1047410 .
    ?film sc:name ?name.
    FILTER(REGEX(?name,"[L,l]ittle")).
} 
"""

print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q944287'), ('name', 'The Little Mermaid')]
[('film', 'http://www.wikidata.org/entity/Q270940'), ('name', 'Chicken Little')]
[('film', 'http://www.wikidata.org/entity/Q1633009'), ('name', 'The Little Matchgirl')]
3


`wd:Q944287` is The Little Mermaid

## Task 7

In [41]:
queryString = """
SELECT DISTINCT ?people ?name
WHERE { 
    VALUES ?films {wd:Q944287 wd:Q36479}.
    ?films ?prop ?people .
    ?people wdt:P31 wd:Q5.
    ?people sc:name ?name.
} 
GROUP BY ?people ?name
HAVING (COUNT (DISTINCT ?films) = 2 )
"""

print("Results")
x=run_query(queryString)

Results
[('people', 'http://www.wikidata.org/entity/Q310292'), ('name', 'Jim Cummings')]
1


## Task 8

In [2]:
queryString = """
SELECT DISTINCT ?pr ?name
WHERE {
    ?film wdt:P272 wd:Q1047410 ;
          ?pr ?o.
    ?pr sc:name ?name.
} 
"""

print("Results")
x=run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P3383'), ('name', 'film poster')]
[('pr', 'http://www.wikidata.org/prop/direct/P7132'), ('name', 'ComingSoon.it film ID')]
[('pr', 'http://www.wikidata.org/prop/direct/P2438'), ('name', 'narrator')]
[('pr', 'http://www.wikidata.org/prop/direct/P3092'), ('name', 'film crew member')]
[('pr', 'http://www.wikidata.org/prop/direct/P4021'), ('name', 'danskefilm animated film ID')]
[('pr', 'http://www.wikidata.org/prop/direct/P5201'), ('name', 'IMDA rating')]
[('pr', 'http://www.wikidata.org/prop/direct/P6359'), ('name', 'Crew United title ID')]
[('pr', 'http://www.wikidata.org/prop/direct/P8371'), ('name', 'references work, tradition or theory')]
[('pr', 'http://www.wikidata.org/prop/direct/P4437'), ('name', 'FPB rating')]
[('pr', 'http://www.wikidata.org/prop/direct/P7679'), ('name', 'Cinépolis KLIC ID')]
[('pr', 'http://www.wikidata.org/prop/direct/P8889'), ('name', 'MTRCB rating')]
[('pr', 'http://www.wikidata.org/prop/direct/P7458'), (

In [3]:
queryString = """
SELECT DISTINCT ?o ?name
WHERE {
    ?film wdt:P272 wd:Q1047410 ;
          wdt:P166 ?o.
    ?o sc:name ?name.
} 
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q135498'), ('name', 'Genesis Award')]
[('o', 'http://www.wikidata.org/entity/Q1800189'), ('name', 'Annie Award for Best Animated Feature')]
[('o', 'http://www.wikidata.org/entity/Q878902'), ('name', 'Golden Globe Award for Best Animated Feature Film')]
[('o', 'http://www.wikidata.org/entity/Q917808'), ('name', 'Academy Award for Best Animated Short Film')]
[('o', 'http://www.wikidata.org/entity/Q281939'), ('name', 'Academy Award for Best Film Editing')]
[('o', 'http://www.wikidata.org/entity/Q823422'), ('name', 'National Film Registry')]
[('o', 'http://www.wikidata.org/entity/Q393686'), ('name', 'Academy Award for Best Visual Effects')]
[('o', 'http://www.wikidata.org/entity/Q488645'), ('name', 'Academy Award for Best Sound Editing')]
[('o', 'http://www.wikidata.org/entity/Q488651'), ('name', 'Academy Award for Best Original Score')]
[('o', 'http://www.wikidata.org/entity/Q106800'), ('name', 'Academy Award for Best Animated Feature')]
[('o

In [5]:
queryString = """
SELECT DISTINCT ?class ?name
WHERE {
    wd:Q106800 wdt:P31 ?class.
    ?class sc:name ?name.
} 
"""

print("Results")
x=run_query(queryString)

Results
[('class', 'http://www.wikidata.org/entity/Q96474688'), ('name', 'award for best film by genre')]
[('class', 'http://www.wikidata.org/entity/Q19020'), ('name', 'Academy Awards')]
2


## Task 9

In [16]:
queryString = """
SELECT ?film ?name (COUNT(DISTINCT ?rec) AS ?received)
WHERE {
    ?film wdt:P272 wd:Q1047410 ;
          wdt:P166 ?rec.
    ?rec wdt:P31 wd:Q19020.
    ?film sc:name ?name.
} 
GROUP BY ?film ?name
"""

print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q246283'), ('name', 'Frozen'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q36479'), ('name', 'The Lion King'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q379873'), ('name', 'Who Framed Roger Rabbit'), ('received', '3')]
[('film', 'http://www.wikidata.org/entity/Q16857388'), ('name', 'Feast'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q13091172'), ('name', 'Big Hero 6'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q2252695'), ('name', 'Paperman'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q179673'), ('name', 'Beauty and the Beast'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q15270647'), ('name', 'Zootopia'), ('received', '1')]
8


In [23]:
queryString = """
SELECT ?film ?name (COUNT(DISTINCT ?rec) AS ?received)
WHERE {
    ?film wdt:P272 wd:Q1047410 ;
          wdt:P1411 ?rec.
    ?rec wdt:P31 wd:Q19020.
    ?film sc:name ?name.
} 
GROUP BY ?film ?name
"""

print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q246283'), ('name', 'Frozen'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q36479'), ('name', 'The Lion King'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q379873'), ('name', 'Who Framed Roger Rabbit'), ('received', '6')]
[('film', 'http://www.wikidata.org/entity/Q16857388'), ('name', 'Feast'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q18647981'), ('name', 'Moana'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q2399982'), ('name', 'Tea for Two Hundred'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q215617'), ('name', 'Sleeping Beauty'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q212792'), ('name', 'Bolt'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q28891'), ('name', 'Wreck-It Ralph'), ('received', '1')]
[('film', 'http://www.wikidata.org/entity/Q640679'), ('name', 'Rugged Bear'), ('received', '1')]
[('film', 'htt

In [21]:
queryString = """
SELECT ?film ?name ?received ?nomination
WHERE {
    {SELECT ?film ?name (COUNT(DISTINCT ?rec) AS ?nomination) WHERE{
        
        ?rec wdt:P31 wd:Q19020.
        ?film wdt:P272 wd:Q1047410 .
        ?film sc:name ?name.
        OPTIONAL{
            ?film wdt:P1411 ?rec.
        }
        
    }GROUP BY ?film ?name}
    {SELECT ?film ?name (COUNT(DISTINCT ?rec) AS ?received) WHERE{
        ?rec wdt:P31 wd:Q19020.
        ?film wdt:P272 wd:Q1047410 .
        ?film sc:name ?name.
        OPTIONAL{
            ?film wdt:P166 ?rec.
        }
        
    }GROUP BY ?film ?name}
} 
"""

print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q179673'), ('name', 'Beauty and the Beast'), ('received', '63'), ('nomination', '63')]
[('film', 'http://www.wikidata.org/entity/Q97656754'), ('name', 'Zenimation'), ('received', '63'), ('nomination', '63')]
[('film', 'http://www.wikidata.org/entity/Q28891'), ('name', 'Wreck-It Ralph'), ('received', '63'), ('nomination', '63')]
[('film', 'http://www.wikidata.org/entity/Q3036139'), ('name', 'Beezy Bear'), ('received', '63'), ('nomination', '63')]
[('film', 'http://www.wikidata.org/entity/Q1655222'), ('name', "Mickey's Once Upon a Christmas"), ('received', '63'), ('nomination', '63')]
[('film', 'http://www.wikidata.org/entity/Q3236849'), ('name', "Let's Stick Together"), ('received', '63'), ('nomination', '63')]
[('film', 'http://www.wikidata.org/entity/Q22350906'), ('name', 'Tummy Trouble'), ('received', '63'), ('nomination', '63')]
[('film', 'http://www.wikidata.org/entity/Q2031788'), ('name', 'Tarzan & Jane'), ('received', '63'), ('nom