# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-horror_franchises-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString,verbose = True):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        results = sparql.query()
        json_results = results.convert()
        if len(json_results['results']['bindings'])==0:
            print("Empty")
            return []
        array = []
        for bindings in json_results['results']['bindings']:
            app =  [ (var, value['value'])  for var, value in bindings.items() ] 
            if verbose:
                print( app)
            array.append(app)
        if verbose:
            print(len(array))
        return array

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# Movie Workflow Series ("Horror Franchises explorative search") 

Consider the following exploratory scenario:


> we are interested in horror movies, in particular we care about the slasher subgenre, and we want to compare the Halloween and the Friday the 13th francise. 

Background knowledge useful for better conducting the search: Halloween is a movie from 1978 with many sequels and the set of Halloween + all the sequels is called the "Halloween franchise". The same considerations hold for Friday the 13th.


## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | profession    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wd:Q221103`     | Halloween       | node |
| `wd:Q1243029`| Friday the 13th       | node |





Also consider

```
wd:Q1243029 ?p ?obj .
```

is the BGP to retrieve all **properties of Friday the 13th**

The workload should


1. Investigate the workers of the two original movies and check any commonality 

2. Check who was the most famous actor/actress on Halloween and Friday the 13th at the time of the release (e.g., the actor/actress who acted in more movies before and/or the actor/actress who won more academy awards)

3. Check which actor/actress did more movies after the release of these two films (consider the films separately)

4. Get all the movies of the two franchises (return year and director)

5. Who is most involved in a franchise (who did something in the movies, e.g., writer, director, producer)? 

6. Investigate which franchise is highly regarded by the critics overall

7. Investigate which franchise costed more 

To save the results found by this ground truth we are going to create a JSON object. Before creating the object we have to identify this workflow. All the workflows are indexed in the script notebook when we ran the statistics and we saved a csv file that contains all the workflows' name and the related index. This file is saved in `/indexes/workflows.csv`. Before starting to run the queries of the ground truth, we have to find the name of the current workflow (always in the second cell of the notebook) and then reading the csv file of the indexes find the related index. In this way we always know which notebook we are talking about. The results found in a ground truth will be store in the folder `/results/` with the name that is **workflow{index}.json** where index is the related index found in the csv file of the indexes. 

The JSON object is composed by:
- name of the workflow
- results' object

The results' object is a dictionary with keys the number of the goal of the workflow and value the expected result
The result of each goal can be of two types:
- **single** when the result is well known, for example it is a specific `URI` or more than one. Then there is an array of objects that correspond to the expected result. Every object can be composed by different fields, usually an URI and the related name. When there is more than one element inside the values array, it must be specified if the expected result is the whole set of elements in the array or if it is necessary to find only one of those elements
- **set** when the result is more than one elements and it is not well defined, that is when there is not a query that provides only the wanted results. Also in this case there is an array of objects that correspond to the expected result.
There is also another field that specify the type of the check that must be done in the evaluation in order to determine if the student finds the expected result. Usually we want to verify `URIs` but it can happen that the results are numbers or names. This is helpful in the evaluation phase because we know exactly what to compare to evaluate the student's notebook.

In [2]:
##before start set some variables to save the result
#open this notebook as json
import os
import json
import pandas as pd
ipname = "horror_franchises.ipynb"
data = json.load(open(os.getcwd()+os.sep+ipname))
#get the list of the cells 
df = pd.DataFrame(data['cells'])
#get the first line of the cells that tells me the title of the notebook
source = df['source'][2][0].replace("\n","").replace("#","").strip()
### source contains the name of the workflow

#read the workflow indexes
from pathlib import Path
parent = str(Path(os.getcwd()).parent.absolute())
# this is the filepath of the indexes of all workflows
work_filepath = parent+os.sep+"indexes/workflows.csv"
workflow_indexes = pd.read_csv(work_filepath)

# this variable will contain the index of this workflow
global index_workflow

index_workflow = -1
for index, row in workflow_indexes.iterrows():
    if row["name"] == source:
        index_workflow = str(row["macro"])+"_"+str(row["index"])
if index_workflow == -1:
    print("ERROR: WORKFLOW DOES NOT EXIST IN THE INDEXES")
else:
    print("The index of this workflow is:",index_workflow)
    
#create the directory results if not exist
result_path = parent+"/results"
# Check whether the specified path exists or not
if not os.path.exists(result_path):
    os.mkdir(result_path)
    
# this variable contains the filepath of the ground truth's json object of this workflow
global work_filename
work_filename = result_path+os.sep+"workflow"+str(index_workflow)+".json"

global dict_
global result
result = {}
dict_ = {"description":"Workflow's name: "+source}
fd = open(work_filename,"w")
json.dump(dict_, fd)
fd.close()

The index of this workflow is: 5_2


In [3]:
## define constant variable
global TYPE_SET
TYPE_SET = "set"
global TYPE_SINGLE
TYPE_SINGLE = "single"
global TYPE_REFERRED
TYPE_REFERRED = "referred"

In [4]:
#value is always an array of element
#any_all is a string value that must be "any" or "all"
def add_result(goal_num, type_ , check ,value, any_all = None ,type_inside = None, elements_per_tuple = 1):
    global work_filename
    global dict_
    global result
    
    # add the object in the dictionary
    # create the current dictionary for the specific goal, or get it if it already exists from the big JSON object
    current_dict = {}
    if goal_num in result:
        current_dict = result[goal_num]
    
    
    if type_ == TYPE_SET:
        ## the result is a set of element, so is not well define
        current_dict["type"] = type_
        current_dict["check"] = check
        current_dict["values"] = value
    elif type_ == TYPE_SINGLE:
        ## the result is well defined, it is one or more than one element and it must be set the any_all field
        current_dict["type"] = TYPE_SINGLE
        current_dict["check"] = check
        if any_all is not None and check != "ask":
            current_dict["any_all"] = any_all
        current_dict["values"] = value
    elif type_ == TYPE_REFERRED:
        current_dict["type"] = type_
        current_dict["check"] = check
        current_dict["elements_per_tuple"] = elements_per_tuple
        current_dict["values"] = value
        
        
    ## put the current dictionary for the specific goal in the result's object inside the JSON object
    result[goal_num] = current_dict
    ## put the result's object in the JSON object
    dict_["results"] = result
    
    ## save the JSON object
    fd = open(work_filename,"w")
    json.dump(dict_, fd)
    fd.close()
    
    print("JSON object updated")

### Task 1: Investigate the workers of the two original movies and check any commonality

Find the filmmaking occupation class

In [5]:
# find workers in a film
queryString = """
SELECT DISTINCT ?work ?pname
WHERE { 

wd:Q1243029 ?p ?obj .
?obj wdt:P31 wd:Q5;
    wdt:P106 ?job.
?job wdt:P31 ?work.

#Labels
?work sc:name ?pname .
} 
"""

print("Results")
x=run_query(queryString)

Results
[('work', 'http://www.wikidata.org/entity/Q4220920'), ('pname', 'filmmaking occupation')]
[('work', 'http://www.wikidata.org/entity/Q12737077'), ('pname', 'occupation')]
[('work', 'http://www.wikidata.org/entity/Q28640'), ('pname', 'profession')]
[('work', 'http://www.wikidata.org/entity/Q66715801'), ('pname', 'musical profession')]
[('work', 'http://www.wikidata.org/entity/Q108300140'), ('pname', 'occupation group according to ISCO')]
[('work', 'http://www.wikidata.org/entity/Q15839299'), ('pname', 'theatrical occupation')]
6


In [6]:
# compare all workers 
queryString = """
SELECT DISTINCT ?film ?name COUNT(DISTINCT ?worker) AS ?workers
WHERE { 
    VALUES ?film {wd:Q1243029 wd:Q221103}
    ?film ?p ?worker .
    ?worker wdt:P31 wd:Q5 ;
            wdt:P106/wdt:P31/wdt:279* wd:Q4220920 .

    #Labels
    ?film sc:name ?name .
}
GROUP BY ?film ?name
"""

print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q1243029'), ('name', 'Friday the 13th'), ('workers', '13')]
[('film', 'http://www.wikidata.org/entity/Q221103'), ('name', 'Halloween'), ('workers', '15')]
2


In [7]:
# check if someone worked on both films
queryString = """
SELECT DISTINCT ?worker ?name
WHERE { 
    VALUES ?film {wd:Q1243029 wd:Q221103}
    ?film ?p ?worker .
    ?worker wdt:P31 wd:Q5 ;
            wdt:P106/wdt:P31/wdt:279* wd:Q4220920 .

    #Labels
    ?worker sc:name ?name .
}
GROUP BY ?worker ?name
HAVING (COUNT(DISTINCT ?film)>1)
"""

print("Results")
x=run_query(queryString)

Results


KeyboardInterrupt: 

In [None]:
# compare cast member
queryString = """
SELECT DISTINCT ?film ?name COUNT(?worker) AS ?cast
WHERE { 
    VALUES ?film {wd:Q1243029 wd:Q221103}
    ?film wdt:P161 ?worker .

    #Labels
    ?film sc:name ?name .
}
GROUP BY ?film ?name
"""

print("Results")
x=run_query(queryString)

In [None]:
# compare countries 
queryString = """
SELECT DISTINCT ?film ?name ?country ?cname COUNT(*) AS ?howMany
WHERE { 
    VALUES ?film {wd:Q1243029 wd:Q221103}
    ?film ?p ?worker .
    ?worker wdt:P31 wd:Q5 ;
            wdt:P27 ?country;
            wdt:P106/wdt:P31/wdt:279* wd:Q4220920 .

    #Labels
    ?film sc:name ?name .
    ?country sc:name ?cname .
}
GROUP BY ?film ?name ?country ?cname
ORDER BY DESC (?howMany)
"""

print("Results")
x=run_query(queryString)

### Task 2: Check who was the most famous actor/actress on Halloween and Friday the 13th at the time of the release. 

To answer this question we need to find the time of the release of the two films and to define the criteria to determine how much an actor/actress is famous.
In this case we simply use the number of films where the actor/actress acted.

In [49]:
#count in how many films the actors of friday 13th and halloween acted before the release, counting them separately
queryString = """
SELECT ?f ?name ?actor ?aName (COUNT(DISTINCT ?film) AS ?films)
WHERE { 
    # friday 13th and Halloween
    VALUES ?f {wd:Q1243029 wd:Q221103}.
    ?f wdt:P161 ?actor;
                wdt:P577 ?datef.
    ?film wdt:P161 ?actor;
          wdt:P577 ?date.

    FILTER(?date<?datef).

    ?f sc:name ?name.
    ?actor sc:name ?aName.
} 
GROUP BY ?f ?name ?actor  ?aName
ORDER BY DESC(?films)
"""

print("Results")
x=run_query(queryString)

Results
[('f', 'http://www.wikidata.org/entity/Q221103'), ('name', 'Halloween'), ('actor', 'http://www.wikidata.org/entity/Q314914'), ('aName', 'Donald Pleasence'), ('films', '80')]
[('f', 'http://www.wikidata.org/entity/Q221103'), ('name', 'Halloween'), ('actor', 'http://www.wikidata.org/entity/Q3093664'), ('aName', 'Gabriel Cattand'), ('films', '25')]
[('f', 'http://www.wikidata.org/entity/Q221103'), ('name', 'Halloween'), ('actor', 'http://www.wikidata.org/entity/Q2865227'), ('aName', 'Arthur Malet'), ('films', '12')]
[('f', 'http://www.wikidata.org/entity/Q1243029'), ('name', 'Friday the 13th'), ('actor', 'http://www.wikidata.org/entity/Q263765'), ('aName', 'Betsy Palmer'), ('films', '7')]
[('f', 'http://www.wikidata.org/entity/Q221103'), ('name', 'Halloween'), ('actor', 'http://www.wikidata.org/entity/Q1064022'), ('aName', 'Charles Cyphers'), ('films', '7')]
[('f', 'http://www.wikidata.org/entity/Q221103'), ('name', 'Halloween'), ('actor', 'http://www.wikidata.org/entity/Q270079')

In [8]:
### insert the result of TASK 2 in the file
halloween_uri = "http://www.wikidata.org/entity/Q221103"
halloween_name = "Halloween"
friday_uri = "http://www.wikidata.org/entity/Q1243029"
friday_name = "Friday the 13th"
pleasence_uri = "http://www.wikidata.org/entity/Q314914"
pleasence_name = "Donald Pleasence"
palmer_uri = "http://www.wikidata.org/entity/Q2865227"
palmer_name = "Betsy Palmer"

halloween_res = {"uri":pleasence_uri,"name":pleasence_name,"check":["uri","name"],"refers_to":halloween_uri,"refers_to_name":halloween_name}
friday_res = {"uri":palmer_uri,"name":palmer_name,"check":["uri","name"],"refers_to":friday_uri,"refers_to_name":friday_name}
obj = [halloween_res,friday_res]
add_result("2", TYPE_REFERRED ,["uri","name"],obj)

JSON object updated


### Task 3: Check which actor/actress did more movies after the release of these two films (consider the films separately).

Retrieve who acted in more films after the release of Friday 13th.

In [16]:
queryString = """
SELECT ?f ?name ?actor ?aName (COUNT(DISTINCT ?film) AS ?films)
WHERE { 
    VALUES ?f {wd:Q1243029 wd:Q221103}.
    ?f wdt:P161 ?actor;
                wdt:P577 ?datef.
    ?film wdt:P161 ?actor;
          wdt:P577 ?date.

    FILTER(?date>?datef).

    ?f sc:name ?name.
    ?actor sc:name ?aName.
} 
GROUP BY ?f ?name ?actor  ?aName
ORDER BY DESC(?films)

"""

print("Results")
x=run_query(queryString)

Results
[('f', 'http://www.wikidata.org/entity/Q1243029'), ('name', 'Friday the 13th'), ('actor', 'http://www.wikidata.org/entity/Q3454165'), ('aName', 'Kevin Bacon'), ('films', '64')]
[('f', 'http://www.wikidata.org/entity/Q221103'), ('name', 'Halloween'), ('actor', 'http://www.wikidata.org/entity/Q314914'), ('aName', 'Donald Pleasence'), ('films', '55')]
[('f', 'http://www.wikidata.org/entity/Q221103'), ('name', 'Halloween'), ('actor', 'http://www.wikidata.org/entity/Q106997'), ('aName', 'Jamie Lee Curtis'), ('films', '52')]
[('f', 'http://www.wikidata.org/entity/Q221103'), ('name', 'Halloween'), ('actor', 'http://www.wikidata.org/entity/Q270079'), ('aName', 'P. J. Soles'), ('films', '20')]
[('f', 'http://www.wikidata.org/entity/Q221103'), ('name', 'Halloween'), ('actor', 'http://www.wikidata.org/entity/Q95008'), ('aName', 'John Carpenter'), ('films', '18')]
[('f', 'http://www.wikidata.org/entity/Q221103'), ('name', 'Halloween'), ('actor', 'http://www.wikidata.org/entity/Q3093664'), 

In [9]:
### insert the result of TASK 3 in the file
halloween_uri = "http://www.wikidata.org/entity/Q221103"
halloween_name = "Halloween"
friday_uri = "http://www.wikidata.org/entity/Q1243029"
friday_name = "Friday the 13th"
bacon_uri = "http://www.wikidata.org/entity/Q3454165"
bacon_name = "Kevin Bacon"
pleasence_uri = "http://www.wikidata.org/entity/Q314914"
pleasence_name = "Donald Pleasence"

halloween_res = {"uri":pleasence_uri,"name":pleasence_name,"check":["uri","name"],"refers_to":halloween_uri,"refers_to_name":halloween_name}
friday_res = {"uri":bacon_uri,"name":bacon_name,"check":["uri","name"],"refers_to":friday_uri,"refers_to_name":friday_name}
obj = [halloween_res,friday_res]
add_result("3", TYPE_REFERRED ,["uri","name"],obj)

JSON object updated


### Task 4: Get all the movies of the two franchises (return year and director)

`wdt:P179` is the predicate part of the series, that is useful to find the main entity. 

In [41]:
queryString = """
SELECT DISTINCT ?franchise ?name WHERE{
    
    VALUES ?f {wd:Q221103 wd:Q1243029 }.
    ?f wdt:P179 ?franchise.
    ?franchise sc:name ?name .

}

LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('franchise', 'http://www.wikidata.org/entity/Q1364022'), ('name', 'Halloween')]
[('franchise', 'http://www.wikidata.org/entity/Q1138878'), ('name', 'Friday the 13th')]
2


Since a film can have more than one director and more than one year of release (usually it depends by the country of release), in this query I decide to get as result set all triples **film-director-year** instead of using the group concat 

In [10]:
queryString = """
SELECT DISTINCT ?film ?fName ?dir ?director ?year WHERE{
    
    VALUES ?f {wd:Q221103 wd:Q1243029 }.
    ?f wdt:P179 ?o.
    ?film wdt:P179 ?o;
          wdt:P57  ?dir;
          wdt:P577 ?date.
    BIND(year(xsd:dateTime(?date)) as ?year).
    #Labels
    ?dir sc:name ?director .
    ?film sc:name ?fName .

}

LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q959853'), ('fName', 'Halloween 4: The Return of Michael Myers'), ('dir', 'http://www.wikidata.org/entity/Q1268483'), ('director', 'Dwight H. Little'), ('year', '1988')]
[('film', 'http://www.wikidata.org/entity/Q959853'), ('fName', 'Halloween 4: The Return of Michael Myers'), ('dir', 'http://www.wikidata.org/entity/Q1268483'), ('director', 'Dwight H. Little'), ('year', '1989')]
[('film', 'http://www.wikidata.org/entity/Q1454799'), ('fName', 'Friday the 13th Part VI: Jason Lives'), ('dir', 'http://www.wikidata.org/entity/Q730985'), ('director', 'Tom McLoughlin'), ('year', '1986')]
[('film', 'http://www.wikidata.org/entity/Q203705'), ('fName', 'Halloween III: Season of the Witch'), ('dir', 'http://www.wikidata.org/entity/Q574983'), ('director', 'Tommy Lee Wallace'), ('year', '1982')]
[('film', 'http://www.wikidata.org/entity/Q221103'), ('fName', 'Halloween'), ('dir', 'http://www.wikidata.org/entity/Q95008'), ('director', 'John Carpenter'

In [11]:
objs=[]
for i in x:
    f_uri = i[0][1]
    f_name = i[1][1]
    dir_uri = i[2][1]
    dir_name = i[3][1]
    year = i[4][1]
    # add the year element
    obj = {}
    obj["refers_to"] = f_uri
    obj["refers_to_name"] = f_name
    obj["check"] = "value"
    obj["value"] = year
    objs.append(obj)
    # add the director element
    obj = {}
    obj["refers_to"] = f_uri
    obj["refers_to_name"] = f_name
    obj["check"] = ["uri","name"]
    obj["uri"] = dir_uri
    obj["name"] = dir_name
    objs.append(obj)
add_result("4", TYPE_REFERRED ,["uri","name","value"],objs)

JSON object updated


### Task 5: Who is most involved in a franchise (who did something in the movies, e.g., writer, director, producer)?

Retrieve the most involved worker in both franchises.

In [57]:
queryString = """
SELECT ?franchise ?fName ?worker ?wName (COUNT(DISTINCT ?film) AS ?Appearances)
{  
    VALUES ?f {wd:Q221103 wd:Q1243029 }.
    ?f wdt:P179 ?franchise.
    ?film wdt:P179 ?franchise. #franchise
    ?film ?p ?worker.
    ?worker wdt:P31 wd:Q5 ;
            wdt:P106/wdt:P31/wdt:279* wd:Q4220920 .
    
    #Labels
    ?worker sc:name ?wName.
    ?franchise sc:name ?fName .
}
GROUP BY ?franchise ?fName ?worker ?wName
ORDER BY DESC(?Appearances)
LIMIT 6
"""
print("Results")
x=run_query(queryString)

Results
[('franchise', 'http://www.wikidata.org/entity/Q1138878'), ('fName', 'Friday the 13th'), ('worker', 'http://www.wikidata.org/entity/Q664025'), ('wName', 'Harry Manfredini'), ('Appearances', '9')]
[('franchise', 'http://www.wikidata.org/entity/Q1138878'), ('fName', 'Friday the 13th'), ('worker', 'http://www.wikidata.org/entity/Q709325'), ('wName', 'Sean S. Cunningham'), ('Appearances', '6')]
[('franchise', 'http://www.wikidata.org/entity/Q1364022'), ('fName', 'Halloween'), ('worker', 'http://www.wikidata.org/entity/Q380471'), ('wName', 'Moustapha Akkad'), ('Appearances', '5')]
[('franchise', 'http://www.wikidata.org/entity/Q1364022'), ('fName', 'Halloween'), ('worker', 'http://www.wikidata.org/entity/Q314914'), ('wName', 'Donald Pleasence'), ('Appearances', '5')]
[('franchise', 'http://www.wikidata.org/entity/Q1364022'), ('fName', 'Halloween'), ('worker', 'http://www.wikidata.org/entity/Q106997'), ('wName', 'Jamie Lee Curtis'), ('Appearances', '5')]
[('franchise', 'http://www.wi

In [12]:
### insert the result of TASK 5 in the file
halloween_uri = "http://www.wikidata.org/entity/Q1364022"
halloween_name = "Halloween"
friday_uri = "http://www.wikidata.org/entity/Q1138878"
friday_name = "Friday the 13th"
manfredini_uri = "http://www.wikidata.org/entity/Q664025"
manfredini_name = "Harry Manfredini"
pleasence_uri = "http://www.wikidata.org/entity/Q314914"
pleasence_name = "Donald Pleasence"

halloween_res = {"uri":pleasence_uri,"name":pleasence_name,"check":["uri","name"],"refers_to":halloween_uri,"refers_to_name":halloween_name}
friday_res = {"uri":manfredini_uri,"name":manfredini_name,"check":["uri","name"],"refers_to":friday_uri,"refers_to_name":friday_name}
obj = [halloween_res,friday_res]
add_result("5", TYPE_REFERRED ,["uri","name"],obj)

JSON object updated


### Task 6: Investigate which franchise is highly regarded by the critics overall

Since this Wikidata snapshot does not contains the review score I can use the total number of awards won by the franchises

In [59]:
queryString = """
SELECT DISTINCT ?franchise ?name (COUNT(?o) AS ?wins) (COUNT(?x) AS ?nominations)
{ 
    VALUES ?f {wd:Q221103 wd:Q1243029 }.
    ?f wdt:P179 ?franchise.
    ?film wdt:P179 ?franchise. #franchise
    OPTIONAL{?film wdt:P166 ?o.}
    OPTIONAL{?film wdt:P1411 ?x.}
    
    ?franchise sc:name ?name
}
GROUP BY ?franchise ?name
"""
print("Results")
x=run_query(queryString)

Results
[('franchise', 'http://www.wikidata.org/entity/Q1138878'), ('name', 'Friday the 13th'), ('wins', '0'), ('nominations', '2')]
[('franchise', 'http://www.wikidata.org/entity/Q1364022'), ('name', 'Halloween'), ('wins', '1'), ('nominations', '0')]
2


#### TASK 6 considerations
I used the number of nominations and awards won to measure how much the crititcs liked the films belonging to the two franchises. 
Friday 13th has been nominated two times for Golden Raspberry Award. Unexpectedly this is not a good thing because it was for worst picture and worst
actress, so I'd say that critics liked Halloween franchise more than Friday 13th one.

### Task 7: Investigate which franchise costed more 

Check how many movies have cost (P2130) and bugdet (P2769).

In [45]:
#check how many movies have cost (P2130)
queryString = """
SELECT ?film ?name ?cost WHERE
{  
    VALUES ?franchise {wd:Q1364022 wd:Q1138878}.
    ?film wdt:P179 ?franchise.
    ?film wdt:P2130 ?cost.
    ?film <http://schema.org/name> ?name .

}
ORDER BY DESC(?cost)
"""
print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q1454799'), ('name', 'Friday the 13th Part VI: Jason Lives'), ('cost', '3000000')]
[('film', 'http://www.wikidata.org/entity/Q1454798'), ('name', 'Friday the 13th Part VII: The New Blood'), ('cost', '2800000')]
[('film', 'http://www.wikidata.org/entity/Q1243019'), ('name', 'Friday the 13th: The Final Chapter'), ('cost', '2600000')]
[('film', 'http://www.wikidata.org/entity/Q1243013'), ('name', 'Friday the 13th Part III'), ('cost', '2250000')]
[('film', 'http://www.wikidata.org/entity/Q1243009'), ('name', 'Friday the 13th Part 2'), ('cost', '1250000')]
[('film', 'http://www.wikidata.org/entity/Q221103'), ('name', 'Halloween'), ('cost', '325000')]
6


In [46]:
#check how many movies have bugdet (P2769)
queryString = """
SELECT ?film ?name ?budget WHERE
{  
    VALUES ?franchise {wd:Q1364022 wd:Q1138878}.
    ?film wdt:P179 ?franchise.
    ?film wdt:P2769 ?budget.
    ?film <http://schema.org/name> ?name .

}
ORDER BY DESC(?budget)
"""
print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q176488'), ('name', 'Halloween: Resurrection'), ('budget', '15000000')]
[('film', 'http://www.wikidata.org/entity/Q909063'), ('name', 'Halloween'), ('budget', '15000000')]
[('film', 'http://www.wikidata.org/entity/Q1138055'), ('name', 'Halloween: The Curse of Michael Myers'), ('budget', '5000000')]
[('film', 'http://www.wikidata.org/entity/Q976728'), ('name', 'Halloween 5: The Revenge of Michael Myers'), ('budget', '3000000')]
[('film', 'http://www.wikidata.org/entity/Q203705'), ('name', 'Halloween III: Season of the Witch'), ('budget', '2500000')]
5


The two sets of result are disjoint, so we can take the union

In [47]:
# print movie with cost/budget
queryString = """
SELECT ?film ?name ?cost ?budget WHERE
{  
    VALUES ?franchise {wd:Q1364022 wd:Q1138878}.
    ?film wdt:P179 ?franchise.
    {{?film wdt:P2130 ?cost.} UNION {?film wdt:P2769 ?budget.}}
    ?film <http://schema.org/name> ?name .

}
ORDER BY DESC (?cost) DESC (?budget)
"""
print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q1454799'), ('name', 'Friday the 13th Part VI: Jason Lives'), ('cost', '3000000')]
[('film', 'http://www.wikidata.org/entity/Q1454798'), ('name', 'Friday the 13th Part VII: The New Blood'), ('cost', '2800000')]
[('film', 'http://www.wikidata.org/entity/Q1243019'), ('name', 'Friday the 13th: The Final Chapter'), ('cost', '2600000')]
[('film', 'http://www.wikidata.org/entity/Q1243013'), ('name', 'Friday the 13th Part III'), ('cost', '2250000')]
[('film', 'http://www.wikidata.org/entity/Q1243009'), ('name', 'Friday the 13th Part 2'), ('cost', '1250000')]
[('film', 'http://www.wikidata.org/entity/Q221103'), ('name', 'Halloween'), ('cost', '325000')]
[('film', 'http://www.wikidata.org/entity/Q176488'), ('name', 'Halloween: Resurrection'), ('budget', '15000000')]
[('film', 'http://www.wikidata.org/entity/Q909063'), ('name', 'Halloween'), ('budget', '15000000')]
[('film', 'http://www.wikidata.org/entity/Q1138055'), ('name', 'Halloween: The Cur

In [13]:
# final result
queryString = """
SELECT ?franchise ?name (SUM(?cost) AS ?t_cost) (SUM(?budget) AS ?t_budget) WHERE
{  
    VALUES ?franchise {wd:Q1364022 wd:Q1138878}.
    ?film wdt:P179 ?franchise.
    {{?film wdt:P2130 ?cost.} UNION {?film wdt:P2769 ?budget.}}
    ?franchise sc:name ?name .

}
GROUP BY ?franchise ?name
"""
print("Results")
x=run_query(queryString)

Results
[('franchise', 'http://www.wikidata.org/entity/Q1138878'), ('name', 'Friday the 13th'), ('t_cost', '11900000')]
[('franchise', 'http://www.wikidata.org/entity/Q1364022'), ('name', 'Halloween'), ('t_cost', '325000'), ('t_budget', '40500000')]
2


In [14]:
### insert the result of TASK 7 in the file
obj1 = {"refers_to":x[0][0][1],"check":"value","refers_to_name":x[0][1][1],"value":x[0][2][1]}
obj2 = {"refers_to":x[1][0][1],"check":"value","refers_to_name":x[1][1][1],"value":x[1][3][1]}
obj = [obj1,obj2]

add_result("7", TYPE_REFERRED ,"value",obj)

JSON object updated
