# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-61211000ef-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString,verbose = True):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        results = sparql.query()
        json_results = results.convert()
        if len(json_results['results']['bindings'])==0:
            print("Empty")
            return []
        array = []
        for bindings in json_results['results']['bindings']:
            app =  [ (var, value['value'])  for var, value in bindings.items() ] 
            if verbose:
                print( app)
            array.append(app)
        if verbose:
            print(len(array))
        return array

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# Movie Workflow Series ("Film Genre and composer explorative search") 

Consider the following exploratory information need:

> investigate the results concerning the different film genre over years and the composers for the cinema.

## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | occupation    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P27`     | nationality   | predicate |
| `wdt:P3342`     | Significant person       | predicate |
| `wd:Q5`| Human       | node |
| `wd:Q25089`| Woody Allen       | node |





Also consider

```
wd:Q25089 ?p ?obj .
```

is the BGP to retrieve all **properties of Woody Allen**

Please consider that when you return a resource, you should return the IRI and the label of the resource. In particular, when the task require you to identify a BGP the result set must always be a list of couples IRI - label.

The workload should:


1. Identify the BGP for films

2. Identify the BGP for composer

3. Identify the BGP for film genre

4. Find how many films are been released in the United States of America from 2010-01-01 to 2015-31-12 for each film genre available (the result set must be genre IRI, label and #films).

5. Consider the timespan from 2001-01-01 to nowadays. Find the number of films released in this timespan divided by film genre and return only those with more than 50 films released every year (the result set must be genre IRI and label).

6. Consider the composers of Western films and their country of citizenship. Count the number of Western Films on which worked a composer grouped by country of citizenship and return the top-10 (the result set must be country IRI, label and #films).. 

7. Consider the decades from 1961 to 1970, and from 2001 to 2010 and compare the total number of western films released in each one (the result set must be two couples decade and #western films of that decade).

8. Consider the decades from 1961 to 1970, and from 2001 to 2010 and select only western films. Then, for each decade, compare the average cast members size per film (the result set must be two couples decade and average cast members size of that decade).

## Task 1
Identify the BGP for films

In [13]:
#getting films data from Woody Allen-"notable work", Notable work object-"genre", genre object-"subclass" relations
queryString = """
SELECT DISTINCT ?filmwanted ?name
WHERE {
   # bind something
   wd:Q25089 ?p ?obj .
   wd:Q25089 wdt:P800 ?film.
   ?film wdt:P136 ?genre.
   ?genre wdt:P279 ?filmwanted.
   # get the label
   ?filmwanted sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('filmwanted', 'http://www.wikidata.org/entity/Q2973201'), ('name', 'political film')]
[('filmwanted', 'http://www.wikidata.org/entity/Q15416'), ('name', 'television program')]
[('filmwanted', 'http://www.wikidata.org/entity/Q181001'), ('name', 'erotica')]
[('filmwanted', 'http://www.wikidata.org/entity/Q182015'), ('name', 'thriller')]
[('filmwanted', 'http://www.wikidata.org/entity/Q21010853'), ('name', 'drama')]
[('filmwanted', 'http://www.wikidata.org/entity/Q170539'), ('name', 'parody')]
[('filmwanted', 'http://www.wikidata.org/entity/Q185867'), ('name', 'film noir')]
[('filmwanted', 'http://www.wikidata.org/entity/Q905770'), ('name', 'soft science fiction')]
[('filmwanted', 'http://www.wikidata.org/entity/Q1257444'), ('name', 'film adaptation')]
[('filmwanted', 'http://www.wikidata.org/entity/Q15062348'), ('name', 'dystopian fiction')]
[('filmwanted', 'http://www.wikidata.org/entity/Q1054574'), ('name', 'romance film')]
[('filmwanted', 'http://www.wikidata.org/entity/Q114

Final query for this task

In [8]:
# write your final query

queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q11424 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1014'), ('name', 'Art & Architecture Thesaurus ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1036'), ('name', 'Dewey Decimal Classification')]
[('p', 'http://www.wikidata.org/prop/direct/P1051'), ('name', 'PSH ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1151'), ('name', "topic's main Wikimedia portal")]
[('p', 'http://www.wikidata.org/prop/direct/P1225'), ('name', 'U.S. National Archives Identifier')]
[('p', 'http://www.wikidata.org/prop/direct/P1245'), ('name', 'OmegaWiki Defined Meaning')]
[('p', 'http://www.wikidata.org/prop/direct/P1256'), ('name', 'Iconclass notation')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('name', 'described by source')]
[('p', 'http://www.wikidata.org/prop/direct/P1368'), ('name', 'LNB ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('name', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('name', "topic's main template")]
[('p',

## Task 2
Identify the BGP for composer

In [31]:
# write your queries
#Finding composer data via film-"property for this type" relation. 
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q11424 wdt:P1963 ?obj .
   # get the label
   ?obj sc:name ?name.
}limit 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/P345'), ('name', 'IMDb ID')]
[('obj', 'http://www.wikidata.org/entity/P1040'), ('name', 'film editor')]
[('obj', 'http://www.wikidata.org/entity/P344'), ('name', 'director of photography')]
[('obj', 'http://www.wikidata.org/entity/P1476'), ('name', 'title')]
[('obj', 'http://www.wikidata.org/entity/P2047'), ('name', 'duration')]
[('obj', 'http://www.wikidata.org/entity/P162'), ('name', 'producer')]
[('obj', 'http://www.wikidata.org/entity/P921'), ('name', 'main subject')]
[('obj', 'http://www.wikidata.org/entity/P577'), ('name', 'publication date')]
[('obj', 'http://www.wikidata.org/entity/P495'), ('name', 'country of origin')]
[('obj', 'http://www.wikidata.org/entity/P3816'), ('name', 'film script')]
[('obj', 'http://www.wikidata.org/entity/P179'), ('name', 'part of the series')]
[('obj', 'http://www.wikidata.org/entity/P480'), ('name', 'FilmAffinity ID')]
[('obj', 'http://www.wikidata.org/entity/P136'), ('name', 'genre')]
[('obj', 'htt

Final query for this task

In [18]:
# write your final query

queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:P86 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1628'), ('name', 'equivalent property')]
[('p', 'http://www.wikidata.org/prop/direct/P2236'), ('name', 'external subproperty')]
[('p', 'http://www.wikidata.org/prop/direct/P2888'), ('name', 'exact match')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P1659'), ('name', 'see also')]
[('p', 'http://www.wikidata.org/prop/direct/P1629'), ('name', 'Wikidata item of this property')]
[('p', 'http://www.wikidata.org/prop/direct/P1647'), ('name', 'subproperty of')]
[('p', 'http://www.wikidata.org/prop/direct/P1855'), ('name', 'Wikidata property example')]
[('p', 'http://www.wikidata.org/prop/direct/P2302'), ('name', 'property constraint')]
[('p', 'http://www.wikidata.org/prop/direct/P2875'), ('name', 'property usage tracking category')]
[('p', 'http://www.wikidata.org/prop/direct/P3254'), ('name', 'property proposal discussion')]
[('p', 'http://www.wikidata.org/prop/dir

## Task 3
Identify the BGP for film genre

In [173]:
# write your queries
queryString = """ 
SELECT DISTINCT ?obj ?name 
WHERE {
   # bind something
   #wd:Q11424 ?p ?obj . #we get genre data
   
   #wd:Q483394 ?p ?obj. #from genre we get "list of genres" by "has list" relation.

   wd:Q6621591 ?p ?obj. #from "list of genres" by "is a list of" relation we get to film genre. 
   
  
   # get the label
   ?obj sc:name ?name.
}limit 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q15961987'), ('name', 'television genre')]
[('obj', 'http://www.wikidata.org/entity/Q483394'), ('name', 'genre')]
[('obj', 'http://www.wikidata.org/entity/Q201658'), ('name', 'film genre')]
[('obj', 'http://www.wikidata.org/entity/Q659563'), ('name', 'video game genre')]
[('obj', 'http://www.wikidata.org/entity/Q13406463'), ('name', 'Wikimedia list article')]
[('obj', 'http://www.wikidata.org/entity/Q188451'), ('name', 'music genre')]
[('obj', 'http://www.wikidata.org/entity/Q223393'), ('name', 'literary genre')]
7


Final query for this task

In [174]:
# write your final query
queryString = """ 
SELECT DISTINCT ?p ?name 
WHERE {
  wd:Q201658 ?p ?obj.
  ?p sc:name ?name.
}limit 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1014'), ('name', 'Art & Architecture Thesaurus ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1256'), ('name', 'Iconclass notation')]
[('p', 'http://www.wikidata.org/prop/direct/P227'), ('name', 'GND ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2354'), ('name', 'has list')]
[('p', 'http://www.wikidata.org/prop/direct/P244'), ('name', 'Library of Congress authority ID')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P3235'), ('name', 'PhilPapers topic')]
[('p', 'http://www.wikidata.org/prop/direct/P3553'), ('name', 'Zhihu topic ID')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('name', 'part of')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('name', 'Commons category')]
[('p', 'http://www.wikidata.org/prop/direct/P3827'), ('name', 'JSTOR topic ID')]
[('p', 'htt

## Task 4
Find how many films are been released in the United States of America from 2010-01-01 to 2015-31-12 for each film genre available (the result set must be genre IRI, label and #films).

In [139]:
#write your queries
#we bind films-"country of origin" relation to USA data. we get USA data by Woody Allen-"country of citizenship" relation.
#from the films-"genre" relation we get the genre data.
#we get the date data by film-"publication date".
queryString = """
SELECT DISTINCT ?obj ?name 
WHERE {
   # bind something
   wd:Q11424 ?p ?obj. #from film-"properties for this type"-"country of origin" relation we get to data of where film is released in.
   
   # get the label
   ?obj sc:name ?name.
 
   }

"""

print("Results")
x=run_query(queryString)

Results
Empty


Final query for this task

In [151]:
#write your queries
#we bind films-"country of origin" relation to USA data. we get USA data by Woody Allen-"country of citizenship" relation.
#from the films-"genre" relation we get the genre data.
queryString = """
SELECT DISTINCT (count(?s) as ?filmNum) ?genre ?name 
WHERE {
   # bind something
   ?s wdt:P495 wd:Q30. #getting data where the film is released in USA 
   ?s wdt:P136 ?genre. #getting these films genre data 
   ?s wdt:P577 ?date.
   
   # get the label
   ?genre sc:name ?name.
   filter((?date > "2010-01-01"^^xsd:dateTime) && (?date < "2015-12-31"^^xsd:dateTime))
   }limit 20

"""

print("Results")
x=run_query(queryString)

Results
[('filmNum', '2400'), ('genre', 'http://www.wikidata.org/entity/Q130232'), ('name', 'drama')]
[('filmNum', '873'), ('genre', 'http://www.wikidata.org/entity/Q200092'), ('name', 'horror film')]
[('filmNum', '56'), ('genre', 'http://www.wikidata.org/entity/Q858330'), ('name', 'romance novel')]
[('filmNum', '167'), ('genre', 'http://www.wikidata.org/entity/Q10992055'), ('name', 'speculative fiction novel')]
[('filmNum', '10'), ('genre', 'http://www.wikidata.org/entity/Q459435'), ('name', 'mockumentary')]
[('filmNum', '19'), ('genre', 'http://www.wikidata.org/entity/Q1379958'), ('name', 'contemporary Christian music')]
[('filmNum', '5'), ('genre', 'http://www.wikidata.org/entity/Q101856615'), ('name', 'zombie television program')]
[('filmNum', '130'), ('genre', 'http://www.wikidata.org/entity/Q2975633'), ('name', 'coming-of-age story')]
[('filmNum', '35'), ('genre', 'http://www.wikidata.org/entity/Q1033891'), ('name', 'martial arts film')]
[('filmNum', '53'), ('genre', 'http://www.

## Task 5
Consider the timespan from 2001-01-01 to nowadays. Find the number of films released in this timespan divided by film genre and return only those with more than 50 films released every year (the result set must be genre IRI and label).

In [152]:
# write your queries
#we make changes from the previous code 
queryString = """
SELECT DISTINCT (count(?s) as ?filmNum) ?genre ?name #can add "?date" to see the seperation by years 
WHERE {
   # bind something
   ?s wdt:P136 ?genre. #getting these films genre data 
   ?s wdt:P577 ?date.
   
   # get the label
   ?genre sc:name ?name.
   filter((?date > "2001"^^xsd:dateTime) ) #updating the date restriction
   }group by ?genre ?name ?date #adding "?date" in the "group by" to detect data separated by each year and by genre
   having(count(?s) > 50) #using "having" to get data only the ones having more than 50 counts every year
   limit 20

"""

print("Results")
x=run_query(queryString)

Results
[('filmNum', '303'), ('genre', 'http://www.wikidata.org/entity/Q157443'), ('name', 'comedy film')]
[('filmNum', '54'), ('genre', 'http://www.wikidata.org/entity/Q9778'), ('name', 'electronic music')]
[('filmNum', '67'), ('genre', 'http://www.wikidata.org/entity/Q973150'), ('name', 'mandopop')]
[('filmNum', '229'), ('genre', 'http://www.wikidata.org/entity/Q200092'), ('name', 'horror film')]
[('filmNum', '200'), ('genre', 'http://www.wikidata.org/entity/Q200092'), ('name', 'horror film')]
[('filmNum', '55'), ('genre', 'http://www.wikidata.org/entity/Q645928'), ('name', 'biographical film')]
[('filmNum', '61'), ('genre', 'http://www.wikidata.org/entity/Q645928'), ('name', 'biographical film')]
[('filmNum', '62'), ('genre', 'http://www.wikidata.org/entity/Q20442589'), ('name', 'LGBT-related film')]
[('filmNum', '53'), ('genre', 'http://www.wikidata.org/entity/Q637866'), ('name', 'book review')]
[('filmNum', '70'), ('genre', 'http://www.wikidata.org/entity/Q235858'), ('name', 'trad

Final query for this task

In [159]:
# write your final query
#we make changes from the previous code 
queryString = """
SELECT DISTINCT (count(?s) as ?filmNum) ?genre ?name #can add "?date" to see the seperation by years 
WHERE {
   # bind something
   ?s wdt:P136 ?genre. #getting these films genre data 
   ?s wdt:P577 ?date.
   
   # get the label
   ?genre sc:name ?name.
   filter((?date > "2001"^^xsd:dateTime) ) #updating the date restriction
   }group by ?genre ?name ?date #adding "?date" in the "group by" to detect data separated by each year and by genre
   having(count(?s) > 50) #using "having" to get data only the ones having more than 50 counts every year
   limit 20

"""

print("Results")
x=run_query(queryString)

Results
[('filmNum', '303'), ('genre', 'http://www.wikidata.org/entity/Q157443'), ('name', 'comedy film')]
[('filmNum', '54'), ('genre', 'http://www.wikidata.org/entity/Q9778'), ('name', 'electronic music')]
[('filmNum', '67'), ('genre', 'http://www.wikidata.org/entity/Q973150'), ('name', 'mandopop')]
[('filmNum', '229'), ('genre', 'http://www.wikidata.org/entity/Q200092'), ('name', 'horror film')]
[('filmNum', '200'), ('genre', 'http://www.wikidata.org/entity/Q200092'), ('name', 'horror film')]
[('filmNum', '55'), ('genre', 'http://www.wikidata.org/entity/Q645928'), ('name', 'biographical film')]
[('filmNum', '61'), ('genre', 'http://www.wikidata.org/entity/Q645928'), ('name', 'biographical film')]
[('filmNum', '62'), ('genre', 'http://www.wikidata.org/entity/Q20442589'), ('name', 'LGBT-related film')]
[('filmNum', '53'), ('genre', 'http://www.wikidata.org/entity/Q637866'), ('name', 'book review')]
[('filmNum', '70'), ('genre', 'http://www.wikidata.org/entity/Q235858'), ('name', 'trad

## Task 6
Consider the composers of Western films and their country of citizenship. Count the number of Western Films on which worked a composer grouped by country of citizenship and return the top-10 (the result set must be country IRI, label and #films)..

In [190]:
# write your queries

queryString = """
SELECT DISTINCT ?country ?name (count(?film) as ?filmNum) 
WHERE {
    #?sub ?p wd:Q201658. #we find "western films" through "film genre" data
    
    ?film ?p wd:Q172980. #we get all the movies which are western. 
    ?film wdt:P86 ?composer. #we get the composers from these movies.
    ?composer wdt:P27 ?country. #we get the country of citizenship.
    
    ?country sc:name ?name.

   
   } group by ?country ?name
   order by desc(?filmNum)
   limit 10
   

"""

print("Results")
x=run_query(queryString)

Results
[('country', 'http://www.wikidata.org/entity/Q30'), ('name', 'United States of America'), ('filmNum', '1080')]
[('country', 'http://www.wikidata.org/entity/Q145'), ('name', 'United Kingdom'), ('filmNum', '59')]
[('country', 'http://www.wikidata.org/entity/Q183'), ('name', 'Germany'), ('filmNum', '45')]
[('country', 'http://www.wikidata.org/entity/Q34266'), ('name', 'Russian Empire'), ('filmNum', '45')]
[('country', 'http://www.wikidata.org/entity/Q38'), ('name', 'Italy'), ('filmNum', '41')]
[('country', 'http://www.wikidata.org/entity/Q40'), ('name', 'Austria'), ('filmNum', '36')]
[('country', 'http://www.wikidata.org/entity/Q15180'), ('name', 'Soviet Union'), ('filmNum', '36')]
[('country', 'http://www.wikidata.org/entity/Q172579'), ('name', 'Kingdom of Italy'), ('filmNum', '32')]
[('country', 'http://www.wikidata.org/entity/Q28513'), ('name', 'Austria-Hungary'), ('filmNum', '31')]
[('country', 'http://www.wikidata.org/entity/Q142'), ('name', 'France'), ('filmNum', '22')]
10


Final query for this task

In [208]:
# write your final query

queryString = """
SELECT DISTINCT ?country ?name (count(?film) as ?filmNum) 
WHERE {
    #?sub ?p wd:Q201658. #we find "western films" through "film genre" data
    
    ?film ?p wd:Q172980. #we get all the movies which are western. 
    ?film wdt:P86 ?composer. #we get the composers from these movies.
    ?composer wdt:P27 ?country. #we get the country of citizenship.
    
    ?country sc:name ?name.

   
   } group by ?country ?name
   order by desc(?filmNum)
   limit 10
   

"""

print("Results")
x=run_query(queryString)

Results
[('country', 'http://www.wikidata.org/entity/Q30'), ('name', 'United States of America'), ('filmNum', '1080')]
[('country', 'http://www.wikidata.org/entity/Q145'), ('name', 'United Kingdom'), ('filmNum', '59')]
[('country', 'http://www.wikidata.org/entity/Q183'), ('name', 'Germany'), ('filmNum', '45')]
[('country', 'http://www.wikidata.org/entity/Q34266'), ('name', 'Russian Empire'), ('filmNum', '45')]
[('country', 'http://www.wikidata.org/entity/Q38'), ('name', 'Italy'), ('filmNum', '41')]
[('country', 'http://www.wikidata.org/entity/Q40'), ('name', 'Austria'), ('filmNum', '36')]
[('country', 'http://www.wikidata.org/entity/Q15180'), ('name', 'Soviet Union'), ('filmNum', '36')]
[('country', 'http://www.wikidata.org/entity/Q172579'), ('name', 'Kingdom of Italy'), ('filmNum', '32')]
[('country', 'http://www.wikidata.org/entity/Q28513'), ('name', 'Austria-Hungary'), ('filmNum', '31')]
[('country', 'http://www.wikidata.org/entity/Q142'), ('name', 'France'), ('filmNum', '22')]
10


## Task 7
Consider the decades from 1961 to 1970, and from 2001 to 2010 and compare the total number of western films released in each one (the result set must be two couples decade and #western films of that decade).

In [304]:
# write your queries
#we use union to get two different data for two different decades.
queryString = """
SELECT DISTINCT ?year ?name (count(?film) as ?filmTotal)
WHERE {
    {
    ?film ?p wd:Q172980. #we get all the movies which are western. 
    ?film wdt:P577 ?releaseDate. #we get release date.
    bind(STRBEFORE(str(?releaseDate), '-')AS ?year). #we take only the year of the date.
    
    filter((?releaseDate > "1961"^^xsd:dateTime)&&(?releaseDate < "1970"^^xsd:dateTime)).
    }
    UNION
    {
    ?film ?p wd:Q172980. #we get all the movies which are western. 
    ?film wdt:P577 ?releaseDate.
    bind(STRBEFORE(str(?releaseDate), '-')AS ?year).

    filter((?releaseDate > "2001"^^xsd:dateTime)&&(?releaseDate < "2010"^^xsd:dateTime)).
    }   
   }
    group by ?year ?name
    having(sum(?film))
    limit 20
   

"""

print("Results")
x=run_query(queryString)

Results
[('year', '2009'), ('filmTotal', '7')]
[('year', '2002'), ('filmTotal', '8')]
[('year', '1968'), ('filmTotal', '24')]
[('year', '1970'), ('filmTotal', '35')]
[('year', '2001'), ('filmTotal', '4')]
[('year', '1966'), ('filmTotal', '32')]
[('year', '1965'), ('filmTotal', '38')]
[('year', '2005'), ('filmTotal', '8')]
[('year', '1961'), ('filmTotal', '6')]
[('year', '2007'), ('filmTotal', '14')]
[('year', '1964'), ('filmTotal', '26')]
[('year', '1967'), ('filmTotal', '29')]
[('year', '2008'), ('filmTotal', '16')]
[('year', '2004'), ('filmTotal', '7')]
[('year', '2006'), ('filmTotal', '8')]
[('year', '1969'), ('filmTotal', '28')]
[('year', '1963'), ('filmTotal', '12')]
[('year', '1962'), ('filmTotal', '19')]
[('year', '2010'), ('filmTotal', '13')]
[('year', '2003'), ('filmTotal', '12')]
20


Final query for this task

## Task 8
Consider the decades from 1961 to 1970, and from 2001 to 2010 and select only western films. Then, for each decade, compare the average cast members size per film (the result set must be two couples decade and average cast members size of that decade).

In [308]:
# write your queries
#we use union to get two different data for two different decades.
queryString = """
SELECT DISTINCT ?year ?name (count(?cast) as ?castNum)
WHERE {
    {
    ?film ?p wd:Q172980. #we get all the movies which are western. 
    ?film wdt:P577 ?releaseDate. #we get release date.
    bind(STRBEFORE(str(?releaseDate), '-')AS ?year). #we take only the year of the date.
    
    filter((?releaseDate > "1961"^^xsd:dateTime)&&(?releaseDate < "1970"^^xsd:dateTime)).
    
    ?film wdt:P161 ?cast.
    
    }
    UNION
    {
    ?film ?p wd:Q172980. #we get all the movies which are western. 
    ?film wdt:P577 ?releaseDate.
    bind(STRBEFORE(str(?releaseDate), '-')AS ?year).

    filter((?releaseDate > "2001"^^xsd:dateTime)&&(?releaseDate < "2010"^^xsd:dateTime)).
    
    ?film wdt:P161 ?cast.
    
    }   
   }
    group by ?year ?name
    limit 20
   

"""

print("Results")
x=run_query(queryString)

Results
The operation failed QueryBadFormed: a bad request has been sent to the endpoint, probably the sparql query is bad formed. 

Response:
b'Virtuoso 37000 Error SP030: SPARQL compiler, line 25: syntax error at \'UNION\'\n\nSPARQL query:\ndefine sql:big-data-const 0\n#output-format:application/sparql-results+json\n\n##-61211000ef-##\nPREFIX wd: <http://www.wikidata.org/entity/> \nPREFIX wdt: <http://www.wikidata.org/prop/direct/> \nPREFIX sc: <http://schema.org/>\n\n\nSELECT DISTINCT ?year ?name ?castNum1 ?castNum2\nWHERE {\n    select ?film ?name (avg(?cast) as ?castNum1)\n    where{\n    ?film ?p wd:Q172980. #we get all the movies which are western. \n    ?film wdt:P577 ?releaseDate. #we get release date.\n    bind(STRBEFORE(str(?releaseDate), \'-\')AS ?year). #we take only the year of the date.\n    \n    filter((?releaseDate > "1961"^^xsd:dateTime)&&(?releaseDate < "1970"^^xsd:dateTime)).\n    \n    ?film wdt:P161 ?cast.\n    \n    } group by ?film ?name\n      having (count(?c

Final query for this task

In [306]:
# write your final query
# write your queries
#we use union to get two different data for two different decades.
queryString = """
SELECT DISTINCT ?cast ?name 
WHERE {
      ?film ?p wd:Q172980.
      ?film sc:name ?name.
      ?film wdt:P161 ?cast.
      ?cast sc:name ?name.
   }
    limit 20
   

"""

print("Results")
x=run_query(queryString)

Results
[('cast', 'http://www.wikidata.org/entity/Q16889560'), ('name', 'Peggy Montgomery')]
[('cast', 'http://www.wikidata.org/entity/Q266888'), ('name', 'Vinessa Shaw')]
[('cast', 'http://www.wikidata.org/entity/Q7305588'), ('name', 'Redd Harper')]
[('cast', 'http://www.wikidata.org/entity/Q3040473'), ('name', 'Duane Thompson')]
[('cast', 'http://www.wikidata.org/entity/Q7373408'), ('name', 'Roy Stewart')]
[('cast', 'http://www.wikidata.org/entity/Q3760487'), ('name', 'George F. Marion')]
[('cast', 'http://www.wikidata.org/entity/Q749098'), ('name', 'Bob Burns')]
[('cast', 'http://www.wikidata.org/entity/Q1797384'), ('name', 'Lutz Moik')]
[('cast', 'http://www.wikidata.org/entity/Q1794297'), ('name', 'Kurt Zips')]
[('cast', 'http://www.wikidata.org/entity/Q433198'), ('name', 'Paula Malcomson')]
[('cast', 'http://www.wikidata.org/entity/Q232840'), ('name', 'Honor Blackman')]
[('cast', 'http://www.wikidata.org/entity/Q455877'), ('name', 'Stole Aranđelović')]
[('cast', 'http://www.wikid