# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-f457feb412-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString,verbose = True):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        results = sparql.query()
        json_results = results.convert()
        if len(json_results['results']['bindings'])==0:
            print("Empty")
            return []
        array = []
        for bindings in json_results['results']['bindings']:
            app =  [ (var, value['value'])  for var, value in bindings.items() ] 
            if verbose:
                print( app)
            array.append(app)
        if verbose:
            print(len(array))
        return array

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# Movie Workflow Series ("Film Genre and composer explorative search") 

Consider the following exploratory information need:

> investigate the results concerning the different film genre over years and the composers for the cinema.

## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | occupation    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P27`     | nationality   | predicate |
| `wdt:P3342`     | Significant person       | predicate |
| `wd:Q5`| Human       | node |
| `wd:Q25089`| Woody Allen       | node |





Also consider

```
wd:Q25089 ?p ?obj .
```

is the BGP to retrieve all **properties of Woody Allen**

Please consider that when you return a resource, you should return the IRI and the label of the resource. In particular, when the task require you to identify a BGP the result set must always be a list of couples IRI - label.

The workload should:


1. Identify the BGP for films

2. Identify the BGP for composer

3. Identify the BGP for film genre

4. Find how many films are been released in the United States of America from 2010-01-01 to 2015-31-12 for each film genre available (the result set must be genre IRI, label and #films).

5. Consider the timespan from 2001-01-01 to nowadays. Find the number of films released in this timespan divided by film genre and return only those with more than 50 films released every year (the result set must be genre IRI and label).

6. Consider the composers of Western films and their country of citizenship. Count the number of Western Films on which worked a composer grouped by country of citizenship and return the top-10 (the result set must be country IRI, label and #films).. 

7. Consider the decades from 1961 to 1970, and from 2001 to 2010 and compare the total number of western films released in each one (the result set must be two couples decade and #western films of that decade).

8. Consider the decades from 1961 to 1970, and from 2001 to 2010 and select only western films. Then, for each decade, compare the average cast members size per film (the result set must be two couples decade and average cast members size of that decade).

## Task 1
Identify the BGP for films

Exploring Woody Allen:

In [22]:
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   wd:Q25089 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1003'), ('name', 'National Library of Romania ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1005'), ('name', 'Portuguese National Library ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1006'), ('name', 'Nationale Thesaurus voor Auteurs ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1015'), ('name', 'NORAF ID')]
[('p', 'http://www.wikidata.org/prop/direct/P103'), ('name', 'native language')]
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('name', 'employer')]
[('p', 'http://www.wikidata.org/prop/direct/P109'), ('name', 'signature')]
[('p', 'http://www.wikidata.org/prop/direct/P1150'), ('name', 'Regensburg Classification')]
[('p', 'http://www.wikidata.org/prop/direct/P1207'), ('name', 'NUKAT ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1220'), ('name', 'Internet Broadway Database person ID')]
[('p', 'http://www.wikidata.org/prop/direct/P

In [21]:
queryString = """
SELECT DISTINCT ?pr ?p ?or ?o
WHERE {
   # bind something
   wd:Q25089 ?pr ?or .
   # get the label
   ?pr sc:name ?p.
   ?or sc:name ?o
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P138'), ('p', 'named after'), ('or', 'http://www.wikidata.org/entity/Q349357'), ('o', 'Woody Herman')]
[('pr', 'http://www.wikidata.org/prop/direct/P26'), ('p', 'spouse'), ('or', 'http://www.wikidata.org/entity/Q1140914'), ('o', 'Soon-Yi Previn')]
[('pr', 'http://www.wikidata.org/prop/direct/P40'), ('p', 'child'), ('or', 'http://www.wikidata.org/entity/Q22673707'), ('o', 'Bechet Dumaine Allen')]
[('pr', 'http://www.wikidata.org/prop/direct/P40'), ('p', 'child'), ('or', 'http://www.wikidata.org/entity/Q22673708'), ('o', 'Manzie Tio Allen')]
[('pr', 'http://www.wikidata.org/prop/direct/P19'), ('p', 'place of birth'), ('or', 'http://www.wikidata.org/entity/Q18426'), ('o', 'The Bronx')]
[('pr', 'http://www.wikidata.org/prop/direct/P451'), ('p', 'unmarried partner'), ('or', 'http://www.wikidata.org/entity/Q102642'), ('o', 'Diane Keaton')]
[('pr', 'http://www.wikidata.org/prop/direct/P166'), ('p', 'award received'), ('or', 'http://www.wiki

I try to find film as an intersection of the property of two films:

In [21]:
queryString = """
SELECT DISTINCT ?pr ?p ?or ?o
WHERE {
   # bind something
   wd:Q206124 ?pr ?or . # Check properties in common between Midnight in Paris
   wd:Q971865 ?pr ?or . # and What's Up, Tiger Lily 
   # get the label
   ?pr sc:name ?p.
   ?or sc:name ?o.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P31'), ('p', 'instance of'), ('or', 'http://www.wikidata.org/entity/Q11424'), ('o', 'film')]
[('pr', 'http://www.wikidata.org/prop/direct/P364'), ('p', 'original language of film or TV show'), ('or', 'http://www.wikidata.org/entity/Q1860'), ('o', 'English')]
[('pr', 'http://www.wikidata.org/prop/direct/P437'), ('p', 'distribution format'), ('or', 'http://www.wikidata.org/entity/Q723685'), ('o', 'video on demand')]
[('pr', 'http://www.wikidata.org/prop/direct/P462'), ('p', 'color'), ('or', 'http://www.wikidata.org/entity/Q22006653'), ('o', 'color')]
[('pr', 'http://www.wikidata.org/prop/direct/P495'), ('p', 'country of origin'), ('or', 'http://www.wikidata.org/entity/Q30'), ('o', 'United States of America')]
[('pr', 'http://www.wikidata.org/prop/direct/P57'), ('p', 'director'), ('or', 'http://www.wikidata.org/entity/Q25089'), ('o', 'Woody Allen')]
[('pr', 'http://www.wikidata.org/prop/direct/P58'), ('p', 'screenwriter'), ('or', 'http:

Final query for this task

In [22]:
queryString = """
SELECT DISTINCT ?or ?o
WHERE {
   wd:Q206124 wdt:P31 ?or . # Midnight in Paris, instance of, ??
   # get the label
   ?or sc:name ?o.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('or', 'http://www.wikidata.org/entity/Q11424'), ('o', 'film')]
1


## Task 2
Identify the BGP for composer

In [28]:
queryString = """
SELECT DISTINCT ?pr ?p ?or ?o
WHERE {
   wd:Q206124 ?pr ?or . # Midnight in Paris, ??, ??
   # get the label
   ?pr sc:name ?p.
   ?or sc:name ?o.
   FILTER regex(?p, "composer").
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P86'), ('p', 'composer'), ('or', 'http://www.wikidata.org/entity/Q7608199'), ('o', 'Stephane Wrembel')]
1


The one I found is the IRI for the "composer" property (?film "composer" ?person). Does composer exists even as a role? Let's check.

In [24]:
queryString = """
SELECT DISTINCT ?or ?o
WHERE {
   # bind something
   wd:Q25089 wdt:P106 ?or .
   # get the label
   ?or sc:name ?o
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('or', 'http://www.wikidata.org/entity/Q10800557'), ('o', 'film actor')]
[('or', 'http://www.wikidata.org/entity/Q15981151'), ('o', 'jazz musician')]
[('or', 'http://www.wikidata.org/entity/Q18844224'), ('o', 'science fiction writer')]
[('or', 'http://www.wikidata.org/entity/Q1930187'), ('o', 'journalist')]
[('or', 'http://www.wikidata.org/entity/Q214917'), ('o', 'playwright')]
[('or', 'http://www.wikidata.org/entity/Q245068'), ('o', 'comedian')]
[('or', 'http://www.wikidata.org/entity/Q2526255'), ('o', 'film director')]
[('or', 'http://www.wikidata.org/entity/Q28389'), ('o', 'screenwriter')]
[('or', 'http://www.wikidata.org/entity/Q3282637'), ('o', 'film producer')]
[('or', 'http://www.wikidata.org/entity/Q36180'), ('o', 'writer')]
[('or', 'http://www.wikidata.org/entity/Q36834'), ('o', 'composer')]
[('or', 'http://www.wikidata.org/entity/Q482980'), ('o', 'author')]
[('or', 'http://www.wikidata.org/entity/Q639669'), ('o', 'musician')]
[('or', 'http://www.wikidata.org/entity/Q

Final query for this task

In [30]:
queryString = """
SELECT DISTINCT ?composerr ?composer
WHERE {
   # bind something
   {
       wd:Q25089 wdt:P106 ?composerr .
       # get the label
       ?composerr sc:name ?composer.
       FILTER regex(?composer, "composer", "i").
   }UNION{
       wd:Q206124 ?composerr  ?or . # Midnight in Paris, ??, ??
       # get the label
       ?composerr  sc:name ?composer.
       FILTER regex(?composer, "composer").
   }

}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('composerr', 'http://www.wikidata.org/entity/Q36834'), ('composer', 'composer')]
[('composerr', 'http://www.wikidata.org/prop/direct/P86'), ('composer', 'composer')]
2


## Task 3
Identify the BGP for film genre

In [31]:
queryString = """
SELECT DISTINCT ?pr ?p ?or ?o
WHERE {
   wd:Q206124 ?pr ?or . # Midnight in Paris, ??, ??
   # get the label
   ?pr sc:name ?p.
   ?or sc:name ?o.
   FILTER regex(?p, "genre").
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P136'), ('p', 'genre'), ('or', 'http://www.wikidata.org/entity/Q102706436'), ('o', 'magic realist film')]
[('pr', 'http://www.wikidata.org/prop/direct/P136'), ('p', 'genre'), ('or', 'http://www.wikidata.org/entity/Q157394'), ('o', 'fantasy film')]
[('pr', 'http://www.wikidata.org/prop/direct/P136'), ('p', 'genre'), ('or', 'http://www.wikidata.org/entity/Q471839'), ('o', 'science fiction film')]
[('pr', 'http://www.wikidata.org/prop/direct/P136'), ('p', 'genre'), ('or', 'http://www.wikidata.org/entity/Q860626'), ('o', 'romantic comedy')]
4


Final query for this task

In [32]:
queryString = """
SELECT DISTINCT ?pr ?p
WHERE {
   wd:Q206124 ?pr ?or . # Midnight in Paris, ??, ??
   # get the label
   ?pr sc:name ?p.
   FILTER regex(?p, "genre").
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P136'), ('p', 'genre')]
1


## Task 4
Find how many films are been released in the United States of America from 2010-01-01 to 2015-31-12 for each film genre available (the result set must be genre IRI, label and #films).

I did not managed to complete this query. As far as I know, the data requested is not in the database.

While there exists properties as "first published", and "published in" (and actually, there are many synonims), I did not find any kind of quad pattern expressing a concept such as:
```sparql
?film wdt:releasedIn [
  wdt:place "usa".
  wdt:date "releaseDate"
] 
```

Looking for the intersection between the first published date and the published coutry is not enough, because many cases may arise. For example, a film could be first published in 2009 and then released in the USA two years later, and this is not catched by this query.

In [24]:
queryString = """
SELECT DISTINCT  ?or ?o
WHERE {
   wd:Q206124 wdt:P27 ?or . # Midnight in Paris, nationality, ??
   # get the label
   ?or sc:name ?o.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [27]:
queryString = """
SELECT DISTINCT  ?or ?pr ?p
WHERE {
   wd:Q206124 ?pr ?or . # Midnight in Paris, nationality, ??
   # get the label
   MINUS {
       wd:Q206124 ?pr ?or .
       ?or sc:name ?o.
   }
   
   ?pr sc:name ?p
}
LIMIT 2000
"""

print("Results")
x=run_query(queryString)

Results
[('or', 'midnightinparis'), ('pr', 'http://www.wikidata.org/prop/direct/P1237'), ('p', 'Box Office Mojo film ID (former scheme)')]
[('or', 'm/midnight_in_paris'), ('pr', 'http://www.wikidata.org/prop/direct/P1258'), ('p', 'Rotten Tomatoes ID')]
[('or', '178300'), ('pr', 'http://www.wikidata.org/prop/direct/P1265'), ('p', 'AlloCiné film ID')]
[('or', 'topic/Midnight-in-Paris'), ('pr', 'http://www.wikidata.org/prop/direct/P1417'), ('p', 'Encyclopædia Britannica Online ID')]
[('or', 'Midnight in Paris'), ('pr', 'http://www.wikidata.org/prop/direct/P1476'), ('p', 'title')]
[('or', 'v515754'), ('pr', 'http://www.wikidata.org/prop/direct/P1562'), ('p', 'AllMovie title ID')]
[('or', 'movie/midnight-in-paris'), ('pr', 'http://www.wikidata.org/prop/direct/P1712'), ('p', 'Metacritic ID')]
[('or', 'http://commons.wikimedia.org/wiki/Special:FilePath/Owen%20Wilson%20Woody%20Allen%20Cannes%202011.jpg'), ('pr', 'http://www.wikidata.org/prop/direct/P18'), ('p', 'image')]
[('or', '73172'), ('pr

In [36]:
queryString = """
SELECT DISTINCT  ?or ?o
WHERE {
   wd:Q25089 wdt:P27 ?or . # Woody Allen, nationality, ??
   # get the label
   ?or sc:name ?o.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('or', 'http://www.wikidata.org/entity/Q30'), ('o', 'United States of America')]
1


In [37]:
queryString = """
SELECT DISTINCT ?pr ?p
WHERE {
   wd:Q206124 ?pr wd:Q30 . # Midnight in Paris, ??, USA
   # get the label
   ?pr sc:name ?p.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P495'), ('p', 'country of origin')]
1


In [38]:
queryString = """
SELECT DISTINCT ?pr ?p ?or ?o
WHERE {
   wd:Q206124 ?pr ?or . # Midnight in Paris, ??, ??
   # get the label
   ?pr sc:name ?p.
   ?or sc:name ?o.
   FILTER regex(?p, "releas").
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [39]:
queryString = """
SELECT DISTINCT ?sr ?s ?pr ?p
WHERE {
   ?sr wdt:P31 wd:Q11424 .
   ?sr ?pr wd:Q30 . # a generic film, ??, USA
   # get the label
   ?sr sc:name ?s.
   ?pr sc:name ?p.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('sr', 'http://www.wikidata.org/entity/Q1193742'), ('s', 'The Thief of Bagdad'), ('pr', 'http://www.wikidata.org/prop/direct/P162'), ('p', 'producer')]
[('sr', 'http://www.wikidata.org/entity/Q24901279'), ('s', 'Friended to Death'), ('pr', 'http://www.wikidata.org/prop/direct/P17'), ('p', 'country')]
[('sr', 'http://www.wikidata.org/entity/Q2251688'), ('s', 'Star Tours'), ('pr', 'http://www.wikidata.org/prop/direct/P17'), ('p', 'country')]
[('sr', 'http://www.wikidata.org/entity/Q6296468'), ('s', 'To Rob a Thief'), ('pr', 'http://www.wikidata.org/prop/direct/P17'), ('p', 'country')]
[('sr', 'http://www.wikidata.org/entity/Q75836361'), ('s', 'Let There Be Light'), ('pr', 'http://www.wikidata.org/prop/direct/P17'), ('p', 'country')]
[('sr', 'http://www.wikidata.org/entity/Q108046141'), ('s', 'Five Years North'), ('pr', 'http://www.wikidata.org/prop/direct/P17'), ('p', 'country')]
[('sr', 'http://www.wikidata.org/entity/Q7327737'), ('s', 'Richard McGeagh'), ('pr', 'http://www.wik

In [40]:
queryString = """
SELECT DISTINCT ?pr ?p
WHERE {
   ?sr wdt:P31 wd:Q11424 .
   ?sr ?pr wd:Q30 . # generic film, ??, USA
   # get the label
   ?sr sc:name ?s.
   ?pr sc:name ?p.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P162'), ('p', 'producer')]
[('pr', 'http://www.wikidata.org/prop/direct/P17'), ('p', 'country')]
[('pr', 'http://www.wikidata.org/prop/direct/P27'), ('p', 'country of citizenship')]
[('pr', 'http://www.wikidata.org/prop/direct/P291'), ('p', 'place of publication')]
[('pr', 'http://www.wikidata.org/prop/direct/P495'), ('p', 'country of origin')]
[('pr', 'http://www.wikidata.org/prop/direct/P840'), ('p', 'narrative location')]
[('pr', 'http://www.wikidata.org/prop/direct/P750'), ('p', 'distributed by')]
[('pr', 'http://www.wikidata.org/prop/direct/P915'), ('p', 'filming location')]
[('pr', 'http://www.wikidata.org/prop/direct/P921'), ('p', 'main subject')]
[('pr', 'http://www.wikidata.org/prop/direct/P1433'), ('p', 'published in')]
10


In [41]:
queryString = """
SELECT DISTINCT ?pr ?p
WHERE {
   ?or wdt:P31 wd:Q11424 .
   wd:Q30 ?pr ?or . # USA, ??, a film
   # get the label
   ?or sc:name ?o.
   ?pr sc:name ?p.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [46]:
queryString = """
SELECT DISTINCT ?pr ?p
WHERE {
   ?sr wdt:P31 wd:Q11424 .
   ?sr ?pr ?or . # generic film, ??, USA
   # get the label
   ?sr sc:name ?s.
   ?pr sc:name ?p.
   FILTER regex(?p, "date").
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P577'), ('p', 'publication date')]
[('pr', 'http://www.wikidata.org/prop/direct/P1191'), ('p', 'date of first performance')]
[('pr', 'http://www.wikidata.org/prop/direct/P2754'), ('p', 'production date')]
[('pr', 'http://www.wikidata.org/prop/direct/P2913'), ('p', 'date depicted')]
[('pr', 'http://www.wikidata.org/prop/direct/P3893'), ('p', 'public domain date')]
[('pr', 'http://www.wikidata.org/prop/direct/P1619'), ('p', 'date of official opening')]
[('pr', 'http://www.wikidata.org/prop/direct/P6949'), ('p', 'announcement date')]
[('pr', 'http://www.wikidata.org/prop/direct/P3999'), ('p', 'date of official closure')]
[('pr', 'http://www.wikidata.org/prop/direct/P569'), ('p', 'date of birth')]
[('pr', 'http://www.wikidata.org/prop/direct/P2960'), ('p', 'archive date')]
10


In [50]:
queryString = """
SELECT DISTINCT ?genrer SAMPLE(?genre)
WHERE {
   ?filmr wdt:P31 wd:Q11424;  # generic film
       ?releasedIn wd:Q30; # released in the USA
       ?releaseDate ?date; # with a publication date
       wdt:P136 ?genrer.   # and a genre
   
   VALUES(?releasedIn){
       (wdt:P17)   # country
       (wdt:P291)  # place of publication
       (wdt:P495)  # country of origin
       (wdt:P1433) # publicated in
   }
   
   VALUES(?releasedDate){
       (wdt:P577)   # publication date
       (wdt:P1191)  # date of first performance
       (wdt:P1619)  # date of official opening
   }
   

   # get the label
   ?genrer sc:name ?genre.
} GROUP BY ?genrer
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('genrer', 'http://www.wikidata.org/entity/Q1257444'), ('callret-1', 'film adaptation')]
[('genrer', 'http://www.wikidata.org/entity/Q15858553'), ('callret-1', 'wuxia film')]
[('genrer', 'http://www.wikidata.org/entity/Q43911809'), ('callret-1', 'supernatural horror film')]
[('genrer', 'http://www.wikidata.org/entity/Q596138'), ('callret-1', 'educational film')]
[('genrer', 'http://www.wikidata.org/entity/Q25360500'), ('callret-1', 'aftershow')]
[('genrer', 'http://www.wikidata.org/entity/Q9155615'), ('callret-1', 'fantastique')]
[('genrer', 'http://www.wikidata.org/entity/Q1473699'), ('callret-1', 'full motion video')]
[('genrer', 'http://www.wikidata.org/entity/Q5035283'), ('callret-1', 'caper story')]
[('genrer', 'http://www.wikidata.org/entity/Q775344'), ('callret-1', 'teen drama')]
[('genrer', 'http://www.wikidata.org/entity/Q3648909'), ('callret-1', 'popular-science film')]
[('genrer', 'http://www.wikidata.org/entity/Q604725'), ('callret-1', 'psychological horror')]
[('g

In [8]:
queryString = """
SELECT DISTINCT ?filmr ?film ?date
WHERE {
   ?filmr wdt:P31 wd:Q11424;  # generic film
       ?releaseDate ?date. # with a publication date
       
    VALUES(?releasedDate){
       (wdt:P577)   # publication date
       (wdt:P1191)  # date of first performance
       (wdt:P1619)  # date of official opening
   }
   
   FILTER (?date > "2010-01-01T00:00:00"^^xsd:dateTime)
   FILTER (?date < "2015-31-12T00:00:00"^^xsd:dateTime)

   # get the label
   ?genrer sc:name ?film.
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [14]:
queryString = """
SELECT DISTINCT ?filmr ?film ?releasedDate ?date
WHERE {
   ?filmr wdt:P31 wd:Q11424;  # generic film
       ?releaseDate ?date. # with a publication date
       
    VALUES(?releasedDate){
       (wdt:P577)   # publication date
       (wdt:P1191)  # date of first performance
       (wdt:P1619)  # date of official opening
   }
   
   # get the label
   ?filmr sc:name ?film.
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('filmr', 'http://www.wikidata.org/entity/Q8178307'), ('film', '15 Park Avenue'), ('releasedDate', 'http://www.wikidata.org/prop/direct/P577'), ('date', 'http://wikiba.se/ontology#Item')]
[('filmr', 'http://www.wikidata.org/entity/Q8178307'), ('film', '15 Park Avenue'), ('releasedDate', 'http://www.wikidata.org/prop/direct/P1191'), ('date', 'http://wikiba.se/ontology#Item')]
[('filmr', 'http://www.wikidata.org/entity/Q8178307'), ('film', '15 Park Avenue'), ('releasedDate', 'http://www.wikidata.org/prop/direct/P1619'), ('date', 'http://wikiba.se/ontology#Item')]
[('filmr', 'http://www.wikidata.org/entity/Q8197668'), ('film', 'Idiot Love'), ('releasedDate', 'http://www.wikidata.org/prop/direct/P577'), ('date', 'http://wikiba.se/ontology#Item')]
[('filmr', 'http://www.wikidata.org/entity/Q8197668'), ('film', 'Idiot Love'), ('releasedDate', 'http://www.wikidata.org/prop/direct/P1191'), ('date', 'http://wikiba.se/ontology#Item')]
[('filmr', 'http://www.wikidata.org/entity/Q8197668'

In [13]:
queryString = """
SELECT DISTINCT ?releasedDate
WHERE {
       
    VALUES(?releasedDate){
       (wdt:P577)   # publication date
       (wdt:P1191)  # date of first performance
       (wdt:P1619)  # date of official opening
   }
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('releasedDate', 'http://www.wikidata.org/prop/direct/P577')]
[('releasedDate', 'http://www.wikidata.org/prop/direct/P1191')]
[('releasedDate', 'http://www.wikidata.org/prop/direct/P1619')]
3


In [16]:
queryString = """
SELECT DISTINCT ?filmr ?film ?releasedDate ?dateObj ?pr ?p ?or ?o
WHERE {
   ?filmr wdt:P31 wd:Q11424;  # generic film
       ?releaseDate ?dateObj. # with a publication date
       
       ?dateObj ?pr ?or.
       
    VALUES(?releasedDate){
       (wdt:P577)   # publication date
       (wdt:P1191)  # date of first performance
       (wdt:P1619)  # date of official opening
   }
   
   # get the label
   ?filmr sc:name ?film.
   ?pr sc:name ?p.
   ?or sc:name ?o.
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('filmr', 'http://www.wikidata.org/entity/Q8452990'), ('film', "Sunday League - Pepik Hnatek's Final Match"), ('releasedDate', 'http://www.wikidata.org/prop/direct/P577'), ('dateObj', 'http://www.wikidata.org/entity/Q12024203'), ('pr', 'http://www.wikidata.org/prop/direct/P102'), ('p', 'member of political party'), ('or', 'http://www.wikidata.org/entity/Q10728124'), ('o', 'ANO 2011')]
[('filmr', 'http://www.wikidata.org/entity/Q8452990'), ('film', "Sunday League - Pepik Hnatek's Final Match"), ('releasedDate', 'http://www.wikidata.org/prop/direct/P1191'), ('dateObj', 'http://www.wikidata.org/entity/Q12024203'), ('pr', 'http://www.wikidata.org/prop/direct/P102'), ('p', 'member of political party'), ('or', 'http://www.wikidata.org/entity/Q10728124'), ('o', 'ANO 2011')]
[('filmr', 'http://www.wikidata.org/entity/Q8452990'), ('film', "Sunday League - Pepik Hnatek's Final Match"), ('releasedDate', 'http://www.wikidata.org/prop/direct/P1619'), ('dateObj', 'http://www.wikidata.org/en

In [22]:
queryString = """
SELECT DISTINCT ?dateObj ?pr ?p ?or ?o
WHERE {
   wd:Q8452990 wdt:P31 wd:Q11424;  # generic film
       wdt:P577 ?dateObj. # with a publication date
       
       ?dateObj ?pr ?or.
       
   # get the label
   ?pr sc:name ?p.
   ?or sc:name ?o.
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [24]:
queryString = """
SELECT DISTINCT ?dateObj ?pr ?p ?or ?o
WHERE {
   wd:Q206124 wdt:P31 wd:Q11424;  # generic film
       wdt:P577 ?dateObj. # with a publication date
       
       ?dateObj ?pr ?or.
       
   # get the label
   ?pr sc:name ?p.
   ?or sc:name ?o.
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [12]:
queryString = """
SELECT DISTINCT ?dateObj
WHERE {
    
   wd:Q206124 wdt:P577 ?dateObj. # with a publication date
        FILTER (?dateObj > "2010-01-01T00:00:00Z"^^xsd:dateTime)
        FILTER (?dateObj < "2020-01-01T00:00:00Z"^^xsd:dateTime) .
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [22]:
queryString = """
SELECT DISTINCT ?filmr ?film ?dateObj
WHERE {
    ?filmr wdt:P31 wd:Q11424.
    ?filmr wdt:P577 ?dateObj. # with a publication date
        FILTER (?dateObj > "2010-01-01T00:00:00Z"^^xsd:dateTime)
        FILTER (?dateObj < "2020-01-01T00:00:00Z"^^xsd:dateTime) .
       
    ?filmr sc:name ?film
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('filmr', 'http://www.wikidata.org/entity/Q8452990'), ('film', "Sunday League - Pepik Hnatek's Final Match"), ('dateObj', '2012-03-29T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q284184'), ('film', 'Ra.One'), ('dateObj', '2011-01-01T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q284184'), ('film', 'Ra.One'), ('dateObj', '2011-10-26T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q13693967'), ('film', 'Cross'), ('dateObj', '2012-10-04T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q15464247'), ('film', 'What the Fish'), ('dateObj', '2013-01-01T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q15475237'), ('film', 'Mediastan'), ('dateObj', '2013-10-02T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q15490985'), ('film', 'Jailhouse Heat'), ('dateObj', '2011-01-01T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q15046558'), ('film', 'Life After Beth'), ('dateObj', '2014-01-01T00:00:00Z')]
[('filmr', 'http://www.wikidata

In [23]:
queryString = """
SELECT DISTINCT ?filmr ?film ?dateObj
WHERE {
    ?filmr wdt:P31 wd:Q11424.
    
    ?filmr (wdt:P291|wdt:P17|wdt:P495|wdt:P1433)  wd:Q30.  # country

    
    ?filmr (wdt:P577|wdt:P1191|wdt:P1619) ?dateObj. # with a publication date
        FILTER (?dateObj > "2010-01-01T00:00:00Z"^^xsd:dateTime)
        FILTER (?dateObj < "2020-01-01T00:00:00Z"^^xsd:dateTime) .
       
    ?filmr sc:name ?film
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('filmr', 'http://www.wikidata.org/entity/Q24901279'), ('film', 'Friended to Death'), ('dateObj', '2014-05-02T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q75836361'), ('film', 'Let There Be Light'), ('dateObj', '2017-01-01T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q15028216'), ('film', 'Wild Card'), ('dateObj', '2015-01-01T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q15028216'), ('film', 'Wild Card'), ('dateObj', '2015-01-29T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q15028216'), ('film', 'Wild Card'), ('dateObj', '2015-02-12T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q55403066'), ('film', 'No Postage Necessary'), ('dateObj', '2018-07-06T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q55402869'), ('film', 'Sharp Edges'), ('dateObj', '2018-07-06T00:00:00Z')]
[('filmr', 'http://www.wikidata.org/entity/Q15490985'), ('film', 'Jailhouse Heat'), ('dateObj', '2011-01-01T00:00:00Z')]
[('filmr', 'http://www.wikida

In [21]:
queryString = """
SELECT DISTINCT ?filmr ?film ?or ?pr
WHERE {
    ?filmr wdt:P31 wd:Q11424.
    

    
    ?filmr (wdt:P577|wdt:P1191|wdt:P1619) [?pr ?or]. # with a publication date

   # get the label
   ?filmr sc:name ?film.

}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [27]:
queryString = """
SELECT DISTINCT ?filmr ?film ?o ?p
WHERE {
    ?filmr wdt:P31 wd:Q11424.
    
    BIND(str(?or) as ?o)
    BIND(str(?pr) as ?p)

    
    ?filmr (wdt:P577|wdt:P1191|wdt:P1619) [?pr ?or]. # with a publication date

   # get the label
   ?filmr sc:name ?film.

}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [29]:
queryString = """
SELECT DISTINCT ?filmr ?res
WHERE {
    ?filmr wdt:P291 ?res.

}
LIMIT 10
"""

print("Results")
x=run_query(queryString)


Results
[('filmr', 'http://www.wikidata.org/entity/Q11945446'), ('res', 'http://www.wikidata.org/entity/Q13933')]
[('filmr', 'http://www.wikidata.org/entity/Q100288782'), ('res', 'http://www.wikidata.org/entity/Q1073686')]
[('filmr', 'http://www.wikidata.org/entity/Q55671290'), ('res', 'http://www.wikidata.org/entity/Q1073686')]
[('filmr', 'http://www.wikidata.org/entity/Q76824204'), ('res', 'http://www.wikidata.org/entity/Q1627591')]
[('filmr', 'http://www.wikidata.org/entity/Q55666016'), ('res', 'http://www.wikidata.org/entity/Q2156427')]
[('filmr', 'http://www.wikidata.org/entity/Q55671120'), ('res', 'http://www.wikidata.org/entity/Q982554')]
[('filmr', 'http://www.wikidata.org/entity/Q56237018'), ('res', 'http://www.wikidata.org/entity/Q1376400')]
[('filmr', 'http://www.wikidata.org/entity/Q55666605'), ('res', 'http://www.wikidata.org/entity/Q2299339')]
[('filmr', 'http://www.wikidata.org/entity/Q19750830'), ('res', 'http://www.wikidata.org/entity/Q2562317')]
[('filmr', 'http://www

In [30]:
queryString = """
SELECT DISTINCT ?filmr ?res
WHERE {
    ?filmr wdt:P291 wd:Q30.

}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('filmr', 'http://www.wikidata.org/entity/Q25411938')]
[('filmr', 'http://www.wikidata.org/entity/Q3186989')]
[('filmr', 'http://www.wikidata.org/entity/Q15753706')]
[('filmr', 'http://www.wikidata.org/entity/Q2746500')]
[('filmr', 'http://www.wikidata.org/entity/Q2915291')]
[('filmr', 'http://www.wikidata.org/entity/Q1101326')]
[('filmr', 'http://www.wikidata.org/entity/Q17002288')]
[('filmr', 'http://www.wikidata.org/entity/Q17008085')]
[('filmr', 'http://www.wikidata.org/entity/Q17012904')]
[('filmr', 'http://www.wikidata.org/entity/Q17016033')]
10


In [31]:
queryString = """
SELECT DISTINCT ?filmr ?res
WHERE {
    ?filmr  wdt:P31 wd:Q11424.
    ?filmr wdt:P291 wd:Q30.

}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('filmr', 'http://www.wikidata.org/entity/Q15028216')]
[('filmr', 'http://www.wikidata.org/entity/Q55403066')]
[('filmr', 'http://www.wikidata.org/entity/Q55402869')]
[('filmr', 'http://www.wikidata.org/entity/Q59587231')]
[('filmr', 'http://www.wikidata.org/entity/Q98411645')]
5


In [33]:
queryString = """
SELECT DISTINCT ?filmr ?s
WHERE {
    ?filmr  wdt:P31 wd:Q11424.
    ?filmr ?s [wdt:P291 wd:Q30].

}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('filmr', 'http://www.wikidata.org/entity/Q1758112'), ('s', 'http://www.wikidata.org/prop/direct/P144')]
[('filmr', 'http://www.wikidata.org/entity/Q160560'), ('s', 'http://www.wikidata.org/prop/direct/P138')]
[('filmr', 'http://www.wikidata.org/entity/Q160560'), ('s', 'http://www.wikidata.org/prop/direct/P144')]
[('filmr', 'http://www.wikidata.org/entity/Q300547'), ('s', 'http://www.wikidata.org/prop/direct/P144')]
[('filmr', 'http://www.wikidata.org/entity/Q300624'), ('s', 'http://www.wikidata.org/prop/direct/P156')]
[('filmr', 'http://www.wikidata.org/entity/Q2132944'), ('s', 'http://www.wikidata.org/prop/direct/P1889')]
[('filmr', 'http://www.wikidata.org/entity/Q152456'), ('s', 'http://www.wikidata.org/prop/direct/P144')]
[('filmr', 'http://www.wikidata.org/entity/Q18547944'), ('s', 'http://www.wikidata.org/prop/direct/P144')]
[('filmr', 'http://www.wikidata.org/entity/Q195274'), ('s', 'http://www.wikidata.org/prop/direct/P144')]
[('filmr', 'http://www.wikidata.org/entity

In [34]:
queryString = """
SELECT DISTINCT  ?s
WHERE {
    ?filmr  wdt:P31 wd:Q11424.
    ?filmr ?s [wdt:P291 wd:Q30].

}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/prop/direct/P144')]
[('s', 'http://www.wikidata.org/prop/direct/P138')]
[('s', 'http://www.wikidata.org/prop/direct/P156')]
[('s', 'http://www.wikidata.org/prop/direct/P1889')]
[('s', 'http://www.wikidata.org/prop/direct/P155')]
[('s', 'http://www.wikidata.org/prop/direct/P921')]
[('s', 'http://www.wikidata.org/prop/direct/P406')]
[('s', 'http://www.wikidata.org/prop/direct/P272')]
8


In [35]:
queryString = """
SELECT DISTINCT  ?s ?sl
WHERE {
    ?filmr  wdt:P31 wd:Q11424.
    ?filmr ?s [wdt:P291 wd:Q30].

    ?s sc:name ?sl
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/prop/direct/P138'), ('sl', 'named after')]
[('s', 'http://www.wikidata.org/prop/direct/P144'), ('sl', 'based on')]
[('s', 'http://www.wikidata.org/prop/direct/P155'), ('sl', 'follows')]
[('s', 'http://www.wikidata.org/prop/direct/P156'), ('sl', 'followed by')]
[('s', 'http://www.wikidata.org/prop/direct/P1889'), ('sl', 'different from')]
[('s', 'http://www.wikidata.org/prop/direct/P921'), ('sl', 'main subject')]
[('s', 'http://www.wikidata.org/prop/direct/P406'), ('sl', 'soundtrack release')]
[('s', 'http://www.wikidata.org/prop/direct/P272'), ('sl', 'production company')]
8


Final query for this task

In [37]:
queryString = """
SELECT DISTINCT ?genrer (SAMPLE(?genre) as ?genreLabel) (COUNT(distinct ?filmr) as ?filmCount)
WHERE {

    ?filmr wdt:P31 wd:Q11424.  # generic film
    
    ?filmr wdt:P136 ?genrer.   # with genre
    
    ?filmr (wdt:P291|wdt:P17|wdt:P495|wdt:P1433)  wd:Q30.  # country = USA

    
    ?filmr (wdt:P577|wdt:P1191|wdt:P1619) ?dateObj. # with a publication date
        FILTER (?dateObj > "2010-01-01T00:00:00Z"^^xsd:dateTime)
        FILTER (?dateObj < "2020-01-01T00:00:00Z"^^xsd:dateTime) .
       
   # get the label
   ?genrer sc:name ?genre.
} GROUP BY ?genrer
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
The operation failed EndPointInternalError: endpoint returned code 500 and response. 

Response:
b'Virtuoso 42000 Error The estimated execution time 26992 (sec) exceeds the limit of 800 (sec).\n\nSPARQL query:\n\n##-f457feb412-##\nPREFIX wd: <http://www.wikidata.org/entity/> \nPREFIX wdt: <http://www.wikidata.org/prop/direct/> \nPREFIX sc: <http://schema.org/>\n\n\nSELECT DISTINCT ?genrer (SAMPLE(?genre) as ?genreLabel) (COUNT(distinct ?filmr) as ?filmCount)\nWHERE {\n\n    ?filmr wdt:P31 wd:Q11424.  # generic film\n    \n    ?filmr wdt:P136 ?genrer.   # with genre\n    \n    ?filmr (wdt:P291|wdt:P17|wdt:P495|wdt:P1433)  wd:Q30.  # country = USA\n\n    \n    ?filmr (wdt:P577|wdt:P1191|wdt:P1619) ?dateObj. # with a publication date\n        FILTER (?dateObj > "2010-01-01T00:00:00Z"^^xsd:dateTime)\n        FILTER (?dateObj < "2020-01-01T00:00:00Z"^^xsd:dateTime) .\n       \n   # get the label\n   ?genrer sc:name ?genre.\n} GROUP BY ?genrer\nLIMIT 100\n'


In [4]:
queryString = """
SELECT DISTINCT ?filmr ?film
WHERE {
    ?filmr wdt:P31 wd:Q11424.
    
    ?filmr (wdt:P291|wdt:P17|wdt:P495|wdt:P1433)  wd:Q30.  # country

    
    ?filmr (wdt:P577|wdt:P1191|wdt:P1619) ?dateObj. # with a publication date
        FILTER (?dateObj > "2010-01-01T00:00:00Z"^^xsd:dateTime)
        FILTER (?dateObj < "2015-12-31T00:00:00Z"^^xsd:dateTime) .
       
    ?filmr sc:name ?film
}
LIMIT 100"""

print("Results")
x=run_query(queryString)

Results
[('filmr', 'http://www.wikidata.org/entity/Q24901279'), ('film', 'Friended to Death')]
[('filmr', 'http://www.wikidata.org/entity/Q15028216'), ('film', 'Wild Card')]
[('filmr', 'http://www.wikidata.org/entity/Q15490985'), ('film', 'Jailhouse Heat')]
[('filmr', 'http://www.wikidata.org/entity/Q15046558'), ('film', 'Life After Beth')]
[('filmr', 'http://www.wikidata.org/entity/Q15596853'), ('film', 'Burying the Ex')]
[('filmr', 'http://www.wikidata.org/entity/Q15735966'), ('film', 'Dani Daniels: Dare')]
[('filmr', 'http://www.wikidata.org/entity/Q1758112'), ('film', "Ender's Game")]
[('filmr', 'http://www.wikidata.org/entity/Q15856724'), ('film', 'Avengers XXX – A Porn Parody')]
[('filmr', 'http://www.wikidata.org/entity/Q15857417'), ('film', 'E.T. XXX – A Dreamzone Parody')]
[('filmr', 'http://www.wikidata.org/entity/Q15867624'), ('film', 'Real Housewives of the San Fernando Valley – A XXX Parody')]
[('filmr', 'http://www.wikidata.org/entity/Q15882932'), ('film', 'Titanoboa: Mon

In [9]:
queryString = """
SELECT DISTINCT ?genrer SAMPLE(?genre) (COUNT(distinct ?filmr) AS ?filmCount)
WHERE {
   ?filmr wdt:P31 wd:Q11424;  # generic film
       ?releasedIn wd:Q30; # released in the USA
       ?releaseDate ?date; # with a publication date
       wdt:P136 ?genrer.   # and a genre
   
   VALUES(?releasedIn){
       (wdt:P17)   # country
       (wdt:P291)  # place of publication
       (wdt:P495)  # country of origin
       (wdt:P1433) # publicated in
   }
   
   VALUES(?releasedDate){
       (wdt:P577)   # publication date
       (wdt:P1191)  # date of first performance
       (wdt:P1619)  # date of official opening
   }
   
    FILTER (?date > "2010-01-01T00:00:00Z"^^xsd:dateTime)
    FILTER (?date < "2015-12-31T00:00:00Z"^^xsd:dateTime) .
       

   

   # get the label
   ?genrer sc:name ?genre.
} GROUP BY ?genrer ORDER BY DESC (?filmCount)
LIMIT 500
"""

print("Results")
x=run_query(queryString)

Results
[('genrer', 'http://www.wikidata.org/entity/Q130232'), ('callret-1', 'drama'), ('filmCount', '1433')]
[('genrer', 'http://www.wikidata.org/entity/Q93204'), ('callret-1', 'documentary film'), ('filmCount', '1029')]
[('genrer', 'http://www.wikidata.org/entity/Q200092'), ('callret-1', 'horror film'), ('filmCount', '612')]
[('genrer', 'http://www.wikidata.org/entity/Q157443'), ('callret-1', 'comedy film'), ('filmCount', '612')]
[('genrer', 'http://www.wikidata.org/entity/Q188473'), ('callret-1', 'action film'), ('filmCount', '523')]
[('genrer', 'http://www.wikidata.org/entity/Q2484376'), ('callret-1', 'thriller film'), ('filmCount', '469')]
[('genrer', 'http://www.wikidata.org/entity/Q471839'), ('callret-1', 'science fiction film'), ('filmCount', '294')]
[('genrer', 'http://www.wikidata.org/entity/Q20442589'), ('callret-1', 'LGBT-related film'), ('filmCount', '283')]
[('genrer', 'http://www.wikidata.org/entity/Q859369'), ('callret-1', 'comedy-drama'), ('filmCount', '272')]
[('genre

## Task 5
Consider the timespan from 2001-01-01 to nowadays. Find the number of films released in this timespan divided by film genre and return only those with more than 50 films released every year (the result set must be genre IRI and label).

At first, I would like to be sure I'm able to handle the NOW() function in the right way:

In [42]:
queryString = """
SELECT DISTINCT ?now
WHERE {
   BIND(NOW() as ?now)
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('now', '2022-12-20T23:24:50.117375')]
1


In [43]:
queryString = """
SELECT DISTINCT ?year
WHERE {

   BIND(NOW() as ?now)
   VALUES(?year){
       ("1970-01-01T00:00:00"^^xsd:dateTime)
       ("1979-01-01T00:00:00"^^xsd:dateTime)
       ("2023-01-01T00:00:00"^^xsd:dateTime)
   }
   
     FILTER (?now > ?year)

}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('year', '1970-01-01T00:00:00')]
[('year', '1979-01-01T00:00:00')]
2


In [20]:
queryString = """
SELECT DISTINCT ?genrer SAMPLE(?genre) SAMPLE(?year)
WHERE {

   ?filmr wdt:P31 wd:Q11424;  # generic film
       ?releaseDate ?date; # with a publication date
       wdt:P136 ?genrer.   # and a genre
   
   
   VALUES(?releasedDate){
       (wdt:P577)   # publication date
       (wdt:P1191)  # date of first performance
       (wdt:P1619)  # date of official opening
   }
   
      BIND(NOW() as ?now)

    FILTER (?date > "2001-01-01T00:00:00Z"^^xsd:dateTime)
    FILTER (?date < ?now) .
       

    BIND(YEAR(?date) as ?year)

   # get the label
   ?genrer sc:name ?genre.
} GROUP BY ?genrer ?year 
HAVING (COUNT (DISTINCT ?filmr) > 50)
LIMIT 500


"""

print("Results")
x=run_query(queryString)

Results
[('genrer', 'http://www.wikidata.org/entity/Q645928'), ('callret-1', 'biographical film'), ('callret-2', '2018')]
[('genrer', 'http://www.wikidata.org/entity/Q130232'), ('callret-1', 'drama'), ('callret-2', '2019')]
[('genrer', 'http://www.wikidata.org/entity/Q842256'), ('callret-1', 'musical film'), ('callret-2', '2003')]
[('genrer', 'http://www.wikidata.org/entity/Q471839'), ('callret-1', 'science fiction film'), ('callret-2', '2011')]
[('genrer', 'http://www.wikidata.org/entity/Q859369'), ('callret-1', 'comedy-drama'), ('callret-2', '2004')]
[('genrer', 'http://www.wikidata.org/entity/Q2484376'), ('callret-1', 'thriller film'), ('callret-2', '2004')]
[('genrer', 'http://www.wikidata.org/entity/Q459290'), ('callret-1', 'independent film'), ('callret-2', '2016')]
[('genrer', 'http://www.wikidata.org/entity/Q130232'), ('callret-1', 'drama'), ('callret-2', '2009')]
[('genrer', 'http://www.wikidata.org/entity/Q859369'), ('callret-1', 'comedy-drama'), ('callret-2', '2009')]
[('gen

In [21]:
queryString = """
SELECT DISTINCT ?genrer SAMPLE(?genre) (COUNT(DISTINCT ?filmr) as ?filmCount)
WHERE {

   ?filmr wdt:P31 wd:Q11424;  # generic film
       ?releaseDate ?date; # with a publication date
       wdt:P136 ?genrer.   # and a genre
   
   
   VALUES(?releasedDate){
       (wdt:P577)   # publication date
       (wdt:P1191)  # date of first performance
       (wdt:P1619)  # date of official opening
   }
   
      BIND(NOW() as ?now)

    FILTER (?date > "2001-01-01T00:00:00Z"^^xsd:dateTime)
    FILTER (?date < ?now) .
       

    BIND(YEAR(?date) as ?year)

   # get the label
   ?genrer sc:name ?genre.
} GROUP BY ?genrer ?year 
HAVING (COUNT (DISTINCT ?filmr) > 50)
LIMIT 500


"""

print("Results")
x=run_query(queryString)

Results
[('genrer', 'http://www.wikidata.org/entity/Q645928'), ('callret-1', 'biographical film'), ('filmCount', '167')]
[('genrer', 'http://www.wikidata.org/entity/Q130232'), ('callret-1', 'drama'), ('filmCount', '1262')]
[('genrer', 'http://www.wikidata.org/entity/Q842256'), ('callret-1', 'musical film'), ('filmCount', '63')]
[('genrer', 'http://www.wikidata.org/entity/Q471839'), ('callret-1', 'science fiction film'), ('filmCount', '87')]
[('genrer', 'http://www.wikidata.org/entity/Q859369'), ('callret-1', 'comedy-drama'), ('filmCount', '92')]
[('genrer', 'http://www.wikidata.org/entity/Q2484376'), ('callret-1', 'thriller film'), ('filmCount', '118')]
[('genrer', 'http://www.wikidata.org/entity/Q459290'), ('callret-1', 'independent film'), ('filmCount', '53')]
[('genrer', 'http://www.wikidata.org/entity/Q130232'), ('callret-1', 'drama'), ('filmCount', '1352')]
[('genrer', 'http://www.wikidata.org/entity/Q859369'), ('callret-1', 'comedy-drama'), ('filmCount', '123')]
[('genrer', 'http

In [9]:
queryString = """
SELECT DISTINCT ?genrer SAMPLE(?genreLabel) (SUM(?filmCount) as ?totalFilmCount)
WHERE {
{
    SELECT DISTINCT ?genrer ?year (SAMPLE(?genre) as ?genreLabel) (COUNT(DISTINCT ?filmr) as ?filmCount)
    WHERE {

       ?filmr wdt:P31 wd:Q11424;  # generic film
           ?releaseDate ?date; # with a publication date
           wdt:P136 ?genrer.   # and a genre


       VALUES(?releasedDate){
           (wdt:P577)   # publication date
           (wdt:P1191)  # date of first performance
           (wdt:P1619)  # date of official opening
       }

          BIND(NOW() as ?now)

        FILTER (?date > "2001-01-01T00:00:00Z"^^xsd:dateTime)
        FILTER (?date < ?now) .


        BIND(YEAR(?date) as ?year)

       # get the label
       ?genrer sc:name ?genre.
    } GROUP BY ?genrer ?year 
    HAVING (COUNT (DISTINCT ?filmr) > 50)
}

} GROUP BY ?genrer
LIMIT 500

"""

print("Results")
x=run_query(queryString)

Results
[('genrer', 'http://www.wikidata.org/entity/Q471839'), ('callret-1', 'science fiction film'), ('totalFilmCount', '1466')]
[('genrer', 'http://www.wikidata.org/entity/Q93204'), ('callret-1', 'documentary film'), ('totalFilmCount', '19525')]
[('genrer', 'http://www.wikidata.org/entity/Q959790'), ('callret-1', 'crime film'), ('totalFilmCount', '2241')]
[('genrer', 'http://www.wikidata.org/entity/Q369747'), ('callret-1', 'war film'), ('totalFilmCount', '418')]
[('genrer', 'http://www.wikidata.org/entity/Q1339864'), ('callret-1', 'sport film'), ('totalFilmCount', '143')]
[('genrer', 'http://www.wikidata.org/entity/Q188473'), ('callret-1', 'action film'), ('totalFilmCount', '4417')]
[('genrer', 'http://www.wikidata.org/entity/Q1146335'), ('callret-1', 'teen film'), ('totalFilmCount', '581')]
[('genrer', 'http://www.wikidata.org/entity/Q157394'), ('callret-1', 'fantasy film'), ('totalFilmCount', '1310')]
[('genrer', 'http://www.wikidata.org/entity/Q130232'), ('callret-1', 'drama'), ('

In [10]:
queryString = """
SELECT DISTINCT ?genrer SAMPLE(?genreLabel) (SUM(?filmCount) as ?totalFilmCount)
(COUNT(DISTINCT ?year) AS ?numberOfYearsPassed)
WHERE {
{
    SELECT DISTINCT ?genrer ?year (SAMPLE(?genre) as ?genreLabel) (COUNT(DISTINCT ?filmr) as ?filmCount)
    WHERE {

       ?filmr wdt:P31 wd:Q11424;  # generic film
           ?releaseDate ?date; # with a publication date
           wdt:P136 ?genrer.   # and a genre


       VALUES(?releasedDate){
           (wdt:P577)   # publication date
           (wdt:P1191)  # date of first performance
           (wdt:P1619)  # date of official opening
       }

          BIND(NOW() as ?now)

        FILTER (?date > "2001-01-01T00:00:00Z"^^xsd:dateTime)
        FILTER (?date < ?now) .


        BIND(YEAR(?date) as ?year)

       # get the label
       ?genrer sc:name ?genre.
    } GROUP BY ?genrer ?year 
    HAVING (COUNT (DISTINCT ?filmr) > 50)
}

} GROUP BY ?genrer
LIMIT 500

"""

print("Results")
x=run_query(queryString)

Results
[('genrer', 'http://www.wikidata.org/entity/Q471839'), ('callret-1', 'science fiction film'), ('totalFilmCount', '1466'), ('numberOfYearsPassed', '18')]
[('genrer', 'http://www.wikidata.org/entity/Q93204'), ('callret-1', 'documentary film'), ('totalFilmCount', '19525'), ('numberOfYearsPassed', '21')]
[('genrer', 'http://www.wikidata.org/entity/Q959790'), ('callret-1', 'crime film'), ('totalFilmCount', '2241'), ('numberOfYearsPassed', '19')]
[('genrer', 'http://www.wikidata.org/entity/Q369747'), ('callret-1', 'war film'), ('totalFilmCount', '418'), ('numberOfYearsPassed', '7')]
[('genrer', 'http://www.wikidata.org/entity/Q1339864'), ('callret-1', 'sport film'), ('totalFilmCount', '143'), ('numberOfYearsPassed', '1')]
[('genrer', 'http://www.wikidata.org/entity/Q188473'), ('callret-1', 'action film'), ('totalFilmCount', '4417'), ('numberOfYearsPassed', '21')]
[('genrer', 'http://www.wikidata.org/entity/Q1146335'), ('callret-1', 'teen film'), ('totalFilmCount', '581'), ('numberOfY

Before inserting the filter in the final query, I would like to check if I'm calculating the right amount of years:

In [15]:
queryString = """
SELECT DISTINCT ?maxNumberOfYears WHERE {
    
    BIND(YEAR(NOW()) - YEAR("2001-01-01T00:00:00Z"^^xsd:dateTime)  as ?maxNumberOfYears)
}
"""

print("Results")
x=run_query(queryString)

Results
[('maxNumberOfYears', '21')]
1


Final query for this task

Here the strategy is to fold the data in two steps.
- At first (inner query), I count how many films per genre per year. In this step, I also filter the years with less than 50 fils

- Then (outer query) I sum the film of each year for every genre. I also count how many "years" are left from the step above: if they are not the maximun, it means that some years had less than 50 fils, and so the genre should be excluded from the results

In [18]:
queryString = """
SELECT DISTINCT ?genrer SAMPLE(?genreLabel) (SUM(?filmCount) as ?totalFilmCount)
WHERE {
{
    SELECT DISTINCT ?genrer ?year (SAMPLE(?genre) as ?genreLabel) (COUNT(DISTINCT ?filmr) as ?filmCount)
    WHERE {

       ?filmr wdt:P31 wd:Q11424;  # generic film
           ?releaseDate ?date; # with a publication date
           wdt:P136 ?genrer.   # and a genre


       VALUES(?releasedDate){
           (wdt:P577)   # publication date
           (wdt:P1191)  # date of first performance
           (wdt:P1619)  # date of official opening
       }

          BIND(NOW() as ?now)

        FILTER (?date > "2001-01-01T00:00:00Z"^^xsd:dateTime)
        FILTER (?date < ?now) .


        BIND(YEAR(?date) as ?year)

       # get the label
       ?genrer sc:name ?genre.
    } GROUP BY ?genrer ?year 
    HAVING (COUNT (DISTINCT ?filmr) > 50)
}

} GROUP BY ?genrer
HAVING (COUNT(?year) = (YEAR(NOW()) - YEAR("2001-01-01T00:00:00Z"^^xsd:dateTime)))
LIMIT 500

"""

print("Results")
x=run_query(queryString)

Results
[('genrer', 'http://www.wikidata.org/entity/Q93204'), ('callret-1', 'documentary film'), ('totalFilmCount', '19525')]
[('genrer', 'http://www.wikidata.org/entity/Q188473'), ('callret-1', 'action film'), ('totalFilmCount', '4417')]
[('genrer', 'http://www.wikidata.org/entity/Q130232'), ('callret-1', 'drama'), ('totalFilmCount', '25175')]
[('genrer', 'http://www.wikidata.org/entity/Q157443'), ('callret-1', 'comedy film'), ('totalFilmCount', '10720')]
4


## Task 6
Consider the composers of Western films and their country of citizenship. Count the number of Western Films on which worked a composer grouped by country of citizenship and return the top-10 (the result set must be country IRI, label and #films)..

This query and the following ones were very close to others I did in the past workflows/in the pre exam. I just copy and pasted them, while adapting the model to the one I discovered with the previous queries. After some syntax struggling, I got the results immediately.

Final query for this task

In [6]:
queryString = """
SELECT DISTINCT ?countryr  SAMPLE(?country) (COUNT (?filmr) as ?filmCount)
WHERE{
    ?filmr wdt:P86 ?composerr.
    ?composerr wdt:P27 ?countryr.
    
    
    
    ?filmr wdt:P136 wd:Q172980.
    # you may use regexes to get every subgenre, such as Acid westerns.
    # if you wish to select other than pure westerns, use instead:
    # ?filmr wdt:P136 [sc:name ?genre] FILTER regex(?genre "estern").
    
    ?countryr sc:name ?country.
    
}GROUP BY ?countryr
ORDER BY DESC (?filmCount)
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('countryr', 'http://www.wikidata.org/entity/Q30'), ('callret-1', 'United States of America'), ('filmCount', '1080')]
[('countryr', 'http://www.wikidata.org/entity/Q145'), ('callret-1', 'United Kingdom'), ('filmCount', '59')]
[('countryr', 'http://www.wikidata.org/entity/Q34266'), ('callret-1', 'Russian Empire'), ('filmCount', '45')]
[('countryr', 'http://www.wikidata.org/entity/Q183'), ('callret-1', 'Germany'), ('filmCount', '45')]
[('countryr', 'http://www.wikidata.org/entity/Q38'), ('callret-1', 'Italy'), ('filmCount', '41')]
[('countryr', 'http://www.wikidata.org/entity/Q15180'), ('callret-1', 'Soviet Union'), ('filmCount', '36')]
[('countryr', 'http://www.wikidata.org/entity/Q40'), ('callret-1', 'Austria'), ('filmCount', '36')]
[('countryr', 'http://www.wikidata.org/entity/Q172579'), ('callret-1', 'Kingdom of Italy'), ('filmCount', '32')]
[('countryr', 'http://www.wikidata.org/entity/Q28513'), ('callret-1', 'Austria-Hungary'), ('filmCount', '31')]
[('countryr', 'http://ww

## Task 7
Consider the decades from 1961 to 1970, and from 2001 to 2010 and compare the total number of western films released in each one (the result set must be two couples decade and #western films of that decade).

Final query for this task

In [5]:

queryString = """
SELECT DISTINCT ?decade ?westernCount
WHERE{
    {
        SELECT (SAMPLE(?decade1) as ?decade) (COUNT (?filmr1) as ?westernCount){
           BIND("1961-1970" as ?decade1)
        
           ?filmr1 wdt:P31 wd:Q11424;  # generic film
               ?releaseDate1 ?date1;   # with a publication date
               wdt:P136 wd:Q172980.   # and genre = western


           VALUES(?releasedDate1){
               (wdt:P577)   # publication date
               (wdt:P1191)  # date of first performance
               (wdt:P1619)  # date of official opening
           }

            FILTER (?date1 > "1961-01-01T00:00:00Z"^^xsd:dateTime)
            FILTER (?date1 < "1970-01-01T00:00:00Z"^^xsd:dateTime) .

    } }UNION   {
        SELECT (SAMPLE(?decade2) as ?decade) (COUNT (?filmr2) as ?westernCount){
           BIND("2001-2010" as ?decade2)
        
           ?filmr2 wdt:P31 wd:Q11424;  # generic film
               ?releaseDate2 ?date2;   # with a publication date
               wdt:P136 wd:Q172980.   # and genre = western


           VALUES(?releasedDate2){
               (wdt:P577)   # publication date
               (wdt:P1191)  # date of first performance
               (wdt:P1619)  # date of official opening
           }

            FILTER (?date2 > "2001-01-01T00:00:00Z"^^xsd:dateTime)
            FILTER (?date2 < "2010-01-01T00:00:00Z"^^xsd:dateTime) .

    }}
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('decade', '1961-1970'), ('westernCount', '621')]
[('decade', '2001-2010'), ('westernCount', '189')]
2


## Task 8
Consider the decades from 1961 to 1970, and from 2001 to 2010 and select only western films. Then, for each decade, compare the average cast members size per film (the result set must be two couples decade and average cast members size of that decade).

Let's find the "is in the cast" property:

In [4]:
queryString = """
SELECT DISTINCT ?pr ?p ?or ?o
WHERE {
   wd:Q206124 ?pr ?or . # Midnight in Paris, ??, ??
   # get the label
   ?pr sc:name ?p.
   ?or sc:name ?o.
   FILTER regex(?p, "cast", "i").
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('pr', 'http://www.wikidata.org/prop/direct/P161'), ('p', 'cast member'), ('or', 'http://www.wikidata.org/entity/Q104514'), ('o', 'Adrien Brody')]
[('pr', 'http://www.wikidata.org/prop/direct/P161'), ('p', 'cast member'), ('or', 'http://www.wikidata.org/entity/Q143945'), ('o', 'Carla Bruni')]
[('pr', 'http://www.wikidata.org/prop/direct/P161'), ('p', 'cast member'), ('or', 'http://www.wikidata.org/entity/Q17176312'), ('o', 'Laurent Claret')]
[('pr', 'http://www.wikidata.org/prop/direct/P161'), ('p', 'cast member'), ('or', 'http://www.wikidata.org/entity/Q1930952'), ('o', 'Michel Vuillermoz')]
[('pr', 'http://www.wikidata.org/prop/direct/P161'), ('p', 'cast member'), ('or', 'http://www.wikidata.org/entity/Q1986254'), ('o', 'Sonia Rolland')]
[('pr', 'http://www.wikidata.org/prop/direct/P161'), ('p', 'cast member'), ('or', 'http://www.wikidata.org/entity/Q231096'), ('o', 'Léa Seydoux')]
[('pr', 'http://www.wikidata.org/prop/direct/P161'), ('p', 'cast member'), ('or', 'http://www.

Final query for this task

In [2]:
queryString = """
SELECT DISTINCT ?decade ?castMemberAvg
WHERE{
    {
        SELECT (SAMPLE(?decade1) as ?decade) (AVG (?castMemberCount1) as ?castMemberAvg){
           BIND("1961-1970" as ?decade1)
           
           {
               SELECT DISTINCT ?filmr1 (COUNT(distinct ?castMember1) as ?castMemberCount1) WHERE {
                   ?filmr1 wdt:P31 wd:Q11424;  # generic film
                   ?releaseDate1 ?date1;       # with a publication date
                   wdt:P136 wd:Q172980;        # and genre = western
                   wdt:P161 ?castMember1.       # average cast member


                   VALUES(?releasedDate1){
                       (wdt:P577)   # publication date
                       (wdt:P1191)  # date of first performance
                       (wdt:P1619)  # date of official opening
                   }

                    FILTER (?date1 > "1961-01-01T00:00:00Z"^^xsd:dateTime)
                    FILTER (?date1 < "1970-01-01T00:00:00Z"^^xsd:dateTime) .
               }GROUP BY ?filmr1
           
           }
       }

    }UNION   {
        SELECT (SAMPLE(?decade2) as ?decade) (AVG (?castMemberCount2) as ?castMemberAvg){
           BIND("2001-2010" as ?decade2)
           
           {
               SELECT DISTINCT ?filmr2 (COUNT(distinct ?castMember2) as ?castMemberCount2) WHERE {
                   ?filmr2 wdt:P31 wd:Q11424;  # generic film
                   ?releaseDate2 ?date2;       # with a publication date
                   wdt:P136 wd:Q172980;        # and genre = western
                   wdt:P161 ?castMember2.       # average cast member


                   VALUES(?releasedDate2){
                       (wdt:P577)   # publication date
                       (wdt:P1191)  # date of first performance
                       (wdt:P1619)  # date of official opening
                   }

                   FILTER (?date2 > "2001-01-01T00:00:00Z"^^xsd:dateTime)
                   FILTER (?date2 < "2010-01-01T00:00:00Z"^^xsd:dateTime) .
               }GROUP BY ?filmr2
           
           }
       }
    }
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('decade', '1961-1970'), ('castMemberAvg', '12.628272251308901')]
[('decade', '2001-2010'), ('castMemberAvg', '8.928571428571429')]
2
