# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-7c962ddae6-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString,verbose = True):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        results = sparql.query()
        json_results = results.convert()
        if len(json_results['results']['bindings'])==0:
            print("Empty")
            return []
        array = []
        for bindings in json_results['results']['bindings']:
            app =  [ (var, value['value'])  for var, value in bindings.items() ] 
            if verbose:
                print( app)
            array.append(app)
        if verbose:
            print(len(array))
        return array

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# Movie Workflow Series ("Tv series HIMYM explorative search") 


Consider the following exploratory scenario:


> we are interested in the TV series "How I met your mother" and we want to investigate the main aspects related to the actors and directors involved in the production, know the numerber of seasons and check what are the episodes which got the higher success/impact.


## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | profession    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P4969`    | derivative work      | predicate |
| `wd:Q147235` | How I met your mother        | node |
| `wd:Q23831` | The Office (US)        | node |



Also consider

```
wd:Q23831 ?p ?obj .
```

is the BGP to retrieve all **properties of The Office (US)**

Please consider that when you return a resource, you should return the IRI and the label of the resource. In particular, when the task require you to identify a BGP the result set must always be a list of couples IRI - label.

The workload should

1. Identify the BGP for tv series

2. Return the number of seasons and episodes per season of the tv series (the result set must be triples of season IRI, label and #episodes).

3. Get the number of episodes in which the cast members played a role. Who are the most present actors? (the result set must be a list of triples actor/actress IRI, label and #episodes)

4. Check who is the actor who acted in more films while working on "How I met your mother" (the result set must be a list of triples actor/actress IRI, label and #films).

5. Compare HIMYM with the tv series "The Office (US)" in terms of number of seasons, episods and cast members (the result set must be two elements -one for each tv series- of tv series IRI, label, #seasons, #episodes and #cast members).

6. Return how many of the actors who are members of the cast of the tv series have [Kavin Bacon number](https://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon#:~:text=Kevin%20Bacon%20himself%20has%20a,Bacon%20number%20is%20N%2B1.) equal to 2 (the result set must be a list of couples actor/actress IRI and label).

7. Consider the actors who are members of the cast of HIMYM. Amongst the tv series which these actors acted return only those which received more than 2 awards (the result set must be triples of tv series IRI, label, #awards won).

## Task 1
Identify the BGP for tv series

In [2]:
# write your queries
# getting object properties of node 'HIMYM'
queryString = """
SELECT DISTINCT ?object_property ?label
WHERE
{
   wd:Q147235 ?object_property ?object.
   
   FILTER(!isLiteral(?object)).
   
   ?object_property sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('object_property', 'http://www.wikidata.org/prop/direct/P2438'), ('label', 'narrator')]
[('object_property', 'http://www.wikidata.org/prop/direct/P136'), ('label', 'genre')]
[('object_property', 'http://www.wikidata.org/prop/direct/P138'), ('label', 'named after')]
[('object_property', 'http://www.wikidata.org/prop/direct/P1411'), ('label', 'nominated for')]
[('object_property', 'http://www.wikidata.org/prop/direct/P1424'), ('label', "topic's main template")]
[('object_property', 'http://www.wikidata.org/prop/direct/P154'), ('label', 'logo image')]
[('object_property', 'http://www.wikidata.org/prop/direct/P161'), ('label', 'cast member')]
[('object_property', 'http://www.wikidata.org/prop/direct/P162'), ('label', 'producer')]
[('object_property', 'http://www.wikidata.org/prop/direct/P166'), ('label', 'award received')]
[('object_property', 'http://www.wikidata.org/prop/direct/P170'), ('label', 'creator')]
[('object_property', 'http://www.wikidata.org/prop/direct/P18'), ('labe

In [3]:
# getting data properties of node 'HIMYM'
queryString = """
SELECT DISTINCT ?data_property ?label
WHERE
{
   wd:Q147235 ?data_property ?object.
   
   FILTER(isLiteral(?object))
   
   ?data_property sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('data_property', 'http://www.wikidata.org/prop/direct/P1113'), ('label', 'number of episodes')]
[('data_property', 'http://www.wikidata.org/prop/direct/P1258'), ('label', 'Rotten Tomatoes ID')]
[('data_property', 'http://www.wikidata.org/prop/direct/P1267'), ('label', 'AlloCiné series ID')]
[('data_property', 'http://www.wikidata.org/prop/direct/P1476'), ('label', 'title')]
[('data_property', 'http://www.wikidata.org/prop/direct/P1562'), ('label', 'AllMovie title ID')]
[('data_property', 'http://www.wikidata.org/prop/direct/P1712'), ('label', 'Metacritic ID')]
[('data_property', 'http://www.wikidata.org/prop/direct/P1874'), ('label', 'Netflix ID')]
[('data_property', 'http://www.wikidata.org/prop/direct/P1922'), ('label', 'first line')]
[('data_property', 'http://www.wikidata.org/prop/direct/P2002'), ('label', 'Twitter username')]
[('data_property', 'http://www.wikidata.org/prop/direct/P2047'), ('label', 'duration')]
[('data_property', 'http://www.wikidata.org/prop/direct/P21

Final query for this task

In [4]:
# write your final query
queryString = """
SELECT ?object ?label
WHERE
{
   wd:Q147235 wdt:P31 ?object.
   
   ?object sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('object', 'http://www.wikidata.org/entity/Q5398426'), ('label', 'television series')]
1


## Task 2
Return the number of seasons and episodes per season of the tv series (the result set must be triples of season IRI, label and #episodes).

In [5]:
# s:'HIMYM' p:'list of episodes' o:what node?
queryString = """
SELECT ?object ?label
WHERE
{
   wd:Q147235 wdt:P1811 ?object.
   
   ?object sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('object', 'http://www.wikidata.org/entity/Q785891'), ('label', 'list of How I Met Your Mother episodes')]
1


In [6]:
# getting data properties of node 'list of How I met your mother episodes'
queryString = """
SELECT DISTINCT ?data_property ?label
WHERE
{
   wd:Q785891 ?data_property ?object.
   
   FILTER(isLiteral(?object)).
   
   ?data_property sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('data_property', 'http://www.wikidata.org/prop/direct/P3921'), ('label', 'Wikidata SPARQL query equivalent')]
1


In [7]:
# getting only number of seasons of node 'HIMYM'
queryString = """
SELECT ?num_season
WHERE
{
   wd:Q147235 wdt:P2437 ?num_season.
}
"""

print("Results")
x=run_query(queryString)

Results
[('num_season', '9')]
1


In [8]:
# getting only number of episodes of node 'HIMYM'
queryString = """
SELECT ?num_episode
WHERE
{
   wd:Q147235 wdt:P1113 ?num_episode.
}
"""

print("Results")
x=run_query(queryString)

Results
[('num_episode', '208')]
1


In [9]:
# checking 'has parts of the class' object property of node 'HIMYM'
queryString = """
SELECT ?object ?label
WHERE
{
   wd:Q147235 wdt:P2670 ?object.
   
   ?object sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('object', 'http://www.wikidata.org/entity/Q21664088'), ('label', 'two-part episode')]
1


In [10]:
# checking 'has part' object property of node 'HIMYM'
queryString = """
SELECT ?object ?label
WHERE
{
   wd:Q147235 wdt:P527 ?object.
   
   ?object sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('object', 'http://www.wikidata.org/entity/Q2438066'), ('label', 'How I Met Your Mother, season 6')]
[('object', 'http://www.wikidata.org/entity/Q2715578'), ('label', 'How I Met Your Mother, season 1')]
[('object', 'http://www.wikidata.org/entity/Q13567027'), ('label', 'How I Met Your Mother, season 9')]
[('object', 'http://www.wikidata.org/entity/Q2567330'), ('label', 'How I Met Your Mother, season 4')]
[('object', 'http://www.wikidata.org/entity/Q338715'), ('label', 'How I Met Your Mother, season 8')]
[('object', 'http://www.wikidata.org/entity/Q582332'), ('label', 'How I Met Your Mother, season 5')]
[('object', 'http://www.wikidata.org/entity/Q3468515'), ('label', 'How I Met Your Mother, season 2')]
[('object', 'http://www.wikidata.org/entity/Q2555117'), ('label', 'How I Met Your Mother, season 3')]
[('object', 'http://www.wikidata.org/entity/Q2472427'), ('label', 'How I Met Your Mother, season 7')]
9


In [11]:
# getting object/data properties of an arbitrary season (9)
queryString = """
SELECT DISTINCT ?property ?label
WHERE
{
   wd:Q13567027 ?property ?object.
   
   ?property sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('property', 'http://www.wikidata.org/prop/direct/P1113'), ('label', 'number of episodes')]
[('property', 'http://www.wikidata.org/prop/direct/P1258'), ('label', 'Rotten Tomatoes ID')]
[('property', 'http://www.wikidata.org/prop/direct/P1712'), ('label', 'Metacritic ID')]
[('property', 'http://www.wikidata.org/prop/direct/P179'), ('label', 'part of the series')]
[('property', 'http://www.wikidata.org/prop/direct/P2529'), ('label', 'ČSFD film ID')]
[('property', 'http://www.wikidata.org/prop/direct/P2638'), ('label', 'TV.com ID')]
[('property', 'http://www.wikidata.org/prop/direct/P2704'), ('label', 'EIDR content ID')]
[('property', 'http://www.wikidata.org/prop/direct/P31'), ('label', 'instance of')]
[('property', 'http://www.wikidata.org/prop/direct/P3302'), ('label', 'Open Media Database film ID')]
[('property', 'http://www.wikidata.org/prop/direct/P364'), ('label', 'original language of film or TV show')]
[('property', 'http://www.wikidata.org/prop/direct/P437'), ('label', 

In [12]:
# checking 'number of episodes' property of an arbitrary season (9)
queryString = """
SELECT ?num_episode_season_9
WHERE
{
   wd:Q13567027 wdt:P1113 ?num_episode_season_9.
}
"""

print("Results")
x=run_query(queryString)

Results
[('num_episode_season_9', '24')]
1


Final query for this task

In [13]:
# write your final query
queryString = """
SELECT ?season_iri ?label ?num_episode
WHERE
{
   wd:Q147235 wdt:P527 ?season_iri.
   ?season_iri wdt:P1113 ?num_episode.
   
   ?season_iri sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('season_iri', 'http://www.wikidata.org/entity/Q2438066'), ('label', 'How I Met Your Mother, season 6'), ('num_episode', '24')]
[('season_iri', 'http://www.wikidata.org/entity/Q2715578'), ('label', 'How I Met Your Mother, season 1'), ('num_episode', '22')]
[('season_iri', 'http://www.wikidata.org/entity/Q13567027'), ('label', 'How I Met Your Mother, season 9'), ('num_episode', '24')]
[('season_iri', 'http://www.wikidata.org/entity/Q2567330'), ('label', 'How I Met Your Mother, season 4'), ('num_episode', '24')]
[('season_iri', 'http://www.wikidata.org/entity/Q338715'), ('label', 'How I Met Your Mother, season 8'), ('num_episode', '24')]
[('season_iri', 'http://www.wikidata.org/entity/Q582332'), ('label', 'How I Met Your Mother, season 5'), ('num_episode', '24')]
[('season_iri', 'http://www.wikidata.org/entity/Q3468515'), ('label', 'How I Met Your Mother, season 2'), ('num_episode', '22')]
[('season_iri', 'http://www.wikidata.org/entity/Q2555117'), ('label', 'How I Met Your Moth

## Task 3
Get the number of episodes in which the cast members played a role. Who are the most present actors? (the result set must be a list of triples actor/actress IRI, label and #episodes)

In [14]:
# write your queries
# checking 'cast member' object property of node 'HIMYM' (actual count: 480)
queryString = """
SELECT DISTINCT ?object ?label
WHERE
{
   wd:Q147235 wdt:P161 ?object.
   
   ?object sc:name ?label.
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('object', 'http://www.wikidata.org/entity/Q465556'), ('label', 'Annie Ilonzeh')]
[('object', 'http://www.wikidata.org/entity/Q269891'), ('label', 'Julianna Guill')]
[('object', 'http://www.wikidata.org/entity/Q4888924'), ('label', 'Benjamin Koldyke')]
[('object', 'http://www.wikidata.org/entity/Q446031'), ('label', 'Nikki Griffin')]
[('object', 'http://www.wikidata.org/entity/Q312705'), ('label', 'John Cho')]
[('object', 'http://www.wikidata.org/entity/Q469579'), ('label', 'Mircea Monroe')]
[('object', 'http://www.wikidata.org/entity/Q516659'), ('label', 'Virginia Williams')]
[('object', 'http://www.wikidata.org/entity/Q522856'), ('label', 'Kate Micucci')]
[('object', 'http://www.wikidata.org/entity/Q200566'), ('label', 'Cobie Smulders')]
[('object', 'http://www.wikidata.org/entity/Q435839'), ('label', 'Ashley Williams')]
[('object', 'http://www.wikidata.org/entity/Q446481'), ('label', 'Harvey Fierstein')]
[('object', 'http://www.wikidata.org/entity/Q2040329'), ('label', 'Hay

In [15]:
# getting object/data properties of an arbitrary actress (Annie Ilonzeh) taken from last query's output
queryString = """
SELECT DISTINCT ?property ?label
WHERE
{
   wd:Q465556 ?property ?object.
   
   ?property sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('property', 'http://www.wikidata.org/prop/direct/P106'), ('label', 'occupation')]
[('property', 'http://www.wikidata.org/prop/direct/P1266'), ('label', 'AlloCiné person ID')]
[('property', 'http://www.wikidata.org/prop/direct/P172'), ('label', 'ethnic group')]
[('property', 'http://www.wikidata.org/prop/direct/P19'), ('label', 'place of birth')]
[('property', 'http://www.wikidata.org/prop/direct/P2002'), ('label', 'Twitter username')]
[('property', 'http://www.wikidata.org/prop/direct/P2003'), ('label', 'Instagram username')]
[('property', 'http://www.wikidata.org/prop/direct/P2013'), ('label', 'Facebook ID')]
[('property', 'http://www.wikidata.org/prop/direct/P2019'), ('label', 'AllMovie person ID')]
[('property', 'http://www.wikidata.org/prop/direct/P2031'), ('label', 'work period (start)')]
[('property', 'http://www.wikidata.org/prop/direct/P21'), ('label', 'sex or gender')]
[('property', 'http://www.wikidata.org/prop/direct/P214'), ('label', 'VIAF ID')]
[('property', 'htt

In [16]:
# checking 'has part' object property of an arbitrary season (9)
queryString = """
SELECT ?object ?label
WHERE
{
   wd:Q13567027 wdt:P527 ?object.
   
   ?object sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('object', 'http://www.wikidata.org/entity/Q15846036'), ('label', 'Unpause')]
[('object', 'http://www.wikidata.org/entity/Q15846040'), ('label', 'Slapsgiving 3: Slappointment in Slapmarra')]
[('object', 'http://www.wikidata.org/entity/Q15919313'), ('label', 'Rally')]
[('object', 'http://www.wikidata.org/entity/Q15983482'), ('label', 'How Your Mother Met Me')]
[('object', 'http://www.wikidata.org/entity/Q15983495'), ('label', 'Sunrise')]
[('object', 'http://www.wikidata.org/entity/Q16738646'), ('label', 'Daisy')]
[('object', 'http://www.wikidata.org/entity/Q16744027'), ('label', 'The Poker Game')]
[('object', 'http://www.wikidata.org/entity/Q16745693'), ('label', 'Gary Blauman')]
[('object', 'http://www.wikidata.org/entity/Q16746127'), ('label', 'The Locket')]
[('object', 'http://www.wikidata.org/entity/Q16746145'), ('label', 'Coming Back')]
[('object', 'http://www.wikidata.org/entity/Q16746190'), ('label', 'The Broken Code')]
[('object', 'http://www.wikidata.org/entity/Q167462

In [17]:
# getting object/data properties of an arbitrary episode (Unpause) of an arbitrary season (9)
queryString = """
SELECT DISTINCT ?property ?label
WHERE
{
   wd:Q15846036 ?property ?object.
   
   ?property sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('property', 'http://www.wikidata.org/prop/direct/P1258'), ('label', 'Rotten Tomatoes ID')]
[('property', 'http://www.wikidata.org/prop/direct/P1476'), ('label', 'title')]
[('property', 'http://www.wikidata.org/prop/direct/P155'), ('label', 'follows')]
[('property', 'http://www.wikidata.org/prop/direct/P156'), ('label', 'followed by')]
[('property', 'http://www.wikidata.org/prop/direct/P1562'), ('label', 'AllMovie title ID')]
[('property', 'http://www.wikidata.org/prop/direct/P179'), ('label', 'part of the series')]
[('property', 'http://www.wikidata.org/prop/direct/P2529'), ('label', 'ČSFD film ID')]
[('property', 'http://www.wikidata.org/prop/direct/P2638'), ('label', 'TV.com ID')]
[('property', 'http://www.wikidata.org/prop/direct/P2704'), ('label', 'EIDR content ID')]
[('property', 'http://www.wikidata.org/prop/direct/P31'), ('label', 'instance of')]
[('property', 'http://www.wikidata.org/prop/direct/P3212'), ('label', 'ISAN')]
[('property', 'http://www.wikidata.org/prop/d

In [18]:
# getting all 'cast member' of an arbitrary episode (Unpause) of an arbitrary season (9)
queryString = """
SELECT ?actor_iri ?label
WHERE
{
   wd:Q15846036 wdt:P161 ?actor_iri.
   
   ?actor_iri sc:name ?label.
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [19]:
# getting all 'cast member' of a different episode (Rally) of the same season (9) to check whether 
# this one also results as empty or not
queryString = """
SELECT ?actor_iri ?label
WHERE
{
   wd:Q15919313 wdt:P161 ?actor_iri.
   
   ?actor_iri sc:name ?label.
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [20]:
# checking 'has part' object property of an arbitrary season (7)
queryString = """
SELECT ?object ?label
WHERE
{
   wd:Q2472427 wdt:P527 ?object.
   
   ?object sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('object', 'http://www.wikidata.org/entity/Q6372650'), ('label', 'Karma')]
[('object', 'http://www.wikidata.org/entity/Q7720231'), ('label', 'The Broath')]
[('object', 'http://www.wikidata.org/entity/Q5582491'), ('label', 'Good Crazy')]
[('object', 'http://www.wikidata.org/entity/Q7065844'), ('label', "Now We're Even")]
[('object', 'http://www.wikidata.org/entity/Q7842088'), ('label', 'Trilogy Time')]
[('object', 'http://www.wikidata.org/entity/Q11158351'), ('label', 'Mystery vs. History')]
[('object', 'http://www.wikidata.org/entity/Q16745474'), ('label', 'Noretta')]
[('object', 'http://www.wikidata.org/entity/Q16745507'), ('label', 'The Burning Beekeeper')]
[('object', 'http://www.wikidata.org/entity/Q4638335'), ('label', '46 Minutes')]
[('object', 'http://www.wikidata.org/entity/Q48984467'), ('label', "The Magician's Code (Part 1)")]
[('object', 'http://www.wikidata.org/entity/Q48984493'), ('label', "The Magician's Code (Part 2)")]
[('object', 'http://www.wikidata.org/entit

In [21]:
# getting all 'cast member' of an arbitrary episode (Karma) of a different season (7) to check whether 
# this one also results as empty or not

# this one is not empty; so, it seems like some episodes do not have their cast members listed
queryString = """
SELECT ?actor_iri ?label
WHERE
{
   wd:Q5446960 wdt:P161 ?actor_iri.
   
   ?actor_iri sc:name ?label.
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('actor_iri', 'http://www.wikidata.org/entity/Q200566'), ('label', 'Cobie Smulders')]
[('actor_iri', 'http://www.wikidata.org/entity/Q1189102'), ('label', 'Nazanin Boniadi')]
[('actor_iri', 'http://www.wikidata.org/entity/Q220536'), ('label', 'Kal Penn')]
[('actor_iri', 'http://www.wikidata.org/entity/Q485310'), ('label', 'Neil Patrick Harris')]
[('actor_iri', 'http://www.wikidata.org/entity/Q223455'), ('label', 'Josh Radnor')]
[('actor_iri', 'http://www.wikidata.org/entity/Q199927'), ('label', 'Alyson Hannigan')]
[('actor_iri', 'http://www.wikidata.org/entity/Q202304'), ('label', 'Jason Segel')]
[('actor_iri', 'http://www.wikidata.org/entity/Q329744'), ('label', 'Martin Short')]
[('actor_iri', 'http://www.wikidata.org/entity/Q333544'), ('label', 'Bob Saget')]
[('actor_iri', 'http://www.wikidata.org/entity/Q3109076'), ('label', 'Jay Acovone')]
10


Final query for this task

In [22]:
# write your final query
queryString = """
SELECT ?actor_iri ?label (COUNT(*) AS ?num_episode)
WHERE
{
   wd:Q147235 wdt:P527 ?season.
   ?season wdt:P527 ?episode.
   ?episode wdt:P161 ?actor_iri.
   
   ?actor_iri sc:name ?label.
}
GROUP BY ?actor_iri ?label
ORDER BY DESC(?num_episode)
"""

print("Results")
x=run_query(queryString)

Results
[('actor_iri', 'http://www.wikidata.org/entity/Q223455'), ('label', 'Josh Radnor'), ('num_episode', '145')]
[('actor_iri', 'http://www.wikidata.org/entity/Q485310'), ('label', 'Neil Patrick Harris'), ('num_episode', '145')]
[('actor_iri', 'http://www.wikidata.org/entity/Q200566'), ('label', 'Cobie Smulders'), ('num_episode', '145')]
[('actor_iri', 'http://www.wikidata.org/entity/Q202304'), ('label', 'Jason Segel'), ('num_episode', '145')]
[('actor_iri', 'http://www.wikidata.org/entity/Q199927'), ('label', 'Alyson Hannigan'), ('num_episode', '143')]
[('actor_iri', 'http://www.wikidata.org/entity/Q333544'), ('label', 'Bob Saget'), ('num_episode', '142')]
[('actor_iri', 'http://www.wikidata.org/entity/Q229914'), ('label', 'Lyndsy Fonseca'), ('num_episode', '48')]
[('actor_iri', 'http://www.wikidata.org/entity/Q297128'), ('label', 'David Henrie'), ('num_episode', '48')]
[('actor_iri', 'http://www.wikidata.org/entity/Q16149506'), ('label', 'Charlene Amoia'), ('num_episode', '17')]
[

## Task 4
Check who is the actor who acted in more films while working on "How I met your mother" (the result set must be a list of triples actor/actress IRI, label and #films).

In [23]:
# write your queries
# getting whatever an arbitrary actor (Josh Radnor) has played in using the 'cast member' and also by the use of
# 'instance of' property we see whether it is a film or not
queryString = """
SELECT DISTINCT ?subject ?label ?object ?instance_of
WHERE
{
   ?subject wdt:P161 wd:Q223455.
   ?subject wdt:P31 ?object.
   
   ?subject sc:name ?label.
   ?object sc:name ?instance_of.
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('subject', 'http://www.wikidata.org/entity/Q4137975'), ('label', 'Liberal Arts'), ('object', 'http://www.wikidata.org/entity/Q11424'), ('instance_of', 'film')]
[('subject', 'http://www.wikidata.org/entity/Q4690774'), ('label', 'Afternoon Delight'), ('object', 'http://www.wikidata.org/entity/Q11424'), ('instance_of', 'film')]
[('subject', 'http://www.wikidata.org/entity/Q52715605'), ('label', 'Social Animals'), ('object', 'http://www.wikidata.org/entity/Q11424'), ('instance_of', 'film')]
[('subject', 'http://www.wikidata.org/entity/Q18170028'), ('label', 'The Galapagos Affair'), ('object', 'http://www.wikidata.org/entity/Q11424'), ('instance_of', 'film')]
[('subject', 'http://www.wikidata.org/entity/Q1050720'), ('label', 'Happythankyoumoreplease'), ('object', 'http://www.wikidata.org/entity/Q11424'), ('instance_of', 'film')]
[('subject', 'http://www.wikidata.org/entity/Q1145848'), ('label', 'Not Another Teen Movie'), ('object', 'http://www.wikidata.org/entity/Q11424'), ('insta

In [24]:
# getting only the films that an arbitrary actor (Josh Radnor) has played in
queryString = """
SELECT DISTINCT ?film ?label 
WHERE
{
   ?film wdt:P161 wd:Q223455.
   ?film wdt:P31 wd:Q11424.
   
   ?film sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q4137975'), ('label', 'Liberal Arts')]
[('film', 'http://www.wikidata.org/entity/Q4690774'), ('label', 'Afternoon Delight')]
[('film', 'http://www.wikidata.org/entity/Q52715605'), ('label', 'Social Animals')]
[('film', 'http://www.wikidata.org/entity/Q18170028'), ('label', 'The Galapagos Affair')]
[('film', 'http://www.wikidata.org/entity/Q1050720'), ('label', 'Happythankyoumoreplease')]
[('film', 'http://www.wikidata.org/entity/Q1145848'), ('label', 'Not Another Teen Movie')]
[('film', 'http://www.wikidata.org/entity/Q2884686'), ('label', "Barney's Great Adventure")]
7


In [25]:
# getting object/data properties of an arbitrary film (Liberal Arts)
queryString = """
SELECT DISTINCT ?property ?label
WHERE
{
   wd:Q4137975 ?property ?object.
   
   ?property sc:name ?label.
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('property', 'http://www.wikidata.org/prop/direct/P1040'), ('label', 'film editor')]
[('property', 'http://www.wikidata.org/prop/direct/P1237'), ('label', 'Box Office Mojo film ID (former scheme)')]
[('property', 'http://www.wikidata.org/prop/direct/P1258'), ('label', 'Rotten Tomatoes ID')]
[('property', 'http://www.wikidata.org/prop/direct/P1265'), ('label', 'AlloCiné film ID')]
[('property', 'http://www.wikidata.org/prop/direct/P136'), ('label', 'genre')]
[('property', 'http://www.wikidata.org/prop/direct/P1476'), ('label', 'title')]
[('property', 'http://www.wikidata.org/prop/direct/P1562'), ('label', 'AllMovie title ID')]
[('property', 'http://www.wikidata.org/prop/direct/P161'), ('label', 'cast member')]
[('property', 'http://www.wikidata.org/prop/direct/P1712'), ('label', 'Metacritic ID')]
[('property', 'http://www.wikidata.org/prop/direct/P1874'), ('label', 'Netflix ID')]
[('property', 'http://www.wikidata.org/prop/direct/P1970'), ('label', 'MovieMeter film ID')]
[('pro

In [26]:
# checking the 'publication date' property of the films that an arbitrary actor (Josh Radnor) has played in
queryString = """
SELECT ?film ?label ?publication_date
WHERE
{
   ?film wdt:P161 wd:Q223455.
   ?film wdt:P31 wd:Q11424.
   ?film wdt:P577 ?publication_date.
   
   ?film sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q4137975'), ('label', 'Liberal Arts'), ('publication_date', '2012-01-01T00:00:00Z')]
[('film', 'http://www.wikidata.org/entity/Q4690774'), ('label', 'Afternoon Delight'), ('publication_date', '2013-01-21T00:00:00Z')]
[('film', 'http://www.wikidata.org/entity/Q4690774'), ('label', 'Afternoon Delight'), ('publication_date', '2015-05-07T00:00:00Z')]
[('film', 'http://www.wikidata.org/entity/Q52715605'), ('label', 'Social Animals'), ('publication_date', '2018-06-01T00:00:00Z')]
[('film', 'http://www.wikidata.org/entity/Q18170028'), ('label', 'The Galapagos Affair'), ('publication_date', '2013-01-01T00:00:00Z')]
[('film', 'http://www.wikidata.org/entity/Q1050720'), ('label', 'Happythankyoumoreplease'), ('publication_date', '2010-01-01T00:00:00Z')]
[('film', 'http://www.wikidata.org/entity/Q1145848'), ('label', 'Not Another Teen Movie'), ('publication_date', '2001-12-07T00:00:00Z')]
[('film', 'http://www.wikidata.org/entity/Q1145848'), ('labe

In [27]:
# getting the 'start time' and 'end time' properties of node 'HIMYM'
queryString = """
SELECT ?label ?start_time ?end_time
WHERE
{
   wd:Q147235 wdt:P580 ?start_time.
   wd:Q147235 wdt:P582 ?end_time.

   wd:Q147235 sc:name ?label
}
"""

print("Results")
x=run_query(queryString)

Results
[('label', 'How I Met Your Mother'), ('start_time', '2005-09-19T00:00:00Z'), ('end_time', '2014-03-31T00:00:00Z')]
1


In [28]:
# getting only the films that an arbitrary actor (Josh Radnor) has played in, 
# that were published between the start and end time of 'HIMYM'
queryString = """
SELECT DISTINCT ?film ?label 
WHERE
{
   ?film wdt:P161 wd:Q223455.
   ?film wdt:P31 wd:Q11424.
   ?film wdt:P577 ?publication_date.
   
   FILTER("2005-09-19T00:00:00Z"^^xsd:dateTime <= ?publication_date && ?publication_date <= "2014-03-31T00:00:00Z"^^xsd:dateTime)
   
   ?film sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q4137975'), ('label', 'Liberal Arts')]
[('film', 'http://www.wikidata.org/entity/Q4690774'), ('label', 'Afternoon Delight')]
[('film', 'http://www.wikidata.org/entity/Q18170028'), ('label', 'The Galapagos Affair')]
[('film', 'http://www.wikidata.org/entity/Q1050720'), ('label', 'Happythankyoumoreplease')]
4


Final query for this task

In [29]:
# write your final query
queryString = """
SELECT DISTINCT ?actor_iri ?label (COUNT(?film) AS ?num_film)
WHERE
{
   wd:Q147235 wdt:P161 ?actor_iri.
   ?film wdt:P161 ?actor_iri.
   ?film wdt:P31 wd:Q11424.
   ?film wdt:P577 ?publication_date.
   
   FILTER("2005-09-19T00:00:00Z"^^xsd:dateTime <= ?publication_date && ?publication_date <= "2014-03-31T00:00:00Z"^^xsd:dateTime)
   
   ?actor_iri sc:name ?label.
}
GROUP BY ?actor_iri ?label
ORDER BY DESC(?num_film)
"""

print("Results")
x=run_query(queryString)

Results
[('actor_iri', 'http://www.wikidata.org/entity/Q192165'), ('label', 'Danny Glover'), ('num_film', '60')]
[('actor_iri', 'http://www.wikidata.org/entity/Q255293'), ('label', 'Geoff Stults'), ('num_film', '39')]
[('actor_iri', 'http://www.wikidata.org/entity/Q1319539'), ('label', 'Thomas Lennon'), ('num_film', '36')]
[('actor_iri', 'http://www.wikidata.org/entity/Q229669'), ('label', 'Malin Åkerman'), ('num_film', '35')]
[('actor_iri', 'http://www.wikidata.org/entity/Q236189'), ('label', 'Judy Greer'), ('num_film', '33')]
[('actor_iri', 'http://www.wikidata.org/entity/Q1319744'), ('label', 'Will Forte'), ('num_film', '32')]
[('actor_iri', 'http://www.wikidata.org/entity/Q23547'), ('label', 'Bryan Cranston'), ('num_film', '30')]
[('actor_iri', 'http://www.wikidata.org/entity/Q312705'), ('label', 'John Cho'), ('num_film', '30')]
[('actor_iri', 'http://www.wikidata.org/entity/Q1189470'), ('label', 'Jimmi Simpson'), ('num_film', '30')]
[('actor_iri', 'http://www.wikidata.org/entity/Q

## Task 5
Compare HIMYM with the tv series "The Office (US)" in terms of number of seasons, episods and cast members (the result set must be two elements -one for each tv series- of tv series IRI, label, #seasons, #episodes and #cast members).

In [30]:
# write your queries
# getting number of seasons and episodes of node 'HIMYM'
queryString = """
SELECT wd:Q147235 ?label ?num_season ?num_episode
WHERE
{
   wd:Q147235 wdt:P2437 ?num_season.
   wd:Q147235 wdt:P1113 ?num_episode.
   
   wd:Q147235 sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('callret-0', 'http://www.wikidata.org/entity/Q147235'), ('label', 'How I Met Your Mother'), ('num_season', '9'), ('num_episode', '208')]
1


In [31]:
# getting only number of cast members of node 'HIMYM'
queryString = """
SELECT wd:Q147235 ?label (COUNT (?cast_member) AS ?num_cast)
WHERE
{
   wd:Q147235 wdt:P161 ?cast_member.
   
   wd:Q147235 sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('callret-0', 'http://www.wikidata.org/entity/Q147235'), ('label', 'How I Met Your Mother'), ('num_cast', '480')]
1


Final query for this task

In [32]:
# write your final query
queryString = """
SELECT ?series_iri ?label ?num_season ?num_episode (COUNT (?cast_member) AS ?num_cast)
WHERE
{
   values ?series_iri { wd:Q147235 wd:Q23831 }
   
   ?series_iri wdt:P2437 ?num_season.
   ?series_iri wdt:P1113 ?num_episode.
   ?series_iri wdt:P161 ?cast_member.
   
   ?series_iri sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('series_iri', 'http://www.wikidata.org/entity/Q147235'), ('label', 'How I Met Your Mother'), ('num_season', '9'), ('num_episode', '208'), ('num_cast', '480')]
[('series_iri', 'http://www.wikidata.org/entity/Q23831'), ('label', 'The Office'), ('num_season', '9'), ('num_episode', '201'), ('num_cast', '25')]
2


## Task 6
Return how many of the actors who are members of the cast of the tv series have [Kavin Bacon number](https://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon#:~:text=Kevin%20Bacon%20himself%20has%20a,Bacon%20number%20is%20N%2B1.) equal to 2 (the result set must be a list of couples actor/actress IRI and label).

In [33]:
# write your queries
# getting object/data properties of an arbitrary actress (Annie Ilonzeh) that found previously
queryString = """
SELECT DISTINCT ?property ?label
WHERE
{
   wd:Q465556 ?property ?object.
   
   ?property sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('property', 'http://www.wikidata.org/prop/direct/P106'), ('label', 'occupation')]
[('property', 'http://www.wikidata.org/prop/direct/P1266'), ('label', 'AlloCiné person ID')]
[('property', 'http://www.wikidata.org/prop/direct/P172'), ('label', 'ethnic group')]
[('property', 'http://www.wikidata.org/prop/direct/P19'), ('label', 'place of birth')]
[('property', 'http://www.wikidata.org/prop/direct/P2002'), ('label', 'Twitter username')]
[('property', 'http://www.wikidata.org/prop/direct/P2003'), ('label', 'Instagram username')]
[('property', 'http://www.wikidata.org/prop/direct/P2013'), ('label', 'Facebook ID')]
[('property', 'http://www.wikidata.org/prop/direct/P2019'), ('label', 'AllMovie person ID')]
[('property', 'http://www.wikidata.org/prop/direct/P2031'), ('label', 'work period (start)')]
[('property', 'http://www.wikidata.org/prop/direct/P21'), ('label', 'sex or gender')]
[('property', 'http://www.wikidata.org/prop/direct/P214'), ('label', 'VIAF ID')]
[('property', 'htt

In [34]:
# checking 'occupation' object property of the arbitrary actress (Annie Ilonzeh)
queryString = """
SELECT ?object ?label
WHERE
{
   wd:Q465556 wdt:P106 ?object.
   
   ?object sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('object', 'http://www.wikidata.org/entity/Q10798782'), ('label', 'television actor')]
[('object', 'http://www.wikidata.org/entity/Q10800557'), ('label', 'film actor')]
[('object', 'http://www.wikidata.org/entity/Q33999'), ('label', 'actor')]
3


In [35]:
# finding an actor with its name as 'Kevin Bacon'
queryString = """
SELECT DISTINCT ?actor ?label 
WHERE
{
   ?actor wdt:P106 wd:Q33999.
   
   FILTER REGEX(?label, "Kevin Bacon")
   
   ?actor sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('actor', 'http://www.wikidata.org/entity/Q3454165'), ('label', 'Kevin Bacon')]
1


In [36]:
# finding movies or series that KB worked directly with any cast member of HIMYM (KBN=1) in them
queryString = """
SELECT ?cast_member_of_himym ?label_actor ?label_film_or_series
WHERE
{
    wd:Q147235 wdt:P161 ?cast_member_of_himym.
    
    ?cast_member_of_himym sc:name ?label_actor.
    
    FILTER (?cast_member_of_himym = ?cast_member_of_bacon_movie)
    {
        SELECT ?cast_member_of_bacon_movie ?label_film_or_series
        WHERE
        {
            ?film_or_series wdt:P161 ?cast_member_of_bacon_movie.
            {
                SELECT DISTINCT ?film_or_series ?label_film_or_series
                WHERE
                {
                   ?film_or_series wdt:P161 wd:Q3454165.

                   ?film_or_series sc:name ?label_film_or_series.
                }
            }
        }
    }
}
"""

print("Results")
x=run_query(queryString)

Results
[('cast_member_of_himym', 'http://www.wikidata.org/entity/Q269891'), ('label_actor', 'Julianna Guill'), ('label_film_or_series', 'Crazy, Stupid, Love.')]
[('cast_member_of_himym', 'http://www.wikidata.org/entity/Q312705'), ('label_actor', 'John Cho'), ('label_film_or_series', 'The Air I Breathe')]
[('cast_member_of_himym', 'http://www.wikidata.org/entity/Q234715'), ('label_actor', 'Jamie-Lynn Sigler'), ('label_film_or_series', 'Skum Rocks!')]
[('cast_member_of_himym', 'http://www.wikidata.org/entity/Q311271'), ('label_actor', 'John Lithgow'), ('label_film_or_series', 'Footloose')]
[('cast_member_of_himym', 'http://www.wikidata.org/entity/Q234137'), ('label_actor', 'Megan Mullally'), ('label_film_or_series', 'Queens Logic')]
[('cast_member_of_himym', 'http://www.wikidata.org/entity/Q530646'), ('label_actor', 'Ray Wise'), ('label_film_or_series', 'X-Men: First Class')]
[('cast_member_of_himym', 'http://www.wikidata.org/entity/Q199929'), ('label_actor', 'Jennifer Morrison'), ('lab

Final query for this task

In [37]:
# write your final query
queryString = """
SELECT DISTINCT ?cast_member_with_kbn_2 ?label
WHERE
{
    {
        ### among cast members of all films that HIMYM cast with KBN=1 has acted in, ###
        ### select those that are also a cast of HIMYM ###

        ?film_or_series_2 wdt:P161 ?cast_member_with_kbn_2.

        FILTER (?cast_member_with_kbn_2 = ?cast_member_of_himym)

        ?cast_member_with_kbn_2 sc:name ?label.
        {
            SELECT DISTINCT ?film_or_series_2 ?cast_member_of_himym
            WHERE
            {
                ### find all films that HIMYM cast with KBN=1 has acted in ###
                ?film_or_series_2 wdt:P161 ?cast_member_of_himym.
                {
                    SELECT DISTINCT ?cast_member_of_himym 
                    WHERE
                    {
                        ### select those cast members of KB movies that are also cast of HIMYM ###
                        wd:Q147235 wdt:P161 ?cast_member_of_himym.

                        FILTER (?cast_member_of_himym = ?cast_member_of_bacon_movie)
                        {
                            SELECT DISTINCT ?cast_member_of_bacon_movie 
                            WHERE
                            {
                                ### find cast members (KBN=1) of all films that KB acted ###
                                ?film_or_series wdt:P161 ?cast_member_of_bacon_movie.
                                {
                                    SELECT DISTINCT ?film_or_series 
                                    WHERE
                                    {
                                       ### find all films that KB acted ###
                                       ?film_or_series wdt:P161 wd:Q3454165.
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    } 
    ### exclude KBN=1 from the result because of the permutations that actors work with each other ###
    MINUS
    {
        SELECT DISTINCT ?cast_member_of_himym 
        WHERE
        {
            ### select those cast members of KB movies that are also cast of HIMYM ###
            wd:Q147235 wdt:P161 ?cast_member_of_himym.

            FILTER (?cast_member_of_himym = ?cast_member_of_bacon_movie)
            {
                SELECT DISTINCT ?cast_member_of_bacon_movie 
                WHERE
                {
                    ### find cast members (KBN=1) of all films that KB acted ###
                    ?film_or_series wdt:P161 ?cast_member_of_bacon_movie.
                    {
                        SELECT DISTINCT ?film_or_series 
                        WHERE
                        {
                           ### find all films that KB acted ###
                           ?film_or_series wdt:P161 wd:Q3454165.
                        }
                    }
                }
            }
        }
    }
}
"""

print("Results")
x=run_query(queryString)

Results
[('cast_member_with_kbn_2', 'http://www.wikidata.org/entity/Q5357354'), ('label', 'Todd Stashwick')]
1


## Task 7
Consider the actors who are members of the cast of HIMYM. Amongst the tv series which these actors acted return only those which received more than 2 awards (the result set must be triples of tv series IRI, label, #awards won).

In [38]:
# write your queries
# s:HIMYM p:'instance of' o:what node?
queryString = """
SELECT ?object ?label
WHERE
{
   wd:Q147235 wdt:P31 ?object.
   
   ?object sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('object', 'http://www.wikidata.org/entity/Q5398426'), ('label', 'television series')]
1


In [39]:
# getting only the television series that an arbitrary actor (Josh Radnor) has played in
queryString = """
SELECT DISTINCT ?series ?label 
WHERE
{
   ?series wdt:P161 wd:Q223455.
   ?series wdt:P31 wd:Q5398426.
   
   ?series sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('series', 'http://www.wikidata.org/entity/Q147235'), ('label', 'How I Met Your Mother')]
[('series', 'http://www.wikidata.org/entity/Q30607098'), ('label', 'Rise')]
[('series', 'http://www.wikidata.org/entity/Q22074985'), ('label', 'Mercy Street')]
3


In [40]:
# getting number of awards of the node 'HIMYM'
queryString = """
SELECT wd:Q147235 ?label (COUNT (?award) AS ?num_award)
WHERE
{
   wd:Q147235 wdt:P166 ?award.
   
   wd:Q147235 sc:name ?label.
}
"""

print("Results")
x=run_query(queryString)

Results
[('callret-0', 'http://www.wikidata.org/entity/Q147235'), ('label', 'How I Met Your Mother'), ('num_award', '7')]
1


Final query for this task

In [41]:
# write your final query
queryString = """
SELECT ?series_iri ?label ?num_award
WHERE
{
    FILTER (?num_award >= "2"^^xsd:integer).
    {
        SELECT ?series_iri ?label (COUNT (DISTINCT ?award) AS ?num_award)
        WHERE
        {
           wd:Q147235 wdt:P161 ?actor_iri.
           ?series_iri wdt:P161 ?actor_iri.
           ?series_iri wdt:P31 wd:Q5398426.
           ?series_iri wdt:P166 ?award.

           ?series_iri sc:name ?label.
        }
        ORDER BY DESC(?num_award)
    }
}
"""

print("Results")
x=run_query(queryString)

Results
[('series_iri', 'http://www.wikidata.org/entity/Q23628'), ('label', 'The Sopranos'), ('num_award', '22')]
[('series_iri', 'http://www.wikidata.org/entity/Q1079'), ('label', 'Breaking Bad'), ('num_award', '21')]
[('series_iri', 'http://www.wikidata.org/entity/Q1132439'), ('label', 'The Practice'), ('num_award', '12')]
[('series_iri', 'http://www.wikidata.org/entity/Q16756'), ('label', 'Modern Family'), ('num_award', '11')]
[('series_iri', 'http://www.wikidata.org/entity/Q79784'), ('label', 'Friends'), ('num_award', '10')]
[('series_iri', 'http://www.wikidata.org/entity/Q1136370'), ('label', 'General Hospital'), ('num_award', '10')]
[('series_iri', 'http://www.wikidata.org/entity/Q8539'), ('label', 'The Big Bang Theory'), ('num_award', '10')]
[('series_iri', 'http://www.wikidata.org/entity/Q244803'), ('label', 'Ally McBeal'), ('num_award', '9')]
[('series_iri', 'http://www.wikidata.org/entity/Q438406'), ('label', "Grey's Anatomy"), ('num_award', '8')]
[('series_iri', 'http://www.