# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [2]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-6c184b625b-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString,verbose = True):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        results = sparql.query()
        json_results = results.convert()
        if len(json_results['results']['bindings'])==0:
            print("Empty")
            return []
        array = []
        for bindings in json_results['results']['bindings']:
            app =  [ (var, value['value'])  for var, value in bindings.items() ] 
            if verbose:
                print( app)
            array.append(app)
        if verbose:
            print(len(array))
        return array

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# Movie Workflow Series ("Tv series Without a Trace explorative search") 


Consider the following exploratory scenario:


> we are interested in the TV series "Without a Trace" and we want to investigate the main aspects related to the actors and directors involved in the production, know the numerber of seasons and check what are the episodes which got the higher success/impact.


## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | profession    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P4969`    | derivative work      | predicate |
| `wd:Q826477` | Without a Trace    | node |
| `wd:Q733960` | Cold Case       | node |



Also consider

```
wd:Q826477 ?p ?obj .
```

is the BGP to retrieve all **properties of Without a Trace**

Please consider that when you return a resource, you should return the IRI and the label of the resource. In particular, when the task require you to identify a BGP the result set must always be a list of couples IRI - label.



The workload should

1. Identify the BGP for television series

2. Return the number of seasons and episodes per season of the tv series (the result set must be triples of season IRI, label and #episodes).

3. Get the number of episodes in which the cast members played a role. Who are the most present actors? (the result set must be a list of triples actor/actress IRI, label and #episodes)

4. Check who is the actor who acted in more films while working on "Without a Trace" (the result set must be a list of triples actor/actress IRI, label and #films).

5. Compare Without a Trace with the tv series "Cold Case" in terms of number of seasons, episods and cast members (the result set must be two elements -one for each tv series- of tv series IRI, label, #seasons, #episodes and #cast members).

6. Return the actors who are members of the cast of Without a Trace have [Kavin Bacon number](https://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon#:~:text=Kevin%20Bacon%20himself%20has%20a,Bacon%20number%20is%20N%2B1.) equal to 2 (the result set must be a list of couples actor/actress IRI and label).

7. Consider the actors who are members of the cast of Cold Case. Amongst the tv series which these actors acted return only those which received more than 2 awards (the result set must be triples of tv series IRI, label, #awards won).

## Task 1
Identify the BGP for television series

In [2]:
# query example
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q826477 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1113'), ('name', 'number of episodes')]
[('p', 'http://www.wikidata.org/prop/direct/P1258'), ('name', 'Rotten Tomatoes ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1267'), ('name', 'AlloCiné series ID')]
[('p', 'http://www.wikidata.org/prop/direct/P136'), ('name', 'genre')]
[('p', 'http://www.wikidata.org/prop/direct/P1476'), ('name', 'title')]
[('p', 'http://www.wikidata.org/prop/direct/P154'), ('name', 'logo image')]
[('p', 'http://www.wikidata.org/prop/direct/P1562'), ('name', 'AllMovie title ID')]
[('p', 'http://www.wikidata.org/prop/direct/P161'), ('name', 'cast member')]
[('p', 'http://www.wikidata.org/prop/direct/P170'), ('name', 'creator')]
[('p', 'http://www.wikidata.org/prop/direct/P1712'), ('name', 'Metacritic ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1811'), ('name', 'list of episodes')]
[('p', 'http://www.wikidata.org/prop/direct/P2047'), ('name', 'duration')]
[('p', 'http://www.wikidata.org/prop/direct/P2

Final query for this task

In [7]:
# write your final query
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q826477 wdt:P31 ?obj .
   # get the label
   ?obj sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q5398426'), ('name', 'television series')]
1


## Task 2
Return the number of seasons and episodes per season of the tv series (the result set must be triples of season IRI, label and #episodes).

In [11]:
# write your queries
#exploring the list of episodes object
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q826477 wdt:P1811 ?obj .
   # get the label
   ?obj sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q282789'), ('name', 'list of Without a Trace episodes')]
1


In [12]:
#not going anywhere but we've still retrieved the BGP of episode
queryString = """
SELECT DISTINCT ?p ?namep ?obj ?name
WHERE {
   # bind something
   wd:Q282789 ?p ?obj .
   # get the label
   ?p sc:name ?namep .
   ?obj sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P360'), ('namep', 'is a list of'), ('obj', 'http://www.wikidata.org/entity/Q1983062'), ('name', 'episode')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('namep', 'instance of'), ('obj', 'http://www.wikidata.org/entity/Q13406463'), ('name', 'Wikimedia list article')]
[('p', 'http://www.wikidata.org/prop/direct/P360'), ('namep', 'is a list of'), ('obj', 'http://www.wikidata.org/entity/Q21191270'), ('name', 'television series episode')]
3


In [21]:
#looking at what the property "number of seasons" is doing. We would like to retrieve the object season. Maybe using the episode object from before
queryString = """
SELECT DISTINCT ?obj 
WHERE {
   # bind something
   wd:Q826477 wdt:P2437 ?obj .
   # get the label
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', '7')]
1


In [7]:
#looking at the properties of episode
queryString = """
SELECT DISTINCT ?p ?namep ?obj ?name
WHERE {
   # bind something
   wd:Q1983062 ?p ?obj .
   # get the label
   ?p sc:name ?namep .
   ?obj sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1687'), ('namep', 'Wikidata property'), ('obj', 'http://www.wikidata.org/entity/P1113'), ('name', 'number of episodes')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('namep', 'part of'), ('obj', 'http://www.wikidata.org/entity/Q15709880'), ('name', 'serial')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('namep', 'part of'), ('obj', 'http://www.wikidata.org/entity/Q526877'), ('name', 'web series')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('namep', 'different from'), ('obj', 'http://www.wikidata.org/entity/Q21191270'), ('name', 'television series episode')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('namep', 'described by source'), ('obj', 'http://www.wikidata.org/entity/Q2657718'), ('name', 'Armenian Soviet Encyclopedia')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('namep', 'described by source'), ('obj', 'http://www.wikidata.org/entity/Q63284758'), ('name', 'Desktop Encyclopedic Dict

In [25]:
#maybe trying the other way around
queryString = """
SELECT DISTINCT ?p ?namep
WHERE {
   # bind something
   ?s ?p wd:Q1983062 .
   # get the label
   ?p sc:name ?namep .
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('namep', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P2670'), ('namep', 'has parts of the class')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('namep', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('namep', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P360'), ('namep', 'is a list of')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('namep', 'part of')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('namep', 'has part')]
[('p', 'http://www.wikidata.org/prop/direct/P921'), ('namep', 'main subject')]
[('p', 'http://www.wikidata.org/prop/direct/P971'), ('namep', 'category combines topics')]
[('p', 'http://www.wikidata.org/prop/direct/P1629'), ('namep', 'Wikidata item of this property')]
10


In [32]:
#trying to see which objects can contain episodes
#not season apparently.
queryString = """
SELECT DISTINCT ?s ?name
WHERE {
   # bind something
   ?s wdt:P527 wd:Q1983062 ;
       sc:name ?name .
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q15709880'), ('name', 'serial')]
[('s', 'http://www.wikidata.org/entity/Q838795'), ('name', 'comic strip')]
2


In [5]:
#wait, we've obtained the BGP of "televisions series" at the task before why haven't we used this?
queryString = """
SELECT DISTINCT ?p ?namep ?obj ?name
WHERE {
   # bind something
   wd:Q5398426 ?p ?obj .
   # get the label
   ?p sc:name ?namep .
   ?obj sc:name ?name.
}
LIMIT 40
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('namep', 'properties for this type'), ('obj', 'http://www.wikidata.org/entity/P1113'), ('name', 'number of episodes')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('namep', 'subclass of'), ('obj', 'http://www.wikidata.org/entity/Q15416'), ('name', 'television program')]
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('namep', 'properties for this type'), ('obj', 'http://www.wikidata.org/entity/P580'), ('name', 'start time')]
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('namep', 'properties for this type'), ('obj', 'http://www.wikidata.org/entity/P582'), ('name', 'end time')]
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('namep', 'properties for this type'), ('obj', 'http://www.wikidata.org/entity/P1476'), ('name', 'title')]
[('p', 'http://www.wikidata.org/prop/direct/P5869'), ('namep', 'model item'), ('obj', 'http://www.wikidata.org/entity/Q131758'), ('name', 'Desperate Housewives')]
[('p', 'ht

In [37]:
#we obtained a lot of BGP that will surely be useful later and among them we retrieved television series season.
queryString = """
SELECT DISTINCT ?p ?namep ?obj ?name
WHERE {
   # bind something
   wd:Q3464665 ?p ?obj .
   # get the label
   ?p sc:name ?namep .
   ?obj sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('namep', 'properties for this type'), ('obj', 'http://www.wikidata.org/entity/P1113'), ('name', 'number of episodes')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('namep', 'subclass of'), ('obj', 'http://www.wikidata.org/entity/Q70990126'), ('name', 'creative work season')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('namep', 'subclass of'), ('obj', 'http://www.wikidata.org/entity/Q15416'), ('name', 'television program')]
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('namep', 'properties for this type'), ('obj', 'http://www.wikidata.org/entity/P580'), ('name', 'start time')]
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('namep', 'properties for this type'), ('obj', 'http://www.wikidata.org/entity/P582'), ('name', 'end time')]
[('p', 'http://www.wikidata.org/prop/direct/P2670'), ('namep', 'has parts of the class'), ('obj', 'http://www.wikidata.org/entity/Q21191270'), ('name', 'television seri

In [17]:
#maybe we can try to look at the properties of Game of Thrones, season 2 and see how it is linked to the serie.
queryString = """
SELECT DISTINCT ?p ?name ?obj ?nameo
WHERE {
   # bind something
   wd:Q302358 ?p ?obj .
   
   # get the label
   ?p sc:name ?name .
   ?obj sc:name ?nameo .
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P136'), ('name', 'genre'), ('obj', 'http://www.wikidata.org/entity/Q98526239'), ('nameo', 'television series based on a novel')]
[('p', 'http://www.wikidata.org/prop/direct/P449'), ('name', 'original broadcaster'), ('obj', 'http://www.wikidata.org/entity/Q23633'), ('nameo', 'HBO')]
[('p', 'http://www.wikidata.org/prop/direct/P364'), ('name', 'original language of film or TV show'), ('obj', 'http://www.wikidata.org/entity/Q1860'), ('nameo', 'English')]
[('p', 'http://www.wikidata.org/prop/direct/P495'), ('name', 'country of origin'), ('obj', 'http://www.wikidata.org/entity/Q30'), ('nameo', 'United States of America')]
[('p', 'http://www.wikidata.org/prop/direct/P437'), ('name', 'distribution format'), ('obj', 'http://www.wikidata.org/entity/Q723685'), ('nameo', 'video on demand')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of'), ('obj', 'http://www.wikidata.org/entity/Q3464665'), ('nameo', 'television series 

In [10]:
#we found the property "part of the serie"
queryString = """
SELECT DISTINCT ?season ?name 
WHERE {
   ?season wdt:P31 wd:Q3464665 ;
       wdt:P179 wd:Q826477 ;
       sc:name ?name .
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('season', 'http://www.wikidata.org/entity/Q1120248'), ('name', 'Without a Trace, season 1')]
[('season', 'http://www.wikidata.org/entity/Q3729810'), ('name', 'Without a Trace, season 4')]
[('season', 'http://www.wikidata.org/entity/Q3729811'), ('name', 'Without a Trace, season 5')]
[('season', 'http://www.wikidata.org/entity/Q3729812'), ('name', 'Without a Trace, season 2')]
[('season', 'http://www.wikidata.org/entity/Q3729815'), ('name', 'Without a Trace, season 6')]
[('season', 'http://www.wikidata.org/entity/Q3729816'), ('name', 'Without a Trace, season 7')]
[('season', 'http://www.wikidata.org/entity/Q3729817'), ('name', 'Without a Trace, season 3')]
7


Final query for this task

In [11]:
# write your final query
queryString = """
SELECT DISTINCT ?season ?name ?episodes
WHERE {
   ?season wdt:P31 wd:Q3464665 ;
       wdt:P179 wd:Q826477 ;
       wdt:P1113 ?episodes ;
       sc:name ?name .
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('season', 'http://www.wikidata.org/entity/Q1120248'), ('name', 'Without a Trace, season 1'), ('episodes', '23')]
[('season', 'http://www.wikidata.org/entity/Q3729810'), ('name', 'Without a Trace, season 4'), ('episodes', '24')]
[('season', 'http://www.wikidata.org/entity/Q3729811'), ('name', 'Without a Trace, season 5'), ('episodes', '24')]
[('season', 'http://www.wikidata.org/entity/Q3729812'), ('name', 'Without a Trace, season 2'), ('episodes', '24')]
[('season', 'http://www.wikidata.org/entity/Q3729815'), ('name', 'Without a Trace, season 6'), ('episodes', '18')]
[('season', 'http://www.wikidata.org/entity/Q3729816'), ('name', 'Without a Trace, season 7'), ('episodes', '24')]
[('season', 'http://www.wikidata.org/entity/Q3729817'), ('name', 'Without a Trace, season 3'), ('episodes', '23')]
7


## Task 3
Get the number of episodes in which the cast members played a role. Who are the most present actors? (the result set must be a list of triples actor/actress IRI, label and #episodes)

In [12]:
# write your queries
#getting the cast members of the serie.
#strange that there are only two maybe the data is not complete
queryString = """
SELECT DISTINCT ?member ?name
WHERE {
   # bind something
   wd:Q826477 wdt:P161 ?member.
   # get the label
   ?member sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('member', 'http://www.wikidata.org/entity/Q308124'), ('name', 'Anthony LaPaglia')]
[('member', 'http://www.wikidata.org/entity/Q235075'), ('name', 'Poppy Montgomery')]
2


In [21]:
#now we're gonna try to retrieve the episodes of WaT
queryString = """
SELECT DISTINCT  ?episode ?name
WHERE {
   ?episode wdt:P31 wd:Q21191270 ; 
            wdt:P179 wd:Q826477 ;
            sc:name ?name .
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('episode', 'http://www.wikidata.org/entity/Q52667904'), ('name', 'Maple Street')]
[('episode', 'http://www.wikidata.org/entity/Q52667908'), ('name', 'Underground Railroad')]
[('episode', 'http://www.wikidata.org/entity/Q52667910'), ('name', 'Hang On to Me')]
[('episode', 'http://www.wikidata.org/entity/Q52667965'), ('name', 'Our Sons and Daughters')]
[('episode', 'http://www.wikidata.org/entity/Q52667968'), ('name', 'A Tree Falls')]
[('episode', 'http://www.wikidata.org/entity/Q52667970'), ('name', 'Trip Box')]
[('episode', 'http://www.wikidata.org/entity/Q52667973'), ('name', 'Moving On')]
[('episode', 'http://www.wikidata.org/entity/Q52667986'), ('name', 'Life Rules')]
[('episode', 'http://www.wikidata.org/entity/Q52667989'), ('name', 'The Line')]
[('episode', 'http://www.wikidata.org/entity/Q52667991'), ('name', 'Wannabe')]
[('episode', 'http://www.wikidata.org/entity/Q52667993'), ('name', 'Risen')]
[('episode', 'http://www.wikidata.org/entity/Q52667996'), ('name', 'Gung H

Final query for this task

In [26]:
#okay now I have other actors, that's weird. And so there are 5 principal actors.
# write your final query
queryString = """
SELECT DISTINCT  ?cast ?name (COUNT(?episode) AS ?nbepisodes)
WHERE {
   ?episode wdt:P31 wd:Q21191270 ; 
            wdt:P179 wd:Q826477 ;
            wdt:P161 ?cast.
   ?cast sc:name ?name.
} GROUP BY ?cast ?name
ORDER BY DESC(?nbepisodes)
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('cast', 'http://www.wikidata.org/entity/Q308124'), ('name', 'Anthony LaPaglia'), ('nbepisodes', '160')]
[('cast', 'http://www.wikidata.org/entity/Q235075'), ('name', 'Poppy Montgomery'), ('nbepisodes', '160')]
[('cast', 'http://www.wikidata.org/entity/Q503040'), ('name', 'Eric Close'), ('nbepisodes', '160')]
[('cast', 'http://www.wikidata.org/entity/Q551608'), ('name', 'Enrique Murciano'), ('nbepisodes', '160')]
[('cast', 'http://www.wikidata.org/entity/Q430872'), ('name', 'Marianne Jean-Baptiste'), ('nbepisodes', '160')]
[('cast', 'http://www.wikidata.org/entity/Q459638'), ('name', 'Stacy Edwards'), ('nbepisodes', '1')]
[('cast', 'http://www.wikidata.org/entity/Q706513'), ('name', 'Charles S. Dutton'), ('nbepisodes', '1')]
[('cast', 'http://www.wikidata.org/entity/Q5363239'), ('name', 'Elizabeth Morehead'), ('nbepisodes', '1')]
[('cast', 'http://www.wikidata.org/entity/Q64684485'), ('name', 'Caris Vujcec'), ('nbepisodes', '1')]
[('cast', 'http://www.wikidata.org/entity/Q1721

## Task 4
Check who is the actor who acted in more films while working on "Without a Trace" (the result set must be a list of triples actor/actress IRI, label and #films).

In [17]:
# write your queries
#We have to get the BGP of actor first, taking the occupations of Anthony LaPaglia for instance
queryString = """
SELECT DISTINCT  ?o ?nameo 
WHERE {
   # bind something
   wd:Q308124 wdt:P106 ?o.
   ?o sc:name ?nameo.    
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q10798782'), ('nameo', 'television actor')]
[('o', 'http://www.wikidata.org/entity/Q10800557'), ('nameo', 'film actor')]
[('o', 'http://www.wikidata.org/entity/Q2259451'), ('nameo', 'stage actor')]
[('o', 'http://www.wikidata.org/entity/Q3282637'), ('nameo', 'film producer')]
[('o', 'http://www.wikidata.org/entity/Q33999'), ('nameo', 'actor')]
[('o', 'http://www.wikidata.org/entity/Q937857'), ('nameo', 'association football player')]
6


In [14]:
#now we want to get the BGP of film
#maybe looking at the properties/objects of these properties of film actor.
#not working
queryString = """
SELECT DISTINCT ?p ?name ?o ?nameo
WHERE {
   # bind something
   wd:Q10800557 ?p ?o .
   ?p sc:name ?name. 
   ?o sc:name ?nameo .
   FILTER (!isLiteral(?o))
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of'), ('o', 'http://www.wikidata.org/entity/Q4220920'), ('nameo', 'filmmaking occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('name', "topic's main category"), ('o', 'http://www.wikidata.org/entity/Q5479723'), ('nameo', 'Category:Film actors')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from'), ('o', 'http://www.wikidata.org/entity/Q10798782'), ('nameo', 'television actor')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of'), ('o', 'http://www.wikidata.org/entity/Q33999'), ('nameo', 'actor')]
[('p', 'http://www.wikidata.org/prop/direct/P1687'), ('name', 'Wikidata property'), ('o', 'http://www.wikidata.org/entity/P161'), ('nameo', 'cast member')]
5


In [18]:
#maybe just trying to find a film played by Anthony LaPaglia 
queryString = """
SELECT DISTINCT  ?s ?name
WHERE {
   # bind something
   ?s wdt:P161 wd:Q308124;
       sc:name ?name.    
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q255376'), ('name', 'The Client')]
[('s', 'http://www.wikidata.org/entity/Q21511509'), ('name', 'Annabelle: Creation')]
[('s', 'http://www.wikidata.org/entity/Q12126147'), ('name', 'Mental')]
[('s', 'http://www.wikidata.org/entity/Q3805682'), ('name', 'Jack the Dog')]
[('s', 'http://www.wikidata.org/entity/Q826477'), ('name', 'Without a Trace')]
[('s', 'http://www.wikidata.org/entity/Q1352085'), ('name', 'Analyze That')]
[('s', 'http://www.wikidata.org/entity/Q7728500'), ('name', 'The Custodian')]
[('s', 'http://www.wikidata.org/entity/Q16954197'), ('name', 'The Code')]
[('s', 'http://www.wikidata.org/entity/Q3549263'), ('name', 'Underground: The Julian Assange Story')]
[('s', 'http://www.wikidata.org/entity/Q1570302'), ('name', 'Lansky')]
[('s', 'http://www.wikidata.org/entity/Q1125349'), ('name', 'Autumn in New York')]
[('s', 'http://www.wikidata.org/entity/Q52667904'), ('name', 'Maple Street')]
[('s', 'http://www.wikidata.org/entity/Q52

In [30]:
#the only one I know in there and which I'm sure is a film is Annabelle. Let's use that
queryString = """
SELECT DISTINCT  ?obj ?name
WHERE {
   # bind something
   wd:Q21511509 wdt:P31 ?obj.
   ?obj sc:name ?name.    
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q11424'), ('name', 'film')]
1


In [29]:
#retrieving the properties linking an actor with a film
queryString = """
SELECT DISTINCT  ?p ?name
WHERE {
   # bind something
   ?actor wdt:P106 wd:Q33999.
   ?film wdt:P31 wd:Q11424.
   ?actor ?p ?film.
   ?p sc:name ?name. 
}
LIMIT 40
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P800'), ('name', 'notable work')]
[('p', 'http://www.wikidata.org/prop/direct/P1441'), ('name', 'present in work')]
[('p', 'http://www.wikidata.org/prop/direct/P1455'), ('name', 'list of works')]
[('p', 'http://www.wikidata.org/prop/direct/P161'), ('name', 'cast member')]
[('p', 'http://www.wikidata.org/prop/direct/P175'), ('name', 'performer')]
[('p', 'http://www.wikidata.org/prop/direct/P101'), ('name', 'field of work')]
[('p', 'http://www.wikidata.org/prop/direct/P1283'), ('name', 'filmography')]
[('p', 'http://www.wikidata.org/prop/direct/P27'), ('name', 'country of citizenship')]
[('p', 'http://www.wikidata.org/prop/direct/P725'), ('name', 'voice actor')]
[('p', 'http://www.wikidata.org/prop/direct/P2318'), ('name', 'debut participant')]
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('name', 'participant in')]
[('p', 'http://www.wikidata.org/prop/direct/P608'), ('name', 'exhibition history')]
[('p', 'http://www.wikidata.or

In [33]:
#that was my final query(with LIMIT 1) but after seeing that just using the property cast member on the serie doesn't retrieve them all at all, I'm gonna adapt it
queryString = """
SELECT DISTINCT ?member ?name (COUNT(?film) AS ?nbfilms)
WHERE {
   # bind something
   wd:Q826477 wdt:P161 ?member.
   ?film wdt:P31 wd:Q11424;
        wdt:P161 ?member.
   ?member sc:name ?name.
} GROUP BY ?member ?name
ORDER BY DESC(?nbfilms)
LIMIT 5
"""

print("Results")
x=run_query(queryString)

Results
[('member', 'http://www.wikidata.org/entity/Q308124'), ('name', 'Anthony LaPaglia'), ('nbfilms', '47')]
[('member', 'http://www.wikidata.org/entity/Q235075'), ('name', 'Poppy Montgomery'), ('nbfilms', '5')]
2


Final query for this task

In [37]:
# write your final query
queryString = """
SELECT DISTINCT ?cast ?name (COUNT(?film) AS ?nbfilms)
WHERE {
    ?film wdt:P31 wd:Q11424;
        wdt:P161 ?member.
   ?member sc:name ?name.
    {
        SELECT DISTINCT ?cast WHERE{         #This subquery just to have the whole cast and not only two people
            ?episode wdt:P31 wd:Q21191270 ; 
                wdt:P179 wd:Q826477 ;
                wdt:P161 ?cast.
        }
    }
    FILTER(?member = ?cast)
} GROUP BY ?cast ?name
ORDER BY DESC(?nbfilms)
LIMIT 1
"""

print("Results")
x=run_query(queryString)

Results
[('cast', 'http://www.wikidata.org/entity/Q355163'), ('name', 'Bruce Davison'), ('nbfilms', '56')]
1


## Task 5
Compare Without a Trace with the tv series "Cold Case" in terms of number of seasons, episods and cast members (the result set must be two elements -one for each tv series- of tv series IRI, label, #seasons, #episodes and #cast members).

In [47]:
# write your queries
#that was my final query before doing the task 3.   
queryString = """
SELECT DISTINCT ?serie ?name ?seasons ?episodes (COUNT(?cast) as ?nbactors)
WHERE {
   VALUES (?serie) {
   (wd:Q826477)
   (wd:Q733960)
   }
   ?serie wdt:P2437 ?seasons ;
       wdt:P1113 ?episodes;
       wdt:P161 ?cast ;
       sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('serie', 'http://www.wikidata.org/entity/Q733960'), ('name', 'Cold Case'), ('seasons', '7'), ('episodes', '156'), ('nbactors', '478')]
[('serie', 'http://www.wikidata.org/entity/Q826477'), ('name', 'Without a Trace'), ('seasons', '7'), ('episodes', '160'), ('nbactors', '2')]
2


Final query for this task

In [48]:
# write your final query
#But having only 2 cast members for WaT is still very weird, so I'm gonna retrieve all the people being cast members in an episode of the series instead.
queryString = """
SELECT DISTINCT ?serie2 ?name ?seasons ?episodes ?nbactors
WHERE {
   
   ?serie2 wdt:P2437 ?seasons ;
       wdt:P1113 ?episodes;
       sc:name ?name.
   {
        SELECT DISTINCT ?serie (COUNT(?cast) as ?nbactors) WHERE{         #This subquery just to have the whole cast and not only two people
            ?episode wdt:P31 wd:Q21191270 ; 
                wdt:P179 ?serie ;
                wdt:P161 ?cast.
            ?cast wdt:P106 wd:Q33999.
            FILTER (?serie IN (wd:Q826477, wd:Q733960))
        }
    }
    FILTER(?serie2 = ?serie)
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('serie2', 'http://www.wikidata.org/entity/Q733960'), ('name', 'Cold Case'), ('seasons', '7'), ('episodes', '156'), ('nbactors', '1296')]
[('serie2', 'http://www.wikidata.org/entity/Q826477'), ('name', 'Without a Trace'), ('seasons', '7'), ('episodes', '160'), ('nbactors', '821')]
2


## Task 6
Return the actors who are members of the cast of Without a Trace have [Kavin Bacon number](https://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon#:~:text=Kevin%20Bacon%20himself%20has%20a,Bacon%20number%20is%20N%2B1.) equal to 2 (the result set must be a list of couples actor/actress IRI and label).

In [58]:
# write your queries
#we first need to retrieve the BGP of Kevin Bacon
queryString = """

SELECT DISTINCT ?kevBacon ?kevin
WHERE {
   ?kevBacon wdt:P106 wd:Q33999;
       sc:name ?kevin.
   FILTER (?kevin = "Kevin Bacon") .
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [61]:
#not working, let's see with actors that played with Anthony LaPaglia
queryString = """
SELECT DISTINCT ?actor ?name
WHERE {
   ?film wdt:P31 wd:Q11424 ;
       wdt:P161 wd:Q308124 ;
       wdt:P161 ?actor.
    ?actor sc:name ?name.
    FILTER REGEX (?name, "^Kevin")
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('actor', 'http://www.wikidata.org/entity/Q586857'), ('name', 'Kevin Conway')]
[('actor', 'http://www.wikidata.org/entity/Q1334725'), ('name', 'Kevin Corrigan')]
[('actor', 'http://www.wikidata.org/entity/Q3454165'), ('name', 'Kevin Bacon')]
[('actor', 'http://www.wikidata.org/entity/Q1740104'), ('name', 'Kevin Chamberlin')]
4


Final query for this task

In [6]:
# write your final query
queryString = """
SELECT DISTINCT ?WaTactor ?name
WHERE {
       ?episode wdt:P31 wd:Q21191270 ; #to have the whole cast
                wdt:P179 wd:Q826477 ;
                wdt:P161 ?WaTactor.
       ?film wdt:P161 ?WaTactor;
           wdt:P161 ?actor.
       ?WaTactor sc:name ?name .

    {
        SELECT DISTINCT ?bacon1
        WHERE {
           ?bacon1 wdt:P106 wd:Q33999.
           ?film wdt:P31 wd:Q11424 ;
               wdt:P161 ?bacon1 ;
               wdt:P161 wd:Q3454165.
        }
    } FILTER (?actor = ?bacon1) 
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)


Results
[('WaTactor', 'http://www.wikidata.org/entity/Q139341'), ('name', 'Zach Grenier')]
[('WaTactor', 'http://www.wikidata.org/entity/Q373989'), ('name', 'David Paymer')]
[('WaTactor', 'http://www.wikidata.org/entity/Q3265830'), ('name', 'Lucille Soong')]
[('WaTactor', 'http://www.wikidata.org/entity/Q430872'), ('name', 'Marianne Jean-Baptiste')]
[('WaTactor', 'http://www.wikidata.org/entity/Q238501'), ('name', 'Alex Veadov')]
[('WaTactor', 'http://www.wikidata.org/entity/Q308124'), ('name', 'Anthony LaPaglia')]
[('WaTactor', 'http://www.wikidata.org/entity/Q355163'), ('name', 'Bruce Davison')]
[('WaTactor', 'http://www.wikidata.org/entity/Q706513'), ('name', 'Charles S. Dutton')]
[('WaTactor', 'http://www.wikidata.org/entity/Q1125651'), ('name', 'Thom Barry')]
[('WaTactor', 'http://www.wikidata.org/entity/Q235075'), ('name', 'Poppy Montgomery')]
[('WaTactor', 'http://www.wikidata.org/entity/Q22704603'), ('name', 'Lily Knight')]
[('WaTactor', 'http://www.wikidata.org/entity/Q445325'

## Task 7
Consider the actors who are members of the cast of Cold Case. Amongst the tv series which these actors acted return only those which received more than 2 awards (the result set must be triples of tv series IRI, label, #awards won).

In [3]:
# write your queries
#Getting the cast members of Cold Case
queryString = """
SELECT DISTINCT ?member ?name
WHERE {
   # bind something
   wd:Q733960 wdt:P161 ?member.
   # get the label
   ?member sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('member', 'http://www.wikidata.org/entity/Q270382'), ('name', 'Justina Machado')]
[('member', 'http://www.wikidata.org/entity/Q269891'), ('name', 'Julianna Guill')]
[('member', 'http://www.wikidata.org/entity/Q128821'), ('name', 'Stephen Full')]
[('member', 'http://www.wikidata.org/entity/Q1340897'), ('name', 'Martin Spanjers')]
[('member', 'http://www.wikidata.org/entity/Q275476'), ('name', 'Kay Lenz')]
[('member', 'http://www.wikidata.org/entity/Q441617'), ('name', 'Gabrielle Miller')]
[('member', 'http://www.wikidata.org/entity/Q449249'), ('name', 'Chelsea Hobbs')]
[('member', 'http://www.wikidata.org/entity/Q613378'), ('name', 'Bruno Campos')]
[('member', 'http://www.wikidata.org/entity/Q242206'), ('name', 'Bree Turner')]
[('member', 'http://www.wikidata.org/entity/Q258246'), ('name', 'Lois Smith')]
[('member', 'http://www.wikidata.org/entity/Q261926'), ('name', 'Muse Watson')]
[('member', 'http://www.wikidata.org/entity/Q266179'), ('name', 'Johnathon Schaech')]
[('member

Final query for this task

In [9]:
# write your final query
#the property award received was found somewhere during the task 4
queryString = """
SELECT DISTINCT ?serie2 ?name (COUNT(?award) AS ?nbawards) 
WHERE {
    ?serie2 wdt:P31 wd:Q5398426;
            wdt:P166 ?award;
            sc:name ?name.
    FILTER (?serie2 = ?serie) 
    {
        SELECT DISTINCT ?serie
        WHERE {
           wd:Q733960 wdt:P161 ?member .
           ?serie wdt:P31 wd:Q5398426 ;
                wdt:P161 ?member. 
        }
    } 
}GROUP BY ?serie2 ?name 
HAVING (COUNT(?award) > 2)
ORDER BY DESC (?nbawards)
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('serie2', 'http://www.wikidata.org/entity/Q1079'), ('name', 'Breaking Bad'), ('nbawards', '21')]
[('serie2', 'http://www.wikidata.org/entity/Q56153643'), ('name', 'Watchmen'), ('nbawards', '20')]
[('serie2', 'http://www.wikidata.org/entity/Q1132439'), ('name', 'The Practice'), ('nbawards', '12')]
[('serie2', 'http://www.wikidata.org/entity/Q16756'), ('name', 'Modern Family'), ('nbawards', '11')]
[('serie2', 'http://www.wikidata.org/entity/Q8539'), ('name', 'The Big Bang Theory'), ('nbawards', '10')]
[('serie2', 'http://www.wikidata.org/entity/Q244803'), ('name', 'Ally McBeal'), ('nbawards', '9')]
[('serie2', 'http://www.wikidata.org/entity/Q1145764'), ('name', 'Guiding Light'), ('nbawards', '9')]
[('serie2', 'http://www.wikidata.org/entity/Q30599007'), ('name', 'Succession'), ('nbawards', '8')]
[('serie2', 'http://www.wikidata.org/entity/Q438406'), ('name', "Grey's Anatomy"), ('nbawards', '8')]
[('serie2', 'http://www.wikidata.org/entity/Q1030713'), ('name', 'Another World'),