# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.

In [12]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-03aeb85bfd-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString,verbose = True):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        results = sparql.query()
        json_results = results.convert()
        if len(json_results['results']['bindings'])==0:
            print("Empty")
            return []
        array = []
        for bindings in json_results['results']['bindings']:
            app =  [ (var, value['value'])  for var, value in bindings.items() ] 
            if verbose:
                print( app)
            array.append(app)
        if verbose:
            print(len(array))
        return array

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# Movie Workflow Series ("Production company explorative search") 

Consider the following exploratory information need:

> investigate the main companies working on cinema-related content. We want to know which are the main television production companies and the film production companies, which company distributes more film and some consideration about awards.

## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | profession    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P27`     | nationality   | predicate |
| `wdt:P106`     | profession   | predicate |
| `wd:Q5`| Human       | node |
| `wd:Q36479`    | The Lion King      | node |





Also consider

```
wd:Q36479 ?p ?obj .
```

is the BGP to retrieve all **properties of The Lion King**

Please consider that when you return a resource, you should return the IRI and the label of the resource. In particular, when the task require you to identify a BGP the result set must always be a list of couples IRI - label.

The workload should:


1. Identify the BGP for films

2. Identify the BGP for Netflix

3. Identify the BGP for television production company

4. Identify the BGP for film production company

5. Find the top-5 companies (amongst television production company and film production company) which produced the highest number of crime films (the result set must be a list of triples company IRI, label and #crime film).

6. Find the company (amongst television production company and film production company) which distributed more films (of any kind of film genre) that they did not produce (the result set must be a list of triples company IRI, label and #film).

7. Find how many company are listed in every stock exchange market (the result set must be a list of triples stock exchange IRI, label and #companies). 

8. Identify the BGP for Academy Award

9. Find the companies (among television and film production companies) that won at least 5 Academy Awards for Best Actress for the movies they produced (the result set must be a list of triples company IRI, label and #awards).

## Task 1
Identify the BGP for films

In [32]:
# query example
queryString = """
SELECT DISTINCT ?p ?getname
WHERE {
   # bind something
   wd:Q36479 ?p ?obj .
   #get the label
   ?p sc:name ?getname.

}
LIMIT 20
"""

print("Results")
x=run_query(queryString)



Results
[('get', 'http://www.wikidata.org/prop/direct/P3092'), ('getname', 'film crew member')]
[('get', 'http://www.wikidata.org/prop/direct/P4021'), ('getname', 'danskefilm animated film ID')]
[('get', 'http://www.wikidata.org/prop/direct/P7679'), ('getname', 'Cinépolis KLIC ID')]
[('get', 'http://www.wikidata.org/prop/direct/P1015'), ('getname', 'NORAF ID')]
[('get', 'http://www.wikidata.org/prop/direct/P1040'), ('getname', 'film editor')]
[('get', 'http://www.wikidata.org/prop/direct/P1237'), ('getname', 'Box Office Mojo film ID (former scheme)')]
[('get', 'http://www.wikidata.org/prop/direct/P1258'), ('getname', 'Rotten Tomatoes ID')]
[('get', 'http://www.wikidata.org/prop/direct/P1265'), ('getname', 'AlloCiné film ID')]
[('get', 'http://www.wikidata.org/prop/direct/P1273'), ('getname', 'CANTIC ID')]
[('get', 'http://www.wikidata.org/prop/direct/P136'), ('getname', 'genre')]
[('get', 'http://www.wikidata.org/prop/direct/P1411'), ('getname', 'nominated for')]
[('get', 'http://www.w

Final query for this task

In [37]:
#[('get', 'http://www.wikidata.org/prop/direct/P136'), ('getname', 'genre')]
queryString = """
SELECT DISTINCT ?p ?getname
WHERE {
   # bind something
   wd:Q36479 wdt:P136 ?p.
   #get the label
   ?p sc:name ?getname.

}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q2143665'), ('getname', "children's film")]
[('p', 'http://www.wikidata.org/entity/Q130232'), ('getname', 'drama')]
[('p', 'http://www.wikidata.org/entity/Q842256'), ('getname', 'musical film')]
[('p', 'http://www.wikidata.org/entity/Q28968511'), ('getname', 'traditionally animated film')]
4


In [93]:
#[('get', 'http://www.wikidata.org/prop/direct/P136'), ('getname', 'genre')]
#[('p', 'http://www.wikidata.org/entity/Q28968511'), ('getname', 'traditionally animated film')]
queryString = """
SELECT DISTINCT ?p ?getname
WHERE {
   # bind something
   wd:Q28968511 ?obj ?p.
   #get the label
   ?p sc:name ?getname.

}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q17175676'), ('getname', 'animated cartoon')]
[('p', 'http://www.wikidata.org/entity/Q201658'), ('getname', 'film genre')]
[('p', 'http://www.wikidata.org/entity/Q11425'), ('getname', 'animation')]
[('p', 'http://www.wikidata.org/entity/Q202866'), ('getname', 'animated film')]
[('p', 'http://www.wikidata.org/entity/Q184781'), ('getname', 'traditional animation')]
[('p', 'http://www.wikidata.org/entity/Q6643876'), ('getname', 'list of traditionally animated feature films')]
6


In [94]:
#[('p', 'http://www.wikidata.org/entity/Q28968511'), ('getname', 'traditionally animated film')]
#[('p', 'http://www.wikidata.org/entity/Q842256'), ('getname', 'musical film')]
queryString = """
SELECT DISTINCT ?p ?getname
WHERE {
   # bind something
   wd:Q842256 ?p ?obj.
   #get the label
   ?p sc:name ?getname.

}
LIMIT 50
"""

print("Results")
x=run_query(queryString)


Results
[('p', 'http://www.wikidata.org/prop/direct/P1225'), ('getname', 'U.S. National Archives Identifier')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('getname', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('getname', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P227'), ('getname', 'GND ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2347'), ('getname', 'YSO ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2354'), ('getname', 'has list')]
[('p', 'http://www.wikidata.org/prop/direct/P244'), ('getname', 'Library of Congress authority ID')]
[('p', 'http://www.wikidata.org/prop/direct/P268'), ('getname', 'Bibliothèque nationale de France ID')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('getname', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('getname', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P3219'), ('getname', 'Encyclopædia Universalis ID')]
[('p', 'http

## Task 2
Identify the BGP for Netflix

In [40]:
#[('p', 'http://www.wikidata.org/prop/direct/P227'), ('getname', 'GND ID')]
queryString = """
SELECT DISTINCT ?p ?getname
WHERE {
   # bind something
   ?p wdt:P227 ?i.
   #get the label
   ?p sc:name ?getname.
   FILTER(REGEX(?getname, "Netflix"^^xsd:string))

}
LIMIT 20
"""

print("Results")
x=run_query(queryString)


Results
[('p', 'http://www.wikidata.org/entity/Q907311'), ('getname', 'Netflix')]
1


In [42]:
#[('p', 'http://www.wikidata.org/entity/Q907311'), ('getname', 'Netflix')]
queryString = """
SELECT DISTINCT ?p ?as ?i ?op
WHERE {
   # bind something
   wd:Q907311 ?p ?i.
   #get the label
   ?p sc:name ?as.
   ?i sc:name ?op
   
   

}
LIMIT 20
"""

print("Results")
x=run_query(queryString)


Results
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('as', 'instance of'), ('i', 'http://www.wikidata.org/entity/Q10689397'), ('op', 'television production company')]
[('p', 'http://www.wikidata.org/prop/direct/P400'), ('as', 'platform'), ('i', 'http://www.wikidata.org/entity/Q23930024'), ('op', 'streaming media receiver')]
[('p', 'http://www.wikidata.org/prop/direct/P166'), ('as', 'award received'), ('i', 'http://www.wikidata.org/entity/Q838121'), ('op', 'Peabody Awards')]
[('p', 'http://www.wikidata.org/prop/direct/P127'), ('as', 'owned by'), ('i', 'http://www.wikidata.org/entity/Q849363'), ('op', 'The Vanguard Group')]
[('p', 'http://www.wikidata.org/prop/direct/P1056'), ('as', 'product or material produced'), ('i', 'http://www.wikidata.org/entity/Q220499'), ('op', 'streaming media')]
[('p', 'http://www.wikidata.org/prop/direct/P452'), ('as', 'industry'), ('i', 'http://www.wikidata.org/entity/Q220499'), ('op', 'streaming media')]
[('p', 'http://www.wikidata.org/prop/direct/P4

Final query for this task

## Task 3
Identify the BGP for television production company

In [None]:
#[('p', 'http://www.wikidata.org/prop/direct/P31'), ('as', 'instance of'), ('i', 'http://www.wikidata.org/entity/Q10689397'), ('op', 'television production company')]
#[('p', 'http://www.wikidata.org/entity/Q907311'), ('getname', 'Netflix')]
queryString = """
SELECT DISTINCT ?p ?as ?i ?op
WHERE {
   # bind something
   wd:Q1068 ?p ?i.
   #get the label
   ?p sc:name ?as.
   ?i sc:name ?op
   
   

}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('i', 'http://www.wikidata.org/entity/Q1762059'), ('op', 'film production company')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('i', 'http://www.wikidata.org/entity/Q19364326'), ('op', 'media studio facility')]
[('p', 'http://www.wikidata.org/prop/direct/P452'), ('as', 'industry'), ('i', 'http://www.wikidata.org/entity/Q1407228'), ('op', 'television production')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('i', 'http://www.wikidata.org/entity/Q11396960'), ('op', 'production company')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('as', "topic's main category"), ('i', 'http://www.wikidata.org/entity/Q8836004'), ('op', 'Category:Television production companies')]
5


Final query for this task

In [45]:
# write your final query#[('p', 'http://www.wikidata.org/prop/direct/P31'), ('as', 'instance of'), ('i', 'http://www.wikidata.org/entity/Q10689397'), ('op', 'television production company')]
#[('p', 'http://www.wikidata.org/entity/Q907311'), ('getname', 'Netflix')]
queryString = """
SELECT DISTINCT ?p ?as ?i ?op
WHERE {
   # bind something
   wd:Q10689397 ?p ?i.
   #get the label
   ?p sc:name ?as.
   ?i sc:name ?op
   
   

}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('i', 'http://www.wikidata.org/entity/Q1762059'), ('op', 'film production company')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('i', 'http://www.wikidata.org/entity/Q19364326'), ('op', 'media studio facility')]
[('p', 'http://www.wikidata.org/prop/direct/P452'), ('as', 'industry'), ('i', 'http://www.wikidata.org/entity/Q1407228'), ('op', 'television production')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('i', 'http://www.wikidata.org/entity/Q11396960'), ('op', 'production company')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('as', "topic's main category"), ('i', 'http://www.wikidata.org/entity/Q8836004'), ('op', 'Category:Television production companies')]
5


## Task 4
Identify the BGP for film production company

In [46]:
#[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('i', 'http://www.wikidata.org/entity/Q1762059'), ('op', 'film production company')]
queryString = """
SELECT DISTINCT ?p ?as ?i ?op
WHERE {
   # bind something
   wd:Q1762059 ?p ?i.
   #get the label
   ?p sc:name ?as.
   ?i sc:name ?op
   
   

}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('as', 'different from'), ('i', 'http://www.wikidata.org/entity/Q375336'), ('op', 'film studio')]
[('p', 'http://www.wikidata.org/prop/direct/P452'), ('as', 'industry'), ('i', 'http://www.wikidata.org/entity/Q1415395'), ('op', 'film industry')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('i', 'http://www.wikidata.org/entity/Q104649845'), ('op', 'film organization')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('as', 'part of'), ('i', 'http://www.wikidata.org/entity/Q625568'), ('op', 'show business')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('i', 'http://www.wikidata.org/entity/Q11396960'), ('op', 'production company')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('as', "topic's main category"), ('i', 'http://www.wikidata.org/entity/Q6483398'), ('op', 'Category:Film production companies')]
6


Final query for this task

In [47]:
# write your final qu#[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('i', 'http://www.wikidata.org/entity/Q1762059'), ('op', 'film production company')]
queryString = """
SELECT DISTINCT ?p ?as ?i ?op
WHERE {
   # bind something
   wd:Q1762059 ?p ?i.
   #get the label
   ?p sc:name ?as.
   ?i sc:name ?op
   
   

}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('as', 'different from'), ('i', 'http://www.wikidata.org/entity/Q375336'), ('op', 'film studio')]
[('p', 'http://www.wikidata.org/prop/direct/P452'), ('as', 'industry'), ('i', 'http://www.wikidata.org/entity/Q1415395'), ('op', 'film industry')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('i', 'http://www.wikidata.org/entity/Q104649845'), ('op', 'film organization')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('as', 'part of'), ('i', 'http://www.wikidata.org/entity/Q625568'), ('op', 'show business')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('i', 'http://www.wikidata.org/entity/Q11396960'), ('op', 'production company')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('as', "topic's main category"), ('i', 'http://www.wikidata.org/entity/Q6483398'), ('op', 'Category:Film production companies')]
6


## Task 5
Find the top-5 companies (amongst television production company and film production company) which produced the highest number of crime films (the result set must be a list of triples company IRI, label and #crime film).

In [61]:
#[('get', 'http://www.wikidata.org/prop/direct/P136'), ('getname', 'genre')]
queryString = """
SELECT DISTINCT ?p ?obj ?top ?films
WHERE {
   # bind something
   ?p wdt:P136 ?obj.
   #get the label
   ?p sc:name ?top.
   ?obj sc:name ?films
   FILTER(REGEX(?films,"crime film"^^xsd:string ))

}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q13693967'), ('obj', 'http://www.wikidata.org/entity/Q959790'), ('top', 'Cross'), ('films', 'crime film')]
[('p', 'http://www.wikidata.org/entity/Q20724304'), ('obj', 'http://www.wikidata.org/entity/Q959790'), ('top', 'The Purge: Election Year'), ('films', 'crime film')]
[('p', 'http://www.wikidata.org/entity/Q15865603'), ('obj', 'http://www.wikidata.org/entity/Q959790'), ('top', 'After Dark'), ('films', 'crime film')]
[('p', 'http://www.wikidata.org/entity/Q1755064'), ('obj', 'http://www.wikidata.org/entity/Q959790'), ('top', "Sharky's Machine"), ('films', 'crime film')]
[('p', 'http://www.wikidata.org/entity/Q11510316'), ('obj', 'http://www.wikidata.org/entity/Q959790'), ('top', 'Love Under Fire'), ('films', 'crime film')]
[('p', 'http://www.wikidata.org/entity/Q16020981'), ('obj', 'http://www.wikidata.org/entity/Q959790'), ('top', 'The Amazing Partnership'), ('films', 'crime film')]
[('p', 'http://www.wikidata.org/entity/Q16028315'), ('

Final query for this task

In [67]:
#[('get', 'http://www.wikidata.org/prop/direct/P136'), ('getname', 'genre')]
#[('get', 'http://www.wikidata.org/entity/Q11424'), ('getname', 'film')]
queryString = """
SELECT DISTINCT ?p ?obj ?top ?films ?pcn ?abc
WHERE {
   # bind something
   ?p wdt:P136 ?obj;
      wdt:P272 ?pcn.
   #get the label
   ?p sc:name ?top.
   ?obj sc:name ?films.
   ?pcn sc:name ?abc
   FILTER(REGEX(?films,"crime film"^^xsd:string ))
   

}
LIMIT 80
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q26266013'), ('obj', 'http://www.wikidata.org/entity/Q959790'), ('top', 'Tatort: Die Kunst des Krieges'), ('films', 'crime film'), ('pcn', 'http://www.wikidata.org/entity/Q2367528'), ('abc', 'Superfilm')]
[('p', 'http://www.wikidata.org/entity/Q459057'), ('obj', 'http://www.wikidata.org/entity/Q959790'), ('top', 'Monster'), ('films', 'crime film'), ('pcn', 'http://www.wikidata.org/entity/Q5204823'), ('abc', 'DEJ Productions')]
[('p', 'http://www.wikidata.org/entity/Q496113'), ('obj', 'http://www.wikidata.org/entity/Q959790'), ('top', 'City of Industry'), ('films', 'crime film'), ('pcn', 'http://www.wikidata.org/entity/Q17346576'), ('abc', 'Largo Entertainment')]
[('p', 'http://www.wikidata.org/entity/Q498983'), ('obj', 'http://www.wikidata.org/entity/Q959790'), ('top', 'Young and Innocent'), ('films', 'crime film'), ('pcn', 'http://www.wikidata.org/entity/Q2756503'), ('abc', 'Gaumont-British Picture Corporation')]
[('p', 'http://www.wikida

## Task 6
Find the company (amongst television production company and film production company) which distributed more films (of any kind of film genre) that they did not produce (the result set must be a list of triples company IRI, label and #film).

In [68]:
#TOTAL FILMS BY EACH COMPANY
#[('get', 'http://www.wikidata.org/prop/direct/P136'), ('getname', 'genre')]
#[('get', 'http://www.wikidata.org/entity/Q11424'), ('getname', 'film')]
queryString = """
SELECT DISTINCT (COUNT(?p) AS ?tf) ?pcn ?abc
WHERE {
   # bind something
   ?p wdt:P136 ?obj;
      wdt:P272 ?pcn.
   #get the label
   ?p sc:name ?top.
   ?obj sc:name ?films.
   ?pcn sc:name ?abc
   FILTER(REGEX(?films,"crime film"^^xsd:string ))
   

}
GROUP BY ?abc ?pcn
LIMIT 80
"""

print("Results")
x=run_query(queryString)

Results
[('tf', '2'), ('pcn', 'http://www.wikidata.org/entity/Q47153414'), ('abc', 'Athos Films')]
[('tf', '1'), ('pcn', 'http://www.wikidata.org/entity/Q108183211'), ('abc', 'Lumen Films')]
[('tf', '1'), ('pcn', 'http://www.wikidata.org/entity/Q17377479'), ('abc', 'Edward Small Productions')]
[('tf', '1'), ('pcn', 'http://www.wikidata.org/entity/Q605307'), ('abc', 'Les films du losange')]
[('tf', '1'), ('pcn', 'http://www.wikidata.org/entity/Q624886'), ('abc', 'FilmDistrict')]
[('tf', '1'), ('pcn', 'http://www.wikidata.org/entity/Q82110'), ('abc', 'Steven Seagal')]
[('tf', '2'), ('pcn', 'http://www.wikidata.org/entity/Q4341103'), ('abc', 'Profit')]
[('tf', '1'), ('pcn', 'http://www.wikidata.org/entity/Q2748604'), ('abc', 'Film i Väst')]
[('tf', '1'), ('pcn', 'http://www.wikidata.org/entity/Q7127609'), ('abc', 'Palisades Tartan')]
[('tf', '1'), ('pcn', 'http://www.wikidata.org/entity/Q43380'), ('abc', 'CBS')]
[('tf', '3'), ('pcn', 'http://www.wikidata.org/entity/Q5453219'), ('abc', 'Fi

Final query for this task

In [70]:
# TOP 10 FILM PRODUCTION COMPANIES
#[('get', 'http://www.wikidata.org/prop/direct/P136'), ('getname', 'genre')]
#[('get', 'http://www.wikidata.org/entity/Q11424'), ('getname', 'film')]
queryString = """
SELECT DISTINCT (COUNT(?p) AS ?tf) ?pcn ?abc
WHERE {
   # bind something
   ?p wdt:P136 ?obj;
      wdt:P272 ?pcn.
   #get the label
   ?p sc:name ?top.
   ?obj sc:name ?films.
   ?pcn sc:name ?abc
   FILTER(REGEX(?films,"crime film"^^xsd:string ))
   

}
GROUP BY ?abc ?pcn
ORDER BY DESC(?tf)
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('tf', '161'), ('pcn', 'http://www.wikidata.org/entity/Q126399'), ('abc', 'Warner Bros.')]
[('tf', '118'), ('pcn', 'http://www.wikidata.org/entity/Q186941'), ('abc', 'Columbia Pictures')]
[('tf', '113'), ('pcn', 'http://www.wikidata.org/entity/Q168383'), ('abc', 'Universal Pictures')]
[('tf', '112'), ('pcn', 'http://www.wikidata.org/entity/Q179200'), ('abc', 'Metro-Goldwyn-Mayer')]
[('tf', '109'), ('pcn', 'http://www.wikidata.org/entity/Q159846'), ('abc', 'Paramount Pictures')]
[('tf', '90'), ('pcn', 'http://www.wikidata.org/entity/Q434841'), ('abc', '20th Century Studios')]
[('tf', '87'), ('pcn', 'http://www.wikidata.org/entity/Q267282'), ('abc', 'RKO Pictures')]
[('tf', '37'), ('pcn', 'http://www.wikidata.org/entity/Q913462'), ('abc', 'Gaumont Film Company')]
[('tf', '35'), ('pcn', 'http://www.wikidata.org/entity/Q2450848'), ('abc', 'StudioCanal')]
[('tf', '34'), ('pcn', 'http://www.wikidata.org/entity/Q79202'), ('abc', 'New Line Cinema')]
10


## Task 7
Find how many company are listed in every stock exchange market (the result set must be a list of triples stock exchange IRI, label and #companies).

In [73]:
# WE USE SUB CLASS OF PRODUCTION COMPANY FROM TASK 3 RESULT TO CHECK COMPANIES LISTED IN STOCK EXCHANGE
#[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('i', 'http://www.wikidata.org/entity/Q11396960'), ('op', 'production company')]
queryString = """
SELECT DISTINCT ?p ?as ?op ?i
WHERE {
   # bind something
   wd:Q11396960 ?p ?i.
   #get the label
   ?p sc:name ?as.
   ?i sc:name ?op
   
   

}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P101'), ('as', 'field of work'), ('op', 'audiovisual production'), ('i', 'http://www.wikidata.org/entity/Q21198548')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('op', 'media company'), ('i', 'http://www.wikidata.org/entity/Q1331793')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('op', 'company'), ('i', 'http://www.wikidata.org/entity/Q783794')]
[('p', 'http://www.wikidata.org/prop/direct/P1687'), ('as', 'Wikidata property'), ('op', 'production company'), ('i', 'http://www.wikidata.org/entity/P272')]
4


Final query for this task

In [74]:
# WE WILL FURTHER ELABORATE MEDIA COMPANY TO CHECK COMPANIES INVOLVED IN STOCK EXCHANGE
#[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('op', 'media company'), ('i', 'http://www.wikidata.org/entity/Q1331793')]
queryString = """
SELECT DISTINCT ?p ?as ?op ?i
WHERE {
   # bind something
   wd:Q1331793 ?p ?i.
   #get the label
   ?p sc:name ?as.
   ?i sc:name ?op
   
   

}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('as', 'different from'), ('op', 'media studio facility'), ('i', 'http://www.wikidata.org/entity/Q19364326')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('op', 'business'), ('i', 'http://www.wikidata.org/entity/Q4830453')]
[('p', 'http://www.wikidata.org/prop/direct/P2283'), ('as', 'uses'), ('op', 'communication'), ('i', 'http://www.wikidata.org/entity/Q11024')]
[('p', 'http://www.wikidata.org/prop/direct/P101'), ('as', 'field of work'), ('op', 'mass media'), ('i', 'http://www.wikidata.org/entity/Q11033')]
[('p', 'http://www.wikidata.org/prop/direct/P101'), ('as', 'field of work'), ('op', 'communication medium'), ('i', 'http://www.wikidata.org/entity/Q340169')]
[('p', 'http://www.wikidata.org/prop/direct/P2283'), ('as', 'uses'), ('op', 'communication medium'), ('i', 'http://www.wikidata.org/entity/Q340169')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('op', 'comp

In [75]:
#NOW WE SELECT BUSSINESS FROM RESULTS AND FURTHER EXPLORE.
#[('p', 'http://www.wikidata.org/prop/direct/P279'), ('as', 'subclass of'), ('op', 'business'), ('i', 'http://www.wikidata.org/entity/Q4830453')]
queryString = """
SELECT DISTINCT ?p ?as ?op ?i
WHERE {
   # bind something
   wd:Q4830453 ?p ?i.
   #get the label
   ?p sc:name ?as.
   ?i sc:name ?op
   
   

}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('as', 'properties for this type'), ('op', 'stock exchange'), ('i', 'http://www.wikidata.org/entity/P414')]
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('as', 'properties for this type'), ('op', 'Twitter username'), ('i', 'http://www.wikidata.org/entity/P2002')]
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('as', 'properties for this type'), ('op', 'Legal Entity Identifier'), ('i', 'http://www.wikidata.org/entity/P1278')]
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('as', 'properties for this type'), ('op', 'OpenCorporates ID'), ('i', 'http://www.wikidata.org/entity/P1320')]
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('as', 'properties for this type'), ('op', 'product or material produced'), ('i', 'http://www.wikidata.org/entity/P1056')]
[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('as', 'properties for this type'), ('op', 'phone number'), ('i', 'http://www.wikidata.org/entity/P132

In [77]:
# HERE IT IS WE GOT STOCK EXCHANGE IN FIRST ROW IN PREVIOUS RESULT NOW WE FURTHER EXPLORE STOCK EXCHANGE.
#[('p', 'http://www.wikidata.org/prop/direct/P1963'), ('as', 'properties for this type'), ('op', 'stock exchange'), ('i', 'http://www.wikidata.org/entity/P414')]
queryString = """
SELECT DISTINCT ?p ?as ?op ?i
WHERE {
   # bind something
   ?P wdt:P414 ?i.
   #get the label
   ?p sc:name ?as.
   ?i sc:name ?op
   
   

}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/P348'), ('as', 'software version identifier'), ('op', 'Philippine Stock Exchange'), ('i', 'http://www.wikidata.org/entity/Q1526647')]
[('p', 'http://www.wikidata.org/entity/P4669'), ('as', 'tabular software version'), ('op', 'Philippine Stock Exchange'), ('i', 'http://www.wikidata.org/entity/Q1526647')]
[('p', 'http://www.wikidata.org/entity/P747'), ('as', 'has edition or translation'), ('op', 'Philippine Stock Exchange'), ('i', 'http://www.wikidata.org/entity/Q1526647')]
[('p', 'http://www.wikidata.org/entity/Q1002728'), ('as', 'Fonsi Nieto'), ('op', 'Philippine Stock Exchange'), ('i', 'http://www.wikidata.org/entity/Q1526647')]
[('p', 'http://www.wikidata.org/entity/Q100326534'), ('as', "Al Stewart's albums in chronological order"), ('op', 'Philippine Stock Exchange'), ('i', 'http://www.wikidata.org/entity/Q1526647')]
[('p', 'http://www.wikidata.org/entity/Q100327667'), ('as', 'Category:Adelina Lopes Vieira'), ('op', 'Philippine Stock Ex

In [79]:
# TOTAL NUMBEROF COMPANIES IN STOCK EXCHANGE
# TOTAL 200 COUNTRIES IN WORLD RIGHTNOW WE SET LIMIT FOR 200 TO SEE NUMBER OF COMPANIES IN EACH COUNTRY STOCK EXCHAGE. 
queryString = """
SELECT DISTINCT (COUNT(?P) as ?tnc)?old ?i
WHERE {
   # bind something
   ?P wdt:P414 ?i.
   #get the label
   ?i sc:name ?old
   
   

}
GROUP BY ?old ?i
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('tnc', '40'), ('old', 'SIX Swiss Exchange'), ('i', 'http://www.wikidata.org/entity/Q661834')]
[('tnc', '6'), ('old', 'Xetra'), ('i', 'http://www.wikidata.org/entity/Q819468')]
[('tnc', '460'), ('old', 'Tokyo Stock Exchange'), ('i', 'http://www.wikidata.org/entity/Q217475')]
[('tnc', '1'), ('old', 'Bilbao Stock Exchange'), ('i', 'http://www.wikidata.org/entity/Q4890307')]
[('tnc', '11'), ('old', 'PFTS Ukraine Stock Exchange'), ('i', 'http://www.wikidata.org/entity/Q1984912')]
[('tnc', '346'), ('old', 'Nepal Stock Exchange'), ('i', 'http://www.wikidata.org/entity/Q6529599')]
[('tnc', '58'), ('old', 'Nasdaq Helsinki'), ('i', 'http://www.wikidata.org/entity/Q581755')]
[('tnc', '3'), ('old', 'Bulgarian Stock Exchange'), ('i', 'http://www.wikidata.org/entity/Q1003968')]
[('tnc', '1'), ('old', 'CEX.io'), ('i', 'http://www.wikidata.org/entity/Q22907145')]
[('tnc', '2'), ('old', 'Euronext Growth'), ('i', 'http://www.wikidata.org/entity/Q2718247')]
[('tnc', '1'), ('old', 'Old Stock Exc

## Task 8
Identify the BGP for Academy Award

In [90]:
queryString = """
SELECT DISTINCT ?P ?top ?films ?obj
WHERE {{
   # bind something
   wd:Q11424 ?P ?obj.
   #get the label
   ?p sc:name ?top.
   ?obj sc:name ?films.
   }
   Union
   {
   ?P ?obj wd:Q11424.
   ?p sc:name ?top.
   ?obj sc:name ?films.
   }
   }
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('P', 'http://www.wikidata.org/prop/direct/P2579'), ('top', 'software version identifier'), ('films', 'film studies'), ('obj', 'http://www.wikidata.org/entity/Q1660187')]
[('P', 'http://www.wikidata.org/prop/direct/P2579'), ('top', 'tabular software version'), ('films', 'film studies'), ('obj', 'http://www.wikidata.org/entity/Q1660187')]
[('P', 'http://www.wikidata.org/prop/direct/P2579'), ('top', 'has edition or translation'), ('films', 'film studies'), ('obj', 'http://www.wikidata.org/entity/Q1660187')]
[('P', 'http://www.wikidata.org/prop/direct/P2579'), ('top', 'Fonsi Nieto'), ('films', 'film studies'), ('obj', 'http://www.wikidata.org/entity/Q1660187')]
[('P', 'http://www.wikidata.org/prop/direct/P2579'), ('top', "Al Stewart's albums in chronological order"), ('films', 'film studies'), ('obj', 'http://www.wikidata.org/entity/Q1660187')]
[('P', 'http://www.wikidata.org/prop/direct/P2579'), ('top', 'Category:Adelina Lopes Vieira'), ('films', 'film studies'), ('obj', 'http:/

In [95]:
#now we have to filter out academy awards
queryString = """
SELECT DISTINCT ?P ?top ?films ?obj
WHERE {
   # bind something
    ?P wdt:P166 ?obj.
   #get the label
   ?p sc:name ?top.
   FILTER(REGEX(?top,"Academy Award"^^xsd:))
   }
   
LIMIT 50
"""

print("Results")
x=run_query(queryString)


Results
[('P', 'http://www.wikidata.org/entity/Q8201073'), ('top', 'Category:Academy Award for Best Documentary Feature templates'), ('obj', 'http://www.wikidata.org/entity/Q25421476')]
[('P', 'http://www.wikidata.org/entity/Q8201073'), ('top', 'Category:Academy Award for Best Original Song templates'), ('obj', 'http://www.wikidata.org/entity/Q25421476')]
[('P', 'http://www.wikidata.org/entity/Q8201073'), ('top', 'Category:Films whose cinematographer won the Best Cinematography Academy Award'), ('obj', 'http://www.wikidata.org/entity/Q25421476')]
[('P', 'http://www.wikidata.org/entity/Q8201073'), ('top', 'Africa Movie Academy Award for Best Actress in a Supporting Role'), ('obj', 'http://www.wikidata.org/entity/Q25421476')]
[('P', 'http://www.wikidata.org/entity/Q8201073'), ('top', 'Africa Movie Academy Award for Most Promising Actor'), ('obj', 'http://www.wikidata.org/entity/Q25421476')]
[('P', 'http://www.wikidata.org/entity/Q8201073'), ('top', '"Sunshine", "Queen" in the spotlight a

In [96]:
# write your final query
#now we have to filter out academy awards with respect to object
queryString = """
SELECT DISTINCT ?P ?top ?films ?obj
WHERE {
   # bind something
    ?P wdt:P166 ?obj.
   #get the label
   ?obj sc:name ?top.
   FILTER(REGEX(?top,"Academy Award"^^xsd:))
   }
   
LIMIT 50
"""

print("Results")
x=run_query(queryString)


Results
[('P', 'http://www.wikidata.org/entity/Q231276'), ('top', 'Africa Movie Academy Award for Best Actress in a Supporting Role'), ('obj', 'http://www.wikidata.org/entity/Q17001892')]
[('P', 'http://www.wikidata.org/entity/Q16194334'), ('top', 'Africa Movie Academy Award for Best Actress in a Supporting Role'), ('obj', 'http://www.wikidata.org/entity/Q17001892')]
[('P', 'http://www.wikidata.org/entity/Q8062489'), ('top', 'Africa Movie Academy Award for Best Actress in a Supporting Role'), ('obj', 'http://www.wikidata.org/entity/Q17001892')]
[('P', 'http://www.wikidata.org/entity/Q24289792'), ('top', 'Africa Movie Academy Award for Best Actress in a Supporting Role'), ('obj', 'http://www.wikidata.org/entity/Q17001892')]
[('P', 'http://www.wikidata.org/entity/Q13638782'), ('top', 'Africa Movie Academy Award for Best Actress in a Supporting Role'), ('obj', 'http://www.wikidata.org/entity/Q17001892')]
[('P', 'http://www.wikidata.org/entity/Q24572815'), ('top', 'Africa Movie Academy Awa

In [97]:
# here we further explore 45th academy awards 
#[('P', 'http://www.wikidata.org/entity/Q489951'), ('top', '45th Academy Awards'),
queryString = """
SELECT DISTINCT ?P ?top ?films ?obj
WHERE {
   # bind something
   wd:Q489951 ?P ?obj.
   #get the label
   ?obj sc:name ?top.
   FILTER(REGEX(?top,"Academy Award"^^xsd:))
   }
   
LIMIT 50
"""

print("Results")
x=run_query(queryString)


Results
[('P', 'http://www.wikidata.org/prop/direct/P166'), ('top', '45th Academy Awards'), ('obj', 'http://www.wikidata.org/entity/Q918125')]
[('P', 'http://www.wikidata.org/prop/direct/P1411'), ('top', 'Academy Award for Best Cinematography, Color'), ('obj', 'http://www.wikidata.org/entity/Q21995136')]
[('P', 'http://www.wikidata.org/prop/direct/P166'), ('top', 'Academy Awards'), ('obj', 'http://www.wikidata.org/entity/Q19020')]
[('P', 'http://www.wikidata.org/prop/direct/P1411'), ('top', 'Academy Award for Best Cinematography'), ('obj', 'http://www.wikidata.org/entity/Q131520')]
[('P', 'http://www.wikidata.org/prop/direct/P166'), ('top', 'Academy Award for Best Cinematography'), ('obj', 'http://www.wikidata.org/entity/Q131520')]
[('P', 'http://www.wikidata.org/prop/direct/P166'), ('top', '53rd Academy Awards'), ('obj', 'http://www.wikidata.org/entity/Q767355')]
6


In [98]:
#FINAL QUERY
#BGP FOR ACADEMY AWARDS
#[('P', 'http://www.wikidata.org/prop/direct/P166'), ('top', 'Academy Awards'), ('obj', 'http://www.wikidata.org/entity/Q19020')]
queryString = """
SELECT DISTINCT ?P ?top ?films ?obj
WHERE {
   # bind something
   wd:Q19020 ?P ?obj.
   #get the label
   ?obj sc:name ?top.
   #FILTER(REGEX(?top,"Academy Award"^^xsd:))
   }
   
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('P', 'http://www.wikidata.org/prop/direct/P527'), ('top', 'Academy Award for Best Picture'), ('obj', 'http://www.wikidata.org/entity/Q102427')]
[('P', 'http://www.wikidata.org/prop/direct/P527'), ('top', 'Academy Award for Best Director'), ('obj', 'http://www.wikidata.org/entity/Q103360')]
[('P', 'http://www.wikidata.org/prop/direct/P527'), ('top', 'Academy Award for Best Actress'), ('obj', 'http://www.wikidata.org/entity/Q103618')]
[('P', 'http://www.wikidata.org/prop/direct/P527'), ('top', 'Academy Award for Best Supporting Actress'), ('obj', 'http://www.wikidata.org/entity/Q106301')]
[('P', 'http://www.wikidata.org/prop/direct/P527'), ('top', 'Academy Award for Best Writing, Adapted Screenplay'), ('obj', 'http://www.wikidata.org/entity/Q107258')]
[('P', 'http://www.wikidata.org/prop/direct/P527'), ('top', 'Academy Award for Best Original Song'), ('obj', 'http://www.wikidata.org/entity/Q112243')]
[('P', 'http://www.wikidata.org/prop/direct/P17'), ('top', 'United States of A

## Task 9
Find the companies (among television and film production companies) that won at least 5 Academy Awards for Best Actress for the movies they produced (the result set must be a list of triples company IRI, label and #awards).

In [99]:
# write your queries
#USING CATEGORY OF ACADEMY AWARS WINNERS

#[('P', 'http://www.wikidata.org/prop/direct/P2517'),
queryString = """
SELECT DISTINCT ?P ?top ?films ?obj
WHERE {
   # bind something
   wd:Q19020 wdt:P2517 ?obj.
   #get the label
   ?obj sc:name ?top.
   #FILTER(REGEX(?top,"Academy Award"^^xsd:))
   }
   
LIMIT 50
"""

print("Results")
x=run_query(queryString)


Results
[('top', 'Category:Academy Award winners'), ('obj', 'http://www.wikidata.org/entity/Q6468204')]
1


In [100]:
#[('top', 'Category:Academy Award winners'), ('obj', 'http://www.wikidata.org/entity/Q6468204')]
queryString = """
SELECT DISTINCT ?P ?top ?films ?obj
WHERE {
   # bind something
   wd:Q6468204 ?P ?obj.
   #get the label
   ?obj sc:name ?top.
   #FILTER(REGEX(?top,"Academy Award"^^xsd:))
   }
   
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('P', 'http://www.wikidata.org/prop/direct/P971'), ('top', 'award winner'), ('obj', 'http://www.wikidata.org/entity/Q21096945')]
[('P', 'http://www.wikidata.org/prop/direct/P971'), ('top', 'Academy Awards'), ('obj', 'http://www.wikidata.org/entity/Q19020')]
[('P', 'http://www.wikidata.org/prop/direct/P31'), ('top', 'Wikimedia category'), ('obj', 'http://www.wikidata.org/entity/Q4167836')]
[('P', 'http://www.wikidata.org/prop/direct/P4224'), ('top', 'human'), ('obj', 'http://www.wikidata.org/entity/Q5')]
4


In [103]:
#all award winners
#[('P', 'http://www.wikidata.org/prop/direct/P971'), ('top', 'award winner'),
#[('top', 'Category:Academy Award winners'), ('obj', 'http://www.wikidata.org/entity/Q6468204')]
queryString = """
SELECT DISTINCT ?P ?top ?films ?obj
WHERE {
   # bind something
   ?p wdt:P971 ?obj.
   #get the label
   ?obj sc:name ?top.
   #FILTER(REGEX(?top,"Academy Award"^^xsd:))
   }
   
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('top', 'Category:1920 elections in Oceania'), ('obj', 'http://www.wikidata.org/entity/Q10145708')]
[('top', 'Category:2002 elections in South America'), ('obj', 'http://www.wikidata.org/entity/Q10147007')]
[('top', 'Category:Akita Shoten magazines'), ('obj', 'http://www.wikidata.org/entity/Q10237146')]
[('top', 'Category:2009 in Australian sport'), ('obj', 'http://www.wikidata.org/entity/Q11167612')]
[('top', 'Category:1981 in New Zealand sport'), ('obj', 'http://www.wikidata.org/entity/Q11721574')]
[('top', 'Category:1926 in Argentine sport'), ('obj', 'http://www.wikidata.org/entity/Q13237425')]
[('top', 'Category:1928 in Argentine sport'), ('obj', 'http://www.wikidata.org/entity/Q13237485')]
[('top', 'Category:1928 in South American sport'), ('obj', 'http://www.wikidata.org/entity/Q13237495')]
[('top', 'Category:1932 in table tennis'), ('obj', 'http://www.wikidata.org/entity/Q13237669')]
[('top', 'Category:1944 in Antarctica'), ('obj', 'http://www.wikidata.org/entity/Q13238