# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.

In [2]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-7b2fc88989-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString,verbose = True):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        results = sparql.query()
        json_results = results.convert()
        if len(json_results['results']['bindings'])==0:
            print("Empty")
            return []
        array = []
        for bindings in json_results['results']['bindings']:
            app =  [ (var, value['value'])  for var, value in bindings.items() ] 
            if verbose:
                print( app)
            array.append(app)
        if verbose:
            print(len(array))
        return array

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# Movie Workflow Series ("Production company explorative search") 

Consider the following exploratory information need:

> investigate the main companies working on cinema-related content. We want to know which are the main television production companies and the film production companies, which company distributes more film and some consideration about awards.

## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | profession    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P27`     | nationality   | predicate |
| `wdt:P106`     | profession   | predicate |
| `wd:Q5`| Human       | node |
| `wd:Q36479`    | The Lion King      | node |





Also consider

```
wd:Q36479 ?p ?obj .
```

is the BGP to retrieve all **properties of The Lion King**

Please consider that when you return a resource, you should return the IRI and the label of the resource. In particular, when the task require you to identify a BGP the result set must always be a list of couples IRI - label.

The workload should:


1. Identify the BGP for films

2. Identify the BGP for Netflix

3. Identify the BGP for television production company

4. Identify the BGP for film production company

5. Find the top-5 companies (amongst television production company and film production company) which produced the highest number of crime films (the result set must be a list of triples company IRI, label and #crime film).

6. Find the company (amongst television production company and film production company) which distributed more films (of any kind of film genre) that they did not produce (the result set must be a list of triples company IRI, label and #film).

7. Find how many company are listed in every stock exchange market (the result set must be a list of triples stock exchange IRI, label and #companies). 

8. Identify the BGP for Academy Award

9. Find the companies (among television and film production companies) that won at least 5 Academy Awards for Best Actress for the movies they produced (the result set must be a list of triples company IRI, label and #awards).

## Task 1
Identify the BGP for films

In [3]:
#all the properties for The lion King. 
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q36479 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
ORDER BY ?iname
LIMIT 500
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P3092'), ('name', 'film crew member')]
[('p', 'http://www.wikidata.org/prop/direct/P4021'), ('name', 'danskefilm animated film ID')]
[('p', 'http://www.wikidata.org/prop/direct/P7679'), ('name', 'Cinépolis KLIC ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1015'), ('name', 'NORAF ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1040'), ('name', 'film editor')]
[('p', 'http://www.wikidata.org/prop/direct/P1237'), ('name', 'Box Office Mojo film ID (former scheme)')]
[('p', 'http://www.wikidata.org/prop/direct/P1258'), ('name', 'Rotten Tomatoes ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1265'), ('name', 'AlloCiné film ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1273'), ('name', 'CANTIC ID')]
[('p', 'http://www.wikidata.org/prop/direct/P136'), ('name', 'genre')]
[('p', 'http://www.wikidata.org/prop/direct/P1411'), ('name', 'nominated for')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('name', 'Encyclopædia B

In [8]:
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q36479 ?p ?obj .
   ?obj wdt:P31 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q19020'), ('iname', 'Academy Awards')]
[('inst', 'http://www.wikidata.org/entity/Q23817731'), ('iname', 'CNC film classification category')]
[('inst', 'http://www.wikidata.org/entity/Q27915572'), ('iname', 'CNC film classification category')]
[('inst', 'http://www.wikidata.org/entity/Q23830575'), ('iname', 'Filmiroda rating category')]
[('inst', 'http://www.wikidata.org/entity/Q841645'), ('iname', 'Internet television')]
[('inst', 'http://www.wikidata.org/entity/Q23660208'), ('iname', 'MPAA classification category')]
[('inst', 'http://www.wikidata.org/entity/Q28980086'), ('iname', 'RTC film classification category')]
[('inst', 'http://www.wikidata.org/entity/Q15708861'), ('iname', 'Red-billed Hornbill')]
[('inst', 'http://www.wikidata.org/entity/Q4167836'), ('iname', 'Wikimedia category')]
[('inst', 'http://www.wikidata.org/entity/Q482994'), ('iname', 'album')]
[('inst', 'http://www.wikidata.org/entity/Q15711870'), ('iname', 'animated c

In [9]:
#I explore the properties for animated feature film 
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q29168811 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 500
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P461'), ('name', 'opposite of')]
[('p', 'http://www.wikidata.org/prop/direct/P8408'), ('name', 'KBpedia ID')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('name', "topic's main category")]
4


In [10]:
#Lets see from who its subclass of. 
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q29168811 ?p ?obj .
   ?obj wdt:P279 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q202866'), ('iname', 'animated film')]
[('inst', 'http://www.wikidata.org/entity/Q11424'), ('iname', 'film')]
[('inst', 'http://www.wikidata.org/entity/Q24862'), ('iname', 'short film')]
3


Final query for this task

In [11]:
#Found with the last query, the BGP for film. Finding out from who its subclass the animated feature film. 
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q29168811 ?p ?obj .
   ?obj wdt:P279 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q202866'), ('iname', 'animated film')]
[('inst', 'http://www.wikidata.org/entity/Q11424'), ('iname', 'film')]
[('inst', 'http://www.wikidata.org/entity/Q24862'), ('iname', 'short film')]
3


## Task 2
Identify the BGP for Netflix

In [14]:
#Netflix it's a distribution company so lets look for the BGP of distributor and then all the companies 
#Lets start looking for the property of the lion king called netflix ID. see what contains P1874
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q36479 ?p ?obj .
   ?obj wdt:P1874 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)


Results
Empty


Final query for this task

In [15]:
#Keep looking for properties. Explore the property distribuited by. P750
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q36479 ?p ?obj .
   ?obj wdt:P750 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)


Results
[('inst', 'http://www.wikidata.org/entity/Q54958752'), ('iname', 'Disney+')]
[('inst', 'http://www.wikidata.org/entity/Q1097348'), ('iname', 'InterCom')]
[('inst', 'http://www.wikidata.org/entity/Q135288'), ('iname', 'Microsoft Store')]
[('inst', 'http://www.wikidata.org/entity/Q907311'), ('iname', 'Netflix')]
[('inst', 'http://www.wikidata.org/entity/Q1323594'), ('iname', 'Walt Disney Studios Motion Pictures')]
5


## Task 3
Identify the BGP for television production company

In [17]:
#let's see the properties from Netflix
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q907311 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
ORDER BY ?iname
LIMIT 500
"""

print("Results")
x=run_query(queryString)


Results
[('p', 'http://www.wikidata.org/prop/direct/P8939'), ('name', 'Google Play developer slug')]
[('p', 'http://www.wikidata.org/prop/direct/P1056'), ('name', 'product or material produced')]
[('p', 'http://www.wikidata.org/prop/direct/P112'), ('name', 'founded by')]
[('p', 'http://www.wikidata.org/prop/direct/P1128'), ('name', 'employees')]
[('p', 'http://www.wikidata.org/prop/direct/P127'), ('name', 'owned by')]
[('p', 'http://www.wikidata.org/prop/direct/P1411'), ('name', 'nominated for')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('name', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('name', "topic's main template")]
[('p', 'http://www.wikidata.org/prop/direct/P1454'), ('name', 'legal form')]
[('p', 'http://www.wikidata.org/prop/direct/P154'), ('name', 'logo image')]
[('p', 'http://www.wikidata.org/prop/direct/P1581'), ('name', 'official blog')]
[('p', 'http://www.wikidata.org/prop/direct/P159'), ('name', 'headquarters lo

In [18]:
#From who it's instance of Netflix. 
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q907311 ?p ?obj .
   ?obj wdt:P31 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [25]:
#I go for another path and look on the properties of The lion king, I take a look to the values on the properti production company.
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q36479 ?p ?obj .
   ?obj wdt:P272 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q2081339'), ('iname', 'Disneytoon Studios')]
[('inst', 'http://www.wikidata.org/entity/Q3866664'), ('iname', 'Moving Picture Company')]
[('inst', 'http://www.wikidata.org/entity/Q7414'), ('iname', 'The Walt Disney Company')]
[('inst', 'http://www.wikidata.org/entity/Q191224'), ('iname', 'Walt Disney Pictures')]
4


In [26]:
#Now I explore walt disney Pictures
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q191224 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
ORDER BY ?iname
LIMIT 500
"""

print("Results")
x=run_query(queryString)


Results
[('p', 'http://www.wikidata.org/prop/direct/P1056'), ('name', 'product or material produced')]
[('p', 'http://www.wikidata.org/prop/direct/P112'), ('name', 'founded by')]
[('p', 'http://www.wikidata.org/prop/direct/P127'), ('name', 'owned by')]
[('p', 'http://www.wikidata.org/prop/direct/P1454'), ('name', 'legal form')]
[('p', 'http://www.wikidata.org/prop/direct/P154'), ('name', 'logo image')]
[('p', 'http://www.wikidata.org/prop/direct/P159'), ('name', 'headquarters location')]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('name', 'country')]
[('p', 'http://www.wikidata.org/prop/direct/P1830'), ('name', 'owner of')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P213'), ('name', 'ISNI')]
[('p', 'http://www.wikidata.org/prop/direct/P214'), ('name', 'VIAF ID')]
[('p', 'http://www.wikidata.org/prop/direct/P244'), ('name', 'Library of Congress authority ID')]
[('p', 'http://www.wikidata.org/prop/d

In [27]:
#Let see from who it's instance of. 
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q191224 ?p ?obj .
   ?obj wdt:P31 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q35657'), ('iname', 'U.S. state')]
[('inst', 'http://www.wikidata.org/entity/Q4167836'), ('iname', 'Wikimedia category')]
[('inst', 'http://www.wikidata.org/entity/Q1107679'), ('iname', 'animation studio')]
[('inst', 'http://www.wikidata.org/entity/Q1549591'), ('iname', 'big city')]
[('inst', 'http://www.wikidata.org/entity/Q4830453'), ('iname', 'business')]
[('inst', 'http://www.wikidata.org/entity/Q13218391'), ('iname', 'charter city')]
[('inst', 'http://www.wikidata.org/entity/Q1093829'), ('iname', 'city of the United States')]
[('inst', 'http://www.wikidata.org/entity/Q21980538'), ('iname', 'commercial organization')]
[('inst', 'http://www.wikidata.org/entity/Q1520223'), ('iname', 'constitutional republic')]
[('inst', 'http://www.wikidata.org/entity/Q7540587'), ('iname', 'corporate headquarters')]
[('inst', 'http://www.wikidata.org/entity/Q6256'), ('iname', 'country')]
[('inst', 'http://www.wikidata.org/entity/Q5255892'), ('iname', 

In [28]:
#Let's explore film production company. 
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q1762059 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
ORDER BY ?iname
LIMIT 500
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P227'), ('name', 'GND ID')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('name', 'part of')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('name', 'Commons category')]
[('p', 'http://www.wikidata.org/prop/direct/P452'), ('name', 'industry')]
[('p', 'http://www.wikidata.org/prop/direct/P646'), ('name', 'Freebase ID')]
[('p', 'http://www.wikidata.org/prop/direct/P8408'), ('name', 'KBpedia ID')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('name', "topic's main category")]
9


In [29]:
#Subclass of for film production company. 
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q1762059 ?p ?obj .
   ?obj wdt:P279 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q20897549'), ('iname', 'art institution')]
[('inst', 'http://www.wikidata.org/entity/Q783794'), ('iname', 'company')]
[('inst', 'http://www.wikidata.org/entity/Q3152824'), ('iname', 'cultural institution')]
[('inst', 'http://www.wikidata.org/entity/Q19595701'), ('iname', 'entertainment industry')]
[('inst', 'http://www.wikidata.org/entity/Q104649845'), ('iname', 'film organization')]
[('inst', 'http://www.wikidata.org/entity/Q1331793'), ('iname', 'media company')]
[('inst', 'http://www.wikidata.org/entity/Q19364326'), ('iname', 'media studio facility')]
7


In [31]:
#let's see all the items that are subclass of film production company. 
queryString = """

SELECT DISTINCT ?production ?productionName 
WHERE {

   #bind something 
    ?production wdt:P279 wd:Q1762059 .  
    
   # get the label 
    ?production sc:name  ?productionName . 
   
}

LIMIT 100


"""

print("Results")
x=run_query(queryString)


Results
[('production', 'http://www.wikidata.org/entity/Q10689397'), ('productionName', 'television production company')]
[('production', 'http://www.wikidata.org/entity/Q735427'), ('productionName', 'Poverty Row')]
2


Final query for this task

In [32]:
#I found the BGP for Television production company with last query.
queryString = """

SELECT DISTINCT ?production ?productionName 
WHERE {

   #bind something 
    ?production wdt:P279 wd:Q1762059 . 
    
   # get the label 
    ?production sc:name  ?productionName . 
   
}

LIMIT 100


"""

print("Results")
x=run_query(queryString)

Results
[('production', 'http://www.wikidata.org/entity/Q10689397'), ('productionName', 'television production company')]
[('production', 'http://www.wikidata.org/entity/Q735427'), ('productionName', 'Poverty Row')]
2


## Task 4
Identify the BGP for film production company

In [33]:
#On the last query searching for the television production company I found the film production company between the instaces of walt disney pictures.
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q191224 ?p ?obj . #Walt disney pictures. 
   ?obj wdt:P31 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)



Results
[('inst', 'http://www.wikidata.org/entity/Q35657'), ('iname', 'U.S. state')]
[('inst', 'http://www.wikidata.org/entity/Q4167836'), ('iname', 'Wikimedia category')]
[('inst', 'http://www.wikidata.org/entity/Q1107679'), ('iname', 'animation studio')]
[('inst', 'http://www.wikidata.org/entity/Q1549591'), ('iname', 'big city')]
[('inst', 'http://www.wikidata.org/entity/Q4830453'), ('iname', 'business')]
[('inst', 'http://www.wikidata.org/entity/Q13218391'), ('iname', 'charter city')]
[('inst', 'http://www.wikidata.org/entity/Q1093829'), ('iname', 'city of the United States')]
[('inst', 'http://www.wikidata.org/entity/Q21980538'), ('iname', 'commercial organization')]
[('inst', 'http://www.wikidata.org/entity/Q1520223'), ('iname', 'constitutional republic')]
[('inst', 'http://www.wikidata.org/entity/Q7540587'), ('iname', 'corporate headquarters')]
[('inst', 'http://www.wikidata.org/entity/Q6256'), ('iname', 'country')]
[('inst', 'http://www.wikidata.org/entity/Q5255892'), ('iname', 

Final query for this task

In [36]:
# write your final query
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q191224 ?p ?obj .
   ?obj wdt:P31 ?inst.
   
   VALUES ?inst {wd:Q1762059}
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q1762059'), ('iname', 'film production company')]
1


## Task 5
Find the top-5 companies (amongst television production company and film production company) which produced the highest number of crime films (the result set must be a list of triples company IRI, label and #crime film).

In [37]:
#first I need the identifier for Crime film. 
#Lets start looking throw the properties of The lion king, genre
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q36479 ?p ?obj .
   ?obj wdt:P136 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)


Results
[('inst', 'http://www.wikidata.org/entity/Q5073733'), ('iname', 'Character comedy')]
[('inst', 'http://www.wikidata.org/entity/Q1152184'), ('iname', 'Shakespearean comedy')]
[('inst', 'http://www.wikidata.org/entity/Q378988'), ('iname', 'adult contemporary music')]
[('inst', 'http://www.wikidata.org/entity/Q319221'), ('iname', 'adventure film')]
[('inst', 'http://www.wikidata.org/entity/Q53094'), ('iname', 'black comedy')]
[('inst', 'http://www.wikidata.org/entity/Q40831'), ('iname', 'comedy')]
[('inst', 'http://www.wikidata.org/entity/Q157443'), ('iname', 'comedy film')]
[('inst', 'http://www.wikidata.org/entity/Q2975633'), ('iname', 'coming-of-age story')]
[('inst', 'http://www.wikidata.org/entity/Q850412'), ('iname', 'contemporary R&B')]
[('inst', 'http://www.wikidata.org/entity/Q1616750'), ('iname', 'deadpan')]
[('inst', 'http://www.wikidata.org/entity/Q130232'), ('iname', 'drama')]
[('inst', 'http://www.wikidata.org/entity/Q1361932'), ('iname', 'family film')]
[('inst', 'h

In [43]:
#Lets see from who is intance musical Q2743
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q2743 ?p ?obj .
   ?obj wdt:P31 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q19361238'), ('iname', 'Wikidata metaclass')]
[('inst', 'http://www.wikidata.org/entity/Q4167836'), ('iname', 'Wikimedia category')]
[('inst', 'http://www.wikidata.org/entity/Q19887878'), ('iname', 'Wikimedia infobox template')]
[('inst', 'http://www.wikidata.org/entity/Q33532284'), ('iname', 'Wikimedia list of lists')]
[('inst', 'http://www.wikidata.org/entity/Q151885'), ('iname', 'concept')]
[('inst', 'http://www.wikidata.org/entity/Q5107'), ('iname', 'continent')]
[('inst', 'http://www.wikidata.org/entity/Q201658'), ('iname', 'film genre')]
[('inst', 'http://www.wikidata.org/entity/Q4263830'), ('iname', 'literary form')]
[('inst', 'http://www.wikidata.org/entity/Q223393'), ('iname', 'literary genre')]
[('inst', 'http://www.wikidata.org/entity/Q19478619'), ('iname', 'metaclass')]
[('inst', 'http://www.wikidata.org/entity/Q188451'), ('iname', 'music genre')]
[('inst', 'http://www.wikidata.org/entity/Q1239328'), ('iname', 'national ency

In [44]:
#Now we get all the instaces of film genre. 
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   ?inst wdt:P31 wd:Q201658.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)


Results
[('inst', 'http://www.wikidata.org/entity/Q4674071'), ('iname', 'Acid Western')]
[('inst', 'http://www.wikidata.org/entity/Q108179271'), ('iname', 'Arthouse animation')]
[('inst', 'http://www.wikidata.org/entity/Q223770'), ('iname', 'B movie')]
[('inst', 'http://www.wikidata.org/entity/Q4836991'), ('iname', 'B western')]
[('inst', 'http://www.wikidata.org/entity/Q3461143'), ('iname', 'BDSM film')]
[('inst', 'http://www.wikidata.org/entity/Q31086090'), ('iname', 'Barbarian film')]
[('inst', 'http://www.wikidata.org/entity/Q84041'), ('iname', 'Bavarian porn')]
[('inst', 'http://www.wikidata.org/entity/Q7717463'), ('iname', 'Bible film')]
[('inst', 'http://www.wikidata.org/entity/Q48816645'), ('iname', 'Black film')]
[('inst', 'http://www.wikidata.org/entity/Q93196'), ('iname', 'Bollywood')]
[('inst', 'http://www.wikidata.org/entity/Q16247268'), ('iname', 'Bollywood horror film')]
[('inst', 'http://www.wikidata.org/entity/Q4950139'), ('iname', 'Bourekas film')]
[('inst', 'http://w

In [46]:
#Now we have crime film (Q959790), the next step is to get all the movies that has in the property genre crime film to get all the crime films 
queryString = """
SELECT DISTINCT ?movie ?movieName 
WHERE {
   # bind something
   ?movie wdt:P136 wd:Q959790.
   
   # get the label
   ?movie sc:name ?movieName.
}
ORDER BY ?movieName
LIMIT 200
"""

print("Results")
x=run_query(queryString)


Results
[('movie', 'http://www.wikidata.org/entity/Q16691116'), ('movieName', '"Dream" Agency')]
[('movie', 'http://www.wikidata.org/entity/Q99644282'), ('movieName', '#martyisdead')]
[('movie', 'http://www.wikidata.org/entity/Q2966024'), ('movieName', "'R Xmas")]
[('movie', 'http://www.wikidata.org/entity/Q426517'), ('movieName', '...And Justice for All')]
[('movie', 'http://www.wikidata.org/entity/Q4545163'), ('movieName', '...tick...tick...tick...')]
[('movie', 'http://www.wikidata.org/entity/Q1537986'), ('movieName', '.45')]
[('movie', 'http://www.wikidata.org/entity/Q4545428'), ('movieName', '.com for Murder')]
[('movie', 'http://www.wikidata.org/entity/Q27588607'), ('movieName', '1 + 3 = 4')]
[('movie', 'http://www.wikidata.org/entity/Q4595935'), ('movieName', '1 Day')]
[('movie', 'http://www.wikidata.org/entity/Q2806572'), ('movieName', '1-Ichi')]
[('movie', 'http://www.wikidata.org/entity/Q27890273'), ('movieName', '10 Kalpanakal')]
[('movie', 'http://www.wikidata.org/entity/Q2

In [49]:
#Now I get also the production company from the movies 
queryString = """
SELECT DISTINCT ?movie ?movieName ?company ?companyName
WHERE {
   # bind something
   ?movie wdt:P136 wd:Q959790.
   ?movie wdt:P272 ?company. 
   
   # get the label
   ?movie sc:name ?movieName.
   ?company sc:name ?companyName .
}
ORDER BY ?companyName
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('movie', 'http://www.wikidata.org/entity/Q705232'), ('movieName', 'Arsène Lupin'), ('company', 'http://www.wikidata.org/entity/Q1554656'), ('companyName', '01 Distribution')]
[('movie', 'http://www.wikidata.org/entity/Q3233538'), ('movieName', 'Rivals'), ('company', 'http://www.wikidata.org/entity/Q13515711'), ('companyName', '13ème Rue Universal')]
[('movie', 'http://www.wikidata.org/entity/Q2296287'), ('movieName', 'The Chicago Code'), ('company', 'http://www.wikidata.org/entity/Q2717612'), ('companyName', '20th Century Fox Television')]
[('movie', 'http://www.wikidata.org/entity/Q11510316'), ('movieName', 'Love Under Fire'), ('company', 'http://www.wikidata.org/entity/Q434841'), ('companyName', '20th Century Studios')]
[('movie', 'http://www.wikidata.org/entity/Q586880'), ('movieName', 'Vanishing Point'), ('company', 'http://www.wikidata.org/entity/Q434841'), ('companyName', '20th Century Studios')]
[('movie', 'http://www.wikidata.org/entity/Q2533178'), ('movieName', 'Dead

In [52]:
#Now I show only the companies that are instance of Television production company and film production company. 
queryString = """
SELECT DISTINCT ?movie ?movieName ?company ?companyName
WHERE {
   # bind something
   ?movie wdt:P136 wd:Q959790.
   ?movie wdt:P272 ?company. 
   ?company wdt:P31 ?type .
            #Television production company    #film production company 
   VALUES ?type {wd:Q10689397                    wd:Q1762059}
   # get the label
   ?movie sc:name ?movieName.
   ?company sc:name ?companyName .
}
ORDER BY ?companyName
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('movie', 'http://www.wikidata.org/entity/Q2296287'), ('movieName', 'The Chicago Code'), ('company', 'http://www.wikidata.org/entity/Q2717612'), ('companyName', '20th Century Fox Television')]
[('movie', 'http://www.wikidata.org/entity/Q11510316'), ('movieName', 'Love Under Fire'), ('company', 'http://www.wikidata.org/entity/Q434841'), ('companyName', '20th Century Studios')]
[('movie', 'http://www.wikidata.org/entity/Q586880'), ('movieName', 'Vanishing Point'), ('company', 'http://www.wikidata.org/entity/Q434841'), ('companyName', '20th Century Studios')]
[('movie', 'http://www.wikidata.org/entity/Q2533178'), ('movieName', 'Deadline – U.S.A.'), ('company', 'http://www.wikidata.org/entity/Q434841'), ('companyName', '20th Century Studios')]
[('movie', 'http://www.wikidata.org/entity/Q23780457'), ('movieName', 'Kingsman: The Golden Circle'), ('company', 'http://www.wikidata.org/entity/Q434841'), ('companyName', '20th Century Studios')]
[('movie', 'http://www.wikidata.org/entity/

Final query for this task

In [59]:
#The last thing its count all the movies for each company.
queryString = """
SELECT DISTINCT ?company ?companyName (COUNT(*) AS ?crimefilms)
WHERE {
   # bind something
   ?movie wdt:P136 wd:Q959790.
   ?movie wdt:P272 ?company. 
   ?company wdt:P31 ?type .
            #Television production company    #film production company 
   VALUES ?type {wd:Q10689397                    wd:Q1762059}
   # get the label
   
   ?company sc:name ?companyName .
}
ORDER BY DESC (?crimefilms) #I order by ?crimefilms to get the companies with more films.
LIMIT 5 #I put the limit 5 because I only want the top-5.
"""

print("Results")
x=run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q434841'), ('companyName', '20th Century Studios'), ('crimefilms', '180')]
[('company', 'http://www.wikidata.org/entity/Q126399'), ('companyName', 'Warner Bros.'), ('crimefilms', '164')]
[('company', 'http://www.wikidata.org/entity/Q186941'), ('companyName', 'Columbia Pictures'), ('crimefilms', '118')]
[('company', 'http://www.wikidata.org/entity/Q168383'), ('companyName', 'Universal Pictures'), ('crimefilms', '113')]
[('company', 'http://www.wikidata.org/entity/Q179200'), ('companyName', 'Metro-Goldwyn-Mayer'), ('crimefilms', '112')]
5


## Task 6
Find the company (amongst television production company and film production company) which distributed more films (of any kind of film genre) that they did not produce (the result set must be a list of triples company IRI, label and #film).

In [61]:
#First I get all the films 
queryString = """
SELECT DISTINCT ?film ?filmName
WHERE {
   # bind something
   ?film wdt:P31 wd:Q11424.
   
   
   # get the label
   ?film sc:name ?filmName.
  
}
ORDER BY ?filmName
LIMIT 50
"""

print("Results")
x=run_query(queryString)



Results
[('film', 'http://www.wikidata.org/entity/Q4540204'), ('filmName', '!Women Art Revolution')]
[('film', 'http://www.wikidata.org/entity/Q4032288'), ('filmName', '"713" Requests Permission to Land')]
[('film', 'http://www.wikidata.org/entity/Q93765322'), ('filmName', '"By the railroad" film')]
[('film', 'http://www.wikidata.org/entity/Q615254'), ('filmName', '"Crocodile" Dundee')]
[('film', 'http://www.wikidata.org/entity/Q1140745'), ('filmName', '"Crocodile" Dundee II')]
[('film', 'http://www.wikidata.org/entity/Q16691116'), ('filmName', '"Dream" Agency')]
[('film', 'http://www.wikidata.org/entity/Q3596104'), ('filmName', '"FF.SS." – Cioè: "...che mi hai portato a fare sopra a Posillipo se non mi vuoi più bene?"')]
[('film', 'http://www.wikidata.org/entity/Q4540285'), ('filmName', '"Good Hair" and Other Dubious Distinctions')]
[('film', 'http://www.wikidata.org/entity/Q93764465'), ('filmName', '"Interrogation", film, 1982')]
[('film', 'http://www.wikidata.org/entity/Q821560'), (

In [68]:
#Now I get the companies who distributed them
queryString = """
SELECT DISTINCT ?filmName ?distributorName 
WHERE {

   # bind something
   ?film wdt:P31 wd:Q11424.
   ?film wdt:P750 ?distributor.
   
  
   # get the label
   ?film sc:name ?filmName.
    ?distributor sc:name ?distributorName .
    
}
ORDER BY ?filmName ?distributorName ?producerName
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('filmName', '!Women Art Revolution'), ('distributorName', 'Zeitgeist Films')]
[('filmName', '"Crocodile" Dundee'), ('distributorName', 'FandangoNow')]
[('filmName', '"Crocodile" Dundee'), ('distributorName', 'Netflix')]
[('filmName', '"Crocodile" Dundee'), ('distributorName', 'Paramount Pictures')]
[('filmName', '"Crocodile" Dundee II'), ('distributorName', 'FandangoNow')]
[('filmName', '"Crocodile" Dundee II'), ('distributorName', 'Netflix')]
[('filmName', '"Crocodile" Dundee II'), ('distributorName', 'Paramount Pictures')]
[('filmName', '"Weird Al" Yankovic Live!'), ('distributorName', 'Volcano')]
[('filmName', '"Weird Al" Yankovic Live!: The Alpocalypse Tour'), ('distributorName', 'Comedy Central')]
[('filmName', '"Weird Al" Yankovic: The Ultimate Video Collection'), ('distributorName', 'Volcano')]
[('filmName', '#1 Cheerleader Camp'), ('distributorName', 'Netflix')]
[('filmName', '#1 Cheerleader Camp'), ('distributorName', 'The Asylum')]
[('filmName', '#Alive'), ('distrib

In [65]:
#Now I get also the company who produced them. 
queryString = """
SELECT DISTINCT ?filmName ?distributorName ?producerName
WHERE {

   # bind something
   ?film wdt:P31 wd:Q11424.
   ?film wdt:P750 ?distributor.
   ?film wdt:P272 ?producer .

   # get the label
   ?film sc:name ?filmName.
    ?distributor sc:name ?distributorName .
    ?producer sc:name ?producerName .
}
ORDER BY ?filmName ?distributorName ?producerName
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('filmName', '"Crocodile" Dundee'), ('distributorName', 'FandangoNow'), ('producerName', 'Paramount Pictures')]
[('filmName', '"Crocodile" Dundee'), ('distributorName', 'Netflix'), ('producerName', 'Paramount Pictures')]
[('filmName', '"Crocodile" Dundee'), ('distributorName', 'Paramount Pictures'), ('producerName', 'Paramount Pictures')]
[('filmName', '"Crocodile" Dundee II'), ('distributorName', 'FandangoNow'), ('producerName', 'Paramount Pictures')]
[('filmName', '"Crocodile" Dundee II'), ('distributorName', 'Netflix'), ('producerName', 'Paramount Pictures')]
[('filmName', '"Crocodile" Dundee II'), ('distributorName', 'Paramount Pictures'), ('producerName', 'Paramount Pictures')]
[('filmName', '#1 Cheerleader Camp'), ('distributorName', 'Netflix'), ('producerName', 'The Asylum')]
[('filmName', '#1 Cheerleader Camp'), ('distributorName', 'The Asylum'), ('producerName', 'The Asylum')]
[('filmName', "'Neath the Arizona Skies"), ('distributorName', 'Monogram Pictures'), ('produ

In [70]:
#I try now only the distributors that not produced the movies but they distributed them. 
queryString = """
SELECT DISTINCT ?filmName ?distributorName ?producerName
WHERE {

   # bind something
   ?film wdt:P31 wd:Q11424.
   ?film wdt:P750 ?distributor.
   ?film wdt:P272 ?producer .
   
   FILTER (?distributor != ?producer) #The values of the producer and the distributor can't be the same.
   # get the label
   ?film sc:name ?filmName.
    ?distributor sc:name ?distributorName .
    ?producer sc:name ?producerName .
}
ORDER BY ?filmName ?distributorName ?producerName
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('filmName', '"Crocodile" Dundee'), ('distributorName', 'FandangoNow'), ('producerName', 'Paramount Pictures')]
[('filmName', '"Crocodile" Dundee'), ('distributorName', 'Netflix'), ('producerName', 'Paramount Pictures')]
[('filmName', '"Crocodile" Dundee II'), ('distributorName', 'FandangoNow'), ('producerName', 'Paramount Pictures')]
[('filmName', '"Crocodile" Dundee II'), ('distributorName', 'Netflix'), ('producerName', 'Paramount Pictures')]
[('filmName', '#1 Cheerleader Camp'), ('distributorName', 'Netflix'), ('producerName', 'The Asylum')]
[('filmName', "'Neath the Arizona Skies"), ('distributorName', 'Monogram Pictures'), ('producerName', 'Lone Star Productions')]
[('filmName', "'Neath the Arizona Skies"), ('distributorName', 'Netflix'), ('producerName', 'Lone Star Productions')]
[('filmName', '-30-'), ('distributorName', 'Warner Bros.'), ('producerName', 'Mark VII Limited')]
[('filmName', '...And Justice for All'), ('distributorName', 'Netflix'), ('producerName', 'Colum

In [71]:
#Now I get only the companies that are television production company and film production company
queryString = """
SELECT DISTINCT ?filmName ?distributorName ?producerName
WHERE {

   # bind something
   ?film wdt:P31 wd:Q11424.
   ?film wdt:P750 ?distributor.
   
   ?distributor wdt:P31 ?disType. #I get the instances of the distributor .
   
   ?film wdt:P272 ?producer .
   ?producer wdt:P31 ?proType . #I get the instances of the producers .
   
                   #Television production company    #film production company 
   VALUES ?disType {wd:Q10689397                    wd:Q1762059} 
   VALUES ?proType {wd:Q10689397                    wd:Q1762059}
   
   
   FILTER (?distributor != ?producer) #The values of the producer and the distributor can't be the same. 
   
   
   # get the label
   ?film sc:name ?filmName.
    ?distributor sc:name ?distributorName .
    ?producer sc:name ?producerName .
}
ORDER BY ?filmName ?distributorName ?producerName
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('filmName', '"Crocodile" Dundee'), ('distributorName', 'Netflix'), ('producerName', 'Paramount Pictures')]
[('filmName', '"Crocodile" Dundee II'), ('distributorName', 'Netflix'), ('producerName', 'Paramount Pictures')]
[('filmName', '#1 Cheerleader Camp'), ('distributorName', 'Netflix'), ('producerName', 'The Asylum')]
[('filmName', '...And Justice for All'), ('distributorName', 'Netflix'), ('producerName', 'Columbia Pictures')]
[('filmName', '1, 2, 3, Sun'), ('distributorName', 'Netflix'), ('producerName', 'Gaumont Film Company')]
[('filmName', '10'), ('distributorName', 'Netflix'), ('producerName', 'Orion Pictures')]
[('filmName', '10'), ('distributorName', 'Warner Bros.'), ('producerName', 'Orion Pictures')]
[('filmName', '10 Cloverfield Lane'), ('distributorName', 'Netflix'), ('producerName', 'Bad Robot Productions')]
[('filmName', '10 Cloverfield Lane'), ('distributorName', 'Netflix'), ('producerName', 'Paramount Pictures')]
[('filmName', '10 Cloverfield Lane'), ('distri

Final query for this task

In [76]:
#The last thing to do its to get the number of films for each distributor and only show the one with more films. 

queryString = """
SELECT DISTINCT ?distributor ?distributorName (COUNT(*) AS ?films)
WHERE {

   # bind something
   ?film wdt:P31 wd:Q11424.
   ?film wdt:P750 ?distributor.
   
   ?distributor wdt:P31 ?disType. #I get the instances of the distributor .
   
   ?film wdt:P272 ?producer .
   ?producer wdt:P31 ?proType . #I get the instances of the producers .
   
                   #Television production company    #film production company 
   VALUES ?disType {wd:Q10689397                    wd:Q1762059} 
   VALUES ?proType {wd:Q10689397                    wd:Q1762059}
   
   
   FILTER (?distributor != ?producer) #The values of the producer and the distributor can't be the same. 
   
   
   # get the label
   ?film sc:name ?filmName.
    ?distributor sc:name ?distributorName .
    ?producer sc:name ?producerName .
}
ORDER BY DESC (?films) 
LIMIT 1
"""

print("Results")
x=run_query(queryString)

Results
[('distributor', 'http://www.wikidata.org/entity/Q907311'), ('distributorName', 'Netflix'), ('films', '10756')]
1


## Task 7
Find how many company are listed in every stock exchange market (the result set must be a list of triples stock exchange IRI, label and #companies).

In [84]:
#I need the identifier for company. I am going to search for the instances of production company 
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q1762059 ?p ?obj .
   ?obj wdt:P279 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q20897549'), ('iname', 'art institution')]
[('inst', 'http://www.wikidata.org/entity/Q783794'), ('iname', 'company')]
[('inst', 'http://www.wikidata.org/entity/Q3152824'), ('iname', 'cultural institution')]
[('inst', 'http://www.wikidata.org/entity/Q19595701'), ('iname', 'entertainment industry')]
[('inst', 'http://www.wikidata.org/entity/Q104649845'), ('iname', 'film organization')]
[('inst', 'http://www.wikidata.org/entity/Q1331793'), ('iname', 'media company')]
[('inst', 'http://www.wikidata.org/entity/Q19364326'), ('iname', 'media studio facility')]
7


In [79]:
#Lets find that property on the properties of the companies. Lets start with the properties of netflix. 
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q907311 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
ORDER BY DESC (?name)
LIMIT 500
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P2283'), ('name', 'uses')]
[('p', 'http://www.wikidata.org/prop/direct/P2139'), ('name', 'total revenue')]
[('p', 'http://www.wikidata.org/prop/direct/P2403'), ('name', 'total assets')]
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('name', "topic's main template")]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('name', "topic's main category")]
[('p', 'http://www.wikidata.org/prop/direct/P355'), ('name', 'subsidiary')]
[('p', 'http://www.wikidata.org/prop/direct/P3984'), ('name', 'subreddit')]
[('p', 'http://www.wikidata.org/prop/direct/P414'), ('name', 'stock exchange')]
[('p', 'http://www.wikidata.org/prop/direct/P8687'), ('name', 'social media followers')]
[('p', 'http://www.wikidata.org/prop/direct/P9138'), ('name', 'service status information URL')]
[('p', 'http://www.wikidata.org/prop/direct/P4354'), ('name', 'search formatter URL')]
[('p', 'http://www.wikidata.org/prop/direct/P1056'), ('name', 'product or material

In [80]:
#Now I get the values on the propertie stock exchange.  
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q36479 ?p ?obj .
   ?obj wdt:P414 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q82059'), ('iname', 'NASDAQ')]
1


In [9]:
#lets explore from who is instance of NASDAQ. 
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q82059 ?p ?obj .
   ?obj wdt:P31 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)


Results
[('inst', 'http://www.wikidata.org/entity/Q202686'), ('iname', 'Commonwealth realm')]
[('inst', 'http://www.wikidata.org/entity/Q4167836'), ('iname', 'Wikimedia category')]
[('inst', 'http://www.wikidata.org/entity/Q11266439'), ('iname', 'Wikimedia template')]
[('inst', 'http://www.wikidata.org/entity/Q1549591'), ('iname', 'big city')]
[('inst', 'http://www.wikidata.org/entity/Q4830453'), ('iname', 'business')]
[('inst', 'http://www.wikidata.org/entity/Q5119'), ('iname', 'capital')]
[('inst', 'http://www.wikidata.org/entity/Q515'), ('iname', 'city')]
[('inst', 'http://www.wikidata.org/entity/Q6256'), ('iname', 'country')]
[('inst', 'http://www.wikidata.org/entity/Q8142'), ('iname', 'currency')]
[('inst', 'http://www.wikidata.org/entity/Q28795800'), ('iname', 'decimal currency')]
[('inst', 'http://www.wikidata.org/entity/Q14083'), ('iname', 'dollar')]
[('inst', 'http://www.wikidata.org/entity/Q223832'), ('iname', 'dominion of the British Empire')]
[('inst', 'http://www.wikidata.

In [24]:
#I get all the companies that have the property stock exchange and show also the stock exchange that the are in . Company  (Q783794)
queryString = """

SELECT DISTINCT ?company ?companyName ?stock ?stockName
WHERE {

   #bind something
   ?company wdt:P31 wd:Q783794 .
   ?company wdt:P414 ?stock .
   
   # get the label 
   ?company sc:name ?companyName .
   ?stock sc:name ?stockName .
}
ORDER BY ?stockName ?companyName
LIMIT 5
#LIMIT 1000 #I coment this limit to don't get the page to long but with that limit you can see all the companies on the NASDAQ and Netflix is not
            #in that so we don't want this result due to that doesn't provides me all the companies for what probably don't provides neither all 
            #the stock exchange markets. 

"""

print("Results")
x=run_query(queryString)


Results
[('company', 'http://www.wikidata.org/entity/Q30153680'), ('companyName', 'Orion Land'), ('stock', 'http://www.wikidata.org/entity/Q438511'), ('stockName', 'Alternative Investment Market')]
[('company', 'http://www.wikidata.org/entity/Q104839317'), ('companyName', '1300 SMILES'), ('stock', 'http://www.wikidata.org/entity/Q732670'), ('stockName', 'Australian Securities Exchange')]
[('company', 'http://www.wikidata.org/entity/Q20311412'), ('companyName', 'APA Group'), ('stock', 'http://www.wikidata.org/entity/Q732670'), ('stockName', 'Australian Securities Exchange')]
[('company', 'http://www.wikidata.org/entity/Q55639119'), ('companyName', 'Afterpay'), ('stock', 'http://www.wikidata.org/entity/Q732670'), ('stockName', 'Australian Securities Exchange')]
[('company', 'http://www.wikidata.org/entity/Q67174463'), ('companyName', 'AppsVillage'), ('stock', 'http://www.wikidata.org/entity/Q732670'), ('stockName', 'Australian Securities Exchange')]
5


I don't find Netflix among the NASDAQ stock exchange, so with the last query I don't get all the companies, I look to find all that have the property stock exchange(P414) insted 

In [23]:
#I get all the companies and the stock market that they are. 
queryString = """

SELECT DISTINCT ?company ?companyName ?stock ?stockName
WHERE {

   #bind something
   ?company wdt:P414 ?stock.
   
   # get the label 
   ?company sc:name ?companyName .
   ?stock sc:name ?stockName .
}
ORDER BY ?stockName ?companyName 
LIMIT 5
#LIMIT 5000 #This limit I comment it due to I put it to high to see if finally I got all the companies and it seems so.

"""

print("Results")
x=run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q4819798'), ('companyName', 'AudioBoom'), ('stock', 'http://www.wikidata.org/entity/Q438511'), ('stockName', 'Alternative Investment Market')]
[('company', 'http://www.wikidata.org/entity/Q17070435'), ('companyName', 'Beowulf Mining'), ('stock', 'http://www.wikidata.org/entity/Q438511'), ('stockName', 'Alternative Investment Market')]
[('company', 'http://www.wikidata.org/entity/Q17008542'), ('companyName', 'Datatec'), ('stock', 'http://www.wikidata.org/entity/Q438511'), ('stockName', 'Alternative Investment Market')]
[('company', 'http://www.wikidata.org/entity/Q50390569'), ('companyName', 'European Metals Holdings'), ('stock', 'http://www.wikidata.org/entity/Q438511'), ('stockName', 'Alternative Investment Market')]
[('company', 'http://www.wikidata.org/entity/Q16984856'), ('companyName', 'Horizon Discovery'), ('stock', 'http://www.wikidata.org/entity/Q438511'), ('stockName', 'Alternative Investment Market')]
5


Final query for this task

In [27]:
#I remove the companies and look only the stock and then count all the companies
queryString = """

SELECT DISTINCT  ?stock ?stockName (COUNT(*) AS ?companies)
WHERE {

   #bind something
   ?company wdt:P414 ?stock.
   
   # get the label 
   
   ?stock sc:name ?stockName .
}
ORDER BY ?stockName  
LIMIT 200


"""

print("Results")
x=run_query(queryString)

Results
[('stock', 'http://www.wikidata.org/entity/Q438511'), ('stockName', 'Alternative Investment Market'), ('companies', '8')]
[('stock', 'http://www.wikidata.org/entity/Q11285412'), ('stockName', 'Ambitious'), ('companies', '1')]
[('stock', 'http://www.wikidata.org/entity/Q463881'), ('stockName', 'American depositary receipt'), ('companies', '1')]
[('stock', 'http://www.wikidata.org/entity/Q74039053'), ('stockName', 'Amsterdam Stock Exchange'), ('companies', '1')]
[('stock', 'http://www.wikidata.org/entity/Q683787'), ('stockName', 'Armenia Securities Exchange'), ('companies', '2')]
[('stock', 'http://www.wikidata.org/entity/Q755341'), ('stockName', 'Athens Exchange'), ('companies', '13')]
[('stock', 'http://www.wikidata.org/entity/Q732670'), ('stockName', 'Australian Securities Exchange'), ('companies', '146')]
[('stock', 'http://www.wikidata.org/entity/Q12721279'), ('stockName', 'Australian Stock Exchange'), ('companies', '2')]
[('stock', 'http://www.wikidata.org/entity/Q796297'),

## Task 8
Identify the BGP for Academy Award

In [34]:
#look the property award recieved from the lion king. 
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q36479 ?p ?obj .
   ?obj wdt:P166 ?inst.
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)


Results
[('inst', 'http://www.wikidata.org/entity/Q52382875'), ('iname', 'AAAS Fellow')]
[('inst', 'http://www.wikidata.org/entity/Q103916'), ('iname', 'Academy Award for Best Actor')]
[('inst', 'http://www.wikidata.org/entity/Q106800'), ('iname', 'Academy Award for Best Animated Feature')]
[('inst', 'http://www.wikidata.org/entity/Q488651'), ('iname', 'Academy Award for Best Original Score')]
[('inst', 'http://www.wikidata.org/entity/Q112243'), ('iname', 'Academy Award for Best Original Song')]
[('inst', 'http://www.wikidata.org/entity/Q106301'), ('iname', 'Academy Award for Best Supporting Actress')]
[('inst', 'http://www.wikidata.org/entity/Q19020'), ('iname', 'Academy Awards')]
[('inst', 'http://www.wikidata.org/entity/Q727328'), ('iname', 'Academy Honorary Award')]
[('inst', 'http://www.wikidata.org/entity/Q8610'), ('iname', 'Academy Scientific and Technical Award')]
[('inst', 'http://www.wikidata.org/entity/Q566905'), ('iname', 'Annie Award')]
[('inst', 'http://www.wikidata.org/e

Final query for this task

In [3]:
#I see Academy awards on the list from the awards recieved by the lion king Q19020
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   wd:Q36479 ?p ?obj .
   ?obj wdt:P166 ?inst.
   
   VALUES ?inst {wd:Q19020}
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)


Results
[('inst', 'http://www.wikidata.org/entity/Q19020'), ('iname', 'Academy Awards')]
1


## Task 9
Find the companies (among television and film production companies) that won at least 5 Academy Awards for Best Actress for the movies they produced (the result set must be a list of triples company IRI, label and #awards).

In [31]:
#First I'll look for the films that won the academy awards for best Actress but for do that I have to get the identifier for Academy awards for 
#best actress. 
queryString = """
SELECT DISTINCT ?inst ?iname 
WHERE {
   # bind something
   ?inst wdt:P31 wd:Q19020 .
   
   
   # get the label
   ?inst sc:name ?iname.
}
ORDER BY ?iname
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('inst', 'http://www.wikidata.org/entity/Q103916'), ('iname', 'Academy Award for Best Actor')]
[('inst', 'http://www.wikidata.org/entity/Q103618'), ('iname', 'Academy Award for Best Actress')]
[('inst', 'http://www.wikidata.org/entity/Q106800'), ('iname', 'Academy Award for Best Animated Feature')]
[('inst', 'http://www.wikidata.org/entity/Q917808'), ('iname', 'Academy Award for Best Animated Short Film')]
[('inst', 'http://www.wikidata.org/entity/Q22253133'), ('iname', 'Academy Award for Best Art Direction, Black and White')]
[('inst', 'http://www.wikidata.org/entity/Q22253131'), ('iname', 'Academy Award for Best Art Direction, Color')]
[('inst', 'http://www.wikidata.org/entity/Q501004'), ('iname', 'Academy Award for Best Assistant Director')]
[('inst', 'http://www.wikidata.org/entity/Q131520'), ('iname', 'Academy Award for Best Cinematography')]
[('inst', 'http://www.wikidata.org/entity/Q21995139'), ('iname', 'Academy Award for Best Cinematography, Black-and-White')]
[('inst

In [4]:
#Now I get all the films that won the Academy Award for best actress Q103618
queryString = """
SELECT DISTINCT ?film ?filmName 
WHERE {

   # bind something
   ?film wdt:P31 wd:Q11424 . #I get all the films. 
   ?film wdt:P166 wd:Q103618. #Now I filter to only get the ones that won the award. 
   
   
   
   # get the label
   ?film sc:name ?filmName.
}
ORDER BY ?filmName
LIMIT 200
"""

print("Results")
x=run_query(queryString)


Results
[('film', 'http://www.wikidata.org/entity/Q212129'), ('filmName', 'A Streetcar Named Desire')]
[('film', 'http://www.wikidata.org/entity/Q1622883'), ('filmName', 'A Touch of Class')]
[('film', 'http://www.wikidata.org/entity/Q916431'), ('filmName', "Alice Doesn't Live Here Anymore")]
[('film', 'http://www.wikidata.org/entity/Q486591'), ('filmName', 'Anastasia')]
[('film', 'http://www.wikidata.org/entity/Q233464'), ('filmName', 'Annie Hall')]
[('film', 'http://www.wikidata.org/entity/Q221594'), ('filmName', 'As Good as It Gets')]
[('film', 'http://www.wikidata.org/entity/Q180214'), ('filmName', 'Black Swan')]
[('film', 'http://www.wikidata.org/entity/Q2907178'), ('filmName', 'Blue Jasmine')]
[('film', 'http://www.wikidata.org/entity/Q1849813'), ('filmName', 'Blue Sky')]
[('film', 'http://www.wikidata.org/entity/Q375594'), ('filmName', 'Born Yesterday')]
[('film', 'http://www.wikidata.org/entity/Q465646'), ('filmName', "Boys Don't Cry")]
[('film', 'http://www.wikidata.org/entity/

In [22]:
#Now I have all the movies that won the award for the best actress, I have to get the company. 
queryString = """
SELECT DISTINCT ?film ?filmName ?companyName
WHERE {

   # bind something
   ?film wdt:P31 wd:Q11424 . #I get all the films. 
   ?film wdt:P166 wd:Q103618. #Now I filter to only get the ones that won the award. 
   ?film wdt:P272 ?company. 
   
   
   # get the label
   ?film sc:name ?filmName.
   ?company sc:name ?companyName. 
}
ORDER BY ?filmName
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('film', 'http://www.wikidata.org/entity/Q212129'), ('filmName', 'A Streetcar Named Desire'), ('companyName', 'Warner Bros.')]
[('film', 'http://www.wikidata.org/entity/Q916431'), ('filmName', "Alice Doesn't Live Here Anymore"), ('companyName', 'Warner Bros.')]
[('film', 'http://www.wikidata.org/entity/Q486591'), ('filmName', 'Anastasia'), ('companyName', '20th Century Studios')]
[('film', 'http://www.wikidata.org/entity/Q221594'), ('filmName', 'As Good as It Gets'), ('companyName', 'TriStar Pictures')]
[('film', 'http://www.wikidata.org/entity/Q221594'), ('filmName', 'As Good as It Gets'), ('companyName', 'Gracie Films')]
[('film', 'http://www.wikidata.org/entity/Q180214'), ('filmName', 'Black Swan'), ('companyName', 'Phoenix Pictures')]
[('film', 'http://www.wikidata.org/entity/Q180214'), ('filmName', 'Black Swan'), ('companyName', 'RatPac-Dune Entertainment')]
[('film', 'http://www.wikidata.org/entity/Q1849813'), ('filmName', 'Blue Sky'), ('companyName', 'Orion Pictures')]


In [14]:
#The next step is to get only the companies (among television and film production companies) and how many movies that the producer won the award. 
queryString = """
SELECT DISTINCT ?company ?companyName (COUNT(*) AS ?awards) #I count how many times each company won. 
WHERE {

   # bind something
   ?film wdt:P31 wd:Q11424 . #I get all the films. 
   ?film wdt:P166 wd:Q103618. #Now I filter to only get the ones that won the award. 
   ?film wdt:P272 ?company. 
   ?company wdt:P31 ?type . #I get the type of the company into ?type
   
   #I get sure that the companies are Television producers companies or Film producers company. 
    #Television production company    #film production company 
   VALUES ?type {wd:Q10689397                    wd:Q1762059} 
   
   
   # get the label
   
   ?company sc:name ?companyName. 
}
ORDER BY ?companyName
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q434841'), ('companyName', '20th Century Studios'), ('awards', '10')]
[('company', 'http://www.wikidata.org/entity/Q4824145'), ('companyName', 'Australian Film Commission'), ('awards', '1')]
[('company', 'http://www.wikidata.org/entity/Q971446'), ('companyName', 'BBC Film'), ('awards', '2')]
[('company', 'http://www.wikidata.org/entity/Q100706708'), ('companyName', 'Babelsberg Film GmbH'), ('awards', '1')]
[('company', 'http://www.wikidata.org/entity/Q3639301'), ('companyName', 'BiM Distribuzione'), ('awards', '1')]
[('company', 'http://www.wikidata.org/entity/Q17386859'), ('companyName', 'Blueprint Pictures'), ('awards', '1')]
[('company', 'http://www.wikidata.org/entity/Q1706678'), ('companyName', 'Brandywine Productions'), ('awards', '1')]
[('company', 'http://www.wikidata.org/entity/Q622848'), ('companyName', 'Castle Rock Entertainment'), ('awards', '2')]
[('company', 'http://www.wikidata.org/entity/Q4036210'), ('companyName', 'C

Final query for this task

In [45]:
#The last step is to get only the ones with 5 or more
queryString = """
SELECT DISTINCT ?company ?companyName (COUNT(*) AS ?awards) #I count how many times each company won. 
WHERE {

   # bind something
   ?film wdt:P31 wd:Q11424 . #I get all the films. 
   ?film wdt:P166 wd:Q103618. #Now I filter to only get the ones that won the award. 
   ?film wdt:P272 ?company. 
   ?company wdt:P31 ?type . #I get the type of the company into ?type
   
   #I get sure that the companies are Television producers companies or Film producers company. 
    #Television production company    #film production company 
   VALUES ?type { wd:Q10689397                 wd:Q1762059} 
   
 
   # get the label
   
   ?company sc:name ?companyName. 
}
GROUP BY ?company ?companyName 
HAVING (count(*)>=5) #With having I do that 
ORDER BY ?companyName
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q434841'), ('companyName', '20th Century Studios'), ('awards', '10')]
[('company', 'http://www.wikidata.org/entity/Q179200'), ('companyName', 'Metro-Goldwyn-Mayer'), ('awards', '11')]
[('company', 'http://www.wikidata.org/entity/Q159846'), ('companyName', 'Paramount Pictures'), ('awards', '11')]
[('company', 'http://www.wikidata.org/entity/Q126399'), ('companyName', 'Warner Bros.'), ('awards', '9')]
4
