# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.

In [None]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-e49c8d021d-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString,verbose = True):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        results = sparql.query()
        json_results = results.convert()
        if len(json_results['results']['bindings'])==0:
            print("Empty")
            return []
        array = []
        for bindings in json_results['results']['bindings']:
            app =  [ (var, value['value'])  for var, value in bindings.items() ] 
            if verbose:
                print( app)
            array.append(app)
        if verbose:
            print(len(array))
        return array

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# History Workflow Series ("World Wide Web") 

Consider the following exploratory information need:

> Investigate the origins of the World Wide Web and related academic activities and scientists.

## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | occupation    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P27`     | nationality   | predicate |
| `wdt:P3342`     | Significant person       | predicate |
| `wd:Q5`| Human       | node |
| `wd:Q466`| World Wide Web      | node |





Also consider

```
wd:Q466 ?p ?obj .
```

is the BGP to retrieve all **properties of World Wide Web**

Please consider that when you return a resource, you should return the IRI and the label of the resource. In particular, when the task require you to identify a BGP the result set must always be a list of couples IRI - label.

The workload should:

1. Find the inventors of World Wide Web (return IRI and name).

2. Identify the BGP for hypertext system.

3. Find all the hypertext systems born before the 1980 (return the IRI and name of the system and the inception date).

4. Identify the BGP for computer scientist

5. Find how many computer scientists there are for each continent (consider their citizenship). Return the IRI and name of the continent and the number of scientists for each continent. 

6. Find all the computer scientists who thaught at the University of Cambridge (return IRI and name)

7. Find all the computer scientists who wrote at least 5 books (please consider only the instances of book, exclude "literary work" or other type of work related to books). Return the IRI and name of the computer scientist and the number of books. 

## Task 1
Find the inventors of World Wide Web (return IRI and name).

In [16]:
# retrieve all properties of World Wide Web
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q466 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
#LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1051'), ('name', 'PSH ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1225'), ('name', 'U.S. National Archives Identifier')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('name', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1430'), ('name', 'OpenPlaques subject ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1482'), ('name', 'Stack Exchange tag')]
[('p', 'http://www.wikidata.org/prop/direct/P1542'), ('name', 'has effect')]
[('p', 'http://www.wikidata.org/prop/direct/P1813'), ('name', 'short name')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P2004'), ('name', 'NALT ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2163'), ('name', 'FAST ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2184'), ('name', 'history of topic')]
[('p', 'http://www.wikidata.org/prop/direct/P227'), ('name', 'GND ID')]
[('p', 

In [None]:
# from the previous query we obtain [('p', 'http://www.wikidata.org/prop/direct/P61'), ('name', 'discoverer or inventor')]

Final query for this task

In [18]:
# let's use the property P61 to retrieve the inventors of 'World Wide Web'

queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q466 wdt:P61 ?obj.
   # get the label
   ?obj sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q80'), ('name', 'Tim Berners-Lee')]
[('obj', 'http://www.wikidata.org/entity/Q92749'), ('name', 'Robert Cailliau')]
2


## Task 2
Identify the BGP for hypertext system.

Final query for this task

In [24]:
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q466 wdt:P31 ?obj.
   # get the label
   ?obj sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q121182'), ('name', 'information system')]
[('obj', 'http://www.wikidata.org/entity/Q65966993'), ('name', 'hypertext system')]
2


In [None]:
#[('obj', 'http://www.wikidata.org/entity/Q65966993'), ('name', 'hypertext system')]

## Task 3
Find all the hypertext systems born before the 1980 (return the IRI and name of the system and the inception date).

In [34]:
# first of all let's get all the hypertext systems in the database

queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   ?obj wdt:P31 wd:Q65966993.
   # get the label
   ?obj sc:name ?name.
}
#LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q466'), ('name', 'World Wide Web')]
[('obj', 'http://www.wikidata.org/entity/Q370979'), ('name', 'Amigaguide')]
[('obj', 'http://www.wikidata.org/entity/Q2385520'), ('name', 'ENQUIRE')]
[('obj', 'http://www.wikidata.org/entity/Q4994212'), ('name', 'Hypertext Editing System')]
[('obj', 'http://www.wikidata.org/entity/Q5448331'), ('name', 'File Retrieval and Editing System')]
[('obj', 'http://www.wikidata.org/entity/Q785345'), ('name', 'Project Xanadu')]
[('obj', 'http://www.wikidata.org/entity/Q1799609'), ('name', 'HyTime')]
[('obj', 'http://www.wikidata.org/entity/Q7742259'), ('name', 'The Interactive Encyclopedia System')]
[('obj', 'http://www.wikidata.org/entity/Q74587961'), ('name', 'MaxThink')]
[('obj', 'http://www.wikidata.org/entity/Q74590695'), ('name', 'HyperRez')]
[('obj', 'http://www.wikidata.org/entity/Q74667091'), ('name', 'HOUDINI')]
[('obj', 'http://www.wikidata.org/entity/Q66561170'), ('name', 'Visual Knowledge Builder')]


In [36]:
# let's look at a generic 'hypertext systems' properties and try to find something like date of creation

queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q466 ?p ?obj.
   # get the label
   ?p sc:name ?name.
}
#LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1051'), ('name', 'PSH ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1225'), ('name', 'U.S. National Archives Identifier')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('name', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1430'), ('name', 'OpenPlaques subject ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1482'), ('name', 'Stack Exchange tag')]
[('p', 'http://www.wikidata.org/prop/direct/P1542'), ('name', 'has effect')]
[('p', 'http://www.wikidata.org/prop/direct/P1813'), ('name', 'short name')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P2004'), ('name', 'NALT ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2163'), ('name', 'FAST ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2184'), ('name', 'history of topic')]
[('p', 'http://www.wikidata.org/prop/direct/P227'), ('name', 'GND ID')]
[('p', 

In [None]:
# [('p', 'http://www.wikidata.org/prop/direct/P575'), ('name', 'time of discovery or invention')]

In [41]:
# let's try to get hypertext systems and their date of invention

queryString = """
SELECT DISTINCT ?obj ?name ?date 
WHERE {
   # bind something
   ?obj wdt:P31 wd:Q65966993;
        wdt:P575 ?date.
   # get the label
   ?obj sc:name ?name.
   #?date sc:name ?inceptiondate.
}
#LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q466'), ('name', 'World Wide Web'), ('date', '1990-01-01T00:00:00Z')]
1


In [None]:
# [('p', 'http://www.wikidata.org/prop/direct/P575'), ('name', 'time of discovery or invention')] is not correct, let's look for a different property

In [None]:
# [('p', 'http://www.wikidata.org/prop/direct/P571'), ('name', 'inception')]

In [46]:
# let's try to get hypertext systems and their date of invention

queryString = """
SELECT DISTINCT ?obj ?name ?date 
WHERE {
   # bind something
   ?obj wdt:P31 wd:Q65966993;
        wdt:P571 ?date.
   # get the label
   ?obj sc:name ?name.
}
#LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q466'), ('name', 'World Wide Web'), ('date', '1989-03-12T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q466'), ('name', 'World Wide Web'), ('date', '1990-12-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q370979'), ('name', 'Amigaguide'), ('date', '1991-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q2385520'), ('name', 'ENQUIRE'), ('date', '1980-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q2385520'), ('name', 'ENQUIRE'), ('date', '1984-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q4994212'), ('name', 'Hypertext Editing System'), ('date', '1967-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q5448331'), ('name', 'File Retrieval and Editing System'), ('date', '1968-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q785345'), ('name', 'Project Xanadu'), ('date', '1960-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q1799609'), ('name', 'HyTime')

In [45]:
# using the property inception we are able to get all hypertext systems and their date of invention

Final query for this task

In [49]:
# the next step is to filter and find all the hypertext systems born before the 1980 (return the IRI and name of the system and the inception date).

queryString = """
SELECT DISTINCT ?obj ?name ?date 
WHERE {
   # bind something
   ?obj wdt:P31 wd:Q65966993;
        wdt:P571 ?date.
   # get the label
   ?obj sc:name ?name.
   FILTER (?date < "1980"^^xsd:gYear).
}
#LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q2385520'), ('name', 'ENQUIRE'), ('date', '1980-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q4994212'), ('name', 'Hypertext Editing System'), ('date', '1967-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q5448331'), ('name', 'File Retrieval and Editing System'), ('date', '1968-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q785345'), ('name', 'Project Xanadu'), ('date', '1960-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q471'), ('name', 'Memex'), ('date', '1939-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q1050365'), ('name', 'oNLine System'), ('date', '1962-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q8063246'), ('name', 'ZOG'), ('date', '1977-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q3625272'), ('name', 'Aspen Movie Map'), ('date', '1978-01-01T00:00:00Z')]
[('obj', 'http://www.wikidata.org/entity/Q5358242'), ('name', 'Electronic D

## Task 4
Identify the BGP for computer scientist

In [None]:
# let's try to find the BGP for computer scientist using the entity associated to Tim Berners-Lee, inventor of the World Wide Web

Final query for this task

In [52]:
# return all the profession of Tim Berners-Lee (Q80)

queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q80 wdt:P106 ?obj.
   # get the label
   ?obj sc:name ?name.
}
#LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q1622272'), ('name', 'university teacher')]
[('obj', 'http://www.wikidata.org/entity/Q169470'), ('name', 'physicist')]
[('obj', 'http://www.wikidata.org/entity/Q205375'), ('name', 'inventor')]
[('obj', 'http://www.wikidata.org/entity/Q81096'), ('name', 'engineer')]
[('obj', 'http://www.wikidata.org/entity/Q82594'), ('name', 'computer scientist')]
[('obj', 'http://www.wikidata.org/entity/Q5482740'), ('name', 'programmer')]
[('obj', 'http://www.wikidata.org/entity/Q6859454'), ('name', 'web developer')]
7


In [None]:
# [('obj', 'http://www.wikidata.org/entity/Q82594'), ('name', 'computer scientist')]

## Task 5
Find how many computer scientists there are for each continent (consider their citizenship). Return the IRI and name of the continent and the number of scientists for each continent.

In [79]:
# first of all let's get all the computer scientists in the world

queryString = """
SELECT DISTINCT ?computer_scientist ?name
WHERE {
   # bind something
   ?computer_scientist wdt:P31 wd:Q5;
   wdt:P106 wd:Q82594.
   # get the label
   ?computer_scientist sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('computer_scientist', 'http://www.wikidata.org/entity/Q8190500'), ('name', 'Seinosuke Toda')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q603999'), ('name', 'Kevin Poulsen')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q11109023'), ('name', 'Michael Barr')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q20731777'), ('name', 'Valeria de Paiva')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q24901829'), ('name', 'Ernesto Damiani')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q61856956'), ('name', 'Richard Garner')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q6539960'), ('name', 'Liang-Jie Zhang')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q92669'), ('name', 'Jeffrey Shallit')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q15842362'), ('name', 'Richard Clippinger')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q15858435'), ('name', 'Hardo Bourguignon')]
[('computer_scientis

In [153]:
# let's look at a generic computer scientist properties to find out if there is a citizenship property

queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q8190500 ?p ?obj.
   # get the label
   ?p sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('name', 'employer')]
[('p', 'http://www.wikidata.org/prop/direct/P166'), ('name', 'award received')]
[('p', 'http://www.wikidata.org/prop/direct/P1814'), ('name', 'name in kana')]
[('p', 'http://www.wikidata.org/prop/direct/P184'), ('name', 'doctoral advisor')]
[('p', 'http://www.wikidata.org/prop/direct/P19'), ('name', 'place of birth')]
[('p', 'http://www.wikidata.org/prop/direct/P21'), ('name', 'sex or gender')]
[('p', 'http://www.wikidata.org/prop/direct/P213'), ('name', 'ISNI')]
[('p', 'http://www.wikidata.org/prop/direct/P214'), ('name', 'VIAF ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2456'), ('name', 'DBLP author ID')]
[('p', 'http://www.wikidata.org/prop/direct/P27'), ('name', 'country of citizenship')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P349'), ('nam

In [None]:
# [('p', 'http://www.wikidata.org/prop/direct/P27'), ('name', 'country of citizenship')]

In [93]:
# let's get all the computer scientists in the world with their nationality

queryString = """
SELECT DISTINCT ?computer_scientist ?name ?nation
WHERE {
   # bind something
   ?computer_scientist wdt:P31 wd:Q5;
                  wdt:P106 wd:Q82594;
                  wdt:P27 ?nationality.
   # get the label
   ?computer_scientist sc:name ?name.
   ?nationality sc:name ?nation.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('computer_scientist', 'http://www.wikidata.org/entity/Q3018520'), ('name', 'David Naccache'), ('nation', 'Morocco')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q1045568'), ('name', 'Carsten Haitzler'), ('nation', 'Nigeria')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q28648829'), ('name', 'Clarisse Iribagiza'), ('nation', 'Rwanda')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q26207354'), ('name', 'Lum Zhaveli'), ('nation', 'Kosovo')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q24565815'), ('name', 'Narasimhiah Seshagiri'), ('nation', 'British India')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q253960'), ('name', 'Ravi Sethi'), ('nation', 'British India')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q21715732'), ('name', 'Syed Jan Abas'), ('nation', 'British India')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q53977158'), ('name', 'Svetlana Selezneva'), ('nation', 'Ukrainian Soviet Socia

In [100]:
# let's get a generic country in order to have a look at its property

queryString = """
SELECT DISTINCT ?nationality ?nation
WHERE {
   # bind something
   ?computer_scientist wdt:P31 wd:Q5;
                  wdt:P106 wd:Q82594;
                  wdt:P27 ?nationality.
   # get the label
   ?nationality sc:name ?nation.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('nationality', 'http://www.wikidata.org/entity/Q1028'), ('nation', 'Morocco')]
[('nationality', 'http://www.wikidata.org/entity/Q1033'), ('nation', 'Nigeria')]
[('nationality', 'http://www.wikidata.org/entity/Q1037'), ('nation', 'Rwanda')]
[('nationality', 'http://www.wikidata.org/entity/Q1246'), ('nation', 'Kosovo')]
[('nationality', 'http://www.wikidata.org/entity/Q129286'), ('nation', 'British India')]
[('nationality', 'http://www.wikidata.org/entity/Q133356'), ('nation', 'Ukrainian Soviet Socialist Republic')]
[('nationality', 'http://www.wikidata.org/entity/Q13426199'), ('nation', 'Republic of China 1912–1949')]
[('nationality', 'http://www.wikidata.org/entity/Q140359'), ('nation', 'First Czechoslovak Republic')]
[('nationality', 'http://www.wikidata.org/entity/Q142'), ('nation', 'France')]
[('nationality', 'http://www.wikidata.org/entity/Q145'), ('nation', 'United Kingdom')]
[('nationality', 'http://www.wikidata.org/entity/Q148'), ('nation', "People's Republic of China"

In [98]:
# let's get the property to retrieve the continent from a country

queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q1028 ?p ?obj.
   # get the label
   ?p sc:name ?name.
   
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1036'), ('name', 'Dewey Decimal Classification')]
[('p', 'http://www.wikidata.org/prop/direct/P1081'), ('name', 'Human Development Index')]
[('p', 'http://www.wikidata.org/prop/direct/P1082'), ('name', 'population')]
[('p', 'http://www.wikidata.org/prop/direct/P1151'), ('name', "topic's main Wikimedia portal")]
[('p', 'http://www.wikidata.org/prop/direct/P1198'), ('name', 'unemployment rate')]
[('p', 'http://www.wikidata.org/prop/direct/P122'), ('name', 'basic form of government')]
[('p', 'http://www.wikidata.org/prop/direct/P1225'), ('name', 'U.S. National Archives Identifier')]
[('p', 'http://www.wikidata.org/prop/direct/P1245'), ('name', 'OmegaWiki Defined Meaning')]
[('p', 'http://www.wikidata.org/prop/direct/P1296'), ('name', 'Gran Enciclopèdia Catalana ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1304'), ('name', 'central bank')]
[('p', 'http://www.wikidata.org/prop/direct/P1313'), ('name', 'office held by head of governm

In [None]:
#[('p', 'http://www.wikidata.org/prop/direct/P30'), ('name', 'continent')]

In [114]:
# Return the IRI and name of the country and the number of scientists for each country

queryString = """
SELECT DISTINCT ?country ?nation COUNT (?nation) as ?number_of_scientist
WHERE {
   # bind something
   ?computer_scientist wdt:P31 wd:Q5;
                  wdt:P106 wd:Q82594;
                  wdt:P27 ?country.
   # get the label
   #?computer_scientist sc:name ?name.
   ?country sc:name ?nation.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('country', 'http://www.wikidata.org/entity/Q191'), ('nation', 'Estonia'), ('number_of_scientist', '53')]
[('country', 'http://www.wikidata.org/entity/Q219060'), ('nation', 'State of Palestine'), ('number_of_scientist', '1')]
[('country', 'http://www.wikidata.org/entity/Q668'), ('nation', 'India'), ('number_of_scientist', '107')]
[('country', 'http://www.wikidata.org/entity/Q1036'), ('nation', 'Uganda'), ('number_of_scientist', '4')]
[('country', 'http://www.wikidata.org/entity/Q713750'), ('nation', 'West Germany'), ('number_of_scientist', '2')]
[('country', 'http://www.wikidata.org/entity/Q189'), ('nation', 'Iceland'), ('number_of_scientist', '7')]
[('country', 'http://www.wikidata.org/entity/Q399'), ('nation', 'Armenia'), ('number_of_scientist', '3')]
[('country', 'http://www.wikidata.org/entity/Q186320'), ('nation', 'Grand Duchy of Baden'), ('number_of_scientist', '1')]
[('country', 'http://www.wikidata.org/entity/Q1009'), ('nation', 'Cameroon'), ('number_of_scientist', '5'

Final query for this task

In [116]:
# Return the IRI and name of the continent and the number of scientists for each continent

queryString = """
SELECT DISTINCT ?continent ?name COUNT (?continent) as ?number_of_scientist
WHERE {
   # bind something
   ?computer_scientist wdt:P31 wd:Q5;
                  wdt:P106 wd:Q82594;
                  wdt:P27 ?country.
   ?country wdt:P30 ?continent.
   # get the label
   ?continent sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('continent', 'http://www.wikidata.org/entity/Q5401'), ('name', 'Eurasia'), ('number_of_scientist', '23')]
[('continent', 'http://www.wikidata.org/entity/Q46'), ('name', 'Europe'), ('number_of_scientist', '3487')]
[('continent', 'http://www.wikidata.org/entity/Q538'), ('name', 'Insular Oceania'), ('number_of_scientist', '2479')]
[('continent', 'http://www.wikidata.org/entity/Q15'), ('name', 'Africa'), ('number_of_scientist', '76')]
[('continent', 'http://www.wikidata.org/entity/Q18'), ('name', 'South America'), ('number_of_scientist', '92')]
[('continent', 'http://www.wikidata.org/entity/Q48'), ('name', 'Asia'), ('number_of_scientist', '953')]
[('continent', 'http://www.wikidata.org/entity/Q49'), ('name', 'North America'), ('number_of_scientist', '2782')]
7


## Task 6
Find all the computer scientists who thaught at the University of Cambridge (return IRI and name)

In [117]:
# let's get the old query that gets all the computer scientists

queryString = """
SELECT DISTINCT ?computer_scientist ?name
WHERE {
   # bind something
   ?computer_scientist wdt:P31 wd:Q5;
   wdt:P106 wd:Q82594.
   # get the label
   ?computer_scientist sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('computer_scientist', 'http://www.wikidata.org/entity/Q8190500'), ('name', 'Seinosuke Toda')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q603999'), ('name', 'Kevin Poulsen')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q11109023'), ('name', 'Michael Barr')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q20731777'), ('name', 'Valeria de Paiva')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q24901829'), ('name', 'Ernesto Damiani')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q61856956'), ('name', 'Richard Garner')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q6539960'), ('name', 'Liang-Jie Zhang')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q92669'), ('name', 'Jeffrey Shallit')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q15842362'), ('name', 'Richard Clippinger')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q15858435'), ('name', 'Hardo Bourguignon')]
[('computer_scientis

In [146]:
# let's run a query to find the cambridge university entity

queryString = """
SELECT DISTINCT ?obj ?uni
WHERE {
   # bind something
   ?computer_scientist wdt:P31 wd:Q5;
                  wdt:P106 wd:Q82594;
                  wdt:P108 ?obj.
   # get the label
   #?computer_scientist sc:name ?name.
   ?obj sc:name ?uni.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q5500827'), ('uni', 'Freedom of the Press Foundation')]
[('obj', 'http://www.wikidata.org/entity/Q1381966'), ('uni', 'Flipkart')]
[('obj', 'http://www.wikidata.org/entity/Q2119882'), ('uni', 'Google Nest')]
[('obj', 'http://www.wikidata.org/entity/Q4863446'), ('uni', 'Barrett Technology')]
[('obj', 'http://www.wikidata.org/entity/Q6647356'), ('uni', 'Lit Motors')]
[('obj', 'http://www.wikidata.org/entity/Q359'), ('uni', 'WikiLeaks')]
[('obj', 'http://www.wikidata.org/entity/Q317087'), ('uni', 'University of Lübeck')]
[('obj', 'http://www.wikidata.org/entity/Q2304477'), ('uni', 'South East European University')]
[('obj', 'http://www.wikidata.org/entity/Q2303478'), ('uni', 'Synopsys')]
[('obj', 'http://www.wikidata.org/entity/Q11838090'), ('uni', 'Faculty of Mathematics and Computer Science of Jagiellonian University')]
[('obj', 'http://www.wikidata.org/entity/Q1681557'), ('uni', 'Junta para Ampliación de Estudios')]
[('obj', 'http://www.w

In [None]:
#[('obj', 'http://www.wikidata.org/entity/Q35794'), ('uni', 'University of Cambridge')
#[('obj', 'http://www.wikidata.org/entity/Q1622272'), ('name', 'university teacher')]
#[('obj', 'http://www.wikidata.org/entity/Q82594'), ('name', 'computer scientist')]

Final query for this task

In [150]:
# Find all the computer scientists who thaught at the University of Cambridge (return IRI and name) (this query has been written considering both university teacher
# and computer scientist as occupation)

queryString = """
SELECT DISTINCT ?computer_scientist ?name
WHERE {
   # bind something
   ?computer_scientist wdt:P31 wd:Q5;
           wdt:P108 wd:Q35794;
           wdt:P106 wd:Q1622272;
           wdt:P106 wd:Q82594.
   # get the label
   ?computer_scientist sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('computer_scientist', 'http://www.wikidata.org/entity/Q46633'), ('name', 'Charles Babbage')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q62857'), ('name', 'Maurice Wilkes')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q451770'), ('name', 'Douglas Hartree')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q4707402'), ('name', 'Alan Mycroft')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q63126308'), ('name', 'Peter Földiák')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q216811'), ('name', 'John Venn')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q100586732'), ('name', 'Ferenc Huszár')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q316330'), ('name', 'Daniel Bell')]
8


## Task 7
Find all the computer scientists who wrote at least 5 books (please consider only the instances of book, exclude "literary work" or other type of work related to books). Return the IRI and name of the computer scientist and the number of books.

In [151]:
# let's get the old query that gets all the computer scientists

queryString = """
SELECT DISTINCT ?computer_scientist ?name
WHERE {
   # bind something
   ?computer_scientist wdt:P31 wd:Q5;
   wdt:P106 wd:Q82594.
   # get the label
   ?computer_scientist sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('computer_scientist', 'http://www.wikidata.org/entity/Q8190500'), ('name', 'Seinosuke Toda')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q603999'), ('name', 'Kevin Poulsen')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q11109023'), ('name', 'Michael Barr')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q20731777'), ('name', 'Valeria de Paiva')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q24901829'), ('name', 'Ernesto Damiani')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q61856956'), ('name', 'Richard Garner')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q6539960'), ('name', 'Liang-Jie Zhang')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q92669'), ('name', 'Jeffrey Shallit')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q15842362'), ('name', 'Richard Clippinger')]
[('computer_scientist', 'http://www.wikidata.org/entity/Q15858435'), ('name', 'Hardo Bourguignon')]
[('computer_scientis

In [198]:


queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q316330 wdt:P106 ?obj.
   # get the label
   ?obj sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q1622272'), ('name', 'university teacher')]
[('obj', 'http://www.wikidata.org/entity/Q1930187'), ('name', 'journalist')]
[('obj', 'http://www.wikidata.org/entity/Q36180'), ('name', 'writer')]
[('obj', 'http://www.wikidata.org/entity/Q37226'), ('name', 'teacher')]
[('obj', 'http://www.wikidata.org/entity/Q82594'), ('name', 'computer scientist')]
[('obj', 'http://www.wikidata.org/entity/Q2306091'), ('name', 'sociologist')]
[('obj', 'http://www.wikidata.org/entity/Q6051619'), ('name', 'opinion journalist')]
7


In [None]:
#[('obj', 'http://www.wikidata.org/entity/Q36180'), ('name', 'writer')]

In [207]:

queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q36180 ?b ?p.
   # get the label
   ?p sc:name ?name.
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q37260'), ('name', 'writing')]
[('p', 'http://www.wikidata.org/entity/Q2250012'), ('name', 'writing')]
[('p', 'http://www.wikidata.org/entity/Q5849863'), ('name', 'Category:Writers')]
[('p', 'http://www.wikidata.org/entity/P50'), ('name', 'author')]
[('p', 'http://www.wikidata.org/entity/Q12737077'), ('name', 'occupation')]
[('p', 'http://www.wikidata.org/entity/Q28640'), ('name', 'profession')]
[('p', 'http://www.wikidata.org/entity/Q4164871'), ('name', 'position')]
[('p', 'http://www.wikidata.org/entity/Q482980'), ('name', 'author')]
[('p', 'http://www.wikidata.org/entity/Q98970039'), ('name', 'WikiProject PCC Wikidata Pilot/University of Washington')]
[('p', 'http://www.wikidata.org/entity/Q13019738'), ('name', 'list of lists of writers')]
[('p', 'http://www.wikidata.org/entity/Q6963972'), ('name', 'Category:Wikipedia categories named after writers')]
[('p', 'http://www.wikidata.org/entity/P2093'), ('name', 'author name string')]
[('p',

In [None]:
#[('p', 'http://www.wikidata.org/entity/P50'), ('name', 'author')]

In [208]:

queryString = """
SELECT DISTINCT ?job ?name
WHERE {
   # bind something
   wd:Q482980 ?b ?job.
   # get the label
   ?job sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('job', 'http://www.wikidata.org/entity/Q9701158'), ('name', 'Category:Authors')]
[('job', 'http://www.wikidata.org/entity/P50'), ('name', 'author')]
[('job', 'http://www.wikidata.org/entity/P3931'), ('name', 'copyright holder')]
[('job', 'http://www.wikidata.org/entity/Q12737077'), ('name', 'occupation')]
[('job', 'http://www.wikidata.org/entity/Q42'), ('name', 'Douglas Adams')]
[('job', 'http://www.wikidata.org/entity/Q692'), ('name', 'William Shakespeare')]
[('job', 'http://www.wikidata.org/entity/Q8209871'), ('name', 'Autor')]
[('job', 'http://www.wikidata.org/entity/P170'), ('name', 'creator')]
[('job', 'http://www.wikidata.org/entity/Q2500638'), ('name', 'creator')]
[('job', 'http://www.wikidata.org/entity/Q8347428'), ('name', 'Category:Categories by writer')]
[('job', 'http://www.wikidata.org/entity/Q10859333'), ('name', 'authorship')]
[('job', 'http://www.wikidata.org/entity/Q36322'), ('name', 'Jane Austen')]
[('job', 'http://www.wikidata.org/entity/P1778'), ('name', '

In [215]:
queryString = """
SELECT DISTINCT ?job ?name
WHERE {
   # bind something
   wd:Q692 ?b ?job.
   # get the label
   ?job sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('job', 'http://www.wikidata.org/entity/Q662550'), ('name', "Shakespeare's sonnets")]
[('job', 'http://www.wikidata.org/entity/Q661222'), ('name', 'King John')]
[('job', 'http://www.wikidata.org/entity/Q215750'), ('name', 'Julius Caesar')]
[('job', 'http://www.wikidata.org/entity/Q38933'), ('name', 'fever')]
[('job', 'http://www.wikidata.org/entity/Q53590'), ('name', 'The Two Noble Kinsmen')]
[('job', 'http://www.wikidata.org/entity/Q128610'), ('name', "Love's Labour's Lost")]
[('job', 'http://www.wikidata.org/entity/Q130283'), ('name', 'Macbeth')]
[('job', 'http://www.wikidata.org/entity/Q181598'), ('name', 'King Lear')]
[('job', 'http://www.wikidata.org/entity/Q332387'), ('name', 'The Taming of the Shrew')]
[('job', 'http://www.wikidata.org/entity/Q83186'), ('name', 'Romeo and Juliet')]
[('job', 'http://www.wikidata.org/entity/Q506505'), ('name', 'The Comedy of Errors')]
[('job', 'http://www.wikidata.org/entity/Q64759522'), ('name', 'Star Trek Concordance')]
[('job', 'http:/

In [225]:
queryString = """
SELECT DISTINCT ?job ?name
WHERE {
   # bind something
   wd:Q83186 ?b ?job.
   # get the label
   ?job sc:name ?name.
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('job', 'http://www.wikidata.org/entity/Q673361'), ('name', 'Romeo')]
[('job', 'http://www.wikidata.org/entity/Q911487'), ('name', 'Romeo and Juliet')]
[('job', 'http://www.wikidata.org/entity/Q19968309'), ('name', 'Romeo and Juliet')]
[('job', 'http://www.wikidata.org/entity/Q25379'), ('name', 'play')]
[('job', 'http://www.wikidata.org/entity/Q191489'), ('name', 'melodrama')]
[('job', 'http://www.wikidata.org/entity/Q1071307'), ('name', 'I Capuleti e i Montecchi')]
[('job', 'http://www.wikidata.org/entity/Q179876'), ('name', 'Kingdom of England')]
[('job', 'http://www.wikidata.org/entity/Q1860'), ('name', 'English')]
[('job', 'http://www.wikidata.org/entity/Q19652'), ('name', 'public domain')]
[('job', 'http://www.wikidata.org/entity/Q2028'), ('name', 'Verona')]
10


In [238]:
queryString = """
SELECT DISTINCT ?job ?name
WHERE {
   # bind something
   wd:Q83186 ?b ?job.
   # get the label
   ?job sc:name ?name.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('job', 'http://www.wikidata.org/entity/Q673361'), ('name', 'Romeo')]
[('job', 'http://www.wikidata.org/entity/Q911487'), ('name', 'Romeo and Juliet')]
[('job', 'http://www.wikidata.org/entity/Q19968309'), ('name', 'Romeo and Juliet')]
[('job', 'http://www.wikidata.org/entity/Q25379'), ('name', 'play')]
[('job', 'http://www.wikidata.org/entity/Q191489'), ('name', 'melodrama')]
[('job', 'http://www.wikidata.org/entity/Q1071307'), ('name', 'I Capuleti e i Montecchi')]
[('job', 'http://www.wikidata.org/entity/Q179876'), ('name', 'Kingdom of England')]
[('job', 'http://www.wikidata.org/entity/Q1860'), ('name', 'English')]
[('job', 'http://www.wikidata.org/entity/Q19652'), ('name', 'public domain')]
[('job', 'http://www.wikidata.org/entity/Q2028'), ('name', 'Verona')]
[('job', 'http://www.wikidata.org/entity/Q6247'), ('name', 'Mantua')]
[('job', 'http://www.wikidata.org/entity/Q7725634'), ('name', 'literary work')]
[('job', 'http://www.wikidata.org/entity/Q80930'), ('name', 'traged

In [239]:
queryString = """
SELECT DISTINCT ?job ?name
WHERE {
   # bind something
   wd:Q7725634 ?b ?job.
   # get the label
   ?job sc:name ?name.
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('job', 'http://www.wikidata.org/entity/P747'), ('name', 'has edition or translation')]
[('job', 'http://www.wikidata.org/entity/Q3919906'), ('name', 'Category:Literary works')]
[('job', 'http://www.wikidata.org/entity/P214'), ('name', 'VIAF ID')]
[('job', 'http://www.wikidata.org/entity/Q234460'), ('name', 'text')]
[('job', 'http://www.wikidata.org/entity/P373'), ('name', 'Commons category')]
[('job', 'http://www.wikidata.org/entity/P123'), ('name', 'publisher')]
[('job', 'http://www.wikidata.org/entity/P407'), ('name', 'language of work or name')]
[('job', 'http://www.wikidata.org/entity/P50'), ('name', 'author')]
[('job', 'http://www.wikidata.org/entity/P921'), ('name', 'main subject')]
[('job', 'http://www.wikidata.org/entity/Q47461344'), ('name', 'written work')]
[('job', 'http://www.wikidata.org/entity/Q8242'), ('name', 'literature')]
[('job', 'http://www.wikidata.org/entity/Q3331189'), ('name', 'version, edition, or translation')]
[('job', 'http://www.wikidata.org/entit

Final query for this task

In [None]:
# write your final query