# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-528af84d0a-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
"""

# select and construct queries
def run_query(queryString,verbose = True):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        results = sparql.query()
        json_results = results.convert()
        if len(json_results['results']['bindings'])==0:
            print("Empty")
            return []
        array = []
        for bindings in json_results['results']['bindings']:
            app =  [ (var, value['value'])  for var, value in bindings.items() ] 
            if verbose:
                print( app)
            array.append(app)
        if verbose:
            print(len(array))
        return array

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

The history saving thread hit an unexpected error (OperationalError('attempt to write a readonly database')).History will not be written to the database.


# History Workflow Series ("World Wide Web") 

Consider the following exploratory information need:

> Investigate the origins of the World Wide Web and related academic activities and scientists.

## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P106`    | occupation    | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wdt:P27`     | nationality   | predicate |
| `wdt:P3342`     | Significant person       | predicate |
| `wd:Q5`| Human       | node |
| `wd:Q466`| World Wide Web      | node |





Also consider

```
wd:Q466 ?p ?obj .
```

is the BGP to retrieve all **properties of World Wide Web**

Please consider that when you return a resource, you should return the IRI and the label of the resource. In particular, when the task require you to identify a BGP the result set must always be a list of couples IRI - label.

The workload should:

1. Find the inventors of World Wide Web (return IRI and name).

2. Identify the BGP for hypertext system.

3. Find all the hypertext systems born before the 1980 (return the IRI and name of the system and the inception date).

4. Identify the BGP for computer scientist

5. Find how many computer scientists there are for each continent (consider their citizenship). Return the IRI and name of the continent and the number of scientists for each continent. 

6. Find all the computer scientists who thaught at the University of Cambridge (return IRI and name)

7. Find all the computer scientists who wrote at least 5 books (please consider only the instances of book, exclude "literary work" or other type of work related to books). Return the IRI and name of the computer scientist and the number of books. 

## Task 1
Find the inventors of World Wide Web (return IRI and name).

In [13]:
# query example
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q466 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 20
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1051'), ('name', 'PSH ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1225'), ('name', 'U.S. National Archives Identifier')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('name', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1430'), ('name', 'OpenPlaques subject ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1482'), ('name', 'Stack Exchange tag')]
[('p', 'http://www.wikidata.org/prop/direct/P1542'), ('name', 'has effect')]
[('p', 'http://www.wikidata.org/prop/direct/P1813'), ('name', 'short name')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P2004'), ('name', 'NALT ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2163'), ('name', 'FAST ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2184'), ('name', 'history of topic')]
[('p', 'http://www.wikidata.org/prop/direct/P227'), ('name', 'GND ID')]
[('p', 

In [4]:
# aim: check more properties for the 'World Wide Web'
# result: the property 'discoverer or inventor' has been found
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q466 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1051'), ('name', 'PSH ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1225'), ('name', 'U.S. National Archives Identifier')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('name', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1430'), ('name', 'OpenPlaques subject ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1482'), ('name', 'Stack Exchange tag')]
[('p', 'http://www.wikidata.org/prop/direct/P1542'), ('name', 'has effect')]
[('p', 'http://www.wikidata.org/prop/direct/P1813'), ('name', 'short name')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P2004'), ('name', 'NALT ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2163'), ('name', 'FAST ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2184'), ('name', 'history of topic')]
[('p', 'http://www.wikidata.org/prop/direct/P227'), ('name', 'GND ID')]
[('p', 

In [7]:
# aim: check the property 'discoverer or inventor' of 'World Wide Web' to find the inventors of the W.W.W.
# result: the inventors of the W.W.W. but there is an error in the query: ?p should be sostituted by ?obj
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q466 wdt:P61 ?obj .
   # get the label
   ?obj sc:name ?name.
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('name', 'Tim Berners-Lee')]
[('name', 'Robert Cailliau')]
2


Final query for this task

In [8]:
# aim: check the property 'discoverer or inventor' of 'World Wide Web' to find the inventors of the W.W.W.
# result: the inventors of the W.W.W.
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q466 wdt:P61 ?obj .
   # get the label
   ?obj sc:name ?name.
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q80'), ('name', 'Tim Berners-Lee')]
[('obj', 'http://www.wikidata.org/entity/Q92749'), ('name', 'Robert Cailliau')]
2


## Task 2
Identify the BGP for hypertext system.

In [9]:
# aim: check more properties for 'Tim Berners-Lee'
# result: the property 'notable work' has been found (even though Tim Berners-Lee wasn't the first inventor of the hypertext system (MEMEX), he contributed with the http protocol which
#         is surely related to it. Therefore we proceed this way
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q80 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P2930'), ('name', 'INSPIRE-HEP author ID')]
[('p', 'http://www.wikidata.org/prop/direct/P6479'), ('name', 'IEEE Xplore author ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1006'), ('name', 'Nationale Thesaurus voor Auteurs ID')]
[('p', 'http://www.wikidata.org/prop/direct/P101'), ('name', 'field of work')]
[('p', 'http://www.wikidata.org/prop/direct/P1015'), ('name', 'NORAF ID')]
[('p', 'http://www.wikidata.org/prop/direct/P103'), ('name', 'native language')]
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('name', 'employer')]
[('p', 'http://www.wikidata.org/prop/direct/P1153'), ('name', 'Scopus author ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1207'), ('name', 'NUKAT ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1233'), ('name', 'Internet Speculative Fiction Database author ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1263'), ('na

In [10]:
# aim: check the 'notable work' of 'Tim Berners-Lee'
# result: we found 'HyperText Markup Language' and 'Hypertext Transfer Protocol' that are related to hypertext systems.
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q80 wdt:P800 ?obj .
   # get the label
   ?obj sc:name ?name.
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q466'), ('name', 'World Wide Web')]
[('obj', 'http://www.wikidata.org/entity/Q8811'), ('name', 'HyperText Markup Language')]
[('obj', 'http://www.wikidata.org/entity/Q11288'), ('name', 'web server')]
[('obj', 'http://www.wikidata.org/entity/Q8777'), ('name', 'Hypertext Transfer Protocol')]
[('obj', 'http://www.wikidata.org/entity/Q590890'), ('name', 'WorldWideWeb')]
5


In [12]:
# aim: check all the properties for  'HyperText Markup Language' and 'Hypertext Transfer Protocol'
# result: we found 'HyperText Markup Language' and 'Hypertext Transfer Protocol' that are related to hypertext systems.
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   ?s ?p ?obj .
   # get the label
   ?p sc:name ?name.
   FILTER (?s IN (wd:Q8811,wd:Q8777))
}
LIMIT 400
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P4742'), ('name', 'Uniform Resource Identifier Scheme')]
[('p', 'http://www.wikidata.org/prop/direct/P5814'), ('name', 'IANA service name')]
[('p', 'http://www.wikidata.org/prop/direct/P1014'), ('name', 'Art & Architecture Thesaurus ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1051'), ('name', 'PSH ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1163'), ('name', 'media type')]
[('p', 'http://www.wikidata.org/prop/direct/P1195'), ('name', 'file extension')]
[('p', 'http://www.wikidata.org/prop/direct/P1245'), ('name', 'OmegaWiki Defined Meaning')]
[('p', 'http://www.wikidata.org/prop/direct/P1324'), ('name', 'source code repository')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('name', 'described by source')]
[('p', 'http://www.wikidata.org/prop/direct/P1401'), ('name', 'bug tracking system')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('name', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikida

In [17]:
# aim: check the 'instance of' property of 'HyperText Markup Language' and 'Hypertext Transfer Protocol'
# result: we found 'markup language'. Maybe we are going far from the objective. Let's give it a try anyway
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   ?s wdt:P31 ?obj .
   # get the label
   ?obj sc:name ?name.
   FILTER (?s IN (wd:Q8811,wd:Q8777))
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q15836568'), ('name', 'computer network protocol')]
[('obj', 'http://www.wikidata.org/entity/Q17537576'), ('name', 'creative work')]
[('obj', 'http://www.wikidata.org/entity/Q20202982'), ('name', 'Web API')]
[('obj', 'http://www.wikidata.org/entity/Q235557'), ('name', 'file format')]
[('obj', 'http://www.wikidata.org/entity/Q2661442'), ('name', 'W3C Recommendation')]
[('obj', 'http://www.wikidata.org/entity/Q37045'), ('name', 'markup language')]
6


In [19]:
# aim: check the 'markup language' object with its properties
# result: we did not find anything special regarding 'hypertext system'. Maybe we can start again from 'World Wide Web' instead
queryString = """
SELECT DISTINCT ?p ?name ?obj ?objname
WHERE {
   # bind something
   wd:Q37045 ?p ?obj .
   # get the label
   ?p sc:name ?name .
   ?obj sc:name ?objname .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of'), ('obj', 'http://www.wikidata.org/entity/Q494823'), ('objname', 'data format')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('name', "topic's main category"), ('obj', 'http://www.wikidata.org/entity/Q5102677'), ('objname', 'Category:Markup languages')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from'), ('obj', 'http://www.wikidata.org/entity/Q24451526'), ('objname', 'data serialization format')]
[('p', 'http://www.wikidata.org/prop/direct/P2670'), ('name', 'has parts of the class'), ('obj', 'http://www.wikidata.org/entity/Q857525'), ('objname', 'annotation')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of'), ('obj', 'http://www.wikidata.org/entity/Q17176533'), ('objname', 'software component')]
[('p', 'http://www.wikidata.org/prop/direct/P2670'), ('name', 'has parts of the class'), ('obj', 'http://www.wikidata.org/entity/Q1260632'), ('objn

Final query for this task

In [20]:
# aim: check the 'instance of' and 'subclass' properties of 'World Wide Web'
# result: we found 'hypertext system'
queryString = """
SELECT DISTINCT ?p ?name ?obj ?objname
WHERE {
   # bind something
   wd:Q466 ?p ?obj .
   # get the label
   ?p sc:name ?name.
   ?obj sc:name ?objname.
   FILTER(?p IN(wdt:P31,wdt:P279))
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of'), ('obj', 'http://www.wikidata.org/entity/Q1668024'), ('objname', 'service on internet')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of'), ('obj', 'http://www.wikidata.org/entity/Q1076968'), ('objname', 'digital media')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of'), ('obj', 'http://www.wikidata.org/entity/Q121182'), ('objname', 'information system')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of'), ('obj', 'http://www.wikidata.org/entity/Q65966993'), ('objname', 'hypertext system')]
4


## Task 3
Find all the hypertext systems born before the 1980 (return the IRI and name of the system and the inception date).

In [3]:
# aim: get all the properties for 'hypertext system'
# result: all the properties for 'hypertext system'
queryString = """
SELECT DISTINCT ?p ?name 
WHERE {
   # bind something
   wd:Q65966993 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P3847'), ('name', 'Open Library subject ID')]
3


In [22]:
# aim: find all the resources in relation with 'hypertext system' by means of 'instance of'
# result: we found the instances of 'hypertext system'
queryString = """
SELECT DISTINCT ?system ?name
WHERE {
   # bind something
   ?system wdt:P31 wd:Q65966993 .
   # get the label
   ?system sc:name ?name.
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('system', 'http://www.wikidata.org/entity/Q466'), ('name', 'World Wide Web')]
[('system', 'http://www.wikidata.org/entity/Q370979'), ('name', 'Amigaguide')]
[('system', 'http://www.wikidata.org/entity/Q2385520'), ('name', 'ENQUIRE')]
[('system', 'http://www.wikidata.org/entity/Q4994212'), ('name', 'Hypertext Editing System')]
[('system', 'http://www.wikidata.org/entity/Q5448331'), ('name', 'File Retrieval and Editing System')]
[('system', 'http://www.wikidata.org/entity/Q785345'), ('name', 'Project Xanadu')]
[('system', 'http://www.wikidata.org/entity/Q1799609'), ('name', 'HyTime')]
[('system', 'http://www.wikidata.org/entity/Q7742259'), ('name', 'The Interactive Encyclopedia System')]
[('system', 'http://www.wikidata.org/entity/Q74587961'), ('name', 'MaxThink')]
[('system', 'http://www.wikidata.org/entity/Q74590695'), ('name', 'HyperRez')]
[('system', 'http://www.wikidata.org/entity/Q74667091'), ('name', 'HOUDINI')]
[('system', 'http://www.wikidata.org/entity/Q66561170'), ('

In [23]:
# aim: check the 'inception' property of 'World Wide Web'
# result: the result list is empty. This means that probably it's a literal or data property (hopefully a date).
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q466 wdt:P571 ?obj .
   # get the label
   ?obj sc:name ?name.
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [24]:
# aim: check the 'inception' property of 'World Wide Web'
# result: we found two possible dates. We assume the first one to be valid in case of multiple dates.
queryString = """
SELECT DISTINCT ?obj
WHERE {
   # bind something
   wd:Q466 wdt:P571 ?obj .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('obj', '1989-03-12T00:00:00Z')]
[('obj', '1990-12-01T00:00:00Z')]
2


Final query for this task

In [30]:
# aim: find all the 'hypertext system' instances that were born before 1980
# result: all the 'hypertext system' instances that were born before 1980
queryString = """
SELECT DISTINCT ?system ?name ?dateinception
WHERE {
   {
       SELECT ?system ?name (MIN(?datesinception) AS ?dateinception) WHERE{
          # bind something
          ?system wdt:P31 wd:Q65966993 ;
             # get the label
             sc:name ?name ;
             #get the date
             wdt:P571 ?datesinception .
       }GROUP BY ?system ?name
   }
   
   FILTER (?dateinception < "1980-01-01T00:00:00Z"^^xsd:dateTime)
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('system', 'http://www.wikidata.org/entity/Q1050365'), ('name', 'oNLine System'), ('dateinception', '1962-01-01T00:00:00Z')]
[('system', 'http://www.wikidata.org/entity/Q8063246'), ('name', 'ZOG'), ('dateinception', '1977-01-01T00:00:00Z')]
[('system', 'http://www.wikidata.org/entity/Q4994212'), ('name', 'Hypertext Editing System'), ('dateinception', '1967-01-01T00:00:00Z')]
[('system', 'http://www.wikidata.org/entity/Q5448331'), ('name', 'File Retrieval and Editing System'), ('dateinception', '1968-01-01T00:00:00Z')]
[('system', 'http://www.wikidata.org/entity/Q785345'), ('name', 'Project Xanadu'), ('dateinception', '1960-01-01T00:00:00Z')]
[('system', 'http://www.wikidata.org/entity/Q3625272'), ('name', 'Aspen Movie Map'), ('dateinception', '1978-01-01T00:00:00Z')]
[('system', 'http://www.wikidata.org/entity/Q5358242'), ('name', 'Electronic Document System'), ('dateinception', '1978-01-01T00:00:00Z')]
[('system', 'http://www.wikidata.org/entity/Q471'), ('name', 'Memex'), ('d

## Task 4
Identify the BGP for computer scientist

Final query for this task

In [31]:
# aim: check the property 'occupation' for 'Tim Berners-Lee'. He should be considered also as a computer scientist among other possible jobs
# result: the property 'computer scientist' is found
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q80 wdt:P106 ?obj .
   # get the label
   ?obj sc:name ?name.
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q1622272'), ('name', 'university teacher')]
[('obj', 'http://www.wikidata.org/entity/Q169470'), ('name', 'physicist')]
[('obj', 'http://www.wikidata.org/entity/Q205375'), ('name', 'inventor')]
[('obj', 'http://www.wikidata.org/entity/Q81096'), ('name', 'engineer')]
[('obj', 'http://www.wikidata.org/entity/Q82594'), ('name', 'computer scientist')]
[('obj', 'http://www.wikidata.org/entity/Q5482740'), ('name', 'programmer')]
[('obj', 'http://www.wikidata.org/entity/Q6859454'), ('name', 'web developer')]
7


## Task 5
Find how many computer scientists there are for each continent (consider their citizenship). Return the IRI and name of the continent and the number of scientists for each continent.

In [6]:
# aim: get all the properties of 'computer scientist'
# result: the bgp for 'computer scientist'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q82594 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1014'), ('name', 'Art & Architecture Thesaurus ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1036'), ('name', 'Dewey Decimal Classification')]
[('p', 'http://www.wikidata.org/prop/direct/P1245'), ('name', 'OmegaWiki Defined Meaning')]
[('p', 'http://www.wikidata.org/prop/direct/P1296'), ('name', 'Gran Enciclopèdia Catalana ID')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('name', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P227'), ('name', 'GND ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2283'), ('name', 'uses')]
[('p', 'http://www.wikidata.org/prop/direct/P2354'), ('name', 'has list')]
[('p', 'http://www.wikidata.org/prop/direct/P244'), ('name', 'Library of Congress authority ID')]
[('p', 'http://www.wikidata.org/prop/direct/P268'), ('name', 'Bibliothèque nationale de France ID')]
[('p', 'http://www.wikidata.org/prop/direct

In [3]:
# aim: check the property 'country of citizenship' for 'Tim Berners-Lee'. Check a property that might relate it to a continent
# result: the property 'continent' is found
queryString = """
SELECT DISTINCT ?country ?countryname ?p ?pname
WHERE {
   # bind something
    wd:Q80 wdt:P27 ?country .
   
   # get the country label
   ?country sc:name ?countryname ;
   #get the country properties
            ?p ?obj .
   ?p sc:name ?pname .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('country', 'http://www.wikidata.org/entity/Q145'), ('countryname', 'United Kingdom'), ('p', 'http://www.wikidata.org/prop/direct/P1036'), ('pname', 'Dewey Decimal Classification')]
[('country', 'http://www.wikidata.org/entity/Q145'), ('countryname', 'United Kingdom'), ('p', 'http://www.wikidata.org/prop/direct/P1081'), ('pname', 'Human Development Index')]
[('country', 'http://www.wikidata.org/entity/Q145'), ('countryname', 'United Kingdom'), ('p', 'http://www.wikidata.org/prop/direct/P1082'), ('pname', 'population')]
[('country', 'http://www.wikidata.org/entity/Q145'), ('countryname', 'United Kingdom'), ('p', 'http://www.wikidata.org/prop/direct/P1125'), ('pname', 'Gini coefficient')]
[('country', 'http://www.wikidata.org/entity/Q145'), ('countryname', 'United Kingdom'), ('p', 'http://www.wikidata.org/prop/direct/P1151'), ('pname', "topic's main Wikimedia portal")]
[('country', 'http://www.wikidata.org/entity/Q145'), ('countryname', 'United Kingdom'), ('p', 'http://www.wikid

In [5]:
# aim: try getting some persons with their job and continents by filtering their job label with 'computer scientist' (for the reader: the next queries are useless; we already
#      had the 'computer scientist' URI but at the time of writing the queries we completely forgot it... anyway it is used in the last cells of this task to get the job of
#      the considered people)
# result: empty set is found. Maybe the filter is not built correctly
queryString = """
SELECT DISTINCT ?personname ?continent ?continentname
WHERE {
   # bind something
   ?person wdt:P106 ?job ;
           wdt:P27 ?country ;
           sc:name ?personname .
           
   # getting the continent
   ?country wdt:P30 ?continent .
   
   ?continent sc:name ?continentname .
   
   # get the job label
   ?job sc:name ?jobname .
   FILTER (?jobname = 'computer scientist'^^xsd:string)
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [6]:
# aim: try getting some persons with their job and continents without filter to check if the above filter was incorrect
# result: the above filter was incorrect. Maybe double quotes?
queryString = """
SELECT DISTINCT ?personname ?jobname ?continentname
WHERE {
   # bind something
   ?person wdt:P106 ?job ;
           wdt:P27 ?country ;
           sc:name ?personname .
           
   # getting the continent
   ?country wdt:P30 ?continent .
   # getting the continent label
   ?continent sc:name ?continentname .
   
   # get the job label
   ?job sc:name ?jobname .
   #FILTER (?jobname = 'computer scientist')
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('personname', 'Zhao Weigu'), ('jobname', 'military commander'), ('continentname', 'Asia')]
[('personname', 'Wei Xianxin'), ('jobname', 'politician'), ('continentname', 'Asia')]
[('personname', 'Qooqai tayu'), ('jobname', 'tribal chief'), ('continentname', 'Asia')]
[('personname', 'Zhao Shoudu'), ('jobname', 'military commander'), ('continentname', 'Asia')]
[('personname', 'Zhao Weiji'), ('jobname', 'military commander'), ('continentname', 'Asia')]
[('personname', 'Zhao Defang'), ('jobname', 'military personnel'), ('continentname', 'Asia')]
[('personname', 'Emperor Taizu of Song'), ('jobname', 'ruler'), ('continentname', 'Asia')]
[('personname', 'Emperor Taizu of Song'), ('jobname', 'military commander'), ('continentname', 'Asia')]
[('personname', 'Zhao Dezhao'), ('jobname', 'military personnel'), ('continentname', 'Asia')]
[('personname', 'Qian Chen'), ('jobname', 'politician'), ('continentname', 'Asia')]
10


In [15]:
# aim: try getting some persons with their job and continents by filtering their job label with 'computer scientist'
# result: empty set is found. The filter is still not built correctly... Maybe without xsd:string ?
queryString = """
SELECT DISTINCT ?personname ?jobname ?continentname
WHERE {
   # bind something
   ?person wdt:P106 ?job ;
           wdt:P27 ?country ;
           sc:name ?personname .
           
   # getting the continent
   ?country wdt:P30 ?continent .
   # getting the continent label
   ?continent sc:name ?continentname .
   
   # get the job label
   ?job sc:name ?jobname .
   FILTER (?jobname = "computer scientist"^^xsd:string)
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [16]:
# aim: try getting some persons with their job and continents by filtering their job label with 'computer scientist'
# result: empty set is found. The filter is still not built correctly. Let's go with a regex
queryString = """
SELECT DISTINCT ?personname ?jobname ?continentname
WHERE {
   # bind something
   ?person wdt:P106 ?job ;
           wdt:P27 ?country ;
           sc:name ?personname .
           
   # getting the continent
   ?country wdt:P30 ?continent .
   # getting the continent label
   ?continent sc:name ?continentname .
   
   # get the job label
   ?job sc:name ?jobname .
   FILTER (?jobname = "computer scientist")
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [9]:
# aim: try getting some persons with their job and continents by filtering their job label with 'computer scientist' through a regex
# result: the regex is fine but it's slow. We try with exact regex matching
queryString = """
SELECT DISTINCT ?personname ?jobname ?continentname
WHERE {
   # bind something
   ?person wdt:P106 ?job ;
           wdt:P27 ?country ;
           sc:name ?personname .
           
   # getting the continent
   ?country wdt:P30 ?continent .
   # getting the continent label
   ?continent sc:name ?continentname .
   
   # get the job label
   ?job sc:name ?jobname .
   FILTER regex(?jobname, "computer scientist", "i")
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('personname', 'Seinosuke Toda'), ('jobname', 'computer scientist'), ('continentname', 'Asia')]
[('personname', 'Kevin Poulsen'), ('jobname', 'computer scientist'), ('continentname', 'North America')]
[('personname', 'Kevin Poulsen'), ('jobname', 'computer scientist'), ('continentname', 'Insular Oceania')]
[('personname', 'Michael Barr'), ('jobname', 'computer scientist'), ('continentname', 'North America')]
[('personname', 'Valeria de Paiva'), ('jobname', 'computer scientist'), ('continentname', 'South America')]
[('personname', 'Valeria de Paiva'), ('jobname', 'computer scientist'), ('continentname', 'Europe')]
[('personname', 'Ernesto Damiani'), ('jobname', 'computer scientist'), ('continentname', 'Europe')]
[('personname', 'Liang-Jie Zhang'), ('jobname', 'computer scientist'), ('continentname', 'North America')]
[('personname', 'Liang-Jie Zhang'), ('jobname', 'computer scientist'), ('continentname', 'Insular Oceania')]
[('personname', 'Jeffrey Shallit'), ('jobname', 'compu

In [10]:
# aim: try getting some persons with their job and continents by filtering their job label with 'computer scientist' through an exact matching regex
# result: the regex is not working. It's better to use the 'computer scientist's URI for filtering.

queryString = """
SELECT DISTINCT ?personname ?jobname ?continentname
WHERE {
   # bind something
   ?person wdt:P106 ?job ;
           wdt:P27 ?country ;
           sc:name ?personname .
           
   # getting the continent
   ?country wdt:P30 ?continent .
   # getting the continent label
   ?continent sc:name ?continentname .
   
   # get the job label
   ?job sc:name ?jobname .
   FILTER regex(?jobname, "^computer scientist$")
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [11]:
# aim: try getting some persons with their job and continents by filtering their job label with 'computer scientist' through a regex obtaining the 
#      'computer scientist's URI
# result: we got the 'computer scientist's URI. We already had it so all the above steps could have been avoided!
queryString = """
SELECT DISTINCT ?personname ?job ?jobname ?continentname
WHERE {
   # bind something
   ?person wdt:P106 ?job ;
           wdt:P27 ?country ;
           sc:name ?personname .
           
   # getting the continent
   ?country wdt:P30 ?continent .
   # getting the continent label
   ?continent sc:name ?continentname .
   
   # get the job label
   ?job sc:name ?jobname .
   FILTER regex(?jobname, "computer scientist")
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('personname', 'Seinosuke Toda'), ('job', 'http://www.wikidata.org/entity/Q82594'), ('jobname', 'computer scientist'), ('continentname', 'Asia')]
[('personname', 'Kevin Poulsen'), ('job', 'http://www.wikidata.org/entity/Q82594'), ('jobname', 'computer scientist'), ('continentname', 'North America')]
[('personname', 'Kevin Poulsen'), ('job', 'http://www.wikidata.org/entity/Q82594'), ('jobname', 'computer scientist'), ('continentname', 'Insular Oceania')]
[('personname', 'Michael Barr'), ('job', 'http://www.wikidata.org/entity/Q82594'), ('jobname', 'computer scientist'), ('continentname', 'North America')]
[('personname', 'Valeria de Paiva'), ('job', 'http://www.wikidata.org/entity/Q82594'), ('jobname', 'computer scientist'), ('continentname', 'South America')]
[('personname', 'Valeria de Paiva'), ('job', 'http://www.wikidata.org/entity/Q82594'), ('jobname', 'computer scientist'), ('continentname', 'Europe')]
[('personname', 'Ernesto Damiani'), ('job', 'http://www.wikidata.org/e

In [12]:
# aim: get some persons with their job and continent to check that everything is fine before grouping
# result: the result set seems to be ok
queryString = """
SELECT DISTINCT ?personname ?jobname ?continentname
WHERE {
   # bind something
   ?person wdt:P106 ?job ;
           wdt:P27 ?country ;
           sc:name ?personname .
           
   # getting the continent
   ?country wdt:P30 ?continent .
   # getting the continent label
   ?continent sc:name ?continentname .
   
   # get the job label
   ?job sc:name ?jobname .
   FILTER (?job = wd:Q82594)
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('personname', 'David Naccache'), ('jobname', 'computer scientist'), ('continentname', 'Africa')]
[('personname', 'Carsten Haitzler'), ('jobname', 'computer scientist'), ('continentname', 'Africa')]
[('personname', 'Clarisse Iribagiza'), ('jobname', 'computer scientist'), ('continentname', 'Africa')]
[('personname', 'Seymour Papert'), ('jobname', 'computer scientist'), ('continentname', 'Africa')]
[('personname', 'Wanda Orlikowski'), ('jobname', 'computer scientist'), ('continentname', 'Africa')]
[('personname', 'Theo de Raadt'), ('jobname', 'computer scientist'), ('continentname', 'Africa')]
[('personname', 'Lior Pachter'), ('jobname', 'computer scientist'), ('continentname', 'Africa')]
[('personname', 'Valeria de Paiva'), ('jobname', 'computer scientist'), ('continentname', 'South America')]
[('personname', 'Renato M. E. Sabbatini'), ('jobname', 'computer scientist'), ('continentname', 'South America')]
[('personname', 'Jean Paul Jacob'), ('jobname', 'computer scientist'), (

Final query for this task

In [14]:
# aim: get the amount of persons who have worked as 'computer scientist' for each continent 
# result: the amount of persons who have worked as 'computer scientist' for each continent 
queryString = """
SELECT DISTINCT ?continent ?continentname (COUNT(*) AS ?numberofscientists) 
WHERE {
   # bind something
   ?person wdt:P106 ?job ;
           wdt:P27 ?country ;
           sc:name ?personname .
           
   # getting the continent
   ?country wdt:P30 ?continent .
   # getting the continent label
   ?continent sc:name ?continentname .
   
   # get the job label
   ?job sc:name ?jobname .
   FILTER (?job = wd:Q82594)
} GROUP BY ?continent ?continentname
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('continent', 'http://www.wikidata.org/entity/Q18'), ('continentname', 'South America'), ('numberofscientists', '92')]
[('continent', 'http://www.wikidata.org/entity/Q538'), ('continentname', 'Insular Oceania'), ('numberofscientists', '2479')]
[('continent', 'http://www.wikidata.org/entity/Q49'), ('continentname', 'North America'), ('numberofscientists', '2782')]
[('continent', 'http://www.wikidata.org/entity/Q46'), ('continentname', 'Europe'), ('numberofscientists', '3443')]
[('continent', 'http://www.wikidata.org/entity/Q5401'), ('continentname', 'Eurasia'), ('numberofscientists', '23')]
[('continent', 'http://www.wikidata.org/entity/Q48'), ('continentname', 'Asia'), ('numberofscientists', '908')]
[('continent', 'http://www.wikidata.org/entity/Q15'), ('continentname', 'Africa'), ('numberofscientists', '76')]
7


## Task 6
Find all the computer scientists who thaught at the University of Cambridge (return IRI and name)

In [18]:
# aim: get all the properties of 'university teacher'
# result: nothing really special is found. Maybe with the property 'work location' since we discovered that Berner's Lee is or was a 'university teacher'
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q1622272 ?p ?obj .
   # get the label
   ?p sc:name ?name.
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1036'), ('name', 'Dewey Decimal Classification')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P227'), ('name', 'GND ID')]
[('p', 'http://www.wikidata.org/prop/direct/P244'), ('name', 'Library of Congress authority ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2671'), ('name', 'Google Knowledge Graph ID')]
[('p', 'http://www.wikidata.org/prop/direct/P268'), ('name', 'Bibliothèque nationale de France ID')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('name', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P3417'), ('name', 'Quora topic ID')]
[('p', 'http://www.wikidata.org/prop/direct/P3911'), ('name', 'STW Thesaurus for Economics ID')]
[('p', 'http://www.wikidata.org/prop/direct/P425'), ('name', 'field of this occupation')]
[('p', 'http://www.wikidata.org/

In [19]:
# aim: get all the objects related to 'work location' of 'Berner's Lee'
# result: we just found that he worked at CERN: no relations with Universities. Maybe instead of checking his profession as a teacher, we should check potential
#         relations with his student career
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q80 wdt:P937 ?obj .
   # get the label
   ?obj sc:name ?name.
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q42944'), ('name', 'CERN')]
1


In [20]:
# aim: get all the objects related to 'educated at' and 'academic degree' to find possible universities
# result: found some objects that might be related to universities
queryString = """
SELECT DISTINCT ?education ?educationname ?obj ?name
WHERE {
   # bind something
   wd:Q80 ?education ?obj .
   # get the label
   ?obj sc:name ?name .
   ?education sc:name ?educationname.
   FILTER (?education IN (wdt:P512,wdt:P69))
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('education', 'http://www.wikidata.org/prop/direct/P69'), ('educationname', 'educated at'), ('obj', 'http://www.wikidata.org/entity/Q73094'), ('name', "The Queen's College")]
[('education', 'http://www.wikidata.org/prop/direct/P69'), ('educationname', 'educated at'), ('obj', 'http://www.wikidata.org/entity/Q5369138'), ('name', 'Emanuel School')]
[('education', 'http://www.wikidata.org/prop/direct/P512'), ('educationname', 'academic degree'), ('obj', 'http://www.wikidata.org/entity/Q121594'), ('name', 'professor')]
[('education', 'http://www.wikidata.org/prop/direct/P512'), ('educationname', 'academic degree'), ('obj', 'http://www.wikidata.org/entity/Q1765120'), ('name', 'Bachelor of Arts')]
4


In [21]:
# aim: get all the objects and paths related to 'educated at' and 'academic degree' to find possible universities
# result: there are many results. We decide to check something about 'Bachelor of Arts'
queryString = """
SELECT DISTINCT ?educationname ?obj ?name ?p2 ?namep2
WHERE {
   # bind something
   wd:Q80 ?education ?obj .
   ?obj ?p2 ?obj2 .
   ?p2 sc:name ?namep2 .
   #?obj2 sc:name ?nameobj2 .
   # get the label
   ?obj sc:name ?name .
   ?education sc:name ?educationname.
   FILTER (?education IN (wdt:P512,wdt:P69))
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('educationname', 'academic degree'), ('obj', 'http://www.wikidata.org/entity/Q121594'), ('name', 'professor'), ('p2', 'http://www.wikidata.org/prop/direct/P3729'), ('namep2', 'next lower rank')]
[('educationname', 'educated at'), ('obj', 'http://www.wikidata.org/entity/Q73094'), ('name', "The Queen's College"), ('p2', 'http://www.wikidata.org/prop/direct/P3057'), ('namep2', 'Charity Commission no.')]
[('educationname', 'educated at'), ('obj', 'http://www.wikidata.org/entity/Q73094'), ('name', "The Queen's College"), ('p2', 'http://www.wikidata.org/prop/direct/P1004'), ('namep2', 'MusicBrainz place ID')]
[('educationname', 'academic degree'), ('obj', 'http://www.wikidata.org/entity/Q121594'), ('name', 'professor'), ('p2', 'http://www.wikidata.org/prop/direct/P1014'), ('namep2', 'Art & Architecture Thesaurus ID')]
[('educationname', 'educated at'), ('obj', 'http://www.wikidata.org/entity/Q73094'), ('name', "The Queen's College"), ('p2', 'http://www.wikidata.org/prop/direct/P112

In [22]:
# aim: check if 'Bachelor of Arts' is maybe one of the possible degrees in some university
# result: we found that it's a bachelor's degree
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q1765120 wdt:P31 ?obj .
   # get the label
   ?obj sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q163727'), ('name', "bachelor's degree")]
1


In [23]:
# aim: check properties and objects related to 'bachelor's degree'
# result: we don't find anything particularly useful
queryString = """
SELECT DISTINCT ?p ?name ?obj ?objname
WHERE {
   # bind something
   wd:Q163727 ?p ?obj .
   # get the label
   ?p sc:name ?name .
   ?obj sc:name ?objname .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from'), ('obj', 'http://www.wikidata.org/entity/Q749020'), ('objname', 'licentiate')]
[('p', 'http://www.wikidata.org/prop/direct/P460'), ('name', 'said to be the same as'), ('obj', 'http://www.wikidata.org/entity/Q749020'), ('objname', 'licentiate')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('name', 'has part'), ('obj', 'http://www.wikidata.org/entity/Q4838815'), ('objname', 'Bachelor of Music')]
[('p', 'http://www.wikidata.org/prop/direct/P460'), ('name', 'said to be the same as'), ('obj', 'http://www.wikidata.org/entity/Q1054679'), ('objname', 'laurea')]
[('p', 'http://www.wikidata.org/prop/direct/P1001'), ('name', 'applies to jurisdiction'), ('obj', 'http://www.wikidata.org/entity/Q145'), ('objname', 'United Kingdom')]
[('p', 'http://www.wikidata.org/prop/direct/P1001'), ('name', 'applies to jurisdiction'), ('obj', 'http://www.wikidata.org/entity/Q16'), ('objname', 'Canada')]
[('p', 'http://

In [24]:
# aim: get all the objects and properties related to 'university teacher'
# result: we found some properties that might relate to a possible university class or instance
queryString = """
SELECT DISTINCT ?p ?name ?obj ?objname
WHERE {
   # bind something
   wd:Q1622272 ?p ?obj .
   # get the label
   ?p sc:name ?name .
   ?obj sc:name ?objname .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P460'), ('name', 'said to be the same as'), ('obj', 'http://www.wikidata.org/entity/Q1569495'), ('objname', 'lecturer')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from'), ('obj', 'http://www.wikidata.org/entity/Q5669847'), ('objname', 'assistant professor')]
[('p', 'http://www.wikidata.org/prop/direct/P425'), ('name', 'field of this occupation'), ('obj', 'http://www.wikidata.org/entity/Q136822'), ('objname', 'higher education')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from'), ('obj', 'http://www.wikidata.org/entity/Q121594'), ('objname', 'professor')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from'), ('obj', 'http://www.wikidata.org/entity/Q28004591'), ('objname', 'professeur des universités')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from'), ('obj', 'http://www.wikidata.org/entity/Q37226'), ('objname', 'teach

In [25]:
# aim: get all the objects and properties related to 'teacher' and 'academic'
# result: we did not find useful data
queryString = """
SELECT DISTINCT ?p ?name ?obj ?objname
WHERE {
   # bind something
   ?s ?p ?obj .
   # get the label
   ?p sc:name ?name .
   ?obj sc:name ?objname .
   FILTER (?s IN (wd:Q37226, wd:Q3400985))
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('name', 'described by source'), ('obj', 'http://www.wikidata.org/entity/Q106727050'), ('objname', 'Metropolitan Museum of Art Tagging Vocabulary')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from'), ('obj', 'http://www.wikidata.org/entity/Q1231865'), ('objname', 'pedagogue')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of'), ('obj', 'http://www.wikidata.org/entity/Q12737077'), ('objname', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('name', 'different from'), ('obj', 'http://www.wikidata.org/entity/Q1622272'), ('objname', 'university teacher')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('name', 'described by source'), ('obj', 'http://www.wikidata.org/entity/Q19180675'), ('objname', 'Small Brockhaus and Efron Encyclopedic Dictionary')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('name', 'described by source'), ('obj', 'http://

In [26]:
# aim: get all the entities related to 'university teacher'
# result: we found some university teachers that can be used to get other data
queryString = """
SELECT DISTINCT ?s ?sname ?p ?name
WHERE {
   # bind something
   ?s ?p wd:Q1622272 .
   # get the label
   ?p sc:name ?name .
   ?s sc:name ?sname .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)


Results
[('s', 'http://www.wikidata.org/entity/Q28475249'), ('sname', 'Joaquín Garrigues Díaz-Cañabate'), ('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('s', 'http://www.wikidata.org/entity/Q8166543'), ('sname', 'Hajo Bernett'), ('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('s', 'http://www.wikidata.org/entity/Q8190500'), ('sname', 'Seinosuke Toda'), ('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('s', 'http://www.wikidata.org/entity/Q8194957'), ('sname', 'Alexandre Azedo Lacerda'), ('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('s', 'http://www.wikidata.org/entity/Q8196240'), ('sname', 'Alois Richard Nykl'), ('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('s', 'http://www.wikidata.org/entity/Q8201210'), ('sname', 'Antonio Escohotado'), ('p', 'http://www.wikidata.org/prop/direct/P106'), ('name', 'occupation')]
[('s', 'http://www.wikidata.o

In [27]:
# aim: at this point we try again following the path from the 'Emanuel's school' and 'Queens College' related to Tim Berner's Lee
# result: we found the University Of Oxford! Maybe now we can retrieve the University of Cambridge and get the list of professors who worked there!
queryString = """
SELECT DISTINCT ?s ?p ?name ?obj ?objname
WHERE {
   # bind something
   ?s ?p ?obj .
   # get the label
   ?p sc:name ?name .
   ?obj sc:name ?objname .
   FILTER (?s IN (wd:Q5369138,wd:Q73094))
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q5369138'), ('p', 'http://www.wikidata.org/prop/direct/P112'), ('name', 'founded by'), ('obj', 'http://www.wikidata.org/entity/Q4768325'), ('objname', 'Anne Fiennes')]
[('s', 'http://www.wikidata.org/entity/Q5369138'), ('p', 'http://www.wikidata.org/prop/direct/P140'), ('name', 'religion'), ('obj', 'http://www.wikidata.org/entity/Q106039'), ('objname', 'Christian')]
[('s', 'http://www.wikidata.org/entity/Q73094'), ('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of'), ('obj', 'http://www.wikidata.org/entity/Q19844914'), ('objname', 'university building')]
[('s', 'http://www.wikidata.org/entity/Q73094'), ('p', 'http://www.wikidata.org/prop/direct/P31'), ('name', 'instance of'), ('obj', 'http://www.wikidata.org/entity/Q2581649'), ('objname', 'college of the University of Oxford')]
[('s', 'http://www.wikidata.org/entity/Q5369138'), ('p', 'http://www.wikidata.org/prop/direct/P131'), ('name', 'located in the administrative t

In [28]:
# aim: get all the properties and objects related to 'University of Oxford'
# result: we found the object 'public university' and 'higher education institution' so maybe the 'University of Cambridge' is an instance of them
queryString = """
SELECT DISTINCT ?p ?name ?obj ?objname
WHERE {
   # bind something
   wd:Q34433 ?p ?obj .
   # get the label
   ?p sc:name ?name .
   ?obj sc:name ?objname .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P3876'), ('name', 'category for alumni of educational institution'), ('obj', 'http://www.wikidata.org/entity/Q8237442'), ('objname', 'Category:Alumni of the University of Oxford')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('name', 'has part'), ('obj', 'http://www.wikidata.org/entity/Q128497'), ('objname', "St Hugh's College")]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('name', 'has part'), ('obj', 'http://www.wikidata.org/entity/Q2035359'), ('objname', "St Antony's College")]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('name', 'has part'), ('obj', 'http://www.wikidata.org/entity/Q5273413'), ('objname', "St Stephen's House")]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('name', 'has part'), ('obj', 'http://www.wikidata.org/entity/Q1546625'), ('objname', 'Griffith Institute')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('name', 'has part'), ('obj', 'http://www.wikidata.org/entity/Q73094'), 

In [31]:
# aim: get all the objects that are instance of 'public university' and 'higher educational institution' containing 'Cambridge' as label
# result: we found a business school that is probably not related to the actual 'University of Cambridge'...
queryString = """
SELECT DISTINCT ?s ?sname
WHERE {
   # bind something
   ?s wdt:P31 ?obj .
   # get the label
   ?s sc:name ?sname .
   FILTER (?obj IN (wd:Q875538,wd:Q38723)) .
   
   FILTER regex(?sname, "cambridge", "i")
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q22914845'), ('sname', 'Cambridge Business School in Prague')]
1


In [34]:
# aim: get all the objects related to the location 'Cambridge' by means of opportune regexes
# result: we finally found the 'University of Cambridge'
queryString = """
SELECT DISTINCT ?s ?sname
WHERE {
   # bind something
   ?s ?p ?obj .
   # get the label
   ?s sc:name ?sname .
   ?obj sc:name ?objname
   FILTER (?p IN (wdt:P159,wdt:P276)) .
   FILTER regex(?sname, "university", "i") .
   FILTER regex(?objname, "cambridge", "i") .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q784593'), ('sname', 'Cambridge University Boat Club')]
[('s', 'http://www.wikidata.org/entity/Q35794'), ('sname', 'University of Cambridge')]
[('s', 'http://www.wikidata.org/entity/Q912887'), ('sname', 'Cambridge University Press')]
[('s', 'http://www.wikidata.org/entity/Q540751'), ('sname', 'Anglia Ruskin University')]
[('s', 'http://www.wikidata.org/entity/Q100993034'), ('sname', 'Anglia Ruskin University - Cambridge Campus')]
[('s', 'http://www.wikidata.org/entity/Q100993031'), ('sname', 'Anglia Ruskin University Ashcroft International Business School - Cambridge Campus')]
[('s', 'http://www.wikidata.org/entity/Q100993035'), ('sname', 'Anglia Ruskin University Department of English Communication Film and Media Cambridge Campus')]
[('s', 'http://www.wikidata.org/entity/Q101046178'), ('sname', 'University of Cambridge Primary Care Unit')]
[('s', 'http://www.wikidata.org/entity/Q101010473'), ('sname', 'Anglia Ruskin University Department 

In [2]:
# aim: get all the properties related to the 'University of Cambridge'
# result: we just found a 'has list' property that might contain a list of students/professors
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   wd:Q35794 ?p ?obj .
   # get the label
   ?p sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P6733'), ('name', 'National Taxonomy of Exempt Entities code')]
[('p', 'http://www.wikidata.org/prop/direct/P1004'), ('name', 'MusicBrainz place ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1005'), ('name', 'Portuguese National Library ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1017'), ('name', 'Vatican Library ID (former scheme)')]
[('p', 'http://www.wikidata.org/prop/direct/P1296'), ('name', 'Gran Enciclopèdia Catalana ID')]
[('p', 'http://www.wikidata.org/prop/direct/P131'), ('name', 'located in the administrative territorial entity')]
[('p', 'http://www.wikidata.org/prop/direct/P1329'), ('name', 'phone number')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('name', 'described by source')]
[('p', 'http://www.wikidata.org/prop/direct/P1416'), ('name', 'affiliation')]
[('p', 'http://www.wikidata.org/prop/direct/P1417'), ('name', 'Encyclopædia Britannica Online ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1

In [3]:
# aim: get the objects of 'has list'
# result: we found a object that might be perfect for the final query!
queryString = """
SELECT DISTINCT ?obj ?name
WHERE {
   # bind something
   wd:Q35794 wdt:P2354 ?obj .
   # get the label
   ?obj sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q6635048'), ('name', 'list of professorships at the University of Cambridge')]
1


In [4]:
# aim: get all the objects and properties related to 'list of professorships at the University of Cambridge'
# result: we didn't find the actual list of professors 
queryString = """
SELECT DISTINCT ?p ?pname ?obj ?name
WHERE {
   # bind something
   wd:Q6635048 ?p ?obj .
   # get the label
   ?p sc:name ?pname .
   ?obj sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of'), ('obj', 'http://www.wikidata.org/entity/Q13406463'), ('name', 'Wikimedia list article')]
[('p', 'http://www.wikidata.org/prop/direct/P360'), ('pname', 'is a list of'), ('obj', 'http://www.wikidata.org/entity/Q1363750'), ('name', 'chair')]
[('p', 'http://www.wikidata.org/prop/direct/P360'), ('pname', 'is a list of'), ('obj', 'http://www.wikidata.org/entity/Q81752537'), ('name', 'professorship')]
[('p', 'http://www.wikidata.org/prop/direct/P360'), ('pname', 'is a list of'), ('obj', 'http://www.wikidata.org/entity/Q2138652'), ('name', 'Regius Professor')]
[('p', 'http://www.wikidata.org/prop/direct/P1754'), ('pname', 'category related to list'), ('obj', 'http://www.wikidata.org/entity/Q7018986'), ('name', 'Category:Professorships at the University of Cambridge')]
5


In [3]:
# aim: get all the properties related to 'list of professorships at the University of Cambridge'
# result: we didn't find the actual list of professors but maybe we need to explore the 'Category:Professorships at the University of Cambridge' object
queryString = """
SELECT DISTINCT ?p ?pname
WHERE {
   # bind something
   ?s ?p  wd:Q6635048 .
   # get the label
   ?p sc:name ?pname .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1753'), ('pname', 'list related to category')]
[('p', 'http://www.wikidata.org/prop/direct/P2354'), ('pname', 'has list')]
2


In [5]:
# aim: get all the objects and properties related to 'Category:Professorships at the University of Cambridge'
# result: we didn't find anything, we are looping...
queryString = """
SELECT DISTINCT ?p ?pname ?obj ?name
WHERE {
   # bind something
   wd:Q7018986 ?p ?obj .
   # get the label
   ?p sc:name ?pname .
   ?obj sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P971'), ('pname', 'category combines topics'), ('obj', 'http://www.wikidata.org/entity/Q35794'), ('name', 'University of Cambridge')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of'), ('obj', 'http://www.wikidata.org/entity/Q4167836'), ('name', 'Wikimedia category')]
[('p', 'http://www.wikidata.org/prop/direct/P971'), ('pname', 'category combines topics'), ('obj', 'http://www.wikidata.org/entity/Q81752537'), ('name', 'professorship')]
[('p', 'http://www.wikidata.org/prop/direct/P4224'), ('pname', 'category contains'), ('obj', 'http://www.wikidata.org/entity/Q81752537'), ('name', 'professorship')]
[('p', 'http://www.wikidata.org/prop/direct/P1753'), ('pname', 'list related to category'), ('obj', 'http://www.wikidata.org/entity/Q6635048'), ('name', 'list of professorships at the University of Cambridge')]
5


In [9]:
# aim: get all the properties in relation with the 'University of Cambridge'
# result: we just found some properties related to possible workers at the university
queryString = """
SELECT DISTINCT ?p ?name
WHERE {
   # bind something
   ?s ?p wd:Q35794 .
   # get the label
   ?p sc:name ?name .
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P2389'), ('name', 'organization directed by the office or position')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('name', 'employer')]
[('p', 'http://www.wikidata.org/prop/direct/P123'), ('name', 'publisher')]
[('p', 'http://www.wikidata.org/prop/direct/P1269'), ('name', 'facet of')]
[('p', 'http://www.wikidata.org/prop/direct/P127'), ('name', 'owned by')]
[('p', 'http://www.wikidata.org/prop/direct/P137'), ('name', 'operator')]
[('p', 'http://www.wikidata.org/prop/direct/P138'), ('name', 'named after')]
[('p', 'http://www.wikidata.org/prop/direct/P1416'), ('name', 'affiliation')]
[('p', 'http://www.wikidata.org/prop/direct/P159'), ('name', 'headquarters location')]
[('p', 'http://www.wikidata.org/prop/direct/P178'), ('name', 'developer')]
[('p', 'http://www.wikidata.org/prop/direct/P20'), ('name', 'place of death')]
[('p', 'http://www.wikidata.org/prop/direct/P2541'), ('name', 'operating area')]
[('p', 'http://www.wikidata.o

In [19]:
# aim: explore the objects with properties 'employer', 'professorship', 'work location' and 'member of' in relation with 'University of Cambridge'
# result: there are for sure many 'employer's. We check each property separately
queryString = """
SELECT DISTINCT ?p ?pname ?s ?name
WHERE {
   # bind something
   ?s ?p wd:Q35794 .
   # get the label
   ?p sc:name ?pname .
   ?s sc:name ?name .
   FILTER (?p IN (wdt:P463,wdt:P108,wdt:P937,wdt:P803))
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('pname', 'employer'), ('s', 'http://www.wikidata.org/entity/Q17022827'), ('name', 'Ashok Venkitaraman')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('pname', 'employer'), ('s', 'http://www.wikidata.org/entity/Q8990547'), ('name', 'Liu Yu')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('pname', 'employer'), ('s', 'http://www.wikidata.org/entity/Q19335551'), ('name', 'Mildred Blaxter')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('pname', 'employer'), ('s', 'http://www.wikidata.org/entity/Q21613245'), ('name', 'Richard Henry Yapp')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('pname', 'employer'), ('s', 'http://www.wikidata.org/entity/Q29654050'), ('name', 'Duncan Odom')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('pname', 'employer'), ('s', 'http://www.wikidata.org/entity/Q44172566'), ('name', 'Sharon Peacock')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('pname', 'employer

In [20]:
# aim: explore the objects with property 'member of' in relation with 'University of Cambridge'
# result: found just one object and it's a comic character...
queryString = """
SELECT DISTINCT ?p ?pname ?s ?name
WHERE {
   # bind something
   ?s ?p wd:Q35794 .
   # get the label
   ?p sc:name ?pname .
   ?s sc:name ?name .
   FILTER (?p IN (wdt:P463))
}
LIMIT 30
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P463'), ('pname', 'member of'), ('s', 'http://www.wikidata.org/entity/Q7861585'), ('name', 'Tyrone Cash')]
1


In [21]:
# aim: explore the objects with property 'work location' in relation with 'University of Cambridge'
# result: found just few objects
queryString = """
SELECT DISTINCT ?p ?pname ?s ?name
WHERE {
   # bind something
   ?s ?p wd:Q35794 .
   # get the label
   ?p sc:name ?pname .
   ?s sc:name ?name .
   FILTER (?p IN (wdt:P937))
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P937'), ('pname', 'work location'), ('s', 'http://www.wikidata.org/entity/Q29609215'), ('name', 'Isaiah (Shy) Arkin')]
[('p', 'http://www.wikidata.org/prop/direct/P937'), ('pname', 'work location'), ('s', 'http://www.wikidata.org/entity/Q98276775'), ('name', 'John Lee')]
[('p', 'http://www.wikidata.org/prop/direct/P937'), ('pname', 'work location'), ('s', 'http://www.wikidata.org/entity/Q5739815'), ('name', 'Herman Hedwig Bernard')]
[('p', 'http://www.wikidata.org/prop/direct/P937'), ('pname', 'work location'), ('s', 'http://www.wikidata.org/entity/Q23761569'), ('name', 'Ralph Symons')]
4


In [22]:
# aim: explore the objects with property 'professorship' in relation with 'University of Cambridge'
# result: found just few objects
queryString = """
SELECT DISTINCT ?p ?pname ?s ?name
WHERE {
   # bind something
   ?s ?p wd:Q35794 .
   # get the label
   ?p sc:name ?pname .
   ?s sc:name ?name .
   FILTER (?p IN (wdt:P803))
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P803'), ('pname', 'professorship'), ('s', 'http://www.wikidata.org/entity/Q75349437'), ('name', 'Friedrich Max Müller')]
[('p', 'http://www.wikidata.org/prop/direct/P803'), ('pname', 'professorship'), ('s', 'http://www.wikidata.org/entity/Q532695'), ('name', 'Michael Levey')]
2


In [4]:
# aim: get all the persons who worked both as university professors and computer scientists
# result: we found persons who worked both as university professors and computer scientists
queryString = """
SELECT DISTINCT ?person ?personname
WHERE {
   # bind something
   ?person wdt:P106 ?job1 ;
           wdt:P106 ?job2 ;
           sc:name ?personname .
           
           
   #the person must be a computer scientist and a university teacher
   FILTER ((?job1 = wd:Q82594) && (?job2 = wd:Q1622272))
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('person', 'http://www.wikidata.org/entity/Q8190500'), ('personname', 'Seinosuke Toda')]
[('person', 'http://www.wikidata.org/entity/Q92669'), ('personname', 'Jeffrey Shallit')]
[('person', 'http://www.wikidata.org/entity/Q15917430'), ('personname', 'A. Ollongren')]
[('person', 'http://www.wikidata.org/entity/Q16216373'), ('personname', 'Klaus Fuchs-Kittowski')]
[('person', 'http://www.wikidata.org/entity/Q32772993'), ('personname', 'Dawson Engler')]
[('person', 'http://www.wikidata.org/entity/Q92881'), ('personname', 'Douglas McIlroy')]
[('person', 'http://www.wikidata.org/entity/Q173714'), ('personname', 'Max Euwe')]
[('person', 'http://www.wikidata.org/entity/Q19296289'), ('personname', 'Michael Backes')]
[('person', 'http://www.wikidata.org/entity/Q6899861'), ('personname', 'Moni Naor')]
[('person', 'http://www.wikidata.org/entity/Q433933'), ('personname', 'Vincent Rijmen')]
[('person', 'http://www.wikidata.org/entity/Q578036'), ('personname', 'Ron Rivest')]
[('person', 'h

In [6]:
# aim: get all the persons who worked both as university professors and computer scientists with the property 'employer' related to 'University of Cambridge'
# result: we found persons who were professors and computer scientists at the University of Cambridge
queryString = """
SELECT DISTINCT ?person ?personname
WHERE {
   # bind something
   ?person wdt:P106 ?job1 ;
           wdt:P106 ?job2 ;
           sc:name ?personname ;
           wdt:P108 wd:Q35794 .
           
           
   #the person must be a 'computer scientist' and a 'university teacher' or 'lecturer'
   FILTER ((?job1 = wd:Q82594) && (?job2 IN(wd:Q1622272,wd:Q1569495 ))) .
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('person', 'http://www.wikidata.org/entity/Q46633'), ('personname', 'Charles Babbage')]
[('person', 'http://www.wikidata.org/entity/Q62857'), ('personname', 'Maurice Wilkes')]
[('person', 'http://www.wikidata.org/entity/Q451770'), ('personname', 'Douglas Hartree')]
[('person', 'http://www.wikidata.org/entity/Q4707402'), ('personname', 'Alan Mycroft')]
[('person', 'http://www.wikidata.org/entity/Q10348961'), ('personname', 'Peter Russell')]
[('person', 'http://www.wikidata.org/entity/Q63126308'), ('personname', 'Peter Földiák')]
[('person', 'http://www.wikidata.org/entity/Q216811'), ('personname', 'John Venn')]
[('person', 'http://www.wikidata.org/entity/Q100586732'), ('personname', 'Ferenc Huszár')]
[('person', 'http://www.wikidata.org/entity/Q316330'), ('personname', 'Daniel Bell')]
9


In [29]:
# aim: get all the persons who worked as computer scientists with the property 'employer' related to 'University of Cambridge'
# result: we found persons who were computer scientists at the University of Cambridge but this might mean that they were only researchers so we choose the
#         previous query to fulfill the objective of this task
queryString = """
SELECT DISTINCT ?person ?personname
WHERE {
   # bind something
   ?person wdt:P106 ?job1 ;
           sc:name ?personname ;
           wdt:P108 wd:Q35794 .
           
           
   #the person must be a computer scientist
   FILTER (?job1 = wd:Q82594) .
}
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('person', 'http://www.wikidata.org/entity/Q92944'), ('personname', 'David Wheeler')]
[('person', 'http://www.wikidata.org/entity/Q7251'), ('personname', 'Alan Turing')]
[('person', 'http://www.wikidata.org/entity/Q365578'), ('personname', 'Nathan Myhrvold')]
[('person', 'http://www.wikidata.org/entity/Q446862'), ('personname', 'David J. C. MacKay')]
[('person', 'http://www.wikidata.org/entity/Q92431724'), ('personname', 'Andrew Pitts')]
[('person', 'http://www.wikidata.org/entity/Q46633'), ('personname', 'Charles Babbage')]
[('person', 'http://www.wikidata.org/entity/Q7259'), ('personname', 'Ada Lovelace')]
[('person', 'http://www.wikidata.org/entity/Q62857'), ('personname', 'Maurice Wilkes')]
[('person', 'http://www.wikidata.org/entity/Q7176624'), ('personname', 'Peter Robinson')]
[('person', 'http://www.wikidata.org/entity/Q451770'), ('personname', 'Douglas Hartree')]
[('person', 'http://www.wikidata.org/entity/Q6135125'), ('personname', 'James H. Davenport')]
[('person', '

Final query for this task

In [7]:
# aim: get all the persons who worked both as university professors and computer scientists with the property 'employer' related to 'University of Cambridge'
# result: we found persons who were professors and computer scientists at the University of Cambridge
queryString = """
SELECT DISTINCT ?person ?personname
WHERE {
   # bind something
   ?person wdt:P106 ?job1 ;
           wdt:P106 ?job2 ;
           sc:name ?personname ;
           wdt:P108 wd:Q35794 .
           
           
   #the person must be a 'computer scientist' and a 'university teacher' or 'lecturer'
   FILTER ((?job1 = wd:Q82594) && (?job2 IN(wd:Q1622272,wd:Q1569495 ))) .
}
LIMIT 10
"""

print("Results")
x=run_query(queryString)

Results
[('person', 'http://www.wikidata.org/entity/Q46633'), ('personname', 'Charles Babbage')]
[('person', 'http://www.wikidata.org/entity/Q62857'), ('personname', 'Maurice Wilkes')]
[('person', 'http://www.wikidata.org/entity/Q451770'), ('personname', 'Douglas Hartree')]
[('person', 'http://www.wikidata.org/entity/Q4707402'), ('personname', 'Alan Mycroft')]
[('person', 'http://www.wikidata.org/entity/Q10348961'), ('personname', 'Peter Russell')]
[('person', 'http://www.wikidata.org/entity/Q63126308'), ('personname', 'Peter Földiák')]
[('person', 'http://www.wikidata.org/entity/Q216811'), ('personname', 'John Venn')]
[('person', 'http://www.wikidata.org/entity/Q100586732'), ('personname', 'Ferenc Huszár')]
[('person', 'http://www.wikidata.org/entity/Q316330'), ('personname', 'Daniel Bell')]
9


## Task 7
Find all the computer scientists who wrote at least 5 books (please consider only the instances of book, exclude "literary work" or other type of work related to books). Return the IRI and name of the computer scientist and the number of books.

In [32]:
# aim: since we don't have particular properties that might refer to books, we try to look for some possible class containing the word 'book'
# result: we found something that might be related to books
queryString = """
SELECT DISTINCT ?book ?bookname
WHERE {
   # bind something
   ?book sc:name ?bookname .
           
           
   FILTER REGEX(?bookname, "book", "i") .
}
LIMIT 10
"""
print("Results")
x=run_query(queryString)

Results
[('book', 'http://www.wikidata.org/entity/Q8130203'), ('bookname', 'Category:1920s books')]
[('book', 'http://www.wikidata.org/entity/Q8212873'), ('bookname', 'Category:810s books')]
[('book', 'http://www.wikidata.org/entity/Q8213977'), ('bookname', 'Category:970s books')]
[('book', 'http://www.wikidata.org/entity/Q8227293'), ('bookname', 'Category:Aikido books')]
[('book', 'http://www.wikidata.org/entity/Q8352313'), ('bookname', 'Category:Cengage books')]
[('book', 'http://www.wikidata.org/entity/Q8361917'), ('bookname', 'Category:Danish booksellers')]
[('book', 'http://www.wikidata.org/entity/Q8362395'), ('bookname', "Category:Chinese children's books")]
[('book', 'http://www.wikidata.org/entity/Q8366938'), ('bookname', 'Category:Christian theology books')]
[('book', 'http://www.wikidata.org/entity/Q8376756'), ('bookname', 'Category:Dinosaur books')]
[('book', 'http://www.wikidata.org/entity/Q8385961'), ('bookname', 'Category:Doctor Who book covers')]
10


In [34]:
# aim: get the properties of 'Category:Dinosaur books'
# result: we found the 'instance of' property: it might mean that it's instance of something related to books
queryString = """
SELECT DISTINCT ?p ?pname
WHERE {
   # bind something
   wd:Q8376756 ?p ?obj .
   ?p sc:name ?pname .
}
LIMIT 100
"""
print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pname', 'instance of')]
1


In [36]:
# aim: check the property 'instance of' of 'Category:Dinosaur books'
# result: we found it's a 'Wikimedia category'
queryString = """
SELECT DISTINCT ?obj ?oname
WHERE {
   # bind something
   wd:Q8376756 wdt:P31 ?obj .
   ?obj sc:name ?oname .
}
LIMIT 100
"""
print("Results")
x=run_query(queryString)

Results
[('obj', 'http://www.wikidata.org/entity/Q4167836'), ('oname', 'Wikimedia category')]
1


In [46]:
# aim: check if 'Category:Dinosaur books' might be related to books
# result: no results. We need to choose another way to proceed
queryString = """
SELECT DISTINCT ?s ?sname
WHERE {
   # bind something
   ?s ?p wd:Q8376756 .
   ?s sc:name ?sname .
}
LIMIT 100
"""
print("Results")
x=run_query(queryString)

Results
Empty


In [38]:
# aim: since we don't have particular properties that might refer to books, we try to look for some possible class containing the word 'book' but not containing the word 'Category:'
# result: we found something that might be related to books but these are not actually book at first glance...
queryString = """
SELECT DISTINCT ?book ?bookname
WHERE {
   # bind something
   ?book sc:name ?bookname .
           
           
   FILTER REGEX(?bookname, "book", "i") .
   FILTER (!REGEX(?bookname, "Category:")) .
}
LIMIT 10
"""
print("Results")
x=run_query(queryString)

Results
[('book', 'http://www.wikidata.org/entity/Q2594405'), ('bookname', 'Modern Age of Comic Books')]
[('book', 'http://www.wikidata.org/entity/Q3430019'), ('bookname', 'Redbook')]
[('book', 'http://www.wikidata.org/entity/Q8614441'), ('bookname', 'Help:Books/License')]
[('book', 'http://www.wikidata.org/entity/Q13865328'), ('bookname', 'Murder Book')]
[('book', 'http://www.wikidata.org/entity/Q15753688'), ('bookname', 'Yearbook of European Law')]
[('book', 'http://www.wikidata.org/entity/Q15759152'), ('bookname', 'Yearbook of Oriental Studies')]
[('book', 'http://www.wikidata.org/entity/Q15759228'), ('bookname', 'Canadian Yearbook of International Law')]
[('book', 'http://www.wikidata.org/entity/Q15759994'), ('bookname', 'Biblionews and Australian Notes and Queries: journal for book collectors')]
[('book', 'http://www.wikidata.org/entity/Q15761830'), ('bookname', 'Vienna Yearbook of Population Research')]
[('book', 'http://www.wikidata.org/entity/Q10876287'), ('bookname', 'Chung Hw

In [39]:
# aim: look for a class containing the word 'book' that may contain possible instances of books
# result: we found a sketchbook, it might be related to a more general class 'book'
queryString = """
SELECT DISTINCT ?book ?bookname
WHERE {
   # bind something
   ?book sc:name ?bookname .
   ?obj wdt:P31 ?book .
           
           
   FILTER REGEX(?bookname, "book", "i") .
}
LIMIT 1
"""
print("Results")
x=run_query(queryString)

Results
[('book', 'http://www.wikidata.org/entity/Q60520'), ('bookname', 'sketchbook')]
1


In [40]:
# aim: check the properties of 'sketchbook'
# result: the 'sketchbook' properties relate it more to something relate to paintings
queryString = """
SELECT DISTINCT ?p ?pname ?o ?oname
WHERE {
   # bind something
   wd:Q60520 ?p ?o .
   ?p sc:name ?pname .
   ?o sc:name ?oname .
}
LIMIT 1
"""
print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part'), ('o', 'http://www.wikidata.org/entity/Q5078274'), ('oname', 'sketch')]
1


In [41]:
# aim: look for a class containing the word 'book' that may contain possible subclasses (there could be classes like children book and so on...)
# result: we found workbook but a workbook is an electronic sheet to perform calculations so it can't be related to books
queryString = """
SELECT DISTINCT ?book ?bookname
WHERE {
   # bind something
   ?book sc:name ?bookname .
   ?obj wdt:P279 ?book .
           
           
   FILTER REGEX(?bookname, "book", "i") .
}
LIMIT 1
"""
print("Results")
x=run_query(queryString)

Results
[('book', 'http://www.wikidata.org/entity/Q26853769'), ('bookname', 'workbook')]
1


In [7]:
# aim: look for a class containing the word 'book' that may contain possible subclasses (there could be classes like children book and so on...)
# result: we found workbook but a workbook is an electronic sheet to perform calculations so it can't be related to books
queryString = """
SELECT DISTINCT ?book ?bookname
WHERE {
   # bind something
   ?book sc:name ?bookname .
   ?obj wdt:P31 ?book .
   ?obj ?pauthor ?author .
   ?pauthor sc:name ?pauthorname
           
   FILTER REGEX(?bookname, "book", "i") .
   FILTER REGEX(?pauthorname, "author", "i") .
}
LIMIT 1
"""
print("Results")
x=run_query(queryString)

Results
[('book', 'http://www.wikidata.org/entity/Q277759'), ('bookname', 'book series')]
1


In [6]:
# aim: look for a resource containing the word 'author'
# result: we didn't find an 'author' property...
queryString = """
SELECT DISTINCT ?p ?pname
WHERE {
   # bind something
   ?p sc:name ?pname .
           
           
   FILTER REGEX(?pname, "author", "i") .
}
LIMIT 20
"""
print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q788451'), ('pname', 'author citation')]
[('p', 'http://www.wikidata.org/entity/Q8354433'), ('pname', 'Category:Certified albums by certifying authority')]
[('p', 'http://www.wikidata.org/entity/Q8371763'), ('pname', 'Category:Delaware River and Bay Authority facilities')]
[('p', 'http://www.wikidata.org/entity/Q8397920'), ('pname', 'Category:Communications authorities')]
[('p', 'http://www.wikidata.org/entity/Q8458417'), ('pname', 'Category:Financial regulatory authorities of the United States')]
[('p', 'http://www.wikidata.org/entity/Q15892203'), ('pname', 'Oxfordian theory of Shakespeare authorship')]
[('p', 'http://www.wikidata.org/entity/Q8616115'), ('pname', 'Help:Wikipedia: The Missing Manual/Author and Publisher Information')]
[('p', 'http://www.wikidata.org/entity/Q8675188'), ('pname', 'Category:New York State Thruway Authority')]
[('p', 'http://www.wikidata.org/entity/Q8703893'), ('pname', 'Category:Santa Clara Valley Transportat

In [9]:
# aim: look for some possible resources containing the word 'book' but not containing the word 'Category:' with a higher LIMIT that has the property 'subclass of' that might link it
#      to a 'book' class
# result: we found an interesting resource called 'banned book' that might be a subclass of a possible book class
queryString = """
SELECT DISTINCT ?book ?bookname
WHERE {
   # bind something
   ?book sc:name ?bookname .
   ?book wdt:P279 ?subclass .
           
           
   FILTER REGEX(?bookname, "book", "i") .
   FILTER (!REGEX(?bookname, "Category:")) .
}
LIMIT 50
"""
print("Results")
x=run_query(queryString)

Results
[('book', 'http://www.wikidata.org/entity/Q17048770'), ('bookname', 'Splatbook')]
[('book', 'http://www.wikidata.org/entity/Q65770793'), ('bookname', 'banned book')]
[('book', 'http://www.wikidata.org/entity/Q60520'), ('bookname', 'sketchbook')]
[('book', 'http://www.wikidata.org/entity/Q22976748'), ('bookname', 'stamp book')]
[('book', 'http://www.wikidata.org/entity/Q26853769'), ('bookname', 'workbook')]
[('book', 'http://www.wikidata.org/entity/Q21818614'), ('bookname', 'liturgical book of the Roman Rite')]
[('book', 'http://www.wikidata.org/entity/Q2477770'), ('bookname', 'Ultrabook')]
[('book', 'http://www.wikidata.org/entity/Q488192'), ('bookname', 'MacBook Air')]
[('book', 'http://www.wikidata.org/entity/Q22759'), ('bookname', 'book censorship')]
[('book', 'http://www.wikidata.org/entity/Q105810973'), ('bookname', 'award for best book (by genre)')]
[('book', 'http://www.wikidata.org/entity/Q2538131'), ('bookname', 'OER textbook')]
[('book', 'http://www.wikidata.org/entit

In [10]:
# aim: get the properties of 'banned book'
# result: we found the 'subclass of' property! Let's see if it is a subclass of a book class
queryString = """
SELECT DISTINCT ?p ?pn
WHERE {
   # bind something
   wd:Q65770793 ?p ?o .
   ?p sc:name ?pn .
}
LIMIT 100
"""
print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P2347'), ('pn', 'YSO ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2354'), ('pn', 'has list')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('pn', 'subclass of')]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('pn', 'part of')]
[('p', 'http://www.wikidata.org/prop/direct/P366'), ('pn', 'use')]
[('p', 'http://www.wikidata.org/prop/direct/P5398'), ('pn', 'TDKIV term ID')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('pn', "topic's main category")]
7


In [11]:
# aim: check the super class of 'banned book'
# result: we found the 'book' class! Let's check its properties
queryString = """
SELECT DISTINCT ?o ?on
WHERE {
   # bind something
   wd:Q65770793 wdt:P279 ?o .
   ?o sc:name ?on .
}
LIMIT 100
"""
print("Results")
x=run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q571'), ('on', 'book')]
[('o', 'http://www.wikidata.org/entity/Q66475652'), ('on', 'banned work')]
2


In [12]:
# aim: get the properties of the 'book' class
# result: we found the 'author' property! Now we can try to solve the task
queryString = """
SELECT DISTINCT ?p ?pn ?o ?on
WHERE {
   # bind something
   wd:Q571 ?p ?o .
   ?o sc:name ?on .
   ?p sc:name ?pn .
}
LIMIT 100
"""
print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1151'), ('pn', "topic's main Wikimedia portal"), ('o', 'http://www.wikidata.org/entity/Q10880143'), ('on', 'Portal:Books')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('pn', 'described by source'), ('o', 'http://www.wikidata.org/entity/Q106727050'), ('on', 'Metropolitan Museum of Art Tagging Vocabulary')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('pn', 'described by source'), ('o', 'http://www.wikidata.org/entity/Q19180675'), ('on', 'Small Brockhaus and Efron Encyclopedic Dictionary')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('pn', 'described by source'), ('o', 'http://www.wikidata.org/entity/Q602358'), ('on', 'Brockhaus and Efron Encyclopedic Dictionary')]
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('pn', "topic's main template"), ('o', 'http://www.wikidata.org/entity/Q22835068'), ('on', 'Template:Infobox book 2')]
[('p', 'http://www.wikidata.org/prop/direct/P1424'), ('pn', "topic's main 

In [10]:
# aim: get the computer scientists who are author of something
# result: we actually retrieved some computer scientists. Now we have to make sure that what they have written is a book
queryString = """
SELECT DISTINCT ?personname ?book ?bookname
WHERE {
   # bind something
   ?person wdt:P106 ?job ;
           sc:name ?personname .

   ?book wdt:P50 ?person ;
         sc:name ?bookname .
   FILTER (?job = wd:Q82594)
}
LIMIT 5
"""

print("Results")
x=run_query(queryString)

Results
[('personname', 'C Lee Giles'), ('book', 'http://www.wikidata.org/entity/Q21012586'), ('bookname', 'Persistence of Web references in scientific research')]
[('personname', 'Steve Lawrence'), ('book', 'http://www.wikidata.org/entity/Q21012586'), ('bookname', 'Persistence of Web references in scientific research')]
[('personname', 'C Lee Giles'), ('book', 'http://www.wikidata.org/entity/Q28140859'), ('bookname', 'Accessibility of information on the web')]
[('personname', 'Steve Lawrence'), ('book', 'http://www.wikidata.org/entity/Q28140859'), ('bookname', 'Accessibility of information on the web')]
[('personname', 'C Lee Giles'), ('book', 'http://www.wikidata.org/entity/Q61011726'), ('bookname', 'Digital libraries and autonomous citation indexing')]
5


In [12]:
# aim: get the types of literary work the computer scientists have done
# result: we actually retrieved some computer scientists. Now we have to make sure that what they have written is a book
queryString = """
SELECT DISTINCT ?instanceof ?instanceofname 
WHERE {
   # bind something
   ?person wdt:P106 ?job ;
           sc:name ?personname .

   ?book wdt:P50 ?person ;
         sc:name ?bookname ;
         wdt:P31 ?instanceof.
    ?instanceof sc:name ?instanceofname
   FILTER (?job = wd:Q82594)
}
LIMIT 500
"""

print("Results")
x=run_query(queryString)

Results
[('instanceof', 'http://www.wikidata.org/entity/Q815382'), ('instanceofname', 'meta-analysis')]
[('instanceof', 'http://www.wikidata.org/entity/Q7318358'), ('instanceofname', 'review article')]
[('instanceof', 'http://www.wikidata.org/entity/Q23011722'), ('instanceofname', 'TED talk')]
[('instanceof', 'http://www.wikidata.org/entity/Q871232'), ('instanceofname', 'editorial')]
[('instanceof', 'http://www.wikidata.org/entity/Q1668024'), ('instanceofname', 'service on internet')]
[('instanceof', 'http://www.wikidata.org/entity/Q188860'), ('instanceofname', 'software library')]
[('instanceof', 'http://www.wikidata.org/entity/Q189210'), ('instanceofname', 'web application')]
[('instanceof', 'http://www.wikidata.org/entity/Q187685'), ('instanceofname', 'doctoral thesis')]
[('instanceof', 'http://www.wikidata.org/entity/Q1348305'), ('instanceofname', 'erratum')]
[('instanceof', 'http://www.wikidata.org/entity/Q23927052'), ('instanceofname', 'conference paper')]
[('instanceof', 'http:/

In [3]:
# aim: get the computer scientists who are author of at least an instance of 'book'
# result: we retrieved the computer scientists who have written at least a book
queryString = """
SELECT DISTINCT ?personname ?book ?bookname
WHERE {
   # bind something
   ?person wdt:P106 ?job ;
           sc:name ?personname .

   ?book wdt:P50 ?person ;
         sc:name ?bookname ;
         wdt:P31 wd:Q571 .
   FILTER (?job = wd:Q82594 )
}
LIMIT 5
"""

print("Results")
x=run_query(queryString)

Results
[('personname', 'Eric S. Raymond'), ('book', 'http://www.wikidata.org/entity/Q467363'), ('bookname', 'The Cathedral and the Bazaar')]
[('personname', 'Peter Dayan'), ('book', 'http://www.wikidata.org/entity/Q98835934'), ('bookname', 'Theoretical neuroscience: computational and mathematical modeling of neural systems')]
[('personname', 'Cal Newport'), ('book', 'http://www.wikidata.org/entity/Q96653616'), ('bookname', "So Good They Can't Ignore You")]
[('personname', 'Edward Tufte'), ('book', 'http://www.wikidata.org/entity/Q56741142'), ('bookname', 'The Visual Display of Quantitative Information')]
[('personname', 'Yannis Tzitzikas'), ('book', 'http://www.wikidata.org/entity/Q105881996'), ('bookname', "Cinderella's Stick A Fairy Tale for Digital Preservation")]
5


In [4]:
# aim: get the computer scientists who are author of at least an instance of 'book' with the number of books written
# result: we retrieved the computer scientists who have written at least a book
queryString = """
SELECT DISTINCT ?person ?personname (COUNT(?book) AS ?numberOfBooks )
WHERE {
    {
       SELECT DISTINCT ?person ?personname ?book WHERE{
       
           # bind something
           ?person wdt:P106 ?job ;
                   sc:name ?personname .

           ?book wdt:P50 ?person ;
                 wdt:P31 wd:Q571 .
           FILTER (?job = wd:Q82594)
       }
    }
}GROUP BY ?person ?personname
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('person', 'http://www.wikidata.org/entity/Q2524992'), ('personname', 'Thomas H. Cormen'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q104439764'), ('personname', 'Ole Tange'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q1128109'), ('personname', 'Constanze Kurz'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q3109900'), ('personname', 'Ralph Kimball'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q77961153'), ('personname', 'Gayle Laakmann McDowell'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q31040791'), ('personname', 'Michael Machtey'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q93380549'), ('personname', 'Richard Szeliski'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q719413'), ('personname', 'David Heinemeier Hansson'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q3098629'), ('

In [5]:
# aim: get the computer scientists who are author of at least 5 books
# result: no computer scientists has written at least 5 books 
queryString = """
SELECT DISTINCT ?person ?personname (COUNT(?book) AS ?numberOfBooks )
WHERE {
    {
       SELECT DISTINCT ?person ?personname ?book WHERE{
       
           # bind something
           ?person wdt:P106 ?job ;
                   sc:name ?personname .

           ?book wdt:P50 ?person ;
                 wdt:P31 wd:Q571 .
           FILTER (?job = wd:Q82594)
       }
    }
}GROUP BY ?person ?personname
HAVING (COUNT(?book)>=5)
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
Empty


In [21]:
# since the result set is empty we investigate more and we find other instances of resources that could be 'books'
# aim: get the computer scientists who are author of at least an instance of 'book','volume','scientific book','non-fiction book','textbook' and 'book series'
# result: we retrieved the computer scientists who have written at least a book
queryString = """
SELECT DISTINCT ?personname ?book ?bookname
WHERE {
   # bind something
   ?person wdt:P106 ?job ;
           sc:name ?personname .

   ?book wdt:P50 ?person ;
         sc:name ?bookname ;
         wdt:P31 ?booktype .
   FILTER (?job = wd:Q82594 && ?booktype IN (wd:Q571,wd:Q1238720,wd:Q7433672,wd:Q20540385,wd:Q83790,wd:Q277759))
}
LIMIT 5
"""

print("Results")
x=run_query(queryString)

Results
[('personname', 'Alicja Corona Klenczon'), ('book', 'http://www.wikidata.org/entity/Q87123937'), ('bookname', 'Krzysztof Klenczon. Historia jednej znajomości')]
[('personname', 'Ross J. Anderson'), ('book', 'http://www.wikidata.org/entity/Q47482099'), ('bookname', 'Security Engineering: A Guide to Building Dependable Distributed Systems')]
[('personname', 'Stephen Emmott'), ('book', 'http://www.wikidata.org/entity/Q97127584'), ('bookname', 'Ten billions')]
[('personname', 'Donald Knuth'), ('book', 'http://www.wikidata.org/entity/Q5157615'), ('bookname', 'Computers and Typesetting')]
[('personname', 'Eric S. Raymond'), ('book', 'http://www.wikidata.org/entity/Q467363'), ('bookname', 'The Cathedral and the Bazaar')]
5


In [24]:
# aim: get the computer scientists who are author of at least an instance of 'book' with the number of books written
# result: we retrieved the computer scientists who have written at least a book
queryString = """
SELECT DISTINCT ?person ?personname (COUNT(?book) AS ?numberOfBooks )
WHERE {
    {
       SELECT DISTINCT ?person ?personname ?book WHERE{
       
           # bind something
           ?person wdt:P106 ?job ;
                   sc:name ?personname .

           ?book wdt:P50 ?person ;
                 wdt:P31 ?booktype .
           FILTER (?job = wd:Q82594 && ?booktype IN (wd:Q571,wd:Q1238720,wd:Q7433672,wd:Q20540385,wd:Q83790,wd:Q277759))
       }
    }
}GROUP BY ?person ?personname
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('person', 'http://www.wikidata.org/entity/Q17457'), ('personname', 'Donald Knuth'), ('numberOfBooks', '5')]
[('person', 'http://www.wikidata.org/entity/Q2524992'), ('personname', 'Thomas H. Cormen'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q104439764'), ('personname', 'Ole Tange'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q1128109'), ('personname', 'Constanze Kurz'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q3109900'), ('personname', 'Ralph Kimball'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q6932746'), ('personname', 'Muhammad Suhail Zubairy'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q105467898'), ('personname', 'Aitzol Astigarraga'), ('numberOfBooks', '2')]
[('person', 'http://www.wikidata.org/entity/Q57654605'), ('personname', 'Agustin Arrieta Urtizberea'), ('numberOfBooks', '1')]
[('person', 'http://www.wikidata.org/entity/Q77961153'), (

Final query for this task

In [26]:
# aim: get the computer scientists who are author of at least 5 books
# result: no computer scientists has written at least 5 books 
queryString = """
SELECT DISTINCT ?person ?personname (COUNT(?book) AS ?numberOfBooks )
WHERE {
    {
       SELECT DISTINCT ?person ?personname ?book WHERE{
       
           # bind something
           ?person wdt:P106 ?job ;
                   sc:name ?personname .

           ?book wdt:P50 ?person ;
                 wdt:P31 ?booktype .
           FILTER (?job = wd:Q82594 && ?booktype IN (wd:Q571,wd:Q1238720,wd:Q7433672,wd:Q20540385,wd:Q83790,wd:Q277759))
       }
    }
}GROUP BY ?person ?personname
HAVING (COUNT(?book)>=5)
LIMIT 100
"""

print("Results")
x=run_query(queryString)

Results
[('person', 'http://www.wikidata.org/entity/Q17457'), ('personname', 'Donald Knuth'), ('numberOfBooks', '5')]
[('person', 'http://www.wikidata.org/entity/Q16517758'), ('personname', 'Iñaki Alegria'), ('numberOfBooks', '11')]
[('person', 'http://www.wikidata.org/entity/Q20492519'), ('personname', 'Olatz Arbelaitz'), ('numberOfBooks', '12')]
[('person', 'http://www.wikidata.org/entity/Q50177430'), ('personname', 'Kepa Sarasola Gabiola'), ('numberOfBooks', '7')]
4


In [2]:
#
#
# OBSERVATION: the first queries that are used to get the class 'book' can be expensive even though the LIMIT has a low value. Therefore here we try to provide an alternative way
# that could have been used to get the class book.
# aim: from the task before we got all the persons that are computer scientists and university teachers. Some of them must have written something so they should have some kind of
#      author property connected to them. Therefore we want to list all the distinct properties to find an 'author' property.

# result: we found the 'author' property. Now we just have to find if it connects them to books.
queryString = """
SELECT DISTINCT ?p ?pname
WHERE {
   # bind something
   ?person wdt:P106 wd:Q82594 ;
           wdt:P106 wd:Q1622272 .
   
   ?s ?p ?person.
   ?p sc:name ?pname .
           
   FILTER REGEX (?pname, "author","i").
}
LIMIT 50
"""

print("Results")
x=run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P50'), ('pname', 'author')]
[('p', 'http://www.wikidata.org/prop/direct/P2679'), ('pname', 'author of foreword')]
[('p', 'http://www.wikidata.org/prop/direct/P2680'), ('pname', 'author of afterword')]
3


In [3]:
# aim: get all the resources for which computer scientists are author containing the word 'book'
# result: we retrieved some results but we are not sure that all of them are books so we need to find only the ones that are 'instanceOf' a 'book' class

queryString = """
SELECT DISTINCT ?s ?sname
WHERE {
   # bind something
   ?person wdt:P106 wd:Q82594 ;
           wdt:P106 wd:Q1622272 .
   
   ?s wdt:P50 ?person.
   ?s sc:name ?sname .
           
   FILTER REGEX (?sname, "book","i").
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q56385312'), ('sname', 'Book Review: Extremal graph theory')]
[('s', 'http://www.wikidata.org/entity/Q56387458'), ('sname', 'Book Review: Winning ways for your mathematical plays')]
[('s', 'http://www.wikidata.org/entity/Q55970196'), ('sname', 'Book Review: Introduction to Logic and to the Methodology of Deductive Sciences')]
[('s', 'http://www.wikidata.org/entity/Q50842887'), ('sname', "The ABC Programmer's Handbook")]
[('s', 'http://www.wikidata.org/entity/Q57066571'), ('sname', 'Making friends and communicating on Facebook: Implications for the access to social capital')]
[('s', 'http://www.wikidata.org/entity/Q57066567'), ('sname', 'Corrigendum to “Making friends and communicating on Facebook: Implications for social capital” [Soc. Netw. (2014) 29–41]')]
[('s', 'http://www.wikidata.org/entity/Q57066614'), ('sname', 'Bookmaker consensus and agreement for the UEFA Champions League 2008/2009')]
[('s', 'http://www.wikidata.org/entity/Q1959

In [4]:
# aim: get all the resources for which computer scientists are author containing the word 'book'
# result: we retrieved the 'book' class in a more efficient way

queryString = """
SELECT DISTINCT ?artworkclass ?name
WHERE {
   # bind something
   ?person wdt:P106 wd:Q82594 ;
           wdt:P106 wd:Q1622272 .
   
   ?s wdt:P50 ?person;
       wdt:P31 ?artworkclass.
   ?artworkclass sc:name ?name .
           
   FILTER REGEX (?name, "book","i").
}
LIMIT 200
"""

print("Results")
x=run_query(queryString)

Results
[('artworkclass', 'http://www.wikidata.org/entity/Q571'), ('name', 'book')]
[('artworkclass', 'http://www.wikidata.org/entity/Q83790'), ('name', 'textbook')]
[('artworkclass', 'http://www.wikidata.org/entity/Q277759'), ('name', 'book series')]
[('artworkclass', 'http://www.wikidata.org/entity/Q7433672'), ('name', 'scientific book')]
[('artworkclass', 'http://www.wikidata.org/entity/Q20540385'), ('name', 'non-fiction book')]
[('artworkclass', 'http://www.wikidata.org/entity/Q57933693'), ('name', 'book edition')]
[('artworkclass', 'http://www.wikidata.org/entity/Q637866'), ('name', 'book review')]
7
