# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-5965f015f6-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# Book Workflow Series ("Author comparison explorative search") 

Consider the following exploratory scenario:


>  Investigate Italian and French book authors in terms of awards, books published and copyright types



## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P279`    | subclass      | predicate |
| `wdt:P106`    | occupation    | predicate | 
| `wdt:P17`     | country       | predicate | 
| `wdt:P27`     | citizenship   | predicate | 
| `wd:Q36180`   | writer        | node |
| `wd:Q38`      | Italy         | node |
| `wd:Q172579`  | Kingdom of Italy        | node |
| `wd:Q142`     | France        | node |
| `wd:Q37922`   | Nobel Prize literature        | node |
| `wd:Q213678`  | Vatican Library        | node |


Also consider that

```
?p wdt:P27 wd:Q142
```

is the BGP to retrieve all **French citizens**


The workload should


1. Identify the BGP for obtaining the Italian and French writers who published a book in the last 50 years

2. Compare the number of books written by Italian and French writers

3. Count how many books written by Italian authors are now released with a "public domain" copyright form

4. How many Literature Nobel awards won authors from Italy and from the Kingdom of Italy? 

5. Are there books from Litarature Nobel Award winners which are not present in the Vatican Library? (if so, who is the author with more books not in the Vatical Library)?

#### Example of query

In [2]:
queryString = """
SELECT COUNT(?p) WHERE { 

    ?p wdt:P27 wd:Q142 .
} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '273456')]


1

### My Workflow

#### ***Task 1 :*** Identify the BGP for obtaining the Italian and French writers who published a book in the last 50 years

First of all, I need to understand how books are representend in the graph.

I retrieve all the object properties that connect instances of ***writer (wd:Q36180)*** with other entities. 

I hope to find something between writers and books (e.g something like "published" or "wrote").

In [3]:
queryString = """
SELECT DISTINCT ?p ?pName WHERE {

    # Search for propery of Writers
    ?writer wdt:P106 wd:Q36180 ;
            ?p       ?o        .
    
    # This returns the labels
    ?p <http://schema.org/name> ?pName .
    
    # Exclude data properties
    FILTER(!isLiteral(?o))
}
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P102'), ('pName', 'member of political party')]
[('p', 'http://www.wikidata.org/prop/direct/P1050'), ('pName', 'medical condition')]
[('p', 'http://www.wikidata.org/prop/direct/P106'), ('pName', 'occupation')]
[('p', 'http://www.wikidata.org/prop/direct/P1066'), ('pName', 'student of')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('pName', 'employer')]
[('p', 'http://www.wikidata.org/prop/direct/P109'), ('pName', 'signature')]
[('p', 'http://www.wikidata.org/prop/direct/P119'), ('pName', 'place of burial')]
[('p', 'http://www.wikidata.org/prop/direct/P1196'), ('pName', 'manner of death')]
[('p', 'http://www.wikidata.org/prop/direct/P1303'), ('pName', 'instrument')]
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('pName', 'described by source')]
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('pName', 'participant in')]
[('p', 'http://www.wikidata.org/prop/direct/P136'), ('pName', 'genre')]
[('p', 'http://www.wikida

50

Since there are too much properties to actually search for something like "published" or "wrote", I try a different approach to find the URI for books.

I search for all the properties that connect an entity to the ***Vatican Library (wd:Q213678)***.

In [4]:
queryString = """
SELECT DISTINCT ?p ?pName WHERE {

    # Anything connected to Vatican Library
    ?s ?p wd:Q213678 .

    # This returns the labels
    ?p <http://schema.org/name> ?pName .

}
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P2389'), ('pName', 'organization directed by the office or position')]
[('p', 'http://www.wikidata.org/prop/direct/P9419'), ('pName', 'personal library at')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('pName', 'employer')]
[('p', 'http://www.wikidata.org/prop/direct/P127'), ('pName', 'owned by')]
[('p', 'http://www.wikidata.org/prop/direct/P137'), ('pName', 'operator')]
[('p', 'http://www.wikidata.org/prop/direct/P1416'), ('pName', 'affiliation')]
[('p', 'http://www.wikidata.org/prop/direct/P189'), ('pName', 'location of discovery')]
[('p', 'http://www.wikidata.org/prop/direct/P195'), ('pName', 'collection')]
[('p', 'http://www.wikidata.org/prop/direct/P276'), ('pName', 'location')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('pName', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('pName', 'part of')]
[('p', 'http://www.wikidata.org/prop/direct/P485'), ('pName', 'archives at')]
[('p',

18

In my opinion, the property ***location (wdt:P276)***, could be the right one to represent in which library the books are. 

So, I want to discover what is linked to the ***Vatican Library (wd:Q213678)*** through the property ***location (wdt:P276)***.

In [5]:
queryString = """
SELECT DISTINCT ?s ?sName WHERE {

    # Anything connected to Vatican Library through property location
    ?s wdt:P276 wd:Q213678 .

    # This returns the labels
    ?s <http://schema.org/name> ?sName .  
}
"""

print("Results")
run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q3592714'), ('sName', 'Barberini Gospels')]
[('s', 'http://www.wikidata.org/entity/Q101438843'), ('sName', 'Rossiana Library')]
[('s', 'http://www.wikidata.org/entity/Q338714'), ('sName', 'Anjou Legendarium')]
[('s', 'http://www.wikidata.org/entity/Q1250195'), ('sName', 'Codex Assemanius')]
[('s', 'http://www.wikidata.org/entity/Q1106235'), ('sName', 'Codex Ríos')]
[('s', 'http://www.wikidata.org/entity/Q33231619'), ('sName', 'breviary of  Matthias Corvin')]
[('s', 'http://www.wikidata.org/entity/Q77463414'), ('sName', 'Apostolic Library Regina 1682')]
[('s', 'http://www.wikidata.org/entity/Q77463433'), ('sName', 'Apostolic Library Reg. 1323')]
[('s', 'http://www.wikidata.org/entity/Q55831099'), ('sName', 'Codex vaticanus latinus 3199')]
[('s', 'http://www.wikidata.org/entity/Q16912485'), ('sName', 'Divine Comedy Illustrated by Botticelli')]
[('s', 'http://www.wikidata.org/entity/Q16939517'), ('sName', 'Codex Parisino-petropolitanus')]
[('

24

I use the property ***instanceOf (wdt:P31)*** to discover the classes of the retrieved entities.

In [6]:
queryString = """
SELECT DISTINCT ?s ?sName ?class ?className WHERE {

    # Retrieve the class of everything that is connected to Vatican Library through property location
    ?s  wdt:P276 wd:Q213678 ;
        wdt:P31  ?class     .

    # this returns the labels
    ?s     <http://schema.org/name> ?sName     .
    ?class <http://schema.org/name> ?className .   
}
"""

print("Results")
run_query(queryString)

Results
[('s', 'http://www.wikidata.org/entity/Q1250195'), ('sName', 'Codex Assemanius'), ('class', 'http://www.wikidata.org/entity/Q1754581'), ('className', 'Evangeliary')]
[('s', 'http://www.wikidata.org/entity/Q101438843'), ('sName', 'Rossiana Library'), ('class', 'http://www.wikidata.org/entity/Q3052382'), ('className', 'fonds')]
[('s', 'http://www.wikidata.org/entity/Q744655'), ('sName', 'Chigi codex'), ('class', 'http://www.wikidata.org/entity/Q213924'), ('className', 'codex')]
[('s', 'http://www.wikidata.org/entity/Q55831099'), ('sName', 'Codex vaticanus latinus 3199'), ('class', 'http://www.wikidata.org/entity/Q213924'), ('className', 'codex')]
[('s', 'http://www.wikidata.org/entity/Q77463433'), ('sName', 'Apostolic Library Reg. 1323'), ('class', 'http://www.wikidata.org/entity/Q5292'), ('className', 'encyclopedia')]
[('s', 'http://www.wikidata.org/entity/Q77464627'), ('sName', 'Cod. Palat. lat. 1074'), ('class', 'http://www.wikidata.org/entity/Q5292'), ('className', 'encyclope

38

Since the classes seems to represent different types of books, I try to use the property ***subclassOf (wdt:P279)*** starting from the first type of book (***Evangeliary (wd:Q1754581)***).

In [7]:
queryString = """
SELECT DISTINCT ?class ?className WHERE {

    # I use the property "subclassOf" on the Evangeliary class
    wd:Q1754581 wdt:P279 ?class .

    # This returns the labels
    ?class <http://schema.org/name> ?className . 
}
"""

print("Results")
run_query(queryString)

Results
[('class', 'http://www.wikidata.org/entity/Q3831847'), ('className', 'liturgical book')]


1

I try to use ***subclassOf (wdt:P279)*** one more time on the just retrieved class: ***liturgical book (wd:Q3831847)***.

In [8]:
queryString = """
SELECT DISTINCT ?class ?className WHERE {

    # I use the "subclassOf" property on the Liturgical Book class
    wd:Q3831847 wdt:P279 ?class .

    # this returns the labels
    ?class <http://schema.org/name> ?className .   
}
"""

print("Results")
run_query(queryString)

Results
[('class', 'http://www.wikidata.org/entity/Q12617225'), ('className', 'religious literature')]
[('class', 'http://www.wikidata.org/entity/Q571'), ('className', 'book')]
[('class', 'http://www.wikidata.org/entity/Q1338914'), ('className', 'handbook')]


3

I discovered the class ***book (wd:Q571)***. 

However to identify a book is not sufficient to use the BGP: ***?book wdt:P31 wd:Q571***, because I can have different types of books that are subclasses of ***book (wd:Q571)***. 

For example, I can have a ***liturgical book (wd:Q3831847)*** as in the previous query.

So, the correct BGP to retrieve entities of books is:
```
?book wdt:P31/wdt:P279* wd:Q571.
```

I want to check how books and writers are connected.

In [9]:
queryString = """
SELECT DISTINCT ?p ?pName WHERE {
    
    # Retrieve istances of book
    ?book wdt:P31/wdt:P279* wd:Q571 .

    # Search for writers
    ?writer wdt:P106 wd:Q36180 .
    
    # Connection between writer and book
    ?book ?p ?writer.
    
    # This returns the labels
    ?p <http://schema.org/name> ?pName.
    
}
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P110'), ('pName', 'illustrator')]
[('p', 'http://www.wikidata.org/prop/direct/P50'), ('pName', 'author')]
[('p', 'http://www.wikidata.org/prop/direct/P736'), ('pName', 'cover art by')]
[('p', 'http://www.wikidata.org/prop/direct/P170'), ('pName', 'creator')]
[('p', 'http://www.wikidata.org/prop/direct/P175'), ('pName', 'performer')]
[('p', 'http://www.wikidata.org/prop/direct/P180'), ('pName', 'depicts')]
[('p', 'http://www.wikidata.org/prop/direct/P287'), ('pName', 'designed by')]
[('p', 'http://www.wikidata.org/prop/direct/P58'), ('pName', 'screenwriter')]
[('p', 'http://www.wikidata.org/prop/direct/P1877'), ('pName', 'after a work by')]
[('p', 'http://www.wikidata.org/prop/direct/P1891'), ('pName', 'signatory')]


10

I can probably use the property ***author (wdt:P50)***. I try to retrieve some examples of pairs [books,author].

In [10]:
queryString = """
SELECT DISTINCT ?bookName ?writerName WHERE {
    
    # Retrieve istances of book and their author
    ?book wdt:P31/wdt:P279* wd:Q571 ;
          wdt:P50           ?writer .

    # Search for writers
    ?writer wdt:P106 wd:Q36180 .
    
    # This returns the labels
    ?writer <http://schema.org/name> ?writerName .
    ?book   <http://schema.org/name> ?bookName   .   
}
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('bookName', 'Book Girl'), ('writerName', 'Mizuki Nomura')]
[('bookName', 'All Star Batman & Robin, the Boy Wonder'), ('writerName', 'Frank Miller')]
[('bookName', 'The Dark Knight Strikes Again'), ('writerName', 'Frank Miller')]
[('bookName', 'Batwoman: Elegy'), ('writerName', 'Greg Rucka')]
[('bookName', 'Kiss of the Spider Woman'), ('writerName', 'Manuel Puig')]
[('bookName', 'Giant Days'), ('writerName', 'John Allison')]
[('bookName', 'Hapless Child'), ('writerName', 'Edward Gorey')]
[('bookName', 'Ore ga Ojousama Gakkou ni "Shomin Sample" Toshite Rachirareta Ken'), ('writerName', 'Takafumi Nanatsuki')]
[('bookName', 'X-Treme X-Men'), ('writerName', 'Chris Claremont')]
[('bookName', 'Wolverine'), ('writerName', 'Chris Claremont')]
[('bookName', 'Sex'), ('writerName', 'Madonna')]
[('bookName', 'Etymologicum Magnum'), ('writerName', 'anonymous')]
[('bookName', 'Black Summer'), ('writerName', 'Warren Ellis')]
[('bookName', 'Asterix Conquers Rome'), ('writerName', 'René Goscin

30

Now I want to discover the data properties of an entity of ***book (wd:Q571)***.

In [11]:
queryString = """
SELECT DISTINCT ?p ?pname WHERE {

    # Retrieve instances of book
    ?book  (wdt:P31 / wdt:P279*) wd:Q571;
           ?p ?o.
            
    # This returns the labels
    ?p <http://schema.org/name> ?pname .
    
    # Only data properties
    FILTER(isLiteral(?o))
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P478'), ('pname', 'volume')]
[('p', 'http://www.wikidata.org/prop/direct/P4480'), ('pname', 'INDUCKS publication ID')]
[('p', 'http://www.wikidata.org/prop/direct/P4484'), ('pname', 'INDUCKS issue ID')]
[('p', 'http://www.wikidata.org/prop/direct/P4110'), ('pname', 'Crunchyroll ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2093'), ('pname', 'author name string')]
[('p', 'http://www.wikidata.org/prop/direct/P481'), ('pname', 'Palissy ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1014'), ('pname', 'Art & Architecture Thesaurus ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1104'), ('pname', 'number of pages')]
[('p', 'http://www.wikidata.org/prop/direct/P1113'), ('pname', 'number of episodes')]
[('p', 'http://www.wikidata.org/prop/direct/P1235'), ('pname', 'ISFDB series ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1267'), ('pname', 'AlloCiné series ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1296'), ('pname', 

100

I have just discovered different data properties that seem to be useful such as:
* ***publication date (wdt:P577)***
* ***title (wdt:P1476)***
* ***edition number (wdt:P393)***
* ...

Now I can finally answer to the first question. I search for Italian and French writers who published a book in the last 50 years.

In [12]:
queryString = """
SELECT DISTINCT ?bookName ?writerName ?countryName ?publishDate WHERE {
    
    # Retrieve instances of book, publication date and author
    ?book (wdt:P31 / wdt:P279*) wd:Q571      ;
          wdt:P577              ?publishDate ;
          wdt:P50               ?writer      .
          
    # Retrieve writers and their nationality
    ?writer wdt:P106 wd:Q36180 ;
            wdt:P27  ?country  .
    
   
    # This returns the labels
    ?book    <http://schema.org/name> ?bookName    .
    ?writer  <http://schema.org/name> ?writerName  .
    ?country <http://schema.org/name> ?countryName .
    
    # Only Italian and French writers
    FILTER (?country = wd:Q38 || ?country = wd:Q142)
    
    # Only book published in the last 50 years
    FILTER (?publishDate > "1971-01-01T00:00:00Z"^^xsd:dateTime)     
}
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('bookName', 'Quand sort la recluse'), ('writerName', 'Fred Vargas'), ('countryName', 'France'), ('publishDate', '2017-05-10T00:00:00Z')]
[('bookName', 'Asterix Conquers Rome'), ('writerName', 'René Goscinny'), ('countryName', 'France'), ('publishDate', '1976-01-01T00:00:00Z')]
[('bookName', 'Wojtyla segreto'), ('writerName', 'Giacomo Galeazzi'), ('countryName', 'Italy'), ('publishDate', '2011-01-01T00:00:00Z')]
[('bookName', 'La France russe, enquête sur les réseaux Poutine'), ('writerName', 'Nicolas Hénin'), ('countryName', 'France'), ('publishDate', '2016-05-25T00:00:00Z')]
[('bookName', 'Reassembling the Social. An introduction to Actor-Network-Theory'), ('writerName', 'Bruno Latour'), ('countryName', 'France'), ('publishDate', '2005-01-01T00:00:00Z')]
[('bookName', 'Jihad Academy'), ('writerName', 'Nicolas Hénin'), ('countryName', 'France'), ('publishDate', '2015-11-05T00:00:00Z')]
[('bookName', 'Comprendre le terrorisme'), ('writerName', 'Nicolas Hénin'), ('countryName',

30

Same as before, but showing only info about the authors.

In [13]:
queryString = """
SELECT DISTINCT ?writerName ?countryName WHERE {
    
    # Retrieve instances of book, publication date and author
    ?book (wdt:P31 / wdt:P279*) wd:Q571      ;
          wdt:P577              ?publishDate ;
          wdt:P50               ?writer      .
          
    # Retrieve writers and their nationality
    ?writer wdt:P106 wd:Q36180 ;
            wdt:P27  ?country  .
    
   
    # This returns the labels
    ?book    <http://schema.org/name> ?bookName    .
    ?writer  <http://schema.org/name> ?writerName  .
    ?country <http://schema.org/name> ?countryName .
    
    # Only Italian and French writers
    FILTER (?country = wd:Q38 || ?country = wd:Q142)
    
    # Only book published in the last 50 years
    FILTER (?publishDate > "1971-01-01T00:00:00Z"^^xsd:dateTime)     
}
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('writerName', 'Fred Vargas'), ('countryName', 'France')]
[('writerName', 'René Goscinny'), ('countryName', 'France')]
[('writerName', 'Giacomo Galeazzi'), ('countryName', 'Italy')]
[('writerName', 'Nicolas Hénin'), ('countryName', 'France')]
[('writerName', 'Bruno Latour'), ('countryName', 'France')]
[('writerName', 'Daniel Pennacchioni'), ('countryName', 'France')]
[('writerName', 'François Place'), ('countryName', 'France')]
[('writerName', 'Christian Laux'), ('countryName', 'France')]
[('writerName', 'Amin Maalouf'), ('countryName', 'France')]
[('writerName', 'Antonio Manzini'), ('countryName', 'Italy')]
[('writerName', 'Michel del Castillo'), ('countryName', 'France')]
[('writerName', 'Cyril Gély'), ('countryName', 'France')]
[('writerName', 'Ben Pastor'), ('countryName', 'Italy')]
[('writerName', 'Donato Carrisi'), ('countryName', 'Italy')]
[('writerName', 'Sebastiano Vassalli'), ('countryName', 'Italy')]
[('writerName', 'Dacia Maraini'), ('countryName', 'Italy')]
[('wri

30

#### END TASK 1:
The BGP for obtaining the Italian and French writers who published a book in the last 50 years is :
```
?book (wdt:P31 / wdt:P279*)  wd:Q571      ;
       wdt:P577              ?publishDate ;
       wdt:P50               ?writer      .
          
?writer wdt:P106 wd:Q36180 ;
        wdt:P27  ?country  .
    
FILTER (?country = wd:Q38 || ?country = wd:Q142)
    
FILTER (?publishDate > "1971-01-01T00:00:00Z"^^xsd:dateTime) 
```

In particular, I have to useo two ***FILTER***, one for the nationality of the authors and one for the publication date.

### ***Task 2 :*** Compare the number of books written by Italian and French writers.

I can use the property ***author (wdt:P50)*** discovered before.

In [14]:
queryString = """
SELECT ?numBookIT ?numBookFR WHERE{

    {   # Retrieve numbers of books written by italian authors
        SELECT COUNT(DISTINCT ?book) AS ?numBookIT WHERE {
        
            # Retrieve istances of books and their author
            ?book  (wdt:P31 / wdt:P279*) wd:Q571 ;
                    wdt:P50              ?writer .

            # Retrieve italian writers
            ?writer wdt:P106 wd:Q36180 ;
                    wdt:P27  wd:Q38    .
        }
    }
    {   # Retrieve numbers of books written by french authors
        SELECT COUNT(DISTINCT ?book) AS ?numBookFR WHERE {
        
            # Retrieve istances of books and their author
            ?book  (wdt:P31 / wdt:P279*) wd:Q571 ;
                    wdt:P50              ?writer .

            # Retirieve french writers
            ?writer wdt:P106 wd:Q36180 ;
                    wdt:P27  wd:Q142   .
        }
    }
}
"""

print("Results")
run_query(queryString)

Results
[('numBookIT', '767'), ('numBookFR', '1454')]


1

#### END TASK 2:
I have 1454 books written by French writers and only 767 by Italian writers.

### ***Task 3 :*** Count how many books written by Italian authors are now released with a "public domain" copyright form.

First of all, I have to understand how copyright license are managed.

I retrieve the object properties that connect istances of ***book (wd:Q571)*** with other entities. I also use a ***REGEX*** to search properties that contain the word "copyright".

In [15]:
queryString = """
SELECT DISTINCT ?p ?pName WHERE {

    # Retrieve instances of book and connect them to other entities
    ?book  (wdt:P31 / wdt:P279*) wd:Q571 ;
           ?p                    ?o      .
            
    # This returns the labels
    ?p <http://schema.org/name> ?pName .
    
    # Exclude data properties
    FILTER(!isLiteral(?o))
    
    # Only properties that contain the word copyright
    FILTER(REGEX(?pName,"copyright"))
}
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P6216'), ('pName', 'copyright status')]
[('p', 'http://www.wikidata.org/prop/direct/P3931'), ('pName', 'copyright holder')]
[('p', 'http://www.wikidata.org/prop/direct/P275'), ('pName', 'copyright license')]
[('p', 'http://www.wikidata.org/prop/direct/P7763'), ('pName', 'copyright status as a creator')]


4

I found the property ***copyright license (wdt:P275)***. I search entities connected to istances of ***book (wd:Q571)*** through this property.

In [16]:
queryString = """
SELECT DISTINCT ?license ?licenseName WHERE {
    
    # Retrieve instances of book and their copyright license 
    ?book  (wdt:P31 / wdt:P279*) wd:Q571  ;
            wdt:P275             ?license .
            
    # This returns the labels
    ?license <http://schema.org/name> ?licenseName .
    
}
"""

print("Results")
run_query(queryString)

Results
[('license', 'http://www.wikidata.org/entity/Q80939351'), ('licenseName', 'Open Licence 2.0')]
[('license', 'http://www.wikidata.org/entity/Q14946043'), ('licenseName', 'Creative Commons Attribution-ShareAlike 3.0 Unported')]
[('license', 'http://www.wikidata.org/entity/Q20007257'), ('licenseName', 'Creative Commons Attribution 4.0 International')]
[('license', 'http://www.wikidata.org/entity/Q178285'), ('licenseName', 'freeware')]
[('license', 'http://www.wikidata.org/entity/Q34179348'), ('licenseName', 'Creative Commons Attribution-NonCommercial 4.0 International')]
[('license', 'http://www.wikidata.org/entity/Q24082749'), ('licenseName', 'Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International')]
[('license', 'http://www.wikidata.org/entity/Q19068204'), ('licenseName', 'Creative Commons Attribution-NonCommercial-NoDerivs 2.5 Generic')]
[('license', 'http://www.wikidata.org/entity/Q26259495'), ('licenseName', 'Creative Commons Attribution  3.0 IGO')]
[('license'

44

I use the property ***instanceOf (wdt:P31)*** to discover the classes of the just retrieved entities.

In [17]:
queryString = """
SELECT DISTINCT ?class ?className WHERE {
    
    # instances of book with a copyright license property
    ?book  (wdt:P31 / wdt:P279*) wd:Q571 ;
           wdt:P275              ?license.
           
    # using the "instanceOf" property        
    ?license wdt:P31 ?class .
    
    # this returns the labels
    ?class <http://schema.org/name> ?className . 
}
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('class', 'http://www.wikidata.org/entity/Q196294'), ('className', 'free license')]
[('class', 'http://www.wikidata.org/entity/Q284742'), ('className', 'Creative Commons license')]
[('class', 'http://www.wikidata.org/entity/Q30939938'), ('className', 'Open Definition conformant license')]
[('class', 'http://www.wikidata.org/entity/Q79719'), ('className', 'license')]
[('class', 'http://www.wikidata.org/entity/Q3238057'), ('className', 'proprietary license')]
[('class', 'http://www.wikidata.org/entity/Q1139274'), ('className', 'copyleft license')]
[('class', 'http://www.wikidata.org/entity/Q5975031'), ('className', 'copyleft free software license')]
[('class', 'http://www.wikidata.org/entity/Q797551'), ('className', 'waiver')]
[('class', 'http://www.wikidata.org/entity/Q1156659'), ('className', 'OSI-approved license')]
[('class', 'http://www.wikidata.org/entity/Q25047642'), ('className', 'public domain equivalent license')]
[('class', 'http://www.wikidata.org/entity/Q51754273'),

24

I found the class ***public license (wd:Q7257461)***.

To retrieve copyright licenses that are public I can use this BGP: 
```
?copyright wdt:P31/wdt:P279* wd:Q7257461
```
Now, I can search for books written by Italian authors that are now released with a "public domain" copyright form.

In [18]:
queryString = """
SELECT ?bookName ?writerName WHERE {
    
    # Retrieve instances of book,their authors and their copyright licenses
    ?book (wdt:P31 / wdt:P279*) wd:Q571  ;
           wdt:P50              ?writer  ;
           wdt:P275             ?license .
    
    # I want only Public Licenses
    ?license wdt:P31/wdt:P279* wd:Q7257461 .
    
    # Retrieve Italian writers
    ?writer wdt:P106 wd:Q36180 ;
            wdt:P27  wd:Q38    .

    # This returns the labels
    ?book    <http://schema.org/name> ?bookName    .
    ?writer  <http://schema.org/name> ?writerName  .
}
"""

print("Results")
run_query(queryString)


Results
Empty


0

I found zero books written by Italian authors with a ***public license (wd:Q7257461)*** copyright form. Maybe, Italian authors do not use ***public license (wd:Q7257461)*** copyright licenses.

I try to discover which copyright licenses are used by Italian authors using the property ***copyright license (wdt:P275)***.

In [19]:
queryString = """
SELECT DISTINCT ?license ?licenseName WHERE {
    
    # Retrieve instances of book,their authors and their copyright licenses
    ?book (wdt:P31 / wdt:P279*) wd:Q571  ;
           wdt:P50              ?writer  ;
           wdt:P275             ?license .
    
    # Retrieve Italian writers
    ?writer wdt:P106 wd:Q36180 ;
            wdt:P27  wd:Q38    .

    # This returns the labels
    ?book    <http://schema.org/name> ?bookName    .
    ?writer  <http://schema.org/name> ?writerName  .
    ?license <http://schema.org/name> ?licenseName .
    
   
}
"""

print("Results")
run_query(queryString)

Results
Empty


0

Again zero results. I try to use an ***OPTIONAL*** to check if there is any problem with the first part of the query.

In [20]:
queryString = """
SELECT ?book ?bookName ?writerName ?copyright ?copyrightName WHERE {
   
    # Retrieve instances of book and their authors
    ?book (wdt:P31 / wdt:P279*) wd:Q571  ;
           wdt:P50              ?writer  .
    
    # Retrieve Italian writers
    ?writer wdt:P106 wd:Q36180 ;
            wdt:P27  wd:Q38    .

    # This returns the labels
    ?book    <http://schema.org/name> ?bookName    .
    ?writer  <http://schema.org/name> ?writerName  .
    
    # Optional: book with a copyright license property
    OPTIONAL{
    ?book    wdt:P275                 ?license     .
    ?license <http://schema.org/name> ?licenseName .
    }       
}
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('book', 'http://www.wikidata.org/entity/Q59920205'), ('bookName', 'Wojtyla segreto'), ('writerName', 'Giacomo Galeazzi')]
[('book', 'http://www.wikidata.org/entity/Q15810418'), ('bookName', 'Frederick'), ('writerName', 'Leo Lionni')]
[('book', 'http://www.wikidata.org/entity/Q3938022'), ('bookName', 'Revolt Against the Modern World'), ('writerName', 'Julius Evola')]
[('book', 'http://www.wikidata.org/entity/Q4015950'), ('bookName', 'Voices'), ('writerName', 'Dacia Maraini')]
[('book', 'http://www.wikidata.org/entity/Q22263556'), ('bookName', 'Terre selvagge'), ('writerName', 'Sebastiano Vassalli')]
[('book', 'http://www.wikidata.org/entity/Q3874134'), ('bookName', 'Necropolis'), ('writerName', 'Boris Pahor')]
[('book', 'http://www.wikidata.org/entity/Q3878628'), ('bookName', 'Nostra Signora dei Turchi'), ('writerName', 'Carmelo Bene')]
[('book', 'http://www.wikidata.org/entity/Q3824656'), ('bookName', 'The Cave of the Winds'), ('writerName', 'Ben Pastor')]
[('book', 'http://w

20

Istances of ***book (wd:Q571)*** written by Italian authors are correctly retrieved. Maybe the property Italian books do not use the property ***copyright license (wdt:P275)***

I want to check this using an ***ASK** query.

In [21]:
queryString = """ASK {

    # Retrieve instances of book,their authors and their copyright licenses
    ?book (wdt:P31 / wdt:P279*) wd:Q571  ;
           wdt:P50              ?writer  ;
           wdt:P275             ?license .
    
    # Retrieve Italian writers
    ?writer wdt:P106 wd:Q36180 ;
            wdt:P27  wd:Q38    .
}
"""

print("Results")
run_ask_query(queryString)

Results


{'head': {'link': []}, 'boolean': False}

Istances of ***book (wd:Q571)*** written by Italian authors  do not use the property ***copyright license (wdt:P275)***. 

I try to remove the constraint about the nationality. I search for books that are now released with a "public domain" copyright form.

In [22]:
queryString = """
SELECT DISTINCT ?bookName ?writerName WHERE {
    
    # Retrieve instances of book,their authors and their copyright licenses
    ?book (wdt:P31 / wdt:P279*) wd:Q571  ;
           wdt:P50              ?writer  ;
           wdt:P275             ?license .
    
    # I want only Public Licenses
    ?license wdt:P31/wdt:P279* wd:Q7257461 .
    
    # Retrieve writers
    ?writer wdt:P106 wd:Q36180 .

    # This returns the labels
    ?book    <http://schema.org/name> ?bookName    .
    ?writer  <http://schema.org/name> ?writerName  .
}
"""

print("Results")
run_query(queryString)


Results
[('bookName', 'The Cathedral and the Bazaar'), ('writerName', 'Eric S. Raymond')]
[('bookName', 'Open Library. Guidelines for libraries on the use of open licenses and open access publishing'), ('writerName', 'Anatoly Tsapenko\u200e')]
[('bookName', 'Wikipedia. Introductory course'), ('writerName', 'Reda Kerbush')]
[('bookName', 'The Social Structure of the First Crusade'), ('writerName', 'Conor Kostick')]
[('bookName', 'Swarmwise'), ('writerName', 'Rickard Falkvinge')]
[('bookName', '"Roads and Bridges, The Unseen Labor behind Our Digital Structure"'), ('writerName', 'Nadia Eghbal')]


6

#### END TASK 3:
I was able to find some books released with a "Public Domain" copyright form. 

Hence, the query was correct but on Wikidata books written by Italian authors do not have the property ***copyright license (wdt:P275)***.

### ***Task 4 :*** How many Literature Nobel awards won authors from Italy and from the Kingdom of Italy?

First, I have to understand how nobel prizes are managed.

Search for all the properties that connect a writer to the ***Literature Nobel Prize (wd:Q37922)***.

In [23]:
queryString = """
SELECT DISTINCT ?p ?pName WHERE {

    # Retrieve writers and try to connect them to Literature Nobel Prize
    ?writer wdt:P106 wd:Q36180 ;
            ?p       wd:Q37922 .

    # This returns the labels
    ?p <http://schema.org/name> ?pName.
    
}
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1411'), ('pName', 'nominated for')]
[('p', 'http://www.wikidata.org/prop/direct/P166'), ('pName', 'award received')]
[('p', 'http://www.wikidata.org/prop/direct/P793'), ('pName', 'significant event')]


3

I show some examples of authors that won ***Literature Nobel Prize (wd:Q37922)*** using the property ***award received (wdt:P166)***.  

In [24]:
queryString = """
SELECT DISTINCT ?writer ?writerName WHERE {

    # Retrieve writers and who won Literature Nobel Prize
    ?writer wdt:P106 wd:Q36180 ;
            wdt:P166 wd:Q37922 .

    # this returns the labels
    ?writer <http://schema.org/name> ?writerName.
    
}
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('writer', 'http://www.wikidata.org/entity/Q159552'), ('writerName', 'Johannes V. Jensen')]
[('writer', 'http://www.wikidata.org/entity/Q254032'), ('writerName', 'Olga Tokarczuk')]
[('writer', 'http://www.wikidata.org/entity/Q34743'), ('writerName', 'Rudyard Kipling')]
[('writer', 'http://www.wikidata.org/entity/Q5878'), ('writerName', 'Gabriel García Márquez')]
[('writer', 'http://www.wikidata.org/entity/Q41042'), ('writerName', 'Harold Pinter')]
[('writer', 'http://www.wikidata.org/entity/Q57074'), ('writerName', 'Nelly Sachs')]
[('writer', 'http://www.wikidata.org/entity/Q42122'), ('writerName', 'Tomas Tranströmer')]
[('writer', 'http://www.wikidata.org/entity/Q80900'), ('writerName', 'Pearl S. Buck')]
[('writer', 'http://www.wikidata.org/entity/Q33760'), ('writerName', 'Bertrand Russell')]
[('writer', 'http://www.wikidata.org/entity/Q7241'), ('writerName', 'Rabindranath Tagore')]
[('writer', 'http://www.wikidata.org/entity/Q83184'), ('writerName', 'Jacinto Benavente')]
[('

30

Search Italian authors who won ***Literature Nobel Prize (wd:Q37922)***.

In [25]:
queryString = """
SELECT DISTINCT ?writer ?writerName WHERE {
      
    # Retrieve writers from italy who won Literature Nobel Prize
    ?writer wdt:P106 wd:Q36180 ;
            wdt:P166 wd:Q37922 ;
            wdt:P27  wd:Q38    .

    # This returns the labels
    ?writer <http://schema.org/name> ?writerName .
      
}
"""

print("Results")
run_query(queryString)

Results
[('writer', 'http://www.wikidata.org/entity/Q765'), ('writerName', 'Dario Fo')]
[('writer', 'http://www.wikidata.org/entity/Q83038'), ('writerName', 'Salvatore Quasimodo')]


2

How many Literature Nobel awards won authors from ***Italy (wd:Q38)*** and from the ***Kingdom of Italy (wd:Q172579)*** ?

In [26]:
queryString = """
SELECT (COUNT(DISTINCT ?writer) AS ?numAwardedWriters) (GROUP_CONCAT(DISTINCT ?writerName; separator=", ") AS ?winners) WHERE {

    # writers who won a Literature Nobel Prize and their nationality
    ?writer wdt:P106 wd:Q36180 ;
            wdt:P166 wd:Q37922 ;
            wdt:P27  ?country  .

    # This returns the labels
    ?writer <http://schema.org/name> ?writerName .

    # I want only authors from Italy and Kingdom Of Italy
    FILTER (?country = wd:Q38 || ?country = wd:Q172579)      
}
"""

print("Results")
run_query(queryString)

Results
[('numAwardedWriters', '5'), ('winners', 'Dario Fo, Giosuè Carducci, Grazia Deledda, Luigi Pirandello, Salvatore Quasimodo')]


1

In [27]:
queryString = """
SELECT DISTINCT ?writer ?writerName ?countryName WHERE {

    # writers who won a Literature Nobel Prize and their nationality
    ?writer wdt:P106 wd:Q36180 ;
            wdt:P166 wd:Q37922 ;
            wdt:P27  ?country  .

    # This returns the labels
    ?writer  <http://schema.org/name> ?writerName .
    ?country <http://schema.org/name> ?countryName .

    # I want only authors from Italy and Kingdom Of Italy
    FILTER (?country = wd:Q38 || ?country = wd:Q172579)   
}
"""

print("Results")
run_query(queryString)

Results
[('writer', 'http://www.wikidata.org/entity/Q7728'), ('writerName', 'Grazia Deledda'), ('countryName', 'Kingdom of Italy')]
[('writer', 'http://www.wikidata.org/entity/Q43440'), ('writerName', 'Giosuè Carducci'), ('countryName', 'Kingdom of Italy')]
[('writer', 'http://www.wikidata.org/entity/Q1403'), ('writerName', 'Luigi Pirandello'), ('countryName', 'Kingdom of Italy')]
[('writer', 'http://www.wikidata.org/entity/Q765'), ('writerName', 'Dario Fo'), ('countryName', 'Italy')]
[('writer', 'http://www.wikidata.org/entity/Q83038'), ('writerName', 'Salvatore Quasimodo'), ('countryName', 'Kingdom of Italy')]
[('writer', 'http://www.wikidata.org/entity/Q83038'), ('writerName', 'Salvatore Quasimodo'), ('countryName', 'Italy')]


6

***Salvatore Quasimodo (wd:Q83038)*** is showed two times, because he lived during the transition from ***Kingdom of Italy(wd:Q172579)*** to ***Italy (wd:Q38)***.

#### END TASK 4:
I was able to answer to the inital question.

In particular, I have a total of 5 distinct authors from ***Italy (wd:Q38)*** and from the ***Kingdom of Italy(wd:Q172579)*** who won a ***Literature Nobel Prize (wd:Q37922)***.

### ***Task 5:***  Are there books from Litarature Nobel Award winners which are not present in the Vatican Library? (if so, who is the author with more books not in the Vatican Library)?

Search books from authors that won ***Literature Nobel Prize (wd:Q37922)*** which are not present in the ***Vatican Library (wd:Q213678)***.

I can use the proerty ***location (wdt:P276)*** discovered in Task 1.

In [28]:
queryString = """
SELECT DISTINCT ?book ?bookName ?writer ?writerName WHERE {

    # Retrieve instances of book and their authors
    ?book (wdt:P31 / wdt:P279*) wd:Q571 ;
           wdt:P50              ?writer .
    
    # Retrieve writers who won a Nobel Prize
    ?writer wdt:P106 wd:Q36180 ;
            wdt:P166 wd:Q37922 .
  
    # this returns the labels
    ?book   <http://schema.org/name> ?bookName   .
    ?writer <http://schema.org/name> ?writerName .
    
    # The book must be not located in the Vatican Library
    FILTER NOT EXISTS {?book wdt:P276 wd:Q213678}
}
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('book', 'http://www.wikidata.org/entity/Q935935'), ('bookName', 'Chronicle of a Death Foretold'), ('writer', 'http://www.wikidata.org/entity/Q5878'), ('writerName', 'Gabriel García Márquez')]
[('book', 'http://www.wikidata.org/entity/Q75700865'), ('bookName', 'The Lost Soul'), ('writer', 'http://www.wikidata.org/entity/Q254032'), ('writerName', 'Olga Tokarczuk')]
[('book', 'http://www.wikidata.org/entity/Q100552037'), ('bookName', 'Fra Fristaterne'), ('writer', 'http://www.wikidata.org/entity/Q159552'), ('writerName', 'Johannes V. Jensen')]
[('book', 'http://www.wikidata.org/entity/Q77012907'), ('bookName', 'Sto roků samoty'), ('writer', 'http://www.wikidata.org/entity/Q5878'), ('writerName', 'Gabriel García Márquez')]
[('book', 'http://www.wikidata.org/entity/Q48841223'), ('bookName', 'E.E.'), ('writer', 'http://www.wikidata.org/entity/Q254032'), ('writerName', 'Olga Tokarczuk')]
[('book', 'http://www.wikidata.org/entity/Q62092410'), ('bookName', 'Principia Mathematica II'),

50

Who is the author who won ***Literature Nobel Prize (wd:Q37922)*** with more books not in the ***Vatican Library (wd:Q213678)***?

In [29]:
queryString = """
SELECT ?writer ?writerName (COUNT(DISTINCT ?book) AS ?numBooks) WHERE {

    # Retrieve instances of book and their authors
    ?book (wdt:P31 / wdt:P279*) wd:Q571 ;
           wdt:P50              ?writer .

    # Retrieve writers who won a Nobel Prize
    ?writer wdt:P106 wd:Q36180 ;
            wdt:P166 wd:Q37922 .

    # This returns the labels
    ?book   <http://schema.org/name> ?bookName   .
    ?writer <http://schema.org/name> ?writerName .

    # The book must be not located in the Vatican Library
    FILTER NOT EXISTS { ?book wdt:P276 wd:Q213678 }
}
GROUP BY ?writer ?writerName
ORDER BY DESC(?numBooks)
LIMIT 1
"""

print("Results")
run_query(queryString)

Results
[('writer', 'http://www.wikidata.org/entity/Q81685'), ('writerName', 'François Mauriac'), ('numBooks', '15')]


1

#### END TASK 5:
I showed some books from winners of the ***Literature Nobel Prize (wd:Q37922)*** which are not present in the  ***Vatican Library (wd:Q213678)***?

The author (winner of ***Literature Nobel Prize (wd:Q37922)***) with more books not in the Vatican Library is ***François Mauriac (wd:Q81685)***


### Extra analytics query

#### On average, how many books are written by authors of the same nationality ?

In [30]:
queryString = """
SELECT ?country ?countryName AVG(?numBook) AS ?avgBookWritten WHERE{

    # Retrieve the nationality of the writers
    ?writer wdt:P27 ?country.  
    
    # This returns the labels
    ?country <http://schema.org/name> ?countryName .

    # Number of books written by each writer
    {    SELECT ?writer COUNT(DISTINCT ?book) AS ?numBook WHERE {

         # Retrieve instances of books and their authors
         ?book (wdt:P31 / wdt:P279*)  wd:Q571 ;
                wdt:P50               ?writer .

         # Retrieve writers
         ?writer wdt:P106 wd:Q36180 .
        }
        GROUP BY (?writer)
     }
}
GROUP BY ?country ?countryName
ORDER BY DESC(?avgBookWritten)
LIMIT 10


"""

print("Results")
run_query(queryString)

Results
[('country', 'http://www.wikidata.org/entity/Q1072949'), ('countryName', 'Western Han'), ('avgBookWritten', '88')]
[('country', 'http://www.wikidata.org/entity/Q188712'), ('countryName', 'Empire of Japan'), ('avgBookWritten', '11')]
[('country', 'http://www.wikidata.org/entity/Q518101'), ('countryName', 'First Republic of Austria'), ('avgBookWritten', '8.666666666666667')]
[('country', 'http://www.wikidata.org/entity/Q503623'), ('countryName', 'Gold Coast'), ('avgBookWritten', '8.5')]
[('country', 'http://www.wikidata.org/entity/Q31'), ('countryName', 'Belgium'), ('avgBookWritten', '6')]
[('country', 'http://www.wikidata.org/entity/Q152750'), ('countryName', 'Protectorate of Bohemia and Moravia'), ('avgBookWritten', '6')]
[('country', 'http://www.wikidata.org/entity/Q172579'), ('countryName', 'Kingdom of Italy'), ('avgBookWritten', '4.9125')]
[('country', 'http://www.wikidata.org/entity/Q38'), ('countryName', 'Italy'), ('avgBookWritten', '4.642857142857143')]
[('country', 'http

10

#### How many books have been written by Nobel Award winners?

In [31]:
queryString = """
SELECT ?writer ?writerName COUNT(DISTINCT ?book) AS ?numBook {

    # Retrieve instances of books and their authors
    ?book (wdt:P31 / wdt:P279*)  wd:Q571 ;
           wdt:P50               ?writer .

    # Retrieve writers who won a Literature Nobel Prize
    ?writer wdt:P106 wd:Q36180 ;
            wdt:P166 wd:Q37922 .
            
    # This returns the labels
    ?writer <http://schema.org/name> ?writerName .
            
}
GROUP BY ?writer ?writerName
ORDER BY DESC(?numBook)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('writer', 'http://www.wikidata.org/entity/Q81685'), ('writerName', 'François Mauriac'), ('numBook', '15')]
[('writer', 'http://www.wikidata.org/entity/Q41488'), ('writerName', 'Wole Soyinka'), ('numBook', '15')]
[('writer', 'http://www.wikidata.org/entity/Q43736'), ('writerName', 'Yasunari Kawabata'), ('numBook', '11')]
[('writer', 'http://www.wikidata.org/entity/Q47484'), ('writerName', 'André Gide'), ('numBook', '9')]
[('writer', 'http://www.wikidata.org/entity/Q42443'), ('writerName', 'Anatole France'), ('numBook', '9')]
[('writer', 'http://www.wikidata.org/entity/Q47162'), ('writerName', 'Romain Rolland'), ('numBook', '8')]
[('writer', 'http://www.wikidata.org/entity/Q133042'), ('writerName', 'Shmuel Yosef Agnon'), ('numBook', '6')]
[('writer', 'http://www.wikidata.org/entity/Q254032'), ('writerName', 'Olga Tokarczuk'), ('numBook', '6')]
[('writer', 'http://www.wikidata.org/entity/Q41502'), ('writerName', 'Henryk Sienkiewicz'), ('numBook', '6')]
[('writer', 'http://www.wi

10

#### For each country show how many writers won a Nobel Award, and the average number of books written by them.

In [32]:
queryString = """
SELECT ?countryName ?numAwardedWriters ?avgBookWritten WHERE{

    # This returns the labels
    ?country <http://schema.org/name> ?countryName .
     
    # Number of Nobel winners for each country
    {   SELECT ?country (COUNT(DISTINCT ?writer) AS ?numAwardedWriters)  WHERE {

        # Retrieve writers who won a Literature Nobel Prize and their nationality
        ?writer wdt:P106 wd:Q36180 ;
                wdt:P166 wd:Q37922 ;
                wdt:P27  ?country .
        }
        GROUP BY ?country      
    }

    # Average number of books for writers who won a Literature Nobel Prize of the same nationality
    {   SELECT ?country AVG(?numBook) AS ?avgBookWritten WHERE{

            # Retrieve the nationality of the writers
            ?writer wdt:P27 ?country. 

            # Number of books for each writer who won a Literature Nobel Prize
            {   SELECT ?writer ?writerName COUNT(DISTINCT ?book) AS ?numBook {

                # Retrieve instances of books and their authors
                ?book (wdt:P31 / wdt:P279*)  wd:Q571 ;
                       wdt:P50               ?writer .

                # Retrieve writers who won a Literature Nobel Prize
                ?writer wdt:P106 wd:Q36180 ;
                        wdt:P166 wd:Q37922 .

                # This returns the labels
                ?writer <http://schema.org/name> ?writerName .
            }
            GROUP BY ?writer ?writerName
            }
        }
        GROUP BY ?country
    }   
}
ORDER BY DESC(?numAwardedWriters)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('countryName', 'France'), ('numAwardedWriters', '16'), ('avgBookWritten', '6.25')]
[('countryName', 'United States of America'), ('numAwardedWriters', '14'), ('avgBookWritten', '1.888888888888889')]
[('countryName', 'United Kingdom'), ('numAwardedWriters', '10'), ('avgBookWritten', '2')]
[('countryName', 'Sweden'), ('numAwardedWriters', '8'), ('avgBookWritten', '2')]
[('countryName', 'Russian Empire'), ('numAwardedWriters', '7'), ('avgBookWritten', '2.333333333333333')]
[('countryName', 'Germany'), ('numAwardedWriters', '6'), ('avgBookWritten', '1.5')]
[('countryName', 'Spain'), ('numAwardedWriters', '6'), ('avgBookWritten', '2')]
[('countryName', 'German Empire'), ('numAwardedWriters', '6'), ('avgBookWritten', '2.75')]
[('countryName', 'Soviet Union'), ('numAwardedWriters', '5'), ('avgBookWritten', '1.666666666666667')]
[('countryName', 'United Kingdom of Great Britain and Ireland'), ('numAwardedWriters', '5'), ('avgBookWritten', '1.75')]


10

## END