# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-b7cf4711ac-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# GEO Workflow Series ("Place of Birth, Death, and Burial") 

Consider the following exploratory information need:

> You want to visit cities connected to famous writers and poets, and you are deciding wether to visit France or Germany

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P279`    | subclass      | predicate |
| `wdt:P106`    | occupation    | predicate |
| `wdt:P27`     | citizenship   | predicate |
| `wd:Q183`     | Germany       | node |
| `wd:Q142`     | France        | node |
| `wd:Q90`      | Paris         | node |
| `wd:Q49757`   | Poet          | node |
| `wd:Q36180`   | Writer        | node |
| `wd:Q501`     | Charles Baudelaire  | node      |
| `wd:Q272208`  | Montparnasse Cemetery       | node |


Also consider

```
?p wdt:P27 wd:Q142  . 
?p wdt:P106 wd:Q36180  . 
```

is the BGP to retrieve all **French writers**

## Workload Goals

1. Identify the BGP that connect people to their place of birth or place, death, or burial

2. Identify the BGP to obtain the country in which a place is located

3. How many poets and writers  have a place of birth, death, or burial in Germany and France?

4. Analyze cities across the two countries
 
   4.1 Is there any poet for which the birth place and the place of burial are located in the same city either in Germany or France?
   
   4.2 Which cities host the place of birth of the larger number of poets or writers across the two countries?
   
   4.3 What are the top 3 cities in each country that you could visit? Based on what criteria?


In [3]:
# start your workflow here

In [2]:
queryString = """
SELECT COUNT(*)
WHERE { 

?p wdt:P27 wd:Q142  . 
?p wdt:P106 wd:Q36180  . 
} 
GROUP BY ?cult  ?arch
"""

print("Predicates")
run_query(queryString)

Predicates
[('callret-0', '14400')]


1

## Task 1
1. Identify the BGP that connect people to their place of birth or place, death, or burial

In [4]:
#From the given URIs I use the one related to "Charles Baudelaire" (wd:Q501) to see if I can find information about place of birth, death and burial looking at the outgoing properties 
queryString = """
SELECT ?pname ?oname ?p ?obj 
WHERE { 

wd:Q501 ?p ?obj .

?p <http://schema.org/name> ?pname .
OPTIONAL {?obj <http://schema.org/name> ?oname }.
} 
"""

print("Predicates")
run_query(queryString)

Predicates
[('pname', 'on focus list of Wikimedia project'), ('oname', 'WikiProject PCC Wikidata Pilot/Frick Art Reference Library'), ('p', 'http://www.wikidata.org/prop/direct/P5008'), ('obj', 'http://www.wikidata.org/entity/Q104694359')]
[('pname', 'educated at'), ('oname', 'Lycée Louis-le-Grand'), ('p', 'http://www.wikidata.org/prop/direct/P69'), ('obj', 'http://www.wikidata.org/entity/Q1059546')]
[('pname', 'occupation'), ('oname', 'essayist'), ('p', 'http://www.wikidata.org/prop/direct/P106'), ('obj', 'http://www.wikidata.org/entity/Q11774202')]
[('pname', 'ethnic group'), ('oname', 'French'), ('p', 'http://www.wikidata.org/prop/direct/P172'), ('obj', 'http://www.wikidata.org/entity/Q121842')]
[('pname', 'country of citizenship'), ('oname', 'France'), ('p', 'http://www.wikidata.org/prop/direct/P27'), ('obj', 'http://www.wikidata.org/entity/Q142')]
[('pname', 'native language'), ('oname', 'French'), ('p', 'http://www.wikidata.org/prop/direct/P103'), ('obj', 'http://www.wikidata.org

265

#### Comment
Looking at these results he have discovered the URIs:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P19`   | place of birth   | predicate |
| `wdt:P20`     | place of death   | predicate |
| `wdt:P119`    | place of burial      | predicate |
| `wd:Q5`    | human    | node |
| `wd:Q90`    | Paris    | node |

As an example I show the graph pattern to retrieve place of birth, place of death and place of burial of a person (human) without executing it due to the huge number of nodes

In [None]:
#EXAMPLE - NOT COMPILE
queryString = """
SELECT ?pname ?oname ?p ?obj 
WHERE { 

?person ?p ?obj ;  
        wdt:P31 wd:Q5 .

FILTER(?p = wdt:P19 || ?p = wdt:P20 || ?p = wdt:P119).

?p <http://schema.org/name> ?pname .
OPTIONAL {?obj <http://schema.org/name> ?oname }.
} 
"""

print("Results")
run_query(queryString)

## Task 2

2. Identify the BGP to obtain the country in which a place is located

#### Comment
The solution of this task is similar to the one for task 1 but in this case I'm going to start from Montparnasse Cemetery node (wd:Q272208)

In [5]:
#From the given URIs I use the one related to "Montparnasse Cemetery" (wd:Q272208) to see if I can find information about the country in which a place is located 
queryString = """
SELECT ?pname ?oname ?p ?obj 
WHERE { 

wd:Q272208 ?p ?obj .

?p <http://schema.org/name> ?pname .
OPTIONAL {?obj <http://schema.org/name> ?oname }.
} 
"""

print("Results")
run_query(queryString)

Results
[('pname', 'category of people buried here'), ('oname', 'Category:Burials at Montparnasse Cemetery'), ('p', 'http://www.wikidata.org/prop/direct/P1791'), ('obj', 'http://www.wikidata.org/entity/Q6444336')]
[('pname', 'country'), ('oname', 'France'), ('p', 'http://www.wikidata.org/prop/direct/P17'), ('obj', 'http://www.wikidata.org/entity/Q142')]
[('pname', 'country of origin'), ('oname', 'France'), ('p', 'http://www.wikidata.org/prop/direct/P495'), ('obj', 'http://www.wikidata.org/entity/Q142')]
[('pname', 'located in the administrative territorial entity'), ('oname', '14th arrondissement of Paris'), ('p', 'http://www.wikidata.org/prop/direct/P131'), ('obj', 'http://www.wikidata.org/entity/Q187153')]
[('pname', 'adjacent building'), ('oname', 'Montparnasse – Bienvenüe'), ('p', 'http://www.wikidata.org/prop/direct/P3032'), ('obj', 'http://www.wikidata.org/entity/Q764465')]
[('pname', 'located on street'), ('oname', 'boulevard Edgar-Quinet'), ('p', 'http://www.wikidata.org/prop/d

27

#### Comment
We can see the country of a place but not the city in which a place is located se let's find how to extract this information

In [7]:
# Properties for Paris
queryString = """
SELECT ?pname ?oname ?p ?obj 
WHERE { 

wd:Q90 ?p ?obj .

?p <http://schema.org/name> ?pname .
OPTIONAL {?obj <http://schema.org/name> ?oname }.
} 
"""

print("Results")
run_query(queryString)

Results
[('pname', 'located in or next to body of water'), ('oname', 'Canal Saint-Martin'), ('p', 'http://www.wikidata.org/prop/direct/P206'), ('obj', 'http://www.wikidata.org/entity/Q860172')]
[('pname', 'category of associated people'), ('oname', 'Category:People from Paris'), ('p', 'http://www.wikidata.org/prop/direct/P1792'), ('obj', 'http://www.wikidata.org/entity/Q8964470')]
[('pname', 'owner of'), ('oname', 'hôtel Dodun'), ('p', 'http://www.wikidata.org/prop/direct/P1830'), ('obj', 'http://www.wikidata.org/entity/Q19832938')]
[('pname', 'shares border with'), ('oname', 'Boulogne-Billancourt'), ('p', 'http://www.wikidata.org/prop/direct/P47'), ('obj', 'http://www.wikidata.org/entity/Q172455')]
[('pname', 'shares border with'), ('oname', 'Aubervilliers'), ('p', 'http://www.wikidata.org/prop/direct/P47'), ('obj', 'http://www.wikidata.org/entity/Q205632')]
[('pname', 'shares border with'), ('oname', 'Saint-Ouen-sur-Seine'), ('p', 'http://www.wikidata.org/prop/direct/P47'), ('obj', '

327

#### Comment
And so we have the country property to retrieve were a place is located:
    
| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P17`   | country   | predicate |

And so starting from a place or a city we have have the BGP to find the belonging country:

```
?loc wdt:P17 ?country
```

## Task 3
3. How many poets and writers  have a place of birth, death, or burial in Germany and France?

We start from the given URIs:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wd:Q183`     | Germany       | node |
| `wd:Q142`     | France        | node |
| `wd:Q49757`     | poet        | node |
| `wd:Q36180`     | writer        | node |

I'm going to use the wdt:P17 "country" property to locate a place 

In [5]:
#Number of poets and writers with place of birth or place of death or place of burial located in France 
queryString = """
SELECT (COUNT(DISTINCT ?wname) AS ?num)
WHERE { 

?per wdt:P106 ?job ;
    ?p ?obj .
?obj wdt:P17 ?c .

FILTER(?job = wd:Q36180 || ?job = wd:Q49757).
FILTER(?p = wdt:P19 || ?p = wdt:P20 || ?p = wdt:P119).
FILTER(?c = wd:Q142).

?per <http://schema.org/name> ?wname.
#?p <http://schema.org/name> ?pname .
#?obj <http://schema.org/name> ?oname .
#?c <http://schema.org/name> ?cname .
}
#GROUP BY ?wname ?cname
"""

print("Results")
run_query(queryString)

Results
[('num', '15788')]


1

In [7]:
#Number of poets and writers with place of birth or place of death or place of burial located in Germany
queryString = """
SELECT (COUNT(DISTINCT ?wname) AS ?num)
WHERE { 

?per wdt:P106 ?job ;
    ?p ?obj .
?obj wdt:P17 ?c .

FILTER(?job = wd:Q36180 || ?job = wd:Q49757).
FILTER(?p = wdt:P19 || ?p = wdt:P20 || ?p = wdt:P119).
FILTER(?c = wd:Q183).

?per <http://schema.org/name> ?wname.
#?p <http://schema.org/name> ?pname .
#?obj <http://schema.org/name> ?oname .
#?c <http://schema.org/name> ?cname .
}
#GROUP BY ?wname ?cname
"""

print("Results")
run_query(queryString)

Results
[('num', '26782')]


1

In [10]:
#Number of poets and writers with place of birth or place of death or place of burial located in Frnce or Germany
queryString = """
SELECT (COUNT(DISTINCT ?wname) AS ?num)
WHERE { 

?per wdt:P106 ?job ;
    ?p ?obj .
?obj wdt:P17 ?c .

FILTER(?job = wd:Q36180 || ?job = wd:Q49757).
FILTER(?p = wdt:P19 || ?p = wdt:P20 || ?p = wdt:P119).
FILTER(?c = wd:Q142 || ?c = wd:Q183).

?per <http://schema.org/name> ?wname.
#?p <http://schema.org/name> ?pname .
#?obj <http://schema.org/name> ?oname .
#?c <http://schema.org/name> ?cname .
}
#GROUP BY ?wname ?cname
"""

print("Results")
run_query(queryString)

Results
[('num', '42260')]


1

#### Comment
If we also want the names we can execute this query (I stopped it), and with this query we can also have an example of how the BGP retrieved in task 2 works.

In [9]:
#If we also want the names we can execute this query (I stopped it)
#Poets and writers with place of birth or place of death or place of burial located in France or Germany 
queryString = """
SELECT ?wname (GROUP_CONCAT(DISTINCT ?pname ; separator = ", ") AS ?info) ?cname
WHERE { 

?per wdt:P106 ?job ;
    ?p ?obj .
?obj wdt:P17 ?c .

FILTER(?job = wd:Q36180 || ?job = wd:Q49757).
FILTER(?p = wdt:P19 || ?p = wdt:P20 || ?p = wdt:P119).
FILTER(?c = wd:Q142 || ?c = wd:Q183).

?per <http://schema.org/name> ?wname.
?p <http://schema.org/name> ?pname .
?obj <http://schema.org/name> ?oname .
?c <http://schema.org/name> ?cname .
}
GROUP BY ?wname ?cname
ORDER BY ASC (?wname)
"""

print("Results")
run_query(queryString)

Results
[('wname', 'A. Augustin-Thierry'), ('info', 'place of birth, place of death'), ('cname', 'France')]
[('wname', 'A. F. Koshko'), ('info', 'place of death'), ('cname', 'France')]
[('wname', 'A. Gennevraye'), ('info', 'place of birth, place of death'), ('cname', 'France')]
[('wname', 'A. I︠A︡blonovskīĭ'), ('info', 'place of death'), ('cname', 'France')]
[('wname', 'A. Josef Štěpánek'), ('info', 'place of death'), ('cname', 'Germany')]
[('wname', 'A. M. W. Stirling'), ('info', 'place of death'), ('cname', 'Germany')]
[('wname', 'A. O. Osipovich-Novodvorskiĭ'), ('info', 'place of death'), ('cname', 'France')]
[('wname', 'A. Rennikov'), ('info', 'place of burial, place of death'), ('cname', 'France')]
[('wname', 'A. S. H. Pelikan'), ('info', 'place of birth'), ('cname', 'Germany')]
[('wname', 'A. Verley'), ('info', 'place of birth'), ('cname', 'France')]
[('wname', 'A.-Sébastien Kauffmann'), ('info', 'place of birth, place of death'), ('cname', 'France')]
[('wname', 'Aar van de Werfh

KeyboardInterrupt: 

Here we have an example of a poet or writer with different country for the place of birth, death or burial

[('wname', 'Albert Schmitt'), ('info', 'place of death'), ('cname', 'Germany')]  
[('wname', 'Albert Schmitt'), ('info', 'place of birth, place of burial'), ('cname', 'France')]

## Task 4

4. Analyze cities across the two countries 

    4.1 Is there any poet for which the birth place and the place of burial are located in the same city either in Germany or France?  

    4.2 Which cities host the place of birth of the larger number of poets or writers across the two countries?  

    4.3 What are the top 3 cities in each country that you could visit? Based on what criteria?  

### Task 4.1
4.1 Is there any poet for which the birth place and the place of burial are located in the same city either in Germany or France?

#### Comment

Let's try to find the city at which a place belongs taking as an example the Montparnasse Cemetery (wd:Q272208), we know that is in Paris (wd:Q90) but are there any links between the two nodes? 

In [21]:
#From Montparnasse Cemetery to Paris
queryString = """
SELECT ?oname ?p 
WHERE { 

wd:Q272208 ?p wd:Q90 .

?p <http://schema.org/name> ?pname .
#OPTIONAL {?obj <http://schema.org/name> ?oname }.
} 
"""

print("Results")
run_query(queryString)

Results
Empty


0

In [22]:
#From Paris to Montparnasse Cemetery
queryString = """
SELECT ?oname ?p 
WHERE { 

wd:Q90 ?p wd:Q272208 .

?p <http://schema.org/name> ?pname .
#OPTIONAL {?obj <http://schema.org/name> ?oname }.
} 
"""

print("Results")
run_query(queryString)

Results
Empty


0

#### Comment 
There are no direct links between the two nodes let's try to find other connection that can link a place with their city:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P131`     | located in the administrative territorial entity       | predicate |
| `wd:Q187153`     | 14th arrondissement of Paris        | node |

Looking at the node "14th arrondissement of Paris" I can suppose that it can have some relations with the "Paris" (wd:Q90) node

In [24]:
#Let's check the outgoing edges of "14th arrondissement of Paris"
queryString = """
SELECT ?pname ?oname ?p 
WHERE { 

wd:Q187153 ?p ?obj .

?p <http://schema.org/name> ?pname .
OPTIONAL {?obj <http://schema.org/name> ?oname }.
} 
"""

print("Results")
run_query(queryString)

Results
[('pname', 'shares border with'), ('oname', '13th arrondissement of Paris'), ('p', 'http://www.wikidata.org/prop/direct/P47')]
[('pname', 'replaces'), ('oname', 'former 12th arrondissement of Paris'), ('p', 'http://www.wikidata.org/prop/direct/P1365')]
[('pname', 'country'), ('oname', 'France'), ('p', 'http://www.wikidata.org/prop/direct/P17')]
[('pname', 'shares border with'), ('oname', '5th arrondissement of Paris'), ('p', 'http://www.wikidata.org/prop/direct/P47')]
[('pname', 'shares border with'), ('oname', '6th arrondissement of Paris'), ('p', 'http://www.wikidata.org/prop/direct/P47')]
[('pname', 'instance of'), ('oname', 'municipal arrondissement'), ('p', 'http://www.wikidata.org/prop/direct/P31')]
[('pname', 'located in the administrative territorial entity'), ('oname', 'Paris'), ('p', 'http://www.wikidata.org/prop/direct/P131')]
[('pname', 'contains administrative territorial entity'), ('oname', 'Montparnasse'), ('p', 'http://www.wikidata.org/prop/direct/P150')]
[('pna

58

#### Comment

We have discovered that through "located in the administrative territorial entity" (wdt:P131) property the "14th arrondissement of Paris" node is linked with the "Paris" node.  
So to be sure to discover in the correct way the belonging to a city of a place I search for 0 or more occurrencies of the property (wdt:P131) between the place of burial node and the place of birth node (if the property occurs 0 times it means that the place of birth and the place of burial are the same and that the place can coincide with a city). 

In [33]:
#Let's check if a place of burial is linked through one or more "located in the administrative territorial entity" with a place of birth
queryString = """
SELECT DISTINCT ?pname ?burial ?birth ?cname
WHERE { 

?per wdt:P106 ?job ; 
    ?pobu ?bu ;
    ?pobi ?bi .
?bu (wdt:P131)* ?bi ;
    wdt:P17 ?c .
?bi wdt:P17 ?c .

FILTER(?job = wd:Q49757).
FILTER(?pobu = wdt:P119 && ?pobi = wdt:P19).
FILTER(?c = wd:Q142 || ?c = wd:Q183).

?per <http://schema.org/name> ?pname .
?bu <http://schema.org/name> ?burial .
?bi <http://schema.org/name> ?birth .
?c <http://schema.org/name> ?cname .
}
ORDER BY ASC (?pname)
"""

print("Results")
run_query(queryString)

Results
[('pname', 'Adolphe Lacuzon'), ('burial', 'Cimetière Saint-Jean de Valenciennes'), ('birth', 'Valenciennes'), ('cname', 'France')]
[('pname', 'Adélaïde Dufrénoy'), ('burial', 'Père Lachaise Cemetery'), ('birth', 'Paris'), ('cname', 'France')]
[('pname', "Adélaïde-Louise d'Eckmühl de Blocqueville"), ('burial', 'Père Lachaise Cemetery'), ('birth', 'Paris'), ('cname', 'France')]
[('pname', 'Agathe-Sophie Sasserno'), ('burial', 'Cimetière du Château'), ('birth', 'Nice'), ('cname', 'France')]
[('pname', 'Alain Jouffroy'), ('burial', 'Père Lachaise Cemetery'), ('birth', 'Paris'), ('cname', 'France')]
[('pname', 'Albert Saisset'), ('burial', 'Cemetery Saint-Martin de Perpignan'), ('birth', 'Perpignan'), ('cname', 'France')]
[('pname', 'Albert Tinchant'), ('burial', 'Père Lachaise Cemetery'), ('birth', 'Paris'), ('cname', 'France')]
[('pname', 'Alexandre Desrousseaux'), ('burial', 'cimetière de l’Est'), ('birth', 'Lille'), ('cname', 'France')]
[('pname', 'Alfred Le Roux'), ('burial', '

179

#### Comment
In this way I have obtained the city in which a place of burial is located.  
The task is completed through the fact that a poet is retrieved only when the place of burial's city and the place of birth are the same.   

In [19]:
#OLD QUERY: First tries of query that specialize the previous search but that is not usefull to fulfill in the correct way the task
#We can also retrieve the poets with same birth and burial place without looking at the city in which a place is located in France or Germany
queryString = """
SELECT ?wname ?birth ?burial ?oname ?cname
WHERE { 

?per wdt:P106 ?job ;
    ?bi ?obj ;
    ?bu ?obj .
?obj wdt:P17 ?c .

FILTER(?job = wd:Q49757).
FILTER(?bi = wdt:P19 && ?bu = wdt:P119).
FILTER(?c = wd:Q142 || ?c = wd:Q183).

?per <http://schema.org/name> ?wname.
?bi <http://schema.org/name> ?birth .
?bu <http://schema.org/name> ?burial .
?obj <http://schema.org/name> ?oname .
?c <http://schema.org/name> ?cname .
}
ORDER BY ASC (?wname)
"""

print("Results")
run_query(queryString)

Results
[('wname', 'Alphonse Germain-Lacour'), ('birth', 'place of birth'), ('burial', 'place of burial'), ('oname', 'Moulins-sur-Orne'), ('cname', 'France')]
[('wname', 'Boris Vian'), ('birth', 'place of birth'), ('burial', 'place of burial'), ('oname', "Ville-d'Avray"), ('cname', 'France')]
[('wname', 'Charles Bécas'), ('birth', 'place of birth'), ('burial', 'place of burial'), ('oname', 'Ainhoa'), ('cname', 'France')]
[('wname', 'Charles-Louis Philippe'), ('birth', 'place of birth'), ('burial', 'place of burial'), ('oname', 'Cérilly'), ('cname', 'France')]
[('wname', 'Ernest Louis, Grand Duke of Hesse'), ('birth', 'place of birth'), ('burial', 'place of burial'), ('oname', 'Darmstadt'), ('cname', 'Germany')]
[('wname', 'Erwan ar Moal'), ('birth', 'place of birth'), ('burial', 'place of burial'), ('oname', 'Coadout'), ('cname', 'France')]
[('wname', 'Eugène Bizeau'), ('birth', 'place of birth'), ('burial', 'place of burial'), ('oname', 'Véretz'), ('cname', 'France')]
[('wname', 'Gast

22

### Task 4.2

4.2 Which cities host the place of birth of the larger number of poets or writers across the two countries?

In [73]:
# Let's count the number of poets and writers born in each city of the two states, let's retrieve only the first 50 highest results to not overload the computation 
queryString = """
SELECT ?cname ?birth (COUNT(DISTINCT ?pname) AS ?num) WHERE { 

?per wdt:P106 ?job ; 
    ?pobi ?bi .
?bi wdt:P17 ?c .

FILTER(?job = wd:Q36180 || ?job = wd:Q49757).
FILTER(?pobi = wdt:P19).
FILTER(?c = wd:Q142 || ?c = wd:Q183).

?per <http://schema.org/name> ?pname .
#?bu <http://schema.org/name> ?burial .
?bi <http://schema.org/name> ?birth .
?c <http://schema.org/name> ?cname .
}
GROUP BY ?cname ?birth
ORDER BY DESC (?num)
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('cname', 'France'), ('birth', 'Paris'), ('num', '2479')]
[('cname', 'Germany'), ('birth', 'Berlin'), ('num', '1904')]
[('cname', 'Germany'), ('birth', 'Hamburg'), ('num', '845')]
[('cname', 'Germany'), ('birth', 'Munich'), ('num', '770')]
[('cname', 'Germany'), ('birth', 'Leipzig'), ('num', '458')]
[('cname', 'Germany'), ('birth', 'Cologne'), ('num', '434')]
[('cname', 'Germany'), ('birth', 'Frankfurt am Main'), ('num', '420')]
[('cname', 'Germany'), ('birth', 'Stuttgart'), ('num', '372')]
[('cname', 'Germany'), ('birth', 'Dresden'), ('num', '365')]
[('cname', 'Germany'), ('birth', 'Hanover'), ('num', '310')]
[('cname', 'France'), ('birth', 'Lyon'), ('num', '255')]
[('cname', 'France'), ('birth', 'Marseille'), ('num', '255')]
[('cname', 'Germany'), ('birth', 'Düsseldorf'), ('num', '253')]
[('cname', 'Germany'), ('birth', 'Nuremberg'), ('num', '231')]
[('cname', 'France'), ('birth', 'Strasbourg'), ('num', '184')]
[('cname', 'Germany'), ('birth', 'Bonn'), ('num', '177')]
[('cna

50

### Task 4.3

4.3 What are the top 3 cities in each country that you could visit? Based on what criteria?

In [50]:
#In task 2 we retrieved the properties related to the node "Paris" maybe the list of monument property (wdt:P1456) leads to find some usefull information  
# Let's try to search for the properties of the list of monuments node related to each city 
queryString = """
SELECT ?bname ?pname ?oname WHERE { 

?per wdt:P106 ?job ; 
    ?pob ?b .
?b wdt:P17 ?c ;
    wdt:P1456 ?l.
?l ?p ?obj .

FILTER(?job = wd:Q36180 || ?job = wd:Q49757).
FILTER(?pob = wdt:P19).
FILTER(?c = wd:Q142).

?obj <http://schema.org/name> ?oname .
?b <http://schema.org/name> ?bname .
?p <http://schema.org/name> ?pname .
}
GROUP BY ?cname ?bname
ORDER BY DESC (?num)
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('bname', 'Payroux'), ('pname', 'instance of'), ('oname', 'Wikimedia list article')]
[('bname', 'Distré'), ('pname', 'country'), ('oname', 'France')]
[('bname', 'Distré'), ('pname', 'is a list of'), ('oname', 'Historical Monument')]
[('bname', 'Clermont'), ('pname', 'country'), ('oname', 'France')]
[('bname', 'Gruey-lès-Surance'), ('pname', 'is a list of'), ('oname', 'Historical Monument')]
[('bname', 'Goudelin'), ('pname', 'instance of'), ('oname', 'Wikimedia list article')]
[('bname', 'Le Coudray-Macouard'), ('pname', 'instance of'), ('oname', 'Wikimedia list article')]
[('bname', 'Saint-Florent-sur-Cher'), ('pname', 'instance of'), ('oname', 'Wikimedia list article')]
[('bname', 'Goudelin'), ('pname', 'country'), ('oname', 'France')]
[('bname', 'Gondrecourt-le-Château'), ('pname', 'country'), ('oname', 'France')]
[('bname', 'Thiers'), ('pname', 'instance of'), ('oname', 'Wikimedia list article')]
[('bname', 'Louannec'), ('pname', 'country'), ('oname', 'France')]
[('bname', 

50

#### Coment
No interesting properties found.
Let's try with the "owner of" property (wdt:P1830), maybe can figure out something interesting

In [52]:
#Let's see the nodes connected to a city through the "owner of" property 
queryString = """
SELECT DISTINCT ?cname ?bname ?oname WHERE { 

?per wdt:P106 ?job ; 
    ?pob ?b .
?b wdt:P17 ?c ;
    wdt:P1830 ?o .

FILTER(?job = wd:Q36180 || ?job = wd:Q49757).
FILTER(?pob = wdt:P19).
FILTER(?c = wd:Q142).

?per <http://schema.org/name> ?pname .
?b <http://schema.org/name> ?bname .
?c <http://schema.org/name> ?cname .
?o <http://schema.org/name> ?oname .
}
#GROUP BY ?cname ?bname
#ORDER BY DESC (?num)
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('cname', 'France'), ('bname', 'Le Tampon'), ('oname', 'Stade Klébert Picard')]
[('cname', 'France'), ('bname', 'Biarritz'), ('oname', 'Parc des Sports Aguiléra')]
[('cname', 'France'), ('bname', 'Ploemeur'), ('oname', 'Château de Soye')]
[('cname', 'France'), ('bname', 'Yzeure'), ('oname', 'Stade de Bellevue')]
[('cname', 'France'), ('bname', 'La Rochelle'), ('oname', 'Hôtel de Ville de La Rochelle')]
[('cname', 'France'), ('bname', 'La Rochelle'), ('oname', 'Stade Marcel-Deflandre')]
[('cname', 'France'), ('bname', 'Antibes'), ('oname', 'Jean Bunoz Sports Hall')]
[('cname', 'France'), ('bname', 'Antibes'), ('oname', 'Château Grimaldi (Antibes)')]
[('cname', 'France'), ('bname', 'Reims'), ('oname', 'Église Saint-André (Reims)')]
[('cname', 'France'), ('bname', 'Aubervilliers'), ('oname', 'Church of Notre-Dame-des-Vertus, Aubervilliers')]
[('cname', 'France'), ('bname', 'Salins-les-Bains'), ('oname', 'Fort Saint-André')]
[('cname', 'France'), ('bname', 'Gaillac'), ('oname', "C

50

#### Comment
The results seems like a list of monument, places to see or things to see in that city.  
So maybe, taking into account this result, the number of destination nodes for the "owner of" property could be a fair measure parameter of how much could be interesting that city.

In [54]:
#Let's try to print what Paris owns
queryString = """
SELECT ?oname WHERE { 

wd:Q90 wdt:P1830 ?o .

?o <http://schema.org/name> ?oname .
}
GROUP BY ?cname ?bname
ORDER BY DESC (?num)
LIMIT 400
"""

print("Results")
run_query(queryString)

Results
[('oname', 'Arc de Triomphe')]
[('oname', 'La Sainte Catherine')]
[('oname', "Pavillon d'Armenonville")]
[('oname', 'Eiffel Tower')]
[('oname', 'Hôtel de Chalon-Luxembourg')]
[('oname', 'Belleville funicular tramway')]
[('oname', 'hôtel Dodun')]
[('oname', 'La Famille')]
[('oname', 'Parc des Princes')]
[('oname', 'fontaine de Jarente')]
[('oname', 'pyramid of bois de Vincennes')]
[('oname', 'Stade Jean-Bouin')]
[('oname', 'La Fraternité')]
[('oname', 'Accor Arena')]
[('oname', 'Jacob Wrestling with the Angel')]
[('oname', 'Alexander II')]
[('oname', 'Paris Bourse')]
[('oname', 'The Sleepers')]
[('oname', 'Grand Synagogue of Paris')]


19

In [82]:
#Let's count the nodes connected to a French city through "owner of" property
queryString = """
SELECT ?cname ?bname (COUNT(DISTINCT ?oname) AS ?num) WHERE { 

?per wdt:P106 ?job ; 
    ?pob ?b .
?b wdt:P17 ?c ;
    wdt:P1830 ?o .

FILTER(?job = wd:Q36180 || ?job = wd:Q49757).
FILTER(?pob = wdt:P19).
FILTER(?c = wd:Q142).

?per <http://schema.org/name> ?pname .
?b <http://schema.org/name> ?bname .
?c <http://schema.org/name> ?cname .
?o <http://schema.org/name> ?oname .
}
GROUP BY ?cname ?bname
ORDER BY DESC (?num)
LIMIT 6
"""

print("Results")
run_query(queryString)

Results
[('cname', 'France'), ('bname', 'Paris'), ('num', '19')]
[('cname', 'France'), ('bname', 'Bordeaux'), ('num', '5')]
[('cname', 'France'), ('bname', 'Toulouse'), ('num', '5')]
[('cname', 'France'), ('bname', 'Le Mans'), ('num', '4')]
[('cname', 'France'), ('bname', 'Sens'), ('num', '4')]
[('cname', 'France'), ('bname', 'Rouen'), ('num', '4')]


6

In [81]:
#Let's list the first six French cities for number of poets or writers born in it   
queryString = """
SELECT ?cname ?bname (COUNT(DISTINCT ?pname) AS ?num) WHERE { 

?per wdt:P106 ?job ; 
    ?pob ?b .
?b wdt:P17 ?c .

FILTER(?job = wd:Q36180 || ?job = wd:Q49757).
FILTER(?pob = wdt:P19).
FILTER(?c = wd:Q142).

?per <http://schema.org/name> ?pname .
?b <http://schema.org/name> ?bname .
?c <http://schema.org/name> ?cname .
}
GROUP BY ?cname ?bname
ORDER BY DESC (?num)
LIMIT 6
"""

print("Results")
run_query(queryString)

Results
[('cname', 'France'), ('bname', 'Paris'), ('num', '2479')]
[('cname', 'France'), ('bname', 'Lyon'), ('num', '255')]
[('cname', 'France'), ('bname', 'Marseille'), ('num', '255')]
[('cname', 'France'), ('bname', 'Strasbourg'), ('num', '184')]
[('cname', 'France'), ('bname', 'Bordeaux'), ('num', '175')]
[('cname', 'France'), ('bname', 'Toulouse'), ('num', '135')]


6

In [80]:
#Let's count the nodes connected to a German city through "owner of" property
queryString = """
SELECT ?cname ?bname (COUNT(DISTINCT ?oname) AS ?num) WHERE { 

?per wdt:P106 ?job ; 
    ?pob ?b .
?b wdt:P17 ?c ;
    wdt:P1830 ?o .

FILTER(?job = wd:Q36180 || ?job = wd:Q49757).
FILTER(?pob = wdt:P19).
FILTER(?c = wd:Q183).

?per <http://schema.org/name> ?pname .
?b <http://schema.org/name> ?bname .
?c <http://schema.org/name> ?cname .
?o <http://schema.org/name> ?oname .
}
GROUP BY ?cname ?bname
ORDER BY DESC (?num)
LIMIT 6
"""

print("Results")
run_query(queryString)

Results
[('cname', 'Germany'), ('bname', 'Berlin'), ('num', '19')]
[('cname', 'Germany'), ('bname', 'Witten'), ('num', '12')]
[('cname', 'Germany'), ('bname', 'Frankfurt am Main'), ('num', '8')]
[('cname', 'Germany'), ('bname', 'Bad Honnef'), ('num', '5')]
[('cname', 'Germany'), ('bname', 'Wolfsburg'), ('num', '4')]
[('cname', 'Germany'), ('bname', 'Essen'), ('num', '4')]


6

In [79]:
#Let's list the first six German cities for number of poets or writers born in it   
queryString = """
SELECT ?cname ?bname (COUNT(DISTINCT ?pname) AS ?num) WHERE { 

?per wdt:P106 ?job ; 
    ?pob ?b .
?b wdt:P17 ?c .

FILTER(?job = wd:Q36180 || ?job = wd:Q49757).
FILTER(?pob = wdt:P19).
FILTER(?c = wd:Q183).

?per <http://schema.org/name> ?pname .
?b <http://schema.org/name> ?bname .
?c <http://schema.org/name> ?cname .
}
GROUP BY ?cname ?bname
ORDER BY DESC (?num)
LIMIT 6
"""

print("Results")
run_query(queryString)

Results
[('cname', 'Germany'), ('bname', 'Berlin'), ('num', '1904')]
[('cname', 'Germany'), ('bname', 'Hamburg'), ('num', '845')]
[('cname', 'Germany'), ('bname', 'Munich'), ('num', '770')]
[('cname', 'Germany'), ('bname', 'Leipzig'), ('num', '458')]
[('cname', 'Germany'), ('bname', 'Cologne'), ('num', '434')]
[('cname', 'Germany'), ('bname', 'Frankfurt am Main'), ('num', '420')]


6

#### Comment
Supposing that I'm going to visit a city mainly because I'm interested in poets and writers I'd like to choose a city in which I can see interesting places related to the poets and writers life but with also some general interesting places and things to see.

For this purpose with the "owner of" queries I found the city with interesting general things and places to see, instead with the other two query I can know how many poet and writers are born in a city and so I suppose that I can find some nice things to do or places to see related to them.

So as a result, mergeing the outputs of the two types of queries my suggestion for French cities are Paris, Bordeaux and Toulouse and for German cities I suggest Berlin, Frankfurt am Main, and one between Hamburg and Munich.

I'm pretty sure that Paris doesn't have only 19 places or things to visit, and this stands also for all the other cities taken into account, so I made an approximation only to what is retrieved through "owner of" property, even because looking at all the incoming edges of a city could have been too onerous in terms of time, computation and so on. Due to this fact this is only a "coarse-grained" estimation.