# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [45]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-77c320cba1-## 
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# Politics Workflow Series ("Monarchies") 

Consider the following exploratory information need:

> You investigating existing and past historical monarchies

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P279`    | subclass      | predicate |
| `wdt:P17`     | country       | predicate |
| `wdt:P27`     | citizenship   | predicate |
| `wdt:P1366`   | replaced by    | predicate |
| `wdt:P122`    | basic form of government    | predicate |
| `wd:Q38`      | Italy          | node |
| `wd:Q145`     | United Kingdom           | node |
| `wd:Q7269`    | monarchy           | node |




Also consider

```
?p (wdt:P31|wdt:P122)/wdt:P279* wd:Q7269  . 
```

is the BGP to retrieve all **countries that are some instance of a monarchy**

## Workload Goals


1. Identify the BGP to retrieve states that were monarchies and have been replaced by some other state

2. Identify the BGP for obtaining statistical information of monarchies, e.g., poluation or inception date

3. Which is or was  the country with the highest population?

4. How many countries are or have been monarchies across continents?

5. Analyze the number of countries that were monarchies
 
   5.1 How many sub-types of monarchy exist?  How many instances of exist of each one?
   
   5.2 Are there cities that have been capital of multiple kingdoms?
   
   5.3 Which are the top-3 monarchies with the earliest inception and still existing?


In [1]:
# start your workflow here

In [15]:
queryString = """
SELECT COUNT(*)
WHERE { 

?p (wdt:P31|wdt:P122)/wdt:P279* wd:Q7269  . 
} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '2344')]


1

In [None]:
1. Identify the BGP to retrieve states that were monarchies and have been replaced by some other state

In [167]:
# find the state values replace by->P1366 , continent->P30 and capital->P36
queryString = """
SELECT distinct ?p ?pname
WHERE { 

?country (wdt:P31|wdt:P122)/wdt:P279* wd:Q7269  ; 
        ?p ?o.filter(!isliteral(?o))
        
?p sc:name ?pname.


} 
limit 5
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1080'), ('pname', 'from narrative universe')]
[('p', 'http://www.wikidata.org/prop/direct/P112'), ('pname', 'founded by')]
[('p', 'http://www.wikidata.org/prop/direct/P1151'), ('pname', "topic's main Wikimedia portal")]
[('p', 'http://www.wikidata.org/prop/direct/P122'), ('pname', 'basic form of government')]
[('p', 'http://www.wikidata.org/prop/direct/P1269'), ('pname', 'facet of')]


5

In [133]:
# get all states which were replaced by a new state
queryString = """
SELECT distinct ?countryname ?replacebyname
WHERE { 

?country (wdt:P31|wdt:P122)/wdt:P279* wd:Q7269  ; 
         wdt:P1366 ?replaceby.
        
?country sc:name ?countryname.     
?replaceby sc:name ?replacebyname.


} 
limit 5
"""

print("Results")
run_query(queryString)

Results
[('countryname', 'Jodhpur State'), ('replacebyname', 'India')]
[('countryname', 'Eastern Hungarian Kingdom'), ('replacebyname', 'Principality of Transylvania')]
[('countryname', 'East Francia'), ('replacebyname', 'Holy Roman Empire')]
[('countryname', 'Kingdom of Hawaiʻi'), ('replacebyname', 'Republic of Hawaii')]
[('countryname', 'Duchy of Luxembourg'), ('replacebyname', 'French First Republic')]


5

In [None]:
2. Identify the BGP for obtaining statistical information of monarchies, e.g., poluation or inception date

In [134]:
# find statistical information  poluation->P1082 , inception date->P571 religion->P140, time period->P2348)
queryString = """
SELECT distinct ?p ?pname
WHERE { 

?country (wdt:P31|wdt:P122)/wdt:P279* wd:Q7269  ; 
        ?p ?o.
        
?p sc:name ?pname.       
filter regex(?pname,"inception|population",'i')        



} 
#limit 5
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1082'), ('pname', 'population')]
[('p', 'http://www.wikidata.org/prop/direct/P571'), ('pname', 'inception')]


2

In [135]:
# find statistical information  poluation->P1082 , inception date->P571 religion->P140, time period->P2348)
queryString = """
SELECT distinct ?countryname ?poluation ?periodname ?religionname
WHERE { 

    ?country (wdt:P31|wdt:P122)/wdt:P279* wd:Q7269 ;
               wdt:P1082 ?poluation;
               wdt:P571  ?inception ;
                wdt:P140 ?religion.
                
     OPTIONAL{  ?country wdt:P2348 ?period.?period sc:name ?periodname.
     }
                

    ?country sc:name ?countryname.
    ?religion sc:name ?religionname.  
     
} 
limit 5
"""

print("Results")
run_query(queryString)

Results
[('countryname', 'Goryeo'), ('poluation', '2100000'), ('periodname', 'Goryeo period'), ('religionname', 'Korean Buddhism')]
[('countryname', 'Safavid Empire'), ('poluation', '20000000'), ('periodname', 'early modern Iran'), ('religionname', 'Twelvers')]
[('countryname', 'Safavid Empire'), ('poluation', '20000000'), ('periodname', 'early modern Iran'), ('religionname', 'Islam')]
[('countryname', 'Roman Catholic Diocese of Bolzano-Brixen'), ('poluation', '525092'), ('religionname', 'Catholicism')]
[('countryname', 'Kingdom of Dalmatia'), ('poluation', '645666'), ('religionname', 'Catholic Church')]


5

In [None]:
3. Which is or was the country with the highest population?

In [140]:
# I find all countries and their poluation 
# then i get last listed number of poluation for that country
queryString = """
SELECT DISTINCT  ?countryname  (max(?poluation) as ?lastpoluation )
WHERE { 

    ?country (wdt:P31|wdt:P122)/wdt:P279* wd:Q7269 ;
               wdt:P1082 ?poluation.

    ?country sc:name ?countryname.

} 

order by desc(?lastpoluation)
limit 5
"""

print("Results")
run_query(queryString)

Results
[('countryname', 'British Empire'), ('lastpoluation', '680000000')]
[('countryname', 'Qing dynasty'), ('lastpoluation', '432000000')]
[('countryname', 'Russian Empire'), ('lastpoluation', '181537800')]
[('countryname', 'Mongol Empire'), ('lastpoluation', '160000000')]
[('countryname', 'Mughal Empire'), ('lastpoluation', '150000000')]


5

In [None]:
4. How many countries are or have been monarchies across continents?

In [145]:
# then i get last listed number of poluation for that country
queryString = """
SELECT DISTINCT  ?continentname  (count(?countryname) as ?howmanycountries )
WHERE { 

    ?country (wdt:P31|wdt:P122)/wdt:P279* wd:Q7269 ;
               wdt:P30 ?continent.

    ?continent sc:name ?continentname.
    ?country sc:name ?countryname.

} 

order by desc(?howmanycountries)

"""

print("Results")
run_query(queryString)

Results
[('continentname', 'Europe'), ('howmanycountries', '478')]
[('continentname', 'Asia'), ('howmanycountries', '207')]
[('continentname', 'Africa'), ('howmanycountries', '72')]
[('continentname', 'North America'), ('howmanycountries', '41')]
[('continentname', 'Insular Oceania'), ('howmanycountries', '20')]
[('continentname', 'Eurasia'), ('howmanycountries', '16')]
[('continentname', 'Hyboria'), ('howmanycountries', '13')]
[('continentname', 'South America'), ('howmanycountries', '8')]
[('continentname', 'The Continent'), ('howmanycountries', '7')]
[('continentname', 'Americas'), ('howmanycountries', '7')]
[('continentname', 'Thuria'), ('howmanycountries', '6')]
[('continentname', 'Central America'), ('howmanycountries', '2')]
[('continentname', 'Australia'), ('howmanycountries', '2')]
[('continentname', 'Northern America and Mexico'), ('howmanycountries', '1')]
[('continentname', 'Antarctica'), ('howmanycountries', '1')]
[('continentname', 'Caribbean'), ('howmanycountries', '1')]

16

In [None]:
5.1 >>How many sub-types of monarchy exist? << How many instances of exist of each one?

In [158]:
# find all value for each calsses which are sub class of monarchy
queryString = """
SELECT DISTINCT  ?typename  
WHERE { 

    ?type wdt:P279 wd:Q7269 .


    ?type sc:name ?typename.

} 
limit 5


"""

print("Results")
run_query(queryString)

Results
[('typename', 'patrimonialism')]
[('typename', 'feudal monarchy')]
[('typename', 'despotism')]
[('typename', 'elective monarchy')]
[('typename', 'hereditary monarchy')]


5

In [159]:
# count the number of type
queryString = """
SELECT DISTINCT  Count(*)  
WHERE { 

    ?type wdt:P279 wd:Q7269 .


    ?type sc:name ?typename.

} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '26')]


1

In [None]:
5.1 How many sub-types of monarchy exist?  >>How many instances of exist of each one?<<

In [163]:
# number of instances in each type of monarchy
queryString = """
SELECT DISTINCT  ?typename  (count(?countryname) as ?numberOfInstance)
WHERE { 

    ?type wdt:P279 wd:Q7269 .
    ?country (wdt:P31|wdt:P122)/wdt:P279* ?type .

   
    ?country sc:name ?countryname.
    ?type sc:name ?typename.

} 

order by desc(?numberOfInstance)
limit 5
"""

print("Results")
run_query(queryString)

Results
[('typename', 'principality'), ('numberOfInstance', '396')]
[('typename', 'kingdom'), ('numberOfInstance', '231')]
[('typename', 'constitutional monarchy'), ('numberOfInstance', '213')]
[('typename', 'duchy'), ('numberOfInstance', '188')]
[('typename', 'absolute monarchy'), ('numberOfInstance', '170')]


5

In [None]:
5.2 Are there cities that have been capital of multiple kingdoms?

In [184]:
# I count for each capital, all countries which had common capital and then I just keep capitals which had more than one country
queryString = """
SELECT DISTINCT  ?capitalname  (count(?countryname) AS ?numberOfCountry)
WHERE { 

    ?country (wdt:P31|wdt:P122)/wdt:P279* wd:Q7269 ;
                wdt:P36 ?capital .

    ?capital sc:name ?capitalname.
    ?country sc:name ?countryname.

} 

order by desc(?numberOfCountry >1)
limit 5
"""

print("Results")
run_query(queryString)

Results
[('capitalname', 'Coburg'), ('numberOfCountry', '4')]
[('capitalname', 'Nassau'), ('numberOfCountry', '2')]
[('capitalname', 'Kingstown'), ('numberOfCountry', '2')]
[('capitalname', 'Hilo'), ('numberOfCountry', '2')]
[('capitalname', 'Pataliputra'), ('numberOfCountry', '2')]


5

In [None]:
5.3 Which are the top-3 monarchies with the earliest inception and still existing?

In [240]:
# find all monarchies which are not replaced by other monarchies and get their inception date
queryString = """
SELECT distinct ?countryname (?inception)
WHERE { 

    ?country (wdt:P31|wdt:P122)/wdt:P279* wd:Q7269 ;
               wdt:P571  ?inception .
               
    filter not exists {?country (wdt:P31|wdt:P122)/wdt:P279* wd:Q7269 ;
                                                wdt:P1366 ?replaceby.}
          
   FILTER ( datatype(?inception) = xsd:dateTime) 
   FILTER ( coalesce(xsd:datetime(str(?inception)), '!') != '!')
    ?country sc:name ?countryname.
} 
order by  (  xsd:dateTime(?inception) )
#limit 30
"""

print("Results")
run_query(queryString)

Results
[('countryname', 'Kushan Empire'), ('inception', '0030-01-01T00:00:00Z')]
[('countryname', 'Kingdom of Funan'), ('inception', '0068-01-01T00:00:00Z')]
[('countryname', 'Chen princedom'), ('inception', '0088-01-01T00:00:00Z')]
[('countryname', 'Champa'), ('inception', '0192-01-01T00:00:00Z')]
[('countryname', 'Kingdom of Araba'), ('inception', '0200-01-01T00:00:00Z')]
[('countryname', 'Shule Kingdom'), ('inception', '0200-01-01T00:00:00Z')]
[('countryname', 'Roman Catholic Diocese of Metz'), ('inception', '0201-01-01T00:00:00Z')]
[('countryname', 'Ghassanids'), ('inception', '0220-01-01T00:00:00Z')]
[('countryname', 'Sasanian Empire'), ('inception', '0226-01-01T00:00:00Z')]
[('countryname', 'Afrighids'), ('inception', '0305-01-01T00:00:00Z')]
[('countryname', 'Rouran Khaganate'), ('inception', '0330-01-01T00:00:00Z')]
[('countryname', 'Airgíalla'), ('inception', '0331-01-01T00:00:00Z')]
[('countryname', 'Kamarupa'), ('inception', '0350-01-01T00:00:00Z')]
[('countryname', 'Kingdo

802