# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
    is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-NOTEBOOK_CODE_HERE-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-03.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)

# Book Workflow Series ("Political Magazines explorative search") 

Consider the following exploratory scenario:


>  Investigate the U.S. Magazines which write about politics and their media presence



## Useful URIs for the current workflow
The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P921`    | main subject  | predicate | 
| `wdt:P17`     | country       | predicate | 
| `wdt:P279`    | subclass      | predicate |
| `wd:Q41298`   | Magazine      | node |
| `wd:Q7163`    | politics      | node |
| `wd:Q30`      | U.S.          | node |
| `wd:Q217305`  | The New Yorker  | node |







Also consider

```
?p wdt:P17 wd:Q30 .
?p wdt:P31/wdt:P279* wd:Q41298  . 
```

is the BGP to retrieve all **type of publications in the U.S.**


The workload should


1. Identify the BGP for obtaining the US magazines that write about politics

2. Compare the number of social media followers and get the top five magazines along with main properties as the place of publication and subfields of work

3. Compare the number of followers with the top three US magazines writing mainly about sports

4. Get the name of notable employees working for The New Yorker and any other political magazine published in the US. Check if these employees have witten any book and if so get the title.  

5. Is any employee of The New Yorker ever been nominated for a prize or award?

In [1]:
# start your workflow here

In [4]:
queryString = """
SELECT COUNT(?p)
WHERE { 

?p wdt:P17 wd:Q30 .
?p wdt:P31/wdt:P279* wd:Q41298  . 


} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '5422')]


1