# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [1]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-764433a2ca-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# Companies Workflow Series ("Business People in Germany") 

Consider the following exploratory information need:

> You are investigating important business people in Germany

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P279`    | subclass      | predicate |
| `wdt:P17`      | country       | predicate |
| `wdt:P27`      | citizenship   | predicate |
| `wdt:P106`     | occupation   | predicate |
| `wd:Q183`      | Germany      | node      |
| `wd:Q2462658`  | manager      | node      |
| `wd:Q40966`    | Opel         | node      |
| `wd:Q56509715` | Michael Lohscheller | node |
| `wd:Q57479`    | Adam Opel    | node |



Also consider

```

?p wdt:P27 wd:Q183  . 
?p wdt:P106 wd:Q2462658  . 

```

is the BGP to retrieve all **german managers**

## Workload Goals

1. Identify the BGP for obtaining C.E.O., managers, directors, or founders of German companies

2. Identify the BGP to retrieve gender and profession of people and industry of companies

3. Are there german companies with C.E.O., managers, or founders that are not German?

4. Are there people related to multiple german companies?

5. Analyze the number of business people per role, type of company, and gender
 
   5.1 How many people for each role and gender are there in Germany?
   
   5.2 Are there companies with multiple german people having important roles?
   
   5.3 In which sectors are important german business people working?
   
   5.4 Are there german business people related to non german companies? How many?


In [2]:
# start your workflow here

In [3]:
queryString = """
SELECT COUNT(*)
WHERE { 

?p wdt:P27 wd:Q183  . 
?p wdt:P106 wd:Q2462658  . 
} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '829')]


1

## Part 1
### Identify the BGP for obtaining C.E.O., managers, directors, or founders of German companies

### Search occupations

In [4]:
queryString = """
SELECT DISTINCT ?occupationName ?occupation
WHERE { 

?s wdt:P27 wd:Q183  ;
    wdt:P106 ?occupation  .
?occupation sc:name ?occupationName .   

} ORDER BY ?occupationName
LIMIT 15
"""

print("Results")
run_query(queryString)

Results
[('occupationName', 'Africanist'), ('occupation', 'http://www.wikidata.org/entity/Q15936497')]
[('occupationName', 'Agriculteur'), ('occupation', 'http://www.wikidata.org/entity/Q17123348')]
[('occupationName', 'Alberte'), ('occupation', 'http://www.wikidata.org/entity/Q42677192')]
[('occupationName', 'American football coach'), ('occupation', 'http://www.wikidata.org/entity/Q42331263')]
[('occupationName', 'American football player'), ('occupation', 'http://www.wikidata.org/entity/Q19204627')]
[('occupationName', 'Amtmann'), ('occupation', 'http://www.wikidata.org/entity/Q592996')]
[('occupationName', 'Amtsarzt'), ('occupation', 'http://www.wikidata.org/entity/Q480298')]
[('occupationName', 'Amtshauptmann'), ('occupation', 'http://www.wikidata.org/entity/Q481183')]
[('occupationName', 'Anglican priest'), ('occupation', 'http://www.wikidata.org/entity/Q3409375')]
[('occupationName', 'Anwalt'), ('occupation', 'http://www.wikidata.org/entity/Q613087')]
[('occupationName', 'Arabis

15

### Try to restrict occupation

In [5]:
queryString = """
SELECT DISTINCT ?occupationName ?occupation
WHERE { 

    ?s wdt:P27 wd:Q183  ;
        wdt:P106 ?occupation  .
    ?occupation sc:name ?occupationName .
    
    FILTER (regex(?occupationName,"director")||regex(?occupationName,"manager"))

} ORDER BY ?occupationName
LIMIT 15
"""

print("Results")
run_query(queryString)

Results
[('occupationName', 'IT project manager'), ('occupation', 'http://www.wikidata.org/entity/Q33060693')]
[('occupationName', 'Post office manager'), ('occupation', 'http://www.wikidata.org/entity/Q2106113')]
[('occupationName', 'advertising manager'), ('occupation', 'http://www.wikidata.org/entity/Q2560008')]
[('occupationName', 'art director'), ('occupation', 'http://www.wikidata.org/entity/Q706364')]
[('occupationName', 'artistic director'), ('occupation', 'http://www.wikidata.org/entity/Q1797162')]
[('occupationName', 'assistant director'), ('occupation', 'http://www.wikidata.org/entity/Q1757008')]
[('occupationName', 'association football manager'), ('occupation', 'http://www.wikidata.org/entity/Q628099')]
[('occupationName', 'bank manager'), ('occupation', 'http://www.wikidata.org/entity/Q32947888')]
[('occupationName', 'baseball manager'), ('occupation', 'http://www.wikidata.org/entity/Q1186921')]
[('occupationName', 'broadcast programming director'), ('occupation', 'http:/

15

### Try another way to find occupations
See data related to Opel

In [6]:
queryString = """
SELECT DISTINCT ?pName ?oName ?p ?o
WHERE { 
    wd:Q40966 ?p ?o .
    OPTIONAL { ?p sc:name ?pName }
    OPTIONAL { ?o sc:name ?oName }
} 
"""

print("Results")
run_query(queryString)

Results
[('pName', 'director / manager'), ('oName', 'Michael Lohscheller'), ('p', 'http://www.wikidata.org/prop/direct/P1037'), ('o', 'http://www.wikidata.org/entity/Q56509715')]
[('pName', 'product or material produced'), ('oName', 'van'), ('p', 'http://www.wikidata.org/prop/direct/P1056'), ('o', 'http://www.wikidata.org/entity/Q193468')]
[('pName', 'product or material produced'), ('oName', 'motor car'), ('p', 'http://www.wikidata.org/prop/direct/P1056'), ('o', 'http://www.wikidata.org/entity/Q1420')]
[('pName', 'founded by'), ('oName', 'Adam Opel'), ('p', 'http://www.wikidata.org/prop/direct/P112'), ('o', 'http://www.wikidata.org/entity/Q57479')]
[('pName', 'owned by'), ('oName', 'Stellantis'), ('p', 'http://www.wikidata.org/prop/direct/P127'), ('o', 'http://www.wikidata.org/entity/Q97439162')]
[('pName', 'named after'), ('oName', 'Adam Opel'), ('p', 'http://www.wikidata.org/prop/direct/P138'), ('o', 'http://www.wikidata.org/entity/Q57479')]
[('pName', "topic's main template"), ('oN

83

### Search fo instance of Company property

In [7]:
queryString = """
SELECT DISTINCT ?oName ?o
WHERE { 
    wd:Q40966 wdt:P31|wdt:P279 ?o .
    ?o sc:name ?oName .

} LIMIT 25
"""

print("Results")
run_query(queryString)

Results
[('oName', 'automobile manufacturer'), ('o', 'http://www.wikidata.org/entity/Q786820')]
[('oName', 'car brand'), ('o', 'http://www.wikidata.org/entity/Q10429667')]
[('oName', 'subsidiary'), ('o', 'http://www.wikidata.org/entity/Q658255')]


3

### Search superclass of the above property

In [8]:
queryString = """
SELECT DISTINCT ?subName ?sub
WHERE { 
    wd:Q40966 wdt:P31 ?o .
    OPTIONAL{ ?o wdt:P279 ?sub .}
    ?o sc:name ?oName .
    ?sub sc:name ?subName .
} LIMIT 25
"""

print("Results")
run_query(queryString)

Results
[('subName', 'manufacturer'), ('sub', 'http://www.wikidata.org/entity/Q13235160')]
[('subName', 'profession'), ('sub', 'http://www.wikidata.org/entity/Q28640')]
[('subName', 'brand'), ('sub', 'http://www.wikidata.org/entity/Q431289')]
[('subName', 'enterprise'), ('sub', 'http://www.wikidata.org/entity/Q6881511')]
[('subName', 'subsidiary entity'), ('sub', 'http://www.wikidata.org/entity/Q62079110')]
[('subName', 'commercial company'), ('sub', 'http://www.wikidata.org/entity/Q567521')]


6

### Find the managers/directors and founder of german companies

In [9]:
queryString = """
SELECT DISTINCT ?companyName (GROUP_CONCAT(distinct ?workerName ; SEPARATOR="; ") AS ?managers)
WHERE {
    {
        SELECT ?companyName ?workerName
        WHERE {
            ?company wdt:P1037 ?worker ; #manager
                 wdt:P112 ?founder ;
                 wdt:P17 wd:Q183 ;
                 sc:name ?companyName .
            ?worker sc:name ?workerName .
        }
    }
    UNION
    {
        SELECT ?companyName ?workerName
        WHERE {
            ?company wdt:P112 ?worker ; #founder
                 wdt:P17 wd:Q183 ;
                 sc:name ?companyName .
            ?worker sc:name ?workerName .
        }
    
    } 
           
} GROUP BY ?companyName
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('companyName', 'Die Burg'), ('managers', 'Willi Burth')]
[('companyName', 'Römheld & Moelle'), ('managers', 'Julius Römheld')]
[('companyName', 'Verlag Ferdinand Schöningh'), ('managers', 'Ferdinand Schöningh')]
[('companyName', 'Grafit Verlag'), ('managers', 'Rutger Booß')]
[('companyName', 'Aurelius Group'), ('managers', 'Dirk Markus')]
[('companyName', 'Freies Jüdisches Lehrhaus'), ('managers', 'Franz Rosenzweig')]
[('companyName', 'Aachener Domchor'), ('managers', 'Alcuin; Berthold Botzet; Charlemagne')]
[('companyName', 'Hulk Räckorz'), ('managers', 'Wizo')]
[('companyName', 'Chemische Fabrik Gebrüder Gravenhorst'), ('managers', 'Christoph Julius Gravenhorst; Johann Heinrich Gravenhorst')]
[('companyName', 'Max Bahr'), ('managers', 'Max Bahr')]
[('companyName', 'Bismarck-Gymnasium Karlsruhe'), ('managers', 'Johann Pistorius the Younger')]
[('companyName', 'K. G. Saur Verlag'), ('managers', 'Karl-Otto Saur')]
[('companyName', 'Kaiser-Wilhelm-Institut für Eisenforschung'),

20

### Search instance of human property

In [10]:
queryString = """
SELECT DISTINCT  ?oName ?o
WHERE { 
    wd:Q56509715 wdt:P31 ?o .
    ?o sc:name ?oName .
} 
"""

print("Results")
run_query(queryString)

Results
[('oName', 'human'), ('o', 'http://www.wikidata.org/entity/Q5')]


1

### Search human worker and their role in Germany company

In [11]:
queryString = """
SELECT DISTINCT ?workerName ?roleName ?role
WHERE {

    ?company ?role ?worker ;
         wdt:P17 wd:Q183 ;
         wdt:P279 wd:Q567521 ;
         sc:name ?companyName .
    
    ?worker sc:name ?workerName ;
            wdt:P31 wdt:Q5 .
    
    ?role sc:name ?roleName .
           
}
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
Empty


0

### Remove some predicates

In [12]:
queryString = """
SELECT DISTINCT ?companyName ?company
WHERE {

    ?company wdt:P17 wd:Q183 ;
         wdt:P279* wd:Q567521 ;
         sc:name ?companyName .           
}
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('companyName', 'Gemeinnützige GmbH'), ('company', 'http://www.wikidata.org/entity/Q1500859')]
[('companyName', 'Aktiengesellschaft'), ('company', 'http://www.wikidata.org/entity/Q422062')]
[('companyName', 'Offene Handelsgesellschaft'), ('company', 'http://www.wikidata.org/entity/Q19340991')]
[('companyName', 'Stiftung & Co. KG'), ('company', 'http://www.wikidata.org/entity/Q2348856')]
[('companyName', 'Entrepreneurial company (Germany)'), ('company', 'http://www.wikidata.org/entity/Q2498526')]
[('companyName', 'Stiftung & Co. KGaA'), ('company', 'http://www.wikidata.org/entity/Q2348860')]
[('companyName', 'Stiftung GmbH & Co. KG'), ('company', 'http://www.wikidata.org/entity/Q2348994')]


7

### Correct query (add * to wdt:P279)

In [13]:
queryString = """
SELECT DISTINCT ?workerName ?roleName ?role
WHERE {

    ?company ?role ?worker ;
         wdt:P17 wd:Q183 ;
         wdt:P279* wd:Q567521 ;
         sc:name ?companyName .
    
    ?worker sc:name ?workerName ;
            wdt:P31 wdt:Q5 .
    
    ?role sc:name ?roleName .
           
}
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
Empty


0

### Correct again the query: removed the subclass of and added the instance of predicate. Then it check if the instance is subclass of company

In [14]:
queryString = """
SELECT DISTINCT ?roleName ?role
WHERE {

    ?company ?role ?worker ;
         wdt:P17 wd:Q183 ;
         wdt:P31 ?instance ;
         sc:name ?companyName .
         
    ?instance wdt:P279* wd:Q567521 .
    
    ?worker wdt:P31 wd:Q5 .
    OPTIONAL{ ?worker sc:name ?workerName .}
    
    ?role sc:name ?roleName .
           
}
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('roleName', 'founded by'), ('role', 'http://www.wikidata.org/prop/direct/P112')]
[('roleName', 'named after'), ('role', 'http://www.wikidata.org/prop/direct/P138')]
[('roleName', 'owned by'), ('role', 'http://www.wikidata.org/prop/direct/P127')]
[('roleName', 'chief executive officer'), ('role', 'http://www.wikidata.org/prop/direct/P169')]
[('roleName', 'director / manager'), ('role', 'http://www.wikidata.org/prop/direct/P1037')]
[('roleName', 'board member'), ('role', 'http://www.wikidata.org/prop/direct/P3320')]
[('roleName', 'significant person'), ('role', 'http://www.wikidata.org/prop/direct/P3342')]
[('roleName', 'chairperson'), ('role', 'http://www.wikidata.org/prop/direct/P488')]
[('roleName', 'supervisory board member'), ('role', 'http://www.wikidata.org/prop/direct/P5052')]
[('roleName', 'owner of'), ('role', 'http://www.wikidata.org/prop/direct/P1830')]
[('roleName', 'investor'), ('role', 'http://www.wikidata.org/prop/direct/P1951')]


11

### Try reverse path

In [15]:
queryString = """
SELECT DISTINCT ?roleName ?role
WHERE {

    ?company wdt:P17 wd:Q183 ;
         wdt:P31 ?instance ;
         sc:name ?companyName .
         
    ?instance wdt:P279* wd:Q567521 .
    
    ?worker wdt:P31 wd:Q5 ;
            ?role ?company .
    OPTIONAL { ?worker sc:name ?workerName . }
    
    ?role sc:name ?roleName .
           
}
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('roleName', 'employer'), ('role', 'http://www.wikidata.org/prop/direct/P108')]
[('roleName', 'affiliation'), ('role', 'http://www.wikidata.org/prop/direct/P1416')]
[('roleName', 'field of work'), ('role', 'http://www.wikidata.org/prop/direct/P101')]
[('roleName', 'chief executive officer'), ('role', 'http://www.wikidata.org/prop/direct/P169')]
[('roleName', 'described by source'), ('role', 'http://www.wikidata.org/prop/direct/P1343')]
[('roleName', 'owner of'), ('role', 'http://www.wikidata.org/prop/direct/P1830')]
[('roleName', 'oral history at'), ('role', 'http://www.wikidata.org/prop/direct/P9600')]
[('roleName', 'member of'), ('role', 'http://www.wikidata.org/prop/direct/P463')]
[('roleName', 'has works in the collection'), ('role', 'http://www.wikidata.org/prop/direct/P6379')]
[('roleName', 'sponsor'), ('role', 'http://www.wikidata.org/prop/direct/P859')]
[('roleName', 'has written for'), ('role', 'http://www.wikidata.org/prop/direct/P6872')]
[('roleName', 'educated at')

14

### Result part 1
We can see that the last query return also the employees, so the previous one is better because it retrieves only "important" role of the companies.
It is fondamental the triple (?instance wdt:P279* wd:Q567521) to retrieve only the companies.

The next 2 query retreive:
* all the C.E.O., managers, directors or founders with thier role;
* all the C.E.O., managers, directors or founders with thier role grouped by company.

#### all the C.E.O., managers, directors or founders with thier role

In [16]:
queryString = """
SELECT ?workerName (GROUP_CONCAT(DISTINCT ?roleName; SEPARATOR="; ") AS ?roles)
WHERE {

    ?company ?role ?worker ;
         wdt:P17 wd:Q183 ;
         wdt:P31 ?instance ;
         sc:name ?companyName .
         
    ?instance wdt:P279* wd:Q567521 .
    
    ?worker wdt:P31 wd:Q5 .
    OPTIONAL{ ?worker sc:name ?workerName .}
    
    ?role sc:name ?roleName .
           
}
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('workerName', 'Wilhelm Haspel'), ('roles', 'board member')]
[('workerName', 'Thomas Eichelmann'), ('roles', 'chairperson; supervisory board member')]
[('workerName', 'Robert Pferdmenges'), ('roles', 'supervisory board member')]
[('workerName', 'Robert Gentz'), ('roles', 'founded by')]
[('workerName', 'Karl Rapp'), ('roles', 'founded by')]
[('workerName', 'Hermann Schilling'), ('roles', 'board member')]
[('workerName', 'Norbert Reithofer'), ('roles', 'director / manager')]
[('workerName', 'Walter J. Zable'), ('roles', 'founded by')]
[('workerName', 'Edzard Reuter'), ('roles', 'board member')]
[('workerName', 'Tobias Meyer'), ('roles', 'board member')]
[('workerName', 'Werner Baumann'), ('roles', 'chief executive officer')]
[('workerName', 'Helmut Gottschalk'), ('roles', 'director / manager')]
[('workerName', 'Carsten Spohr'), ('roles', 'board member; chief executive officer')]
[('workerName', 'Oliver Bäte'), ('roles', 'director / manager')]
[('workerName', 'Ulrich Schmack'), (

20

### all the C.E.O., managers, directors or founders with thier role grouped by company

In [17]:
queryString = """
SELECT ?companyName (GROUP_CONCAT(DISTINCT ?workerInfo; SEPARATOR="; ") AS ?workersInfo)
WHERE {
    
    ?company ?role ?worker ;
         wdt:P17 wd:Q183 ;
         wdt:P31 ?instance ;
         sc:name ?companyName .

    ?instance wdt:P279* wd:Q567521 .

    ?worker wdt:P31 wd:Q5 ;
        sc:name ?workerName .

    ?role sc:name ?roleName .
    
    BIND(CONCAT(?roleName,": ",?workerName) as ?workerInfo)
      
} GROUP BY ?companyName
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('companyName', 'BMW'), ('workersInfo', 'board member: Helmut Panke; board member: Karl-Heinz Sonne; chief executive officer: Oliver Zipse; director / manager: Norbert Reithofer; founded by: Camillo Castiglioni; founded by: Franz Josef Popp; founded by: Karl Rapp; owned by: Stefan Quandt; owned by: Susanne Klatten; supervisory board member: Reinhard Hüttl')]
[('companyName', 'Dachser Group SE & Co. KG'), ('workersInfo', 'director / manager: Bernhard Simon; founded by: Thomas Dachser')]
[('companyName', 'Vattenfall'), ('workersInfo', 'director / manager: Tuomo Hatakka')]
[('companyName', 'Lanxess'), ('workersInfo', 'director / manager: Matthias Zachert')]
[('companyName', 'Friedrich Krupp AG'), ('workersInfo', 'board member: Alfred Hugenberg; board member: Alfried Krupp von Bohlen und Halbach; board member: Bertha Krupp; board member: Friedrich Janssen; board member: Otto Wiedfeldt; founded by: Friedrich Krupp; supervisory board member: Gustav Krupp von Bohlen und Halbach; supe

20

### Same as previous query but for every person in a company has been grouped his/her roles (if more than one are present)

In [18]:
queryString = """
#SELECT ?companyName (GROUP_CONCAT(DISTINCT ?workerInfo; SEPARATOR="; ") AS ?workersInfo)
SELECT ?workerName ?roles ?workerInfo
WHERE {
    {
        SELECT ?companyName ?workerName (GROUP_CONCAT(DISTINCT ?roleName; SEPARATOR=", ") AS ?roles)
        WHERE{
            ?company ?role ?worker ;
                 wdt:P17 wd:Q183 ;
                 wdt:P31 ?instance ;
                 sc:name ?companyName .

            ?instance wdt:P279* wd:Q567521 .

            ?worker wdt:P31 wd:Q5 ;
                sc:name ?workerName .

            ?role sc:name ?roleName .
        } GROUP BY ?companyName ?workerName
    }
    BIND( CONCAT(?workerName," : ",?roles) AS ?workerInfo)
    
} GROUP BY ?companyName
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('workerName', 'Werner Wenning'), ('roles', 'director / manager'), ('workerInfo', 'Werner Wenning : director / manager')]
[('workerName', 'Ernst Tengelmann'), ('roles', 'board member'), ('workerInfo', 'Ernst Tengelmann : board member')]
[('workerName', 'Dr. Marcus Chromik'), ('roles', 'board member'), ('workerInfo', 'Dr. Marcus Chromik : board member')]
[('workerName', 'Klaus Rehnig'), ('roles', 'supervisory board member'), ('workerInfo', 'Klaus Rehnig : supervisory board member')]
[('workerName', 'Adolf Dassler'), ('roles', 'founded by, named after'), ('workerInfo', 'Adolf Dassler : founded by, named after')]
[('workerName', 'Adolf von Brüning'), ('roles', 'founded by, named after'), ('workerInfo', 'Adolf von Brüning : founded by, named after')]
[('workerName', 'Oliver Zipse'), ('roles', 'chief executive officer'), ('workerInfo', 'Oliver Zipse : chief executive officer')]
[('workerName', 'Ernst Leitz senior'), ('roles', 'founded by, named after'), ('workerInfo', 'Ernst Leitz 

20

### See the results written above for the Part 1

## Part 2
### Identify the BGP to retrieve gender and profession of people and industry of companies
Note: this is the last part done, so the predicate has been retrieved from the other queries

In [19]:
queryString = """
SELECT DISTINCT ?workerName ?genderType (GROUP_CONCAT(DISTINCT ?roleName; SEPARATOR="; ") AS ?prefession) (GROUP_CONCAT(DISTINCT ?sector; SEPARATOR="; ") AS ?sectors)
WHERE {

    ?company ?role ?worker ;
         wdt:P17 wd:Q183 ;
         wdt:P31 ?instance ;
         wdt:P452 ?industry ;
         sc:name ?companyName .
         
    ?instance wdt:P279* wd:Q567521 .
    ?industry sc:name ?sector .
    
    ?worker wdt:P31 wd:Q5 ;
         wdt:P21 ?gender ;     #gender detail
        sc:name ?workerName .
        
    ?gender sc:name ?genderType .
    
    ?role sc:name ?roleName .
           
}
GROUP BY ?workerName ?genderType
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('workerName', 'Friedrich Engelhorn'), ('genderType', 'male'), ('prefession', 'founded by'), ('sectors', 'chemical industry; crop protection; extraction of petroleum; pharmaceutical industry')]
[('workerName', 'Jürgen E. Schrempp'), ('genderType', 'male'), ('prefession', 'chief executive officer'), ('sectors', 'Manufacture of motor vehicles; arms industry; automotive industry; aviation industry; vehicle construction')]
[('workerName', 'Uğur Şahin'), ('genderType', 'male'), ('prefession', 'board member; chief executive officer'), ('sectors', 'pharmaceutical industry')]
[('workerName', 'Ken Allen'), ('genderType', 'male'), ('prefession', 'board member'), ('sectors', 'logistics; mail; postal and telecommunications services')]
[('workerName', 'Eugen Vögler'), ('genderType', 'male'), ('prefession', 'board member'), ('sectors', 'Construction; construction industry')]
[('workerName', 'Karl Winnacker'), ('genderType', 'male'), ('prefession', 'board member'), ('sectors', 'chemical indu

20

### Try to find work predicate or similar

In [20]:
queryString = """
SELECT DISTINCT ?pName ?p
WHERE { 
    wd:Q56509715 ?p ?o .
    ?p sc:name ?pName .
    FILTER (regex(?pName, ".*[Ww]ork.*")||regex(?pName, ".*[Ee]mploe.*"))
} 
"""

print("Results")
run_query(queryString)

Results
Empty


0

### Try to see all the predicate associated to Michael Lohscheller

In [21]:
queryString = """
SELECT DISTINCT  ?pName ?p
WHERE { 
    wd:Q56509715 ?p ?o .
    ?p sc:name ?pName .
} 
"""

print("Results")
run_query(queryString)

Results
[('pName', 'occupation'), ('p', 'http://www.wikidata.org/prop/direct/P106')]
[('pName', 'Munzinger person ID'), ('p', 'http://www.wikidata.org/prop/direct/P1284')]
[('pName', 'languages spoken, written or signed'), ('p', 'http://www.wikidata.org/prop/direct/P1412')]
[('pName', 'image'), ('p', 'http://www.wikidata.org/prop/direct/P18')]
[('pName', 'place of birth'), ('p', 'http://www.wikidata.org/prop/direct/P19')]
[('pName', 'sex or gender'), ('p', 'http://www.wikidata.org/prop/direct/P21')]
[('pName', 'Google Knowledge Graph ID'), ('p', 'http://www.wikidata.org/prop/direct/P2671')]
[('pName', 'country of citizenship'), ('p', 'http://www.wikidata.org/prop/direct/P27')]
[('pName', 'instance of'), ('p', 'http://www.wikidata.org/prop/direct/P31')]
[('pName', 'Commons category'), ('p', 'http://www.wikidata.org/prop/direct/P373')]
[('pName', 'date of birth'), ('p', 'http://www.wikidata.org/prop/direct/P569')]
[('pName', 'family name'), ('p', 'http://www.wikidata.org/prop/direct/P734

13

### Retrieve all the workers with: their gender, their roles and the sectors where they work.

In [22]:
queryString = """
SELECT DISTINCT ?workerName ?genderType 
(GROUP_CONCAT(DISTINCT ?occupationName; SEPARATOR="; ") AS ?prefession) (GROUP_CONCAT(DISTINCT ?sector; SEPARATOR="; ") AS ?sectors)
WHERE {

    ?company ?role ?worker ;
         wdt:P17 wd:Q183 ;
         wdt:P31 ?instance ;
         wdt:P452 ?industry ;
         sc:name ?companyName .
         
    ?instance wdt:P279* wd:Q567521 .
    ?industry sc:name ?sector .
    
    ?worker wdt:P31 wd:Q5 ;
         wdt:P21 ?gender ;     #gender detail
         wdt:P106 ?occupation ;
        sc:name ?workerName .
        
    ?gender sc:name ?genderType .
    ?occupation sc:name ?occupationName .
           
}
GROUP BY ?workerName ?genderType
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('workerName', 'Friedrich Engelhorn'), ('genderType', 'male'), ('prefession', 'entrepreneur'), ('sectors', 'chemical industry; crop protection; extraction of petroleum; pharmaceutical industry')]
[('workerName', 'Jürgen E. Schrempp'), ('genderType', 'male'), ('prefession', 'business executive'), ('sectors', 'Manufacture of motor vehicles; arms industry; automotive industry; aviation industry; vehicle construction')]
[('workerName', 'Uğur Şahin'), ('genderType', 'male'), ('prefession', 'entrepreneur; oncologist; researcher; university teacher'), ('sectors', 'pharmaceutical industry')]
[('workerName', 'Ken Allen'), ('genderType', 'male'), ('prefession', 'manager'), ('sectors', 'logistics; mail; postal and telecommunications services')]
[('workerName', 'Eugen Vögler'), ('genderType', 'male'), ('prefession', 'manager'), ('sectors', 'Construction; construction industry')]
[('workerName', 'Karl Winnacker'), ('genderType', 'male'), ('prefession', 'chemist; university teacher'), ('sec

20

### Result part 2
I tried to retreive "work" (or similar) predicate but I did not find it. But then I found "occupation" predicate and after using this predicate I can conclude that it is much more correct to use this predicate instead their role in the company. We can also see that most of them do different works.

## Part 3
### Are there german companies with C.E.O., managers, or founders that are not German?

In [23]:
queryString = """
SELECT DISTINCT ?companyName
WHERE {

    {
        ?company ?role ?worker ;
             wdt:P17 wd:Q183 ;
             wdt:P31 ?instance ;
             sc:name ?companyName .

        ?instance wdt:P279* wd:Q567521 .

        ?worker wdt:P31 wd:Q5 ;
                wdt:P27 ?nationality .
        ?worker sc:name ?workerName .
        ?nationality sc:name ?nationName .

        ?role sc:name ?roleName .
    }
    MINUS
    {
        ?company ?role ?worker ;
             wdt:P17 wd:Q183 ;
             wdt:P31 ?instance .
         
        ?instance wdt:P279* wd:Q567521 .

        ?worker wdt:P31 wd:Q5 ;
                wdt:P27 wd:Q183 .    
    }         
}
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('companyName', 'Hugo Boss')]
[('companyName', 'BMW')]
[('companyName', 'Siemens')]
[('companyName', 'Siemens Healthineers')]
[('companyName', 'Deutsche Bank')]
[('companyName', 'Metro AG')]
[('companyName', 'E.ON SE')]
[('companyName', 'TUI AG')]
[('companyName', 'Donner & Reuschel')]
[('companyName', 'Bayer')]
[('companyName', 'SAP SE')]
[('companyName', 'Dieter-Schwarz-Stiftung')]
[('companyName', 'Leica Camera')]
[('companyName', 'Grillo-Werke')]
[('companyName', 'Agfa-Gevaert')]
[('companyName', 'HHLA')]
[('companyName', 'Wiley')]
[('companyName', 'Willy Bogner GmbH & Co. KGaA')]
[('companyName', 'Südzucker')]
[('companyName', 'Deutsche Post AG')]


20

### Create the same as before but with an ASK query

In [24]:
queryString = """
ASK WHERE {
        ?company ?role ?worker ;
             wdt:P17 wd:Q183 ;
             wdt:P31 ?instance .

        ?instance wdt:P279* wd:Q567521 .

        ?worker wdt:P31 wd:Q5 ;
                wdt:P27 ?nationality .
        
        ?nationality sc:name ?nationName .

    FILTER ( ?nationality != wd:Q183 )
   
           
}
"""

print("Results")
run_ask_query(queryString)

Results


{'head': {'link': []}, 'boolean': True}

### Try to display not only the companies but also the nationality of not german worker 
Wrong result... it retrieves also companies with only german worker

In [25]:
queryString = """
SELECT DISTINCT ?companyName (GROUP_CONCAT(DISTINCT ?nationName; SEPARATOR=", ") AS ?nationalities)
WHERE {
        ?company ?role ?worker ;
             wdt:P17 wd:Q183 ;
             wdt:P31 ?instance ;
             sc:name ?companyName .

        ?instance wdt:P279* wd:Q567521 .

        ?worker wdt:P31 wd:Q5 ;
                wdt:P27 ?nationality .
        ?worker sc:name ?workerName .
        ?nationality sc:name ?nationName .

        ?role sc:name ?roleName .
    
    MINUS
    {
        ?company ?role ?worker ;
             wdt:P17 wd:Q183 ;
             wdt:P31 ?instance .
         
        ?instance wdt:P279* wd:Q567521 .

        ?worker wdt:P31 wd:Q5 ;
                wdt:P27 wd:Q183 . 
    }
}
GROUP BY ?companyName
LIMIT 15
"""

print("Results")
run_query(queryString)

Results
[('companyName', 'BMW'), ('nationalities', 'Austria, Austria-Hungary, Italy')]
[('companyName', 'Vattenfall'), ('nationalities', 'Finland')]
[('companyName', 'Lanxess'), ('nationalities', 'Germany')]
[('companyName', 'Hochtief'), ('nationalities', 'Germany')]
[('companyName', 'Wiley'), ('nationalities', 'United States of America')]
[('companyName', 'Jefferies & Company'), ('nationalities', 'United States of America')]
[('companyName', 'Volkswagen Group'), ('nationalities', 'Austria, Austria-Hungary, Czechoslovakia, Germany, Republic of German-Austria')]
[('companyName', 'Vonovia'), ('nationalities', 'Germany')]
[('companyName', 'Deutsche Bank'), ('nationalities', 'German Reich, Kingdom of Prussia')]
[('companyName', 'Aktiengesellschaft für Seeschiffahrt'), ('nationalities', 'German Reich')]
[('companyName', 'Air Berlin'), ('nationalities', 'Germany')]
[('companyName', 'Opel'), ('nationalities', 'German Reich')]
[('companyName', 'Siemens'), ('nationalities', 'Denmark, German Rei

15

### Rewriting the query to remove wrong rows

In [26]:
queryString = """
SELECT DISTINCT ?companyName (GROUP_CONCAT(DISTINCT ?nationName; SEPARATOR=", ") AS ?nationalities)
WHERE {

    
    ?company ?role ?worker ;
         wdt:P17 wd:Q183 ;
         wdt:P31 ?instance ;
         sc:name ?companyName .

    ?instance wdt:P279* wd:Q567521 .

    ?worker wdt:P31 wd:Q5 ;
            wdt:P27 ?nationality .
    ?worker sc:name ?workerName .
    ?nationality sc:name ?nationName .

    ?role sc:name ?roleName .
    
    FILTER (?nationality != wd:Q183)
    
}
GROUP BY ?companyName
LIMIT 15
"""

print("Results")
run_query(queryString)

Results
[('companyName', 'BMW'), ('nationalities', 'Austria, Austria-Hungary, Italy')]
[('companyName', 'Vattenfall'), ('nationalities', 'Finland')]
[('companyName', 'Wiley'), ('nationalities', 'United States of America')]
[('companyName', 'Jefferies & Company'), ('nationalities', 'United States of America')]
[('companyName', 'Volkswagen Group'), ('nationalities', 'Austria, Austria-Hungary, Czechoslovakia, Republic of German-Austria')]
[('companyName', 'Deutsche Bank'), ('nationalities', 'German Reich, Kingdom of Prussia, United States of America')]
[('companyName', 'Aktiengesellschaft für Seeschiffahrt'), ('nationalities', 'German Reich')]
[('companyName', 'Opel'), ('nationalities', 'German Reich')]
[('companyName', 'Siemens'), ('nationalities', 'Denmark, German Empire, German Reich')]
[('companyName', 'Hoechst'), ('nationalities', 'German Empire, German Reich, United Kingdom of Great Britain and Ireland')]
[('companyName', 'Deutsche Post AG'), ('nationalities', 'Austria, Kingdom of E

15

### Result part 3
As we can see there are companies with C.E.O., managers, or founders that are not German. In particular from the first one we can see the companies' name and from the last one we can see also the nationalities of the workers.

After seeing the problems and the correction of the last query we can conclude that the results of the first query could be wrong.

## Part 4
### Are there people related to multiple german companies?

### Create a query to group companies for each worker

In [27]:
queryString = """
SELECT DISTINCT ?workerName (GROUP_CONCAT(DISTINCT ?companyName; SEPARATOR=", ") AS ?companies)
WHERE {

        ?company ?role ?worker ;
             wdt:P17 wd:Q183 ;
             wdt:P31 ?instance ;
             sc:name ?companyName .

        ?instance wdt:P279* wd:Q567521 .

        ?worker wdt:P31 wd:Q5 ;
                wdt:P27 ?nationality .
        ?worker sc:name ?workerName .
        ?nationality sc:name ?nationName .

        ?role sc:name ?roleName .
           
} GROUP BY ?workerName
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('workerName', 'Wilhelm Haspel'), ('companies', 'Daimler-Benz')]
[('workerName', 'Thomas Eichelmann'), ('companies', 'Wirecard')]
[('workerName', 'Robert Pferdmenges'), ('companies', 'Dresdner Bank')]
[('workerName', 'Robert Gentz'), ('companies', 'Zalando SE')]
[('workerName', 'Karl Rapp'), ('companies', 'BMW')]
[('workerName', 'Hermann Schilling'), ('companies', 'E.ON SE')]
[('workerName', 'Norbert Reithofer'), ('companies', 'BMW')]
[('workerName', 'Walter J. Zable'), ('companies', 'Cubic Corporation')]
[('workerName', 'Edzard Reuter'), ('companies', 'Daimler-Benz')]
[('workerName', 'Tobias Meyer'), ('companies', 'Deutsche Post AG')]
[('workerName', 'Werner Baumann'), ('companies', 'Bayer')]
[('workerName', 'Carsten Spohr'), ('companies', 'Lufthansa')]
[('workerName', 'Oliver Bäte'), ('companies', 'Allianz SE')]
[('workerName', 'Ulrich Schmack'), ('companies', 'Schmack Biogas Service')]
[('workerName', 'Carl Duisberg'), ('companies', 'Bayer')]
[('workerName', 'Werner Wenning

20

### Add a counter of companies to find who works for at least 2 companies

In [28]:
queryString = """
SELECT DISTINCT ?workerName (GROUP_CONCAT(DISTINCT ?companyName; SEPARATOR=", ") AS ?companies)
WHERE {

        ?company ?role ?worker ;
             wdt:P17 wd:Q183 ;
             wdt:P31 ?instance ;
             sc:name ?companyName .

        ?instance wdt:P279* wd:Q567521 .

        ?worker wdt:P31 wd:Q5 ;
                wdt:P27 ?nationality ;
                sc:name ?workerName .

        ?role sc:name ?roleName .
           
}
GROUP BY ?workerName 
HAVING ( (COUNT(DISTINCT ?companyName) AS ?howManyCompanies)  > 1 )
LIMIT 25
"""

print("Results")
run_query(queryString)

Results
[('workerName', 'Dieter Zetsche'), ('companies', 'Daimler AG, TUI AG')]
[('workerName', 'Ernst Werner von Siemens'), ('companies', 'Siemens, Siemens Healthineers')]
[('workerName', 'Dietmar Hopp'), ('companies', 'CureVac, SAP SE')]
[('workerName', 'Günther Quandt'), ('companies', 'Daimler-Benz, KUKA')]
[('workerName', 'Georg Adolf Solmssen'), ('companies', 'Dahlbusch Verwaltungs-AG, Deutsche Bank')]
[('workerName', 'Hugo Henkel'), ('companies', 'Deutsche Bank, Henkel AG & Co. KGaA')]


6

### Result part 4
We can see that there are 6 person (in our database) which are/were related to more than one company.

## Part 5

### Part 5.1 How many people for each role and gender are there in Germany?

In [29]:
queryString = """
SELECT DISTINCT ?roleName (COUNT(DISTINCT ?worker) as ?howManyWorkers) 
WHERE {

        ?company ?role ?worker ;
             wdt:P17 wd:Q183 ;
             wdt:P31 ?instance ;
             sc:name ?companyName .

        ?instance wdt:P279* wd:Q567521 .

        ?worker wdt:P31 wd:Q5 .

        ?role sc:name ?roleName .
           
}
GROUP BY ?roleName 
"""

print("Results")
run_query(queryString)

Results
[('roleName', 'chairperson'), ('howManyWorkers', '8')]
[('roleName', 'chief executive officer'), ('howManyWorkers', '33')]
[('roleName', 'director / manager'), ('howManyWorkers', '33')]
[('roleName', 'founded by'), ('howManyWorkers', '60')]
[('roleName', 'named after'), ('howManyWorkers', '16')]
[('roleName', 'board member'), ('howManyWorkers', '92')]
[('roleName', 'owner of'), ('howManyWorkers', '1')]
[('roleName', 'owned by'), ('howManyWorkers', '11')]
[('roleName', 'significant person'), ('howManyWorkers', '4')]
[('roleName', 'supervisory board member'), ('howManyWorkers', '35')]
[('roleName', 'investor'), ('howManyWorkers', '1')]


11

### Search gender predicate

In [30]:
queryString = """
SELECT DISTINCT ?pName ?oName ?p ?o
WHERE { 
    wd:Q56509715 ?p ?o .
    ?p sc:name ?pName .
    ?o sc:name ?oName .
    
    FILTER regex(?pName,"gender")
} 
"""

print("Results")
run_query(queryString)

Results
[('pName', 'sex or gender'), ('oName', 'male'), ('p', 'http://www.wikidata.org/prop/direct/P21'), ('o', 'http://www.wikidata.org/entity/Q6581097')]


1

### Add gender predicate

In [31]:
queryString = """
SELECT DISTINCT ?roleName ?genderType (COUNT(DISTINCT ?worker) as ?howManyWorkers) 
WHERE {

        ?company ?role ?worker ;
             wdt:P17 wd:Q183 ;
             wdt:P31 ?instance ;
             sc:name ?companyName .

        ?instance wdt:P279* wd:Q567521 .

        ?worker wdt:P31 wd:Q5 ;
                wdt:P21 ?gender . #gender detail
        
        ?gender sc:name ?genderType .      

        ?role sc:name ?roleName .
           
}
GROUP BY ?roleName ?genderType
ORDER BY ?roleName ?genderType
"""

print("Results")
run_query(queryString)

Results
[('roleName', 'board member'), ('genderType', 'female'), ('howManyWorkers', '8')]
[('roleName', 'board member'), ('genderType', 'male'), ('howManyWorkers', '83')]
[('roleName', 'chairperson'), ('genderType', 'female'), ('howManyWorkers', '1')]
[('roleName', 'chairperson'), ('genderType', 'male'), ('howManyWorkers', '7')]
[('roleName', 'chief executive officer'), ('genderType', 'female'), ('howManyWorkers', '2')]
[('roleName', 'chief executive officer'), ('genderType', 'male'), ('howManyWorkers', '31')]
[('roleName', 'director / manager'), ('genderType', 'female'), ('howManyWorkers', '1')]
[('roleName', 'director / manager'), ('genderType', 'male'), ('howManyWorkers', '32')]
[('roleName', 'founded by'), ('genderType', 'male'), ('howManyWorkers', '60')]
[('roleName', 'investor'), ('genderType', 'male'), ('howManyWorkers', '1')]
[('roleName', 'named after'), ('genderType', 'male'), ('howManyWorkers', '16')]
[('roleName', 'owned by'), ('genderType', 'female'), ('howManyWorkers', '1

18

### Result part 5.1
We can notice that (in our database) for each role the male workers are much more than the females workers.The first 2 rows have the highest difference.

Another thing is that for some roles there are not female figures (for example: founded by, investor and named after)

### Part 5.2 Are there companies with multiple german people having important roles?

In [32]:
queryString = """
ASK WHERE {
    {
        SELECT (COUNT(DISTINCT ?company) AS ?howManyComapanies) WHERE
        {
            {
                SELECT ?company
                WHERE{
                    ?company ?role ?worker ;
                         wdt:P17 wd:Q183 ;
                         wdt:P31 ?instance .

                    ?instance wdt:P279* wd:Q567521 .

                    ?worker wdt:P31 wd:Q5 ;
                            wdt:P27 wd:Q183 .    
                } GROUP BY ?company
                HAVING (COUNT(DISTINCT ?workers) > 1)
            }
        }
    }
    FILTER ( ?howManyComapanies > 1)
             
}


"""

print("Results")
run_ask_query(queryString)

Results


{'head': {'link': []}, 'boolean': False}

### See the retrieved results 
Frome the results we can see that there are at least 2 company with more than one german having important roles. So we can conclude that the previus query was wrong.

In [33]:
queryString = """

SELECT ?companyName (COUNT(DISTINCT ?worker) AS ?howManyGerman)
WHERE{
    ?company ?role ?worker ;
         wdt:P17 wd:Q183 ;
         wdt:P31 ?instance ;
         sc:name ?companyName .

    ?instance wdt:P279* wd:Q567521 .

    ?worker wdt:P31 wd:Q5 ;
            wdt:P27 wd:Q183 .    

} 
GROUP BY ?companyName
ORDER BY DESC(?howManyGerman)
LIMIT 10

"""

print("Results")
run_query(queryString)

Results
[('companyName', 'Deutsche Bank'), ('howManyGerman', '10')]
[('companyName', 'Daimler-Benz'), ('howManyGerman', '10')]
[('companyName', 'BMW'), ('howManyGerman', '9')]
[('companyName', 'Friedrich Krupp AG'), ('howManyGerman', '8')]
[('companyName', 'BASF'), ('howManyGerman', '8')]
[('companyName', 'Hoechst'), ('howManyGerman', '6')]
[('companyName', 'Allianz SE'), ('howManyGerman', '6')]
[('companyName', 'Opel'), ('howManyGerman', '5')]
[('companyName', 'Bayer'), ('howManyGerman', '5')]
[('companyName', 'Gelsenkirchener Bergwerks-AG'), ('howManyGerman', '5')]


10

### Try the correct ask query

In [34]:
queryString = """
ASK WHERE{
    FILTER (?howManyCompany > 1)
    {
    SELECT (COUNT(?company) AS ?howManyCompany) WHERE
        {
            SELECT DISTINCT ?company
            WHERE{
                ?company ?role ?worker ;
                     wdt:P17 wd:Q183 ;
                     wdt:P31 ?instance ;
                     sc:name ?companyName .

                ?instance wdt:P279* wd:Q567521 .

                ?worker wdt:P31 wd:Q5 ;
                        wdt:P27 wd:Q183 .    

            } 
            GROUP BY ?company
            HAVING ((COUNT(DISTINCT ?worker) AS ?howManyGerman)>1)
        }
    }
     
}

"""

print("Results")
run_ask_query(queryString)

Results


{'head': {'link': []}, 'boolean': True}

### Results part 5.2
We can conclude that there are companies that have important person who are not german. So there is not discrimination based on place of origin.

### Part 5.3 In which sectors are important german business people working?

In [35]:
queryString = """

SELECT DISTINCT ?workerName (GROUP_CONCAT(DISTINCT ?sector; SEPARATOR=", ") AS ?sectors)
WHERE{
    ?company ?role ?worker ;
         wdt:P17 wd:Q183 ;
         wdt:P31 ?instance ;
         sc:name ?companyName .
    
    ?instance wdt:P279* wd:Q567521 ;
        sc:name ?sector .

    ?worker wdt:P31 wd:Q5 ;
        wdt:P27 wd:Q183 ;
        sc:name ?workerName .

} 
GROUP BY ?workerName
LIMIT 10

"""

print("Results")
run_query(queryString)

Results
[('workerName', 'Wilhelm Haspel'), ('sectors', 'public company')]
[('workerName', 'Thomas Eichelmann'), ('sectors', 'public company')]
[('workerName', 'Robert Gentz'), ('sectors', 'public company')]
[('workerName', 'Robert Pferdmenges'), ('sectors', 'public company')]
[('workerName', 'Karl Rapp'), ('sectors', 'public company')]
[('workerName', 'Norbert Reithofer'), ('sectors', 'public company')]
[('workerName', 'Hermann Schilling'), ('sectors', 'public company')]
[('workerName', 'Edzard Reuter'), ('sectors', 'public company')]
[('workerName', 'Werner Baumann'), ('sectors', 'public company')]
[('workerName', 'Carsten Spohr'), ('sectors', 'public company')]


10

### Use wdt:P452 industry predicate retrieved in part 1

In [36]:
queryString = """

SELECT DISTINCT ?workerName (GROUP_CONCAT(DISTINCT ?sector; SEPARATOR=", ") AS ?sectors)
WHERE{
    ?company ?role ?worker ;
         wdt:P17 wd:Q183 ;
         wdt:P31 ?instance ;
         wdt:P452 ?industry ;
         sc:name ?companyName .
    
    ?instance wdt:P279* wd:Q567521 .
    ?industry sc:name ?sector .

    ?worker wdt:P31 wd:Q5 ;
        wdt:P27 wd:Q183 ;
        sc:name ?workerName .

} 
GROUP BY ?workerName
LIMIT 10

"""

print("Results")
run_query(queryString)

Results
[('workerName', 'Wilhelm Haspel'), ('sectors', 'Manufacture of motor vehicles, arms industry, automotive industry, aviation industry, vehicle construction')]
[('workerName', 'Thomas Eichelmann'), ('sectors', 'financial services')]
[('workerName', 'Robert Gentz'), ('sectors', 'retail')]
[('workerName', 'Robert Pferdmenges'), ('sectors', 'Financial service activities, except insurance and pension funding, Other monetary intermediation, financial services')]
[('workerName', 'Karl Rapp'), ('sectors', 'Manufacture of motor vehicles, automotive industry, vehicle construction')]
[('workerName', 'Norbert Reithofer'), ('sectors', 'Manufacture of motor vehicles, automotive industry, vehicle construction')]
[('workerName', 'Hermann Schilling'), ('sectors', 'energy and water industries, energy sector')]
[('workerName', 'Edzard Reuter'), ('sectors', 'Manufacture of motor vehicles, arms industry, automotive industry, aviation industry, vehicle construction')]
[('workerName', 'Werner Baumann'

10

### Result 5.3 
We can notice that there are people who are working in more than one secotr and other people in only one.

### Part 5.4 Are there german business people related to non german companies? How many?

In [37]:
queryString = """

SELECT DISTINCT ?companyName ?nationName (GROUP_CONCAT(DISTINCT ?workerName; SEPARATOR=", ") AS ?workers)
WHERE{
    ?company ?role ?worker ;
         wdt:P17 ?nation ;
         wdt:P31 ?instance ;
         wdt:P452 ?industry ;
         sc:name ?companyName .
    
    ?nation sc:name ?nationName .
    
    ?instance wdt:P279* wd:Q567521 .
    ?industry sc:name ?sector .

    ?worker wdt:P31 wd:Q5 ;
        wdt:P27 wd:Q183 ;
        sc:name ?workerName .
    
    FILTER( ?nation != wd:Q183 )
} 
GROUP BY ?companyName ?nationName
LIMIT 10

"""

print("Results")
run_query(queryString)

Results
[('companyName', 'MMX Mineração'), ('nationName', 'Brazil'), ('workers', 'Eike Batista')]
[('companyName', 'Lloyds Banking Group'), ('nationName', 'United Kingdom'), ('workers', 'Winfried Bischoff')]
[('companyName', 'Imperial Brands'), ('nationName', 'United Kingdom'), ('workers', 'Stefan Bomhard')]
[('companyName', 'Bâloise'), ('nationName', 'Switzerland'), ('workers', 'Alexander Bockelmann')]
[('companyName', 'ams AG'), ('nationName', 'Austria'), ('workers', 'Alexander Everke')]
[('companyName', 'Demandware'), ('nationName', 'United States of America'), ('workers', 'Stephan Schambach')]
[('companyName', 'Rolls-Royce'), ('nationName', 'United Kingdom'), ('workers', 'Jasmin Staiblin')]
[('companyName', 'Dreamville, Inc.'), ('nationName', 'United States of America'), ('workers', 'J Cole')]
[('companyName', 'Inchcape plc'), ('nationName', 'United Kingdom'), ('workers', 'Stefan Bomhard')]
[('companyName', 'Maersk'), ('nationName', 'Denmark'), ('workers', 'Dorothee Blessing')]


10

### Retrieve the number of german workers that are working for a non german company

In [38]:
queryString = """

SELECT (COUNT(DISTINCT ?worker) AS ?howManyGermanWorksOutside)
WHERE{
    ?company ?role ?worker ;
         wdt:P17 ?nation ;
         wdt:P31 ?instance .
    
    ?nation sc:name ?nationName .
    
    ?instance wdt:P279* wd:Q567521 .

    ?worker wdt:P31 wd:Q5 ;
        wdt:P27 wd:Q183 .
    
    FILTER( ?nation != wd:Q183 )
}

"""

print("Results")
run_query(queryString)

Results
[('howManyGermanWorksOutside', '43')]


1

### Result part 5.4
There are 43 german person who are working in non german company. Note that they are covering important roles in these company.