# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [2]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-NOTEBOOK_CODE_HERE-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# Companies Workflow Series ("IT Companies explorative search") 

Consider the following exploratory information need:

> Compare companies across different sectors in U.K., U.S., and Canada, consider number of employees, companies owned or acquired, and revenue or assets

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P279`    | subclass      | predicate |
| `wdt:P17`      | country       | predicate |
| `wd:Q4830453`  | Business      | node      |
| `wd:Q13977`    | Bloomberg L.P.| node |
| `wd:Q502121`   | BlackBerry    | node |
| `wd:Q16`        | Canada        | node |
| `wd:Q145`      | U.K.          | node |
| `wd:Q30`       | U.S.A.        | node |


Also consider

```
?p wdt:P17 wd:Q16  . 
?p wdt:P31 wd:Q4830453  . 
```

is the BGP to retrieve all **canadian businesses**

## Workload Goals

1. Identify the BGP for obtaining number of employees of a company and other relevant numerical attributes

2. Identify the BGP to retrieve all companies owned by a company

3. Is there some company that owns companies in other countries?

4. Companies have different 'legal forms', compare the number of companies divided in different legal forms

5. Analyze the number of employees  and other relevant numeric attributes
 
   5.1 What are the top-10 companies for a given attribute?
   
   5.2 For which companies is defined some form of income or market capitalization or total assets? What is the min, max, and average in each category and country?
   
   5.3 Which business in each country owns more businesses in other countries?


In [2]:
# start your workflow here

In [3]:
queryString = """
SELECT COUNT(*)
WHERE { 

?p wdt:P17 wd:Q16  . 
?p wdt:P31 wd:Q4830453  . 
} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '2482')]


1

# Workload 1

In [None]:
# 1. Identify the BGP for obtaining number of employees of a company and other relevant numerical attributes

In [130]:
# Find all companies in Canadian Business

In [186]:
queryString = """
SELECT ?p ?pname
WHERE { 

?p wdt:P17 wd:Q16  . 
?p wdt:P31 wd:Q4830453  . 


# this returns the labels
    ?p <http://schema.org/name> ?pname .
    
} 
ORDER BY ASC (?pname) 
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q4545554'), ('pname', '01 Communique')]
[('p', 'http://www.wikidata.org/entity/Q30253583'), ('pname', '123 Certification (Canada)')]
[('p', 'http://www.wikidata.org/entity/Q4549346'), ('pname', '13 Flames Empire')]
[('p', 'http://www.wikidata.org/entity/Q30282461'), ('pname', '360pi (Canada)')]
[('p', 'http://www.wikidata.org/entity/Q30267183'), ('pname', '3v Geomatics (Canada)')]
[('p', 'http://www.wikidata.org/entity/Q67029200'), ('pname', '407 ETR Concession Company Limited')]
[('p', 'http://www.wikidata.org/entity/Q67030462'), ('pname', '407 International Inc')]
[('p', 'http://www.wikidata.org/entity/Q47069776'), ('pname', '7Geese')]
[('p', 'http://www.wikidata.org/entity/Q4646438'), ('pname', '9 Story Media Group')]
[('p', 'http://www.wikidata.org/entity/Q4655664'), ('pname', 'A Buck or Two')]


10

In [132]:
# Analyze company properties

In [184]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

?s wdt:P17 wd:Q16  ; 
   wdt:P31 wd:Q4830453 ;
   ?p ?o .


# this returns the labels
    ?p <http://schema.org/name> ?pname .
 #   FILTER REGEX (?pname, "Apple") 
} 
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P2751'), ('pname', 'Roller Coaster Database ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1056'), ('pname', 'product or material produced')]
[('p', 'http://www.wikidata.org/prop/direct/P112'), ('pname', 'founded by')]
[('p', 'http://www.wikidata.org/prop/direct/P1128'), ('pname', 'employees')]
[('p', 'http://www.wikidata.org/prop/direct/P127'), ('pname', 'owned by')]
[('p', 'http://www.wikidata.org/prop/direct/P1454'), ('pname', 'legal form')]
[('p', 'http://www.wikidata.org/prop/direct/P154'), ('pname', 'logo image')]
[('p', 'http://www.wikidata.org/prop/direct/P159'), ('pname', 'headquarters location')]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('pname', 'country')]
[('p', 'http://www.wikidata.org/prop/direct/P18'), ('pname', 'image')]
[('p', 'http://www.wikidata.org/prop/direct/P2002'), ('pname', 'Twitter username')]
[('p', 'http://www.wikidata.org/prop/direct/P2003'), ('pname', 'Instagram username')]
[('p', 'http://w

20

In [None]:
# Chosen different properties from the list above: number of employees, founded by, headquarterslocation,
# inception, total revenue, net profit

In [None]:
# BGP for obtaining number of employees of a canadian company and other relevant numerical attributes 

In [43]:
queryString = """
SELECT DISTINCT ?p  ?pname ?numEmployees (GROUP_CONCAT(DISTINCT ?foundername ; separator = ", ") AS ?founders) ?inception 
?headquarterslocation ?totalrevenue ?netprofit WHERE { 

  ?p    wdt:P17 wd:Q16  ;
        wdt:P31 wd:Q4830453 ;
        wdt:P1128 ?numEmployees ;
        wdt:P112 ?founder ; 
        wdt:P571 ?inception ;
        wdt:P159 ?hql .
       
        
OPTIONAL { ?p wdt:P2295 ?netprofit } . 

OPTIONAL { ?p wdt:P2139 ?totalrevenue } . 

?p <http://schema.org/name> ?pname .
?founder <http://schema.org/name> ?foundername .
?hql <http://schema.org/name> ?headquarterslocation .

} LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/entity/Q378767'), ('pname', 'Adtranz'), ('numEmployees', '22715'), ('founders', 'ABB Group'), ('inception', '1996-01-01T00:00:00Z'), ('headquarterslocation', 'Berlin')]
[('p', 'http://www.wikidata.org/entity/Q5264469'), ('pname', 'D2L'), ('numEmployees', '800'), ('founders', 'John Baker'), ('inception', '1999-01-01T00:00:00Z'), ('headquarterslocation', 'Kitchener')]
[('p', 'http://www.wikidata.org/entity/Q835456'), ('pname', 'Sun Life Financial'), ('numEmployees', '15000'), ('founders', 'Matthew Hamilton Gault'), ('inception', '1865-01-01T00:00:00Z'), ('headquarterslocation', 'Sun Life Building')]
[('p', 'http://www.wikidata.org/entity/Q76039'), ('pname', "Hudson's Bay Company"), ('numEmployees', '70000'), ('founders', 'Médard des Groseilliers, Pierre-Esprit Radisson'), ('inception', '1670-05-02T00:00:00Z'), ('headquarterslocation', 'Toronto')]
[('p', 'http://www.wikidata.org/entity/Q1436921'), ('pname', 'Ford Motor Company of Canada'), ('numEmpl

16

In [155]:
#
#

# Workload 2

In [6]:
# 2. Identify the BGP to retrieve all companies owned by a company 

In [None]:
# Look for properties linked to companies 

In [28]:
queryString = """
SELECT DISTINCT ?p ?pname 
WHERE { 


?s ?p ?company .
?company wdt:P17 wd:Q16  ; 
         wdt:P31 wd:Q4830453 .
         


?p <http://schema.org/name> ?pname .

} 
 
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('pname', 'employer')]
[('p', 'http://www.wikidata.org/prop/direct/P127'), ('pname', 'owned by')]
[('p', 'http://www.wikidata.org/prop/direct/P137'), ('pname', 'operator')]
[('p', 'http://www.wikidata.org/prop/direct/P176'), ('pname', 'manufacturer')]
[('p', 'http://www.wikidata.org/prop/direct/P1830'), ('pname', 'owner of')]
[('p', 'http://www.wikidata.org/prop/direct/P272'), ('pname', 'production company')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('pname', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P466'), ('pname', 'occupant')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('pname', 'has part')]
[('p', 'http://www.wikidata.org/prop/direct/P61'), ('pname', 'discoverer or inventor')]
[('p', 'http://www.wikidata.org/prop/direct/P749'), ('pname', 'parent organization')]
[('p', 'http://www.wikidata.org/prop/direct/P859'), ('pname', 'sponsor')]
[('p', 'http://www.wikidata.org/prop/dire

20

In [None]:
# P127 --> owned by

In [None]:
# Find all the companies (case: Canadian companies) owned by a company 
# (select all the ?c instance of Business to find only companies)

In [30]:
queryString = """
SELECT DISTINCT (GROUP_CONCAT(DISTINCT ?companyname ; separator = ", ") AS ?listcompanies) ?ob ?companyn
WHERE { 


   ?company wdt:P17 wd:Q16  ; 
            wdt:P31 wd:Q4830453 .
   ?c wdt:P31 wd:Q4830453 ;
      wdt:P127 ?company .
   



?c <http://schema.org/name> ?companyname .
wdt:P127 <http://schema.org/name> ?ob .
?company  <http://schema.org/name> ?companyn .
} 
 
LIMIT 200
"""

print("Results")
run_query(queryString)

Results
[('listcompanies', 'Purolator Courier'), ('ob', 'owned by'), ('companyn', 'Canada Post')]
[('listcompanies', 'City National Bank'), ('ob', 'owned by'), ('companyn', 'Royal Bank of Canada')]
[('listcompanies', 'Sunrise Senior Living'), ('ob', 'owned by'), ('companyn', 'Revera')]
[('listcompanies', 'High Fidelity HDTV'), ('ob', 'owned by'), ('companyn', 'Blue Ant Media')]
[('listcompanies', 'LightLife'), ('ob', 'owned by'), ('companyn', 'Maple Leaf Foods')]
[('listcompanies', 'Carestream Health, Parkdean Resorts, SIG Combibloc GmbH, Tomkins plc'), ('ob', 'owned by'), ('companyn', 'Onex Corporation')]
[('listcompanies', 'ClearBank'), ('ob', 'owned by'), ('companyn', 'Clearwater Fine Foods')]
[('listcompanies', 'Banco Wiese Sudameris, Scotiabank Perú, Tangerine'), ('ob', 'owned by'), ('companyn', 'Scotiabank')]
[('listcompanies', 'Atlantic Broadband'), ('ob', 'owned by'), ('companyn', 'Cogeco')]
[('listcompanies', 'OMERS Infrastructure European Holdings'), ('ob', 'owned by'), ('com

59

In [178]:
#
#

# Workload 3

In [179]:
# 3. Is there some company that owns companies in other countries?

In [None]:
# Using the BGP above on the Canadian company, we try to find companies that owns companies in other country (?country code different from Canadian code wd:Q16)

In [None]:
# With '?countryname' it confirms the correct results

In [31]:
queryString = """
SELECT DISTINCT ?c ?companyname ?countryname ?ob ?companyn ?cont
WHERE { 

?c wdt:P127 ?company ;        # ?c --> company owned by ?company <-- (company from Canadian business)
   wdt:P17 ?country ;         # ?country is the country of the company owned... if it is different from wd:Q16 (Canada) --> so confirms that there are company that owns companies in other countries
   wdt:P31 wd:Q4830453 .
?company wdt:P17 wd:Q16  ;  # ?company from Canadian business 
         wdt:P31 wd:Q4830453 .

?c        <http://schema.org/name> ?companyname .
?country  <http://schema.org/name> ?countryname .
wdt:P127  <http://schema.org/name> ?ob .
?company  <http://schema.org/name> ?companyn .
wd:Q16    <http://schema.org/name> ?cont .

FILTER(?country != wd:Q16)
} 
 
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('c', 'http://www.wikidata.org/entity/Q6412013'), ('companyname', 'King Rollo Films'), ('countryname', 'United Kingdom'), ('ob', 'owned by'), ('companyn', 'Cookie Jar Group'), ('cont', 'Canada')]
[('c', 'http://www.wikidata.org/entity/Q17069264'), ('companyname', 'Mindjet'), ('countryname', 'United States of America'), ('ob', 'owned by'), ('companyn', 'Corel'), ('cont', 'Canada')]
[('c', 'http://www.wikidata.org/entity/Q195997'), ('companyname', 'Parallels'), ('countryname', 'United States of America'), ('ob', 'owned by'), ('companyn', 'Corel'), ('cont', 'Canada')]
[('c', 'http://www.wikidata.org/entity/Q696884'), ('companyname', 'Ulead Systems'), ('countryname', 'Taiwan'), ('ob', 'owned by'), ('companyn', 'Corel'), ('cont', 'Canada')]
[('c', 'http://www.wikidata.org/entity/Q2723882'), ('companyname', 'Saks Fifth Avenue'), ('countryname', 'United States of America'), ('ob', 'owned by'), ('companyn', "Hudson's Bay Company"), ('cont', 'Canada')]
[('c', 'http://www.wikidata.org/e

20

In [None]:
# ASKING: Is there some company that owns companies in other countries? 
# or in my case 'Is there some company Canadian that owns companies in other countries?'

In [201]:
queryString = """
ASK
WHERE { 

?c wdt:P127 ?company ;
   wdt:P17 ?country ;
   wdt:P31 wd:Q4830453 .
?company wdt:P17 wd:Q16  ;  # ?company from Canadian business 
         wdt:P31 wd:Q4830453 .

?c <http://schema.org/name> ?companyname .
?country <http://schema.org/name> ?countryname .
wdt:P127 <http://schema.org/name> ?ob .
?company  <http://schema.org/name> ?companyn .

FILTER(?country != wd:Q16)
} 
 
"""

print("Results")
run_ask_query(queryString)

Results


{'head': {'link': []}, 'boolean': True}

In [4]:
#

# Workload 4

In [203]:
# 4. Companies have different 'legal forms', compare the number of companies divided in different legal forms

In [None]:
# Analyzing company properties to find 'legal form'

In [206]:
queryString = """
SELECT DISTINCT ?p ?pname
WHERE { 

?c wdt:P17 wd:Q16  ;
   wdt:P31 wd:Q4830453  ;
   ?p  ?o .

?p <http://schema.org/name> ?pname .

FILTER REGEX(?pname, "form")
} 
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1454'), ('pname', 'legal form')]
[('p', 'http://www.wikidata.org/prop/direct/P1670'), ('pname', 'Canadiana Authorities ID (former scheme)')]
[('p', 'http://www.wikidata.org/prop/direct/P740'), ('pname', 'location of formation')]
[('p', 'http://www.wikidata.org/prop/direct/P9138'), ('pname', 'service status information URL')]
[('p', 'http://www.wikidata.org/prop/direct/P4354'), ('pname', 'search formatter URL')]


5

In [None]:
# Analyzing 'legal form' --> wdt:P1454

In [213]:
queryString = """
SELECT DISTINCT ?lf ?legalform
WHERE { 

?c wdt:P31 wd:Q4830453  ;
   wdt:P1454  ?lf .

?lf <http://schema.org/name> ?legalform .

} 
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('lf', 'http://www.wikidata.org/entity/Q20746571'), ('legalform', 'public company limited by shares')]
[('lf', 'http://www.wikidata.org/entity/Q18340464'), ('legalform', 'sociedade anónima')]
[('lf', 'http://www.wikidata.org/entity/Q2624661'), ('legalform', 'Sp. z o.o.')]
[('lf', 'http://www.wikidata.org/entity/Q17990971'), ('legalform', 'public enterprise')]
[('lf', 'http://www.wikidata.org/entity/Q1780031'), ('legalform', 'Kommanditgesellschaft')]
[('lf', 'http://www.wikidata.org/entity/Q1518608'), ('legalform', 'limited liability company')]
[('lf', 'http://www.wikidata.org/entity/Q510785'), ('legalform', 'association without lucrative purpose')]
[('lf', 'http://www.wikidata.org/entity/Q64699475'), ('legalform', 'spółka akcyjna')]
[('lf', 'http://www.wikidata.org/entity/Q16917171'), ('legalform', 'private limited company')]
[('lf', 'http://www.wikidata.org/entity/Q11900271'), ('legalform', 'valtion liikelaitos')]


10

In [None]:
# Looking for 'legal form' linked to Business, we find 436 kind of legals forms. 
# For this motivation can be interesting to analyze only some of these. 
# We can restrict the Business only for U.K. companies wd:Q145 node

In [4]:
queryString = """
SELECT DISTINCT ?lf ?legalform (count(?c) as ?num_companies)
WHERE { 

?c wdt:P31 wd:Q4830453  ;
   wdt:P17 wd:Q145;
   wdt:P1454  ?lf .

?lf <http://schema.org/name> ?legalform .

} 
ORDER BY DESC (?num_companies)
LIMIT 100
"""

print("Results")
run_query(queryString)

Results
[('lf', 'http://www.wikidata.org/entity/Q5225895'), ('legalform', 'public limited company'), ('num_companies', '251')]
[('lf', 'http://www.wikidata.org/entity/Q6832945'), ('legalform', 'private company limited by shares'), ('num_companies', '220')]
[('lf', 'http://www.wikidata.org/entity/Q1589009'), ('legalform', 'privately held company'), ('num_companies', '48')]
[('lf', 'http://www.wikidata.org/entity/Q33685'), ('legalform', 'limited company'), ('num_companies', '46')]
[('lf', 'http://www.wikidata.org/entity/Q891723'), ('legalform', 'public company'), ('num_companies', '40')]
[('lf', 'http://www.wikidata.org/entity/Q134161'), ('legalform', 'joint-stock company'), ('num_companies', '37')]
[('lf', 'http://www.wikidata.org/entity/Q18624259'), ('legalform', 'private limited liability company'), ('num_companies', '32')]
[('lf', 'http://www.wikidata.org/entity/Q105162507'), ('legalform', 'Registered Society'), ('num_companies', '13')]
[('lf', 'http://www.wikidata.org/entity/Q658255

46

In [None]:
# To find all the legal form with their number of companies (all companies of the world) we can run the code down

In [7]:
queryString = """
SELECT DISTINCT ?lf ?legalform (count(?c) as ?num_companies)
WHERE { 

?c wdt:P31 wd:Q4830453  ;
   wdt:P1454  ?lf .

?lf <http://schema.org/name> ?legalform .

} 
ORDER BY DESC (?num_companies)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('lf', 'http://www.wikidata.org/entity/Q15646299'), ('legalform', 'společnost s ručením omezeným'), ('num_companies', '13059')]
[('lf', 'http://www.wikidata.org/entity/Q3742494'), ('legalform', 'akciová společnost'), ('num_companies', '5551')]
[('lf', 'http://www.wikidata.org/entity/Q134161'), ('legalform', 'joint-stock company'), ('num_companies', '5168')]
[('lf', 'http://www.wikidata.org/entity/Q460178'), ('legalform', 'GmbH'), ('num_companies', '2823')]
[('lf', 'http://www.wikidata.org/entity/Q15649019'), ('legalform', 'Spoločnosť s ručením obmedzeným'), ('num_companies', '2302')]
[('lf', 'http://www.wikidata.org/entity/Q891723'), ('legalform', 'public company'), ('num_companies', '1407')]
[('lf', 'http://www.wikidata.org/entity/Q15042660'), ('legalform', 'aksjeselskap'), ('num_companies', '1312')]
[('lf', 'http://www.wikidata.org/entity/Q1480166'), ('legalform', 'kabushiki gaisha'), ('num_companies', '1228')]
[('lf', 'http://www.wikidata.org/entity/Q166280'), ('legalform',

10

In [None]:
#
#

# Workload 5

In [None]:
# 5. Analyze the number of employees and other relevant numeric attributes

## Workload 5.1

In [None]:
# 5.1 What are the top-10 companies for a given attribute?

In [32]:
# Choosing wdt:P2295 'netprofit' as attribute we can look for the 10 companies given from U.S.A., U.K and Canada

In [None]:
# FROM GIVEN TABLE
# wd:Q16	Canada	node
# wd:Q145	U.K.	node
# wd:Q30	U.S.A.	node

In [51]:
queryString = """
SELECT DISTINCT  ?compname  ?netprofit 
WHERE { 

?comp  wdt:P2295 ?netprofit;
       wdt:P17 ?contry ;
       wdt:P31 wd:Q4830453  . 

?comp <http://schema.org/name> ?compname .
?country <http://schema.org/name> ?countryname .

FILTER ((?country = wd:Q16 or ?country = wd:Q145 or ?country = wd:Q30) && isNumeric(?netprofit))
} 
ORDER BY DESC (?netprofit)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('compname', 'Samsung Electronics'), ('netprofit', '22730000000000')]
[('compname', 'Toyota'), ('netprofit', '2312694000000')]
[('compname', 'SoftBank'), ('netprofit', '1237812000000')]
[('compname', 'Sberbank of Russia'), ('netprofit', '760300000000')]
[('compname', 'Central Japan Railway Company'), ('netprofit', '391624000000')]
[('compname', 'Saudi Aramco'), ('netprofit', '330693000000')]
[('compname', 'Industrial and Commercial Bank of China'), ('netprofit', '317685000000')]
[('compname', 'East Japan Railway Company'), ('netprofit', '279435000000')]
[('compname', 'MOL Group'), ('netprofit', '251655000000')]
[('compname', 'AsiaCell'), ('netprofit', '243618000000')]


10

In [52]:
queryString = """
SELECT DISTINCT ?compname ?numEmployees
WHERE { 

?comp  wdt:P1128 ?numEmployees ;
       wdt:P17 ?contry ;
       wdt:P31 wd:Q4830453  . 

?comp <http://schema.org/name> ?compname .
?country <http://schema.org/name> ?countryname .

FILTER ((?country = wd:Q16 or ?country = wd:Q145 or ?country = wd:Q30) && isNumeric(?numEmployees))
} 
ORDER BY DESC (?numEmployees)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('compname', 'Walmart'), ('numEmployees', '2500000')]
[('compname', 'China National Petroleum Corporation'), ('numEmployees', '1589508')]
[('compname', 'State Grid Corporation of China'), ('numEmployees', '1581000')]
[('compname', 'Amazon'), ('numEmployees', '1298000')]
[('compname', 'Randstad N.V.'), ('numEmployees', '658580')]
[('compname', 'Foxconn'), ('numEmployees', '618460')]
[('compname', 'G4S'), ('numEmployees', '618260')]
[('compname', 'RAO UES'), ('numEmployees', '577000')]
[('compname', 'Deutsche Post AG'), ('numEmployees', '547459')]
[('compname', 'PetroChina Company Limited'), ('numEmployees', '521566')]


10

In [None]:
#
#

## Workload 5.2

In [None]:
# 5.2 For which companies is defined some form of income or market capitalization or total assets? 
# What is the min, max, and average in each category and country?

## Workload 5.3

In [None]:
# 5.3 Which business in each country owns more businesses in other countries?

In [3]:
queryString = """
SELECT DISTINCT ?c ?companyname ?countryname ?ob ?companyn ?cont
WHERE { 

?c wdt:P127 ?company ;        # ?c --> company owned by ?company <-- (company from Canadian business)
   wdt:P17 ?country ;         # ?country is the country of the company owned... if it is different from wd:Q16 (Canada) --> so confirms that there are company that owns companies in other countries
   wdt:P31 wd:Q4830453 .
?company wdt:P17 ?co  ;  # ?company from Canadian business 
         wdt:P31 wd:Q4830453 .

?c        <http://schema.org/name> ?companyname .
?country  <http://schema.org/name> ?countryname .
wdt:P127  <http://schema.org/name> ?ob .
?company  <http://schema.org/name> ?companyn .
?co    <http://schema.org/name> ?cont .

FILTER(?country != ?co)
} 

LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('c', 'http://www.wikidata.org/entity/Q223191'), ('companyname', 'Adria Airways'), ('countryname', 'Slovenia'), ('ob', 'owned by'), ('companyn', '4K KNDNS'), ('cont', 'Germany')]
[('c', 'http://www.wikidata.org/entity/Q4653345'), ('companyname', 'APM Terminals B.V'), ('countryname', 'Netherlands'), ('ob', 'owned by'), ('companyn', 'Maersk'), ('cont', 'Denmark')]
[('c', 'http://www.wikidata.org/entity/Q15919712'), ('companyname', 'HSBC Philippines'), ('countryname', 'Philippines'), ('ob', 'owned by'), ('companyn', 'HSBC'), ('cont', 'United Kingdom')]
[('c', 'http://www.wikidata.org/entity/Q1381966'), ('companyname', 'Flipkart'), ('countryname', 'India'), ('ob', 'owned by'), ('companyn', 'Walmart'), ('cont', 'United States of America')]
[('c', 'http://www.wikidata.org/entity/Q161906'), ('companyname', 'Oakley, Inc.'), ('countryname', 'United States of America'), ('ob', 'owned by'), ('companyn', 'Luxottica'), ('cont', 'Italy')]
[('c', 'http://www.wikidata.org/entity/Q17006657'), 

20

In [4]:
# Now I count the businesses owned by a company in another country

In [5]:
queryString = """
SELECT DISTINCT  ?companyn ?cont (count(?c) as ?owns_num)
WHERE { 


?c wdt:P127 ?company ;        # ?c --> company owned by ?company <-- (company from Canadian business)
   wdt:P17 ?country ;         # ?country is the country of the company owned... if it is different from wd:Q16 (Canada) --> so confirms that there are company that owns companies in other countries
   wdt:P31 wd:Q4830453 .
?company wdt:P17 ?co  ;  # ?company from Canadian business 
         wdt:P31 wd:Q4830453 .


?c        <http://schema.org/name> ?companyname .
?country  <http://schema.org/name> ?countryname .
?company  <http://schema.org/name> ?companyn .
?co    <http://schema.org/name> ?cont .

FILTER(?country != ?co)
} 
ORDER BY DESC (?owns_num)
LIMIT 20
"""

print("Results")
run_query(queryString)

Results
[('companyn', 'BlackRock'), ('cont', 'United States of America'), ('owns_num', '46')]
[('companyn', 'AG für Verkehrswesen'), ('cont', 'German Reich'), ('owns_num', '23')]
[('companyn', 'Orange'), ('cont', 'France'), ('owns_num', '18')]
[('companyn', 'Modern Times Group'), ('cont', 'Sweden'), ('owns_num', '17')]
[('companyn', 'MTN Group'), ('cont', 'South Africa'), ('owns_num', '15')]
[('companyn', 'Bauer Media Group'), ('cont', 'Germany'), ('owns_num', '15')]
[('companyn', 'Communicorp'), ('cont', 'United Kingdom'), ('owns_num', '15')]
[('companyn', 'MRG-Radio'), ('cont', 'Russia'), ('owns_num', '15')]
[('companyn', 'HSBC'), ('cont', 'United Kingdom'), ('owns_num', '14')]
[('companyn', 'Capital Group Companies'), ('cont', 'United States of America'), ('owns_num', '12')]
[('companyn', 'Stellantis'), ('cont', 'Netherlands'), ('owns_num', '11')]
[('companyn', 'Heineken'), ('cont', 'Netherlands'), ('owns_num', '11')]
[('companyn', 'Carlsberg Group'), ('cont', 'Denmark'), ('owns_num

20