# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [44]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-7fe02a57ad-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# Companies Workflow Series ("IT Companies explorative search") 

Consider the following exploratory information need:

> Compare companies across different sectors in U.K., U.S., and Canada, consider number of employees, companies owned or acquired, and revenue or assets

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P279`    | subclass      | predicate |
| `wdt:P17`      | country       | predicate |
| `wd:Q4830453`  | Business      | node      |
| `wd:Q13977`    | Bloomberg L.P.| node |
| `wd:Q502121`   | BlackBerry    | node |
| `wd:Q16`        | Canada        | node |
| `wd:Q145`      | U.K.          | node |
| `wd:Q30`       | U.S.A.        | node |


Also consider

```
?p wdt:P17 wd:Q16  . 
?p wdt:P31 wd:Q4830453  . 
```

is the BGP to retrieve all **canadian businesses**

## Workload Goals

1. Identify the BGP for obtaining number of employees of a company and other relevant numerical attributes

2. Identify the BGP to retrieve all companies owned by a company

3. Is there some company that owns companies in other countries?

4. Companies have different 'legal forms', compare the number of companies divided in different legal forms

5. Analyze the number of employees  and other relevant numeric attributes
 
   5.1 What are the top-10 companies for a given attribute?
   
   5.2 For which companies is defined some form of income or market capitalization or total assets? What is the min, max, and average in each category and country?
   
   5.3 Which business in each country owns more businesses in other countries?


In [1]:
# start your workflow here

In [2]:
queryString = """
SELECT COUNT(*)
WHERE { 

?p wdt:P17 wd:Q16  . 
?p wdt:P31 wd:Q4830453  . 
} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '2482')]


1

In [5]:
# get predicates and objects from "business" node

queryString = """
SELECT DISTINCT ?p ?plbl 
WHERE { 

?s wdt:P31 wd:Q4830453 ;
   ?p ?o .

?p <http://schema.org/name> ?plbl .

} 
GROUP BY ?p ?plbl
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P6518'), ('plbl', 'Business Registry code (Estonia)')]
[('p', 'http://www.wikidata.org/prop/direct/P131'), ('plbl', 'located in the administrative territorial entity')]
[('p', 'http://www.wikidata.org/prop/direct/P2186'), ('plbl', 'Wiki Loves Monuments ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2671'), ('plbl', 'Google Knowledge Graph ID')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('plbl', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P3417'), ('plbl', 'Quora topic ID')]
[('p', 'http://www.wikidata.org/prop/direct/P373'), ('plbl', 'Commons category')]
[('p', 'http://www.wikidata.org/prop/direct/P5508'), ('plbl', 'archINFORM person or group ID')]
[('p', 'http://www.wikidata.org/prop/direct/P968'), ('plbl', 'email address')]
[('p', 'http://www.wikidata.org/prop/direct/P5531'), ('plbl', 'Central Index Key')]
[('p', 'http://www.wikidata.org/prop/direct/P3418'), ('plbl', 'Google Play Store app ID')]
[('p', '

1659

In [10]:
# get predicates from "Business" node but without predicates which indicate ID, code, username

queryString = """
SELECT DISTINCT ?p ?plbl 
WHERE { 

?s wdt:P31 wd:Q4830453 ;
   ?p ?o .

?p <http://schema.org/name> ?plbl .

FILTER (!CONTAINS (?plbl, "ID") && !CONTAINS (?plbl, "username") && !CONTAINS (?plbl, "code"))

} 
GROUP BY ?p ?plbl
ORDER BY ?plbl
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P2179'), ('plbl', 'ACM Classification Code (2012)')]
[('p', 'http://www.wikidata.org/prop/direct/P2472'), ('plbl', 'ACMA Radiocommunications Client Number')]
[('p', 'http://www.wikidata.org/prop/direct/P6269'), ('plbl', 'API endpoint')]
[('p', 'http://www.wikidata.org/prop/direct/P4162'), ('plbl', 'AUR package')]
[('p', 'http://www.wikidata.org/prop/direct/P1661'), ('plbl', 'Alexa rank')]
[('p', 'http://www.wikidata.org/prop/direct/P8895'), ('plbl', 'All the Tropes identifier')]
[('p', 'http://www.wikidata.org/prop/direct/P5749'), ('plbl', 'Amazon Standard Identification Number')]
[('p', 'http://www.wikidata.org/prop/direct/P3454'), ('plbl', 'Arch Linux package')]
[('p', 'http://www.wikidata.org/prop/direct/P7772'), ('plbl', 'Atlas Obscura identifier')]
[('p', 'http://www.wikidata.org/prop/direct/P3548'), ('plbl', 'Australian Business Number')]
[('p', 'http://www.wikidata.org/prop/direct/P3549'), ('plbl', 'Australian Company Number')]

718

In [11]:
# get employees from BlackBerry

queryString = """
SELECT DISTINCT ?o ?olbl
WHERE { 

wd:Q502121 wdt:P1128 ?o .

?o <http://schema.org/name> ?olbl .

} 
ORDER BY ?olbl
"""

print("Results")
run_query(queryString)

Results
Empty


0

In [13]:
# get predicates and objects from "Business" node where predicates contain "employee" string

queryString = """
SELECT DISTINCT ?p ?plbl 
WHERE { 

?s wdt:P31 wd:Q4830453 ;
   ?p ?o .

?p <http://schema.org/name> ?plbl .

FILTER (CONTAINS (?plbl, "employee"))

} 
GROUP BY ?p ?plbl
ORDER BY ?plbl
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P4195'), ('plbl', 'category for employees of the organization')]
[('p', 'http://www.wikidata.org/prop/direct/P1128'), ('plbl', 'employees')]


2

In [16]:
# get employees from all Business

queryString = """
SELECT DISTINCT ?o ?olbl
WHERE { 

?s wdt:P31 wd:Q4830453 ;
   wdt:P1128 ?o .

?o <http://schema.org/name> ?olbl .

} 
ORDER BY ?olbl
"""

print("Results")
run_query(queryString)

Results
Empty


0

In [18]:
# get predicates from BlackBerry

queryString = """
SELECT DISTINCT ?p ?plbl (GROUP_CONCAT(?olbl; separator="|") as ?objs)
WHERE { 

wd:Q502121 ?p ?o .

?p <http://schema.org/name> ?plbl .
?o <http://schema.org/name> ?olbl .

} 
GROUP BY ?p ?plbl
ORDER BY ?plbl
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P169'), ('plbl', 'chief executive officer'), ('objs', 'John S. Chen')]
[('p', 'http://www.wikidata.org/prop/direct/P17'), ('plbl', 'country'), ('objs', 'Canada')]
[('p', 'http://www.wikidata.org/prop/direct/P112'), ('plbl', 'founded by'), ('objs', 'Mike Lazaridis')]
[('p', 'http://www.wikidata.org/prop/direct/P159'), ('plbl', 'headquarters location'), ('objs', 'Waterloo')]
[('p', 'http://www.wikidata.org/prop/direct/P452'), ('plbl', 'industry'), ('objs', 'information technology|internet of things')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('plbl', 'instance of'), ('objs', 'software company|business|public company')]
[('p', 'http://www.wikidata.org/prop/direct/P1454'), ('plbl', 'legal form'), ('objs', 'corporation')]
[('p', 'http://www.wikidata.org/prop/direct/P740'), ('plbl', 'location of formation'), ('objs', 'Waterloo')]
[('p', 'http://www.wikidata.org/prop/direct/P463'), ('plbl', 'member of'), ('objs', 'Wi-Fi Alliance|CV

15

In [21]:
# get predicates to BlackBerry

queryString = """
SELECT DISTINCT ?p ?plbl (GROUP_CONCAT(?s; separator=" | ") as ?subjs) (GROUP_CONCAT(?slbl; separator="|") as ?subjslbl)
WHERE { 

?s ?p wd:Q502121 .

?p <http://schema.org/name> ?plbl .
?s <http://schema.org/name> ?slbl .

} 
GROUP BY ?p ?plbl
ORDER BY ?plbl
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P971'), ('plbl', 'category combines topics'), ('subjs', 'http://www.wikidata.org/entity/Q6419054'), ('subjslbl', 'Category:BlackBerry Limited mobile phones')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('plbl', "category's main topic"), ('subjs', 'http://www.wikidata.org/entity/Q6805207'), ('subjslbl', 'Category:BlackBerry Limited')]
[('p', 'http://www.wikidata.org/prop/direct/P767'), ('plbl', 'contributor to the creative work or subject'), ('subjs', 'http://www.wikidata.org/entity/Q64720041'), ('subjslbl', 'Perioperative Ischemia Reduction Study (PROSE) Pilot')]
[('p', 'http://www.wikidata.org/prop/direct/P178'), ('plbl', 'developer'), ('subjs', 'http://www.wikidata.org/entity/Q879989 | http://www.wikidata.org/entity/Q880016 | http://www.wikidata.org/entity/Q676052 | http://www.wikidata.org/entity/Q879996 | http://www.wikidata.org/entity/Q2160367 | http://www.wikidata.org/entity/Q2613503 | http://www.wikidata.org/entity/Q169

8

In [23]:
# get predicates and objects from BlackBerry which is instance of "smartphone model series"

queryString = """
SELECT DISTINCT ?p ?plbl (GROUP_CONCAT(?olbl; separator="|") as ?objs) ?nodelbl
WHERE { 

wd:Q171819 ?p ?o ;
    wdt:P31 ?node .

?p <http://schema.org/name> ?plbl .
?o <http://schema.org/name> ?olbl .
?node <http://schema.org/name> ?nodelbl .

} 
GROUP BY ?p ?plbl ?nodelbl
ORDER BY ?plbl
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('plbl', 'instance of'), ('objs', 'smartphone model series'), ('nodelbl', 'smartphone model series')]
[('p', 'http://www.wikidata.org/prop/direct/P176'), ('plbl', 'manufacturer'), ('objs', 'BlackBerry'), ('nodelbl', 'smartphone model series')]
[('p', 'http://www.wikidata.org/prop/direct/P138'), ('plbl', 'named after'), ('objs', 'Rubus subg. Rubus'), ('nodelbl', 'smartphone model series')]
[('p', 'http://www.wikidata.org/prop/direct/P306'), ('plbl', 'operating system'), ('objs', 'Android|BlackBerry OS|BlackBerry 10'), ('nodelbl', 'smartphone model series')]
[('p', 'http://www.wikidata.org/prop/direct/P277'), ('plbl', 'programming language'), ('objs', 'C++'), ('nodelbl', 'smartphone model series')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('plbl', 'subclass of'), ('objs', 'cell phone model'), ('nodelbl', 'smartphone model series')]
[('p', 'http://www.wikidata.org/prop/direct/P910'), ('plbl', "topic's main category"), (

7

In [26]:
# get predicates and objects from John S. Chen node

queryString = """
SELECT DISTINCT ?p ?plbl (GROUP_CONCAT(?olbl; separator=" | ") as ?objs)
WHERE { 

wd:Q6256206 ?p ?o .

?p <http://schema.org/name> ?plbl .
?o <http://schema.org/name> ?olbl .

} 
GROUP BY ?p ?plbl ?nodelbl
ORDER BY ?plbl
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P27'), ('plbl', 'country of citizenship'), ('objs', 'United Kingdom')]
[('p', 'http://www.wikidata.org/prop/direct/P69'), ('plbl', 'educated at'), ('objs', 'Brown University | California Institute of Technology | Northfield Mount Hermon School | La Salle College')]
[('p', 'http://www.wikidata.org/prop/direct/P108'), ('plbl', 'employer'), ('objs', 'BlackBerry | Sybase | Siemens Nixdorf Informationssysteme | Unisys')]
[('p', 'http://www.wikidata.org/prop/direct/P734'), ('plbl', 'family name'), ('objs', 'Cheng')]
[('p', 'http://www.wikidata.org/prop/direct/P735'), ('plbl', 'given name'), ('objs', 'John')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('plbl', 'instance of'), ('objs', 'human')]
[('p', 'http://www.wikidata.org/prop/direct/P1412'), ('plbl', 'languages spoken, written or signed'), ('objs', 'English')]
[('p', 'http://www.wikidata.org/prop/direct/P463'), ('plbl', 'member of'), ('objs', 'Committee of 100')]
[('p', 'http://

11

In [31]:
# get objects from "employer" predicate

queryString = """
SELECT DISTINCT ?o ?olbl
WHERE { 

?s wdt:P108 ?o .

?o <http://schema.org/name> ?olbl .

} 
ORDER BY ?olbl
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q21211947'), ('olbl', '"Armbiotechnology" Scientific and Production Center NAS RA')]
[('o', 'http://www.wikidata.org/entity/Q4822213'), ('olbl', '"Aurel Vlaicu" University of Arad')]
[('o', 'http://www.wikidata.org/entity/Q19369275'), ('olbl', '"Filippo del Ponte" Hospital')]
[('o', 'http://www.wikidata.org/entity/Q12287726'), ('olbl', '"Narodna mladezh"')]
[('o', 'http://www.wikidata.org/entity/Q12730502'), ('olbl', '"Sextil Puscariu" Linguistics and Literary History Institute')]
[('o', 'http://www.wikidata.org/entity/Q105084947'), ('olbl', '"Simon Karas" Conservatory')]
[('o', 'http://www.wikidata.org/entity/Q22137571'), ('olbl', '#0')]
[('o', 'http://www.wikidata.org/entity/Q3202290'), ('olbl', "'A'A'")]
[('o', 'http://www.wikidata.org/entity/Q3044478'), ('olbl', "'s Heeren Loo")]
[('o', 'http://www.wikidata.org/entity/Q94015061'), ('olbl', "'s Rijks Ethnographisch Museum")]
[('o', 'http://www.wikidata.org/entity/Q2745889'), ('olbl', "'

30

In [32]:
# get all "employees"

queryString = """
SELECT DISTINCT ?s ?slbl ?o ?olbl
WHERE { 

?s wdt:P1128 ?o .

?s <http://schema.org/name> ?slbl .
?o <http://schema.org/name> ?olbl .

} 
ORDER BY ?olbl
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
Empty


0

In [37]:
# get objects from path "BlackBerry"->"start time"

queryString = """
SELECT ?o ?olbl
WHERE { 

wd:Q502121 wdt:P580 ?o .

?o <http://schema.org/name> ?olbl .

} 
ORDER BY ?olbl
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
Empty


0

In [41]:
# get number of employers for each business

queryString = """
SELECT ?business ?buslbl (COUNT(DISTINCT ?person) AS ?emp_num)
WHERE { 

?person wdt:P108 ?business .
?business wdt:P31 wd:Q4830453 .

?business <http://schema.org/name> ?buslbl .

} 
GROUP BY ?business ?buslbl
ORDER BY DESC (?emp_num)
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('business', 'http://www.wikidata.org/entity/Q9531'), ('buslbl', 'BBC'), ('emp_num', '1602')]
[('business', 'http://www.wikidata.org/entity/Q1130172'), ('buslbl', 'Mayo Clinic'), ('emp_num', '879')]
[('business', 'http://www.wikidata.org/entity/Q95'), ('buslbl', 'Google'), ('emp_num', '585')]
[('business', 'http://www.wikidata.org/entity/Q37156'), ('buslbl', 'IBM'), ('emp_num', '583')]
[('business', 'http://www.wikidata.org/entity/Q2283'), ('buslbl', 'Microsoft'), ('emp_num', '389')]
[('business', 'http://www.wikidata.org/entity/Q899140'), ('buslbl', 'Genentech'), ('emp_num', '275')]
[('business', 'http://www.wikidata.org/entity/Q206921'), ('buslbl', 'Pfizer'), ('emp_num', '237')]
[('business', 'http://www.wikidata.org/entity/Q312'), ('buslbl', 'Apple Inc.'), ('emp_num', '228')]
[('business', 'http://www.wikidata.org/entity/Q126399'), ('buslbl', 'Warner Bros.'), ('emp_num', '194')]
[('business', 'http://www.wikidata.org/entity/Q731938'), ('buslbl', 'AstraZeneca'), ('emp_num', 

30

In [45]:
# get relevant numeric information for each "Business"
# [number_of_employers; number_of_sub_organizations; number_of_developed_prod; number_of_manufactured_prod]

queryString = """
SELECT ?business ?buslbl (COUNT(DISTINCT ?person) AS ?emp_num) ?subs_num ?devs_num ?manuf_num
WHERE { 

?person wdt:P108 ?business .
?business wdt:P31 wd:Q4830453 .

{
    SELECT ?business (COUNT(DISTINCT ?sub_org) AS ?subs_num)
    WHERE {
    
        ?sub_org wdt:P749 ?business .
        
        ?business wdt:P31 wd:Q4830453 .
    }
    GROUP BY ?business

}

{
    SELECT ?business (COUNT(DISTINCT ?dev) AS ?devs_num)
    WHERE {
    
        ?dev wdt:P178 ?business .
        
        ?business wdt:P31 wd:Q4830453 .
    }
    GROUP BY ?business

}

{
    SELECT ?business (COUNT(DISTINCT ?manuf) AS ?manuf_num)
    WHERE {

        ?manuf wdt:P176 ?business .
        
        ?business wdt:P31 wd:Q4830453 .
    }
    GROUP BY ?business

}

?business <http://schema.org/name> ?buslbl .

} 
GROUP BY ?business ?buslbl ?subs_num ?devs_num ?manuf_num
ORDER BY DESC (?emp_num)
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('business', 'http://www.wikidata.org/entity/Q9531'), ('buslbl', 'BBC'), ('emp_num', '1602'), ('subs_num', '23'), ('devs_num', '6'), ('manuf_num', '1')]
[('business', 'http://www.wikidata.org/entity/Q95'), ('buslbl', 'Google'), ('emp_num', '585'), ('subs_num', '34'), ('devs_num', '600'), ('manuf_num', '20')]
[('business', 'http://www.wikidata.org/entity/Q37156'), ('buslbl', 'IBM'), ('emp_num', '583'), ('subs_num', '37'), ('devs_num', '390'), ('manuf_num', '5560')]
[('business', 'http://www.wikidata.org/entity/Q2283'), ('buslbl', 'Microsoft'), ('emp_num', '389'), ('subs_num', '66'), ('devs_num', '1134'), ('manuf_num', '64')]
[('business', 'http://www.wikidata.org/entity/Q206921'), ('buslbl', 'Pfizer'), ('emp_num', '237'), ('subs_num', '16'), ('devs_num', '1'), ('manuf_num', '7')]
[('business', 'http://www.wikidata.org/entity/Q312'), ('buslbl', 'Apple Inc.'), ('emp_num', '228'), ('subs_num', '23'), ('devs_num', '672'), ('manuf_num', '140')]
[('business', 'http://www.wikidata.org

30

In [53]:
# get nodes which names contains "company"

queryString = """
SELECT DISTINCT ?o ?olbl
WHERE { 

?company wdt:P31 ?o .

?o <http://schema.org/name> ?olbl .
          
FILTER (CONTAINS(?olbl, "company"))

} 
ORDER BY ?olbl
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q28456565'), ('olbl', 'Civic technology company')]
[('o', 'http://www.wikidata.org/entity/Q50825050'), ('olbl', 'Concession company')]
[('o', 'http://www.wikidata.org/entity/Q2498526'), ('olbl', 'Entrepreneurial company (Germany)')]
[('o', 'http://www.wikidata.org/entity/Q99945381'), ('olbl', "Farmers' company town")]
[('o', 'http://www.wikidata.org/entity/Q3055260'), ('olbl', 'French company')]
[('o', 'http://www.wikidata.org/entity/Q20031260'), ('olbl', 'IT consulting company')]
[('o', 'http://www.wikidata.org/entity/Q11367794'), ('olbl', 'Intermediate holding company')]
[('o', 'http://www.wikidata.org/entity/Q6054493'), ('olbl', 'International business company')]
[('o', 'http://www.wikidata.org/entity/Q6055831'), ('olbl', 'Internet company')]
[('o', 'http://www.wikidata.org/entity/Q56410106'), ('olbl', 'Limited stock company')]
[('o', 'http://www.wikidata.org/entity/Q105576474'), ('olbl', 'Marine company')]
[('o', 'http://www.wikidata.o

50

In [76]:
# get predicates from "commercial company"

queryString = """
SELECT DISTINCT ?p ?plbl (GROUP_CONCAT(?o; separator="|") AS ?objs) (GROUP_CONCAT(?olbl; separator="|") AS ?objslbl)
WHERE { 

wd:Q567521 ?p ?o .

?p <http://schema.org/name> ?plbl .
?o <http://schema.org/name> ?olbl .

} 
GROUP BY ?p ?plbl
ORDER BY ?plbl
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('plbl', 'described by source'), ('objs', 'http://www.wikidata.org/entity/Q2041543'), ('objslbl', 'Ottův slovník naučný')]
[('p', 'http://www.wikidata.org/prop/direct/P1382'), ('plbl', 'partially coincident with'), ('objs', 'http://www.wikidata.org/entity/Q783794'), ('objslbl', 'company')]
[('p', 'http://www.wikidata.org/prop/direct/P279'), ('plbl', 'subclass of'), ('objs', 'http://www.wikidata.org/entity/Q5621421'), ('objslbl', 'private company')]


3

In [75]:
# get subjs and objs linking from "subclass of"
# the obj has to contain "company" string

queryString = """
SELECT DISTINCT ?o ?olbl
WHERE { 

?s wdt:P279 ?o .

?s <http://schema.org/name> ?slbl .
?o <http://schema.org/name> ?olbl .

FILTER (CONTAINS(?olbl, "company"))

} 
ORDER BY ?plbl
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('o', 'http://www.wikidata.org/entity/Q30163575'), ('olbl', 'tank company')]
[('o', 'http://www.wikidata.org/entity/Q10689397'), ('olbl', 'television production company')]
[('o', 'http://www.wikidata.org/entity/Q66737123'), ('olbl', 'French company')]
[('o', 'http://www.wikidata.org/entity/Q627236'), ('olbl', 'company town')]
[('o', 'http://www.wikidata.org/entity/Q7497908'), ('olbl', "ship's company")]
[('o', 'http://www.wikidata.org/entity/Q7250430'), ('olbl', 'proprietary company')]
[('o', 'http://www.wikidata.org/entity/Q249556'), ('olbl', 'railway company')]
[('o', 'http://www.wikidata.org/entity/Q107100767'), ('olbl', 'company director')]
[('o', 'http://www.wikidata.org/entity/Q1762059'), ('olbl', 'film production company')]
[('o', 'http://www.wikidata.org/entity/Q219577'), ('olbl', 'holding company')]
[('o', 'http://www.wikidata.org/entity/Q2401749'), ('olbl', 'telecommunication company')]
[('o', 'http://www.wikidata.org/entity/Q2143354'), ('olbl', 'insurance company')]

50

In [77]:
# get predicates and objs from "company" node

queryString = """
SELECT DISTINCT ?p ?plbl (GROUP_CONCAT(?olbl; separator="|") AS ?objslbl)
WHERE { 

wd:Q783794 ?p ?o .

?p <http://schema.org/name> ?plbl .
?o <http://schema.org/name> ?olbl .

} 
GROUP BY ?p ?plbl
ORDER BY ?plbl
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1343'), ('plbl', 'described by source'), ('objslbl', 'Ottův slovník naučný|Granat Encyclopedic Dictionary')]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('plbl', 'different from'), ('objslbl', 'business|company (Italian legal concept)')]
[('p', 'http://www.wikidata.org/prop/direct/P527'), ('plbl', 'has part'), ('objslbl', 'department')]
[('p', 'http://www.wikidata.org/prop/direct/P1552'), ('plbl', 'has quality'), ('objslbl', 'corporate personhood')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('plbl', 'instance of'), ('objslbl', 'type of organisation')]
[('p', 'http://www.wikidata.org/prop/direct/P6104'), ('plbl', 'maintained by WikiProject'), ('objslbl', 'WikiProject Companies')]
[('p', 'http://www.wikidata.org/prop/direct/P460'), ('plbl', 'said to be the same as'), ('objslbl', 'business|enterprise|commercial organization')]
[('p', 'http://www.wikidata.org/prop/direct/P2579'), ('plbl', 'studied by'), ('objslbl', 'eco

11

In [79]:
# get predicates and subjs from "company" node

queryString = """
SELECT DISTINCT ?p ?plbl
WHERE { 

?s ?p wd:Q783794 .

?p <http://schema.org/name> ?plbl .
?s <http://schema.org/name> ?slbl .

} 
ORDER BY ?plbl
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P1855'), ('plbl', 'Wikidata property example')]
[('p', 'http://www.wikidata.org/prop/direct/P199'), ('plbl', 'business division')]
[('p', 'http://www.wikidata.org/prop/direct/P971'), ('plbl', 'category combines topics')]
[('p', 'http://www.wikidata.org/prop/direct/P4224'), ('plbl', 'category contains')]
[('p', 'http://www.wikidata.org/prop/direct/P301'), ('plbl', "category's main topic")]
[('p', 'http://www.wikidata.org/prop/direct/P1889'), ('plbl', 'different from')]
[('p', 'http://www.wikidata.org/prop/direct/P1074'), ('plbl', 'fictional analog of')]
[('p', 'http://www.wikidata.org/prop/direct/P2670'), ('plbl', 'has parts of the class')]
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('plbl', 'instance of')]
[('p', 'http://www.wikidata.org/prop/direct/P360'), ('plbl', 'is a list of')]
[('p', 'http://www.wikidata.org/prop/direct/P1454'), ('plbl', 'legal form')]
[('p', 'http://www.wikidata.org/prop/direct/P921'), ('plbl', 'main su

21

In [4]:
# get relevant numeric information for each kind of company instance
# [number_of_employers; number_of_sub_organizations; number_of_developed_prod; number_of_manufactured_prod]
#
# NOTE: As one of previous query shows, there not exists any employee linked to the businesses. I consider the employers only.
#
# EDITED NOTE: I have found the number of employees in next queries. The GOAL (1) is next to GOAL (4)

queryString = """
SELECT ?company ?complbl (COUNT(DISTINCT ?person) AS ?emp_num) ?subs_num ?devs_num ?manuf_num
WHERE { 

?person wdt:P108 ?company .
?company wdt:P31 ?type .

?type <http://schema.org/name> ?typelbl .
        
FILTER (CONTAINS(?typelbl, "company"))

{
    SELECT ?company (COUNT(DISTINCT ?sub_org) AS ?subs_num)
    WHERE {
    
        ?sub_org wdt:P749 ?company .
        
        ?company wdt:P31 ?type .
        
        ?type <http://schema.org/name> ?typelbl .
        
        FILTER (CONTAINS(?typelbl, "company"))
    }
    GROUP BY ?company

}

{
    SELECT ?company (COUNT(DISTINCT ?dev) AS ?devs_num)
    WHERE {
    
        ?dev wdt:P178 ?company .
        
        ?company wdt:P31 ?type .
        
        ?type <http://schema.org/name> ?typelbl .
        
        FILTER (CONTAINS(?typelbl, "company"))
    }
    GROUP BY ?company

}

{
    SELECT ?company (COUNT(DISTINCT ?manuf) AS ?manuf_num)
    WHERE {

        ?manuf wdt:P176 ?company .
        
        ?company wdt:P31 ?type .
        
        ?type <http://schema.org/name> ?typelbl .
        
        FILTER (CONTAINS(?typelbl, "company"))
    }
    GROUP BY ?company

}

?company <http://schema.org/name> ?complbl .

} 
GROUP BY ?company ?complbl ?subs_num ?devs_num ?manuf_num
ORDER BY DESC (?emp_num)
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q95'), ('complbl', 'Google'), ('emp_num', '585'), ('subs_num', '34'), ('devs_num', '600'), ('manuf_num', '20')]
[('company', 'http://www.wikidata.org/entity/Q37156'), ('complbl', 'IBM'), ('emp_num', '583'), ('subs_num', '37'), ('devs_num', '390'), ('manuf_num', '5560')]
[('company', 'http://www.wikidata.org/entity/Q2283'), ('complbl', 'Microsoft'), ('emp_num', '389'), ('subs_num', '66'), ('devs_num', '1134'), ('manuf_num', '64')]
[('company', 'http://www.wikidata.org/entity/Q212322'), ('complbl', 'GlaxoSmithKline'), ('emp_num', '237'), ('subs_num', '17'), ('devs_num', '4'), ('manuf_num', '12')]
[('company', 'http://www.wikidata.org/entity/Q206921'), ('complbl', 'Pfizer'), ('emp_num', '237'), ('subs_num', '16'), ('devs_num', '1'), ('manuf_num', '7')]
[('company', 'http://www.wikidata.org/entity/Q312'), ('complbl', 'Apple Inc.'), ('emp_num', '228'), ('subs_num', '23'), ('devs_num', '672'), ('manuf_num', '140')]
[('company', 'http://www

30

In [5]:
# GOAL (2)
# get companies which are owned by another company

queryString = """
SELECT DISTINCT ?company ?complbl ?owner ?ownerlbl
WHERE { 

?company wdt:P31 ?type1 ;
         wdt:P127 ?owner .
         
?owner wdt:P31 ?type2 .

?company <http://schema.org/name> ?complbl .
?owner <http://schema.org/name> ?ownerlbl .
?type1 <http://schema.org/name> ?type1lbl .
?type2 <http://schema.org/name> ?type2lbl .

FILTER (CONTAINS(?type1lbl, "company") && CONTAINS(?type2lbl, "company"))

} 
ORDER BY ?complbl
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q28105059'), ('complbl', '.aol'), ('owner', 'http://www.wikidata.org/entity/Q27585'), ('ownerlbl', 'AOL')]
[('company', 'http://www.wikidata.org/entity/Q634019'), ('complbl', '20th Century Animation'), ('owner', 'http://www.wikidata.org/entity/Q7414'), ('ownerlbl', 'The Walt Disney Company')]
[('company', 'http://www.wikidata.org/entity/Q2717612'), ('complbl', '20th Century Fox Television'), ('owner', 'http://www.wikidata.org/entity/Q7414'), ('ownerlbl', 'The Walt Disney Company')]
[('company', 'http://www.wikidata.org/entity/Q434841'), ('complbl', '20th Century Studios'), ('owner', 'http://www.wikidata.org/entity/Q7414'), ('ownerlbl', 'The Walt Disney Company')]
[('company', 'http://www.wikidata.org/entity/Q407009'), ('complbl', '3'), ('owner', 'http://www.wikidata.org/entity/Q1671804'), ('ownerlbl', 'Investor AB')]
[('company', 'http://www.wikidata.org/entity/Q407009'), ('complbl', '3'), ('owner', 'http://www.wikidata.org/entity/Q1

50

In [7]:
# GOAL (3)
# get companies which are owned by another company located in a different country
# Answer: YES

queryString = """
SELECT DISTINCT ?company ?complbl ?c1lbl ?owner ?ownerlbl ?c2lbl
WHERE { 

?company wdt:P31 ?type1 ;
         wdt:P127 ?owner ;
         wdt:P17 ?country1 .
         
?owner wdt:P31 ?type2 ;
       wdt:P17 ?country2 .

?company <http://schema.org/name> ?complbl .
?owner <http://schema.org/name> ?ownerlbl .
?type1 <http://schema.org/name> ?type1lbl .
?type2 <http://schema.org/name> ?type2lbl .
?country1 <http://schema.org/name> ?c1lbl .
?country2 <http://schema.org/name> ?c2lbl .

FILTER (CONTAINS(?type1lbl, "company") && CONTAINS(?type2lbl, "company"))

FILTER (?country1 != ?country2)

} 
ORDER BY ?complbl
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q407009'), ('complbl', '3'), ('c1lbl', 'United Kingdom'), ('owner', 'http://www.wikidata.org/entity/Q19548352'), ('ownerlbl', 'CK Hutchison Holdings'), ('c2lbl', "People's Republic of China")]
[('company', 'http://www.wikidata.org/entity/Q407009'), ('complbl', '3'), ('c1lbl', 'United Kingdom'), ('owner', 'http://www.wikidata.org/entity/Q1671804'), ('ownerlbl', 'Investor AB'), ('c2lbl', 'Sweden')]
[('company', 'http://www.wikidata.org/entity/Q407009'), ('complbl', '3'), ('c1lbl', 'Macau'), ('owner', 'http://www.wikidata.org/entity/Q19548352'), ('ownerlbl', 'CK Hutchison Holdings'), ('c2lbl', "People's Republic of China")]
[('company', 'http://www.wikidata.org/entity/Q407009'), ('complbl', '3'), ('c1lbl', 'Macau'), ('owner', 'http://www.wikidata.org/entity/Q1671804'), ('ownerlbl', 'Investor AB'), ('c2lbl', 'Sweden')]
[('company', 'http://www.wikidata.org/entity/Q407009'), ('complbl', '3'), ('c1lbl', 'Indonesia'), ('owner', 'http://www.

50

In [8]:
# get all "legal form" of the companies

queryString = """
SELECT DISTINCT ?lform ?lformlbl
WHERE { 

?company wdt:P31 ?type ;
         wdt:P1454 ?lform .

?lform <http://schema.org/name> ?lformlbl .
?type <http://schema.org/name> ?typelbl .

FILTER (CONTAINS(?typelbl, "company"))

} 
ORDER BY ?lformlbl
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('lform', 'http://www.wikidata.org/entity/Q240625'), ('lformlbl', '501(c) organization')]
[('lform', 'http://www.wikidata.org/entity/Q18325436'), ('lformlbl', '501(c)(3) organization')]
[('lform', 'http://www.wikidata.org/entity/Q56475875'), ('lformlbl', 'A natural person doing business according to the Trade Licensing Act not registered in the Commercial Register')]
[('lform', 'http://www.wikidata.org/entity/Q26839244'), ('lformlbl', 'A.E.')]
[('lform', 'http://www.wikidata.org/entity/Q61472992'), ('lformlbl', 'A.E.E.X.')]
[('lform', 'http://www.wikidata.org/entity/Q20659907'), ('lformlbl', 'AG & Co. KG')]
[('lform', 'http://www.wikidata.org/entity/Q292704'), ('lformlbl', 'AG & Co. KGaA')]
[('lform', 'http://www.wikidata.org/entity/Q15856579'), ('lformlbl', 'Akcinė bendrovė')]
[('lform', 'http://www.wikidata.org/entity/Q693737'), ('lformlbl', 'Aktiengesellschaft')]
[('lform', 'http://www.wikidata.org/entity/Q22084735'), ('lformlbl', 'Aktiengesellschaft')]
[('lform', 'http://w

50

In [11]:
# GOAL (4)
# compare number of countries grouped for their legal form

queryString = """
SELECT ?lform ?lformlbl (COUNT(DISTINCT ?company) AS ?num_comp)
WHERE { 

?company wdt:P31 ?type ;
         wdt:P1454 ?lform .

?lform <http://schema.org/name> ?lformlbl .
?type <http://schema.org/name> ?typelbl .

FILTER (CONTAINS(?typelbl, "company"))

} 
GROUP BY ?lform ?lformlbl
ORDER BY DESC (?num_comp)
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('lform', 'http://www.wikidata.org/entity/Q783794'), ('lformlbl', 'company'), ('num_comp', '5087')]
[('lform', 'http://www.wikidata.org/entity/Q134161'), ('lformlbl', 'joint-stock company'), ('num_comp', '2270')]
[('lform', 'http://www.wikidata.org/entity/Q891723'), ('lformlbl', 'public company'), ('num_comp', '1607')]
[('lform', 'http://www.wikidata.org/entity/Q1589009'), ('lformlbl', 'privately held company'), ('num_comp', '1494')]
[('lform', 'http://www.wikidata.org/entity/Q1480166'), ('lformlbl', 'kabushiki gaisha'), ('num_comp', '747')]
[('lform', 'http://www.wikidata.org/entity/Q460178'), ('lformlbl', 'GmbH'), ('num_comp', '479')]
[('lform', 'http://www.wikidata.org/entity/Q422062'), ('lformlbl', 'Aktiengesellschaft'), ('num_comp', '443')]
[('lform', 'http://www.wikidata.org/entity/Q166280'), ('lformlbl', 'S.A.'), ('num_comp', '329')]
[('lform', 'http://www.wikidata.org/entity/Q5225895'), ('lformlbl', 'public limited company'), ('num_comp', '324')]
[('lform', 'http://www

50

In [13]:
# get employees from companies

queryString = """
SELECT DISTINCT ?empl ?empllbl
WHERE { 

?company wdt:P1128 ?empl ;
         wdt:P31 ?type .

?type <http://schema.org/name> ?typelbl .
?empl <http://schema.org/name> ?empllbl .

FILTER (CONTAINS(?typelbl, "company"))

} 
ORDER BY ?empllbl
"""

print("Results")
run_query(queryString)

Results
Empty


0

In [16]:
# get top 10 of companies in according of number of employers
#
# NOTE: As one of previous query shows, there not exists any employee linked to the businesses. I consider the employers only.
#
# EDITED NOTE: I have found the number of employees in next queries. The GOAL (5.1) is next to QUERY (1)

queryString = """
SELECT ?company ?complbl (COUNT(DISTINCT ?person) AS ?emp_num)
WHERE { 

?person wdt:P108 ?company .
?company wdt:P31 ?type .

?type <http://schema.org/name> ?typelbl .
        
FILTER (CONTAINS(?typelbl, "company"))

?company <http://schema.org/name> ?complbl .

} 
GROUP BY ?company ?complbl ?subs_num ?devs_num ?manuf_num
ORDER BY DESC (?emp_num)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q95'), ('complbl', 'Google'), ('emp_num', '585')]
[('company', 'http://www.wikidata.org/entity/Q37156'), ('complbl', 'IBM'), ('emp_num', '583')]
[('company', 'http://www.wikidata.org/entity/Q2283'), ('complbl', 'Microsoft'), ('emp_num', '389')]
[('company', 'http://www.wikidata.org/entity/Q9684'), ('complbl', 'The New York Times'), ('emp_num', '308')]
[('company', 'http://www.wikidata.org/entity/Q193326'), ('complbl', 'Goldman Sachs'), ('emp_num', '256')]
[('company', 'http://www.wikidata.org/entity/Q2881259'), ('complbl', 'Royal Ballet of Flanders'), ('emp_num', '241')]
[('company', 'http://www.wikidata.org/entity/Q206921'), ('complbl', 'Pfizer'), ('emp_num', '237')]
[('company', 'http://www.wikidata.org/entity/Q212322'), ('complbl', 'GlaxoSmithKline'), ('emp_num', '237')]
[('company', 'http://www.wikidata.org/entity/Q312'), ('complbl', 'Apple Inc.'), ('emp_num', '228')]
[('company', 'http://www.wikidata.org/entity/Q1530721'), ('com

10

In [28]:
# get companies with both market capitalization and total assets

queryString = """
SELECT ?company ?complbl ?mark_cap ?tot_asset
WHERE { 

?company wdt:P31 ?type ;
         wdt:P2226 ?mark_cap .
    
?type <http://schema.org/name> ?typelbl .
        
FILTER (CONTAINS(?typelbl, "company"))

?company <http://schema.org/name> ?complbl .


{
    SELECT ?company ?tot_asset
    WHERE{
    ?company wdt:P31 ?type ;
         wdt:P2403 ?tot_asset .
         
    ?type <http://schema.org/name> ?typelbl .
    FILTER (CONTAINS(?typelbl, "company"))
    }
}

} 
GROUP BY ?company ?complbl
ORDER BY ?complbl
"""

print("Results")
run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q5476713'), ('complbl', '21st Century Fox'), ('mark_cap', '27000000000'), ('tot_asset', '53.831')]
[('company', 'http://www.wikidata.org/entity/Q159433'), ('complbl', '3M'), ('mark_cap', '115280000000'), ('tot_asset', '47344000000')]
[('company', 'http://www.wikidata.org/entity/Q159433'), ('complbl', '3M'), ('mark_cap', '115280000000'), ('tot_asset', '34051000000')]
[('company', 'http://www.wikidata.org/entity/Q294508'), ('complbl', 'ALROSA'), ('mark_cap', '506000000000'), ('tot_asset', '518886000000')]
[('company', 'http://www.wikidata.org/entity/Q35476'), ('complbl', 'AT&T'), ('mark_cap', '204560000000'), ('tot_asset', '525761000000')]
[('company', 'http://www.wikidata.org/entity/Q306764'), ('complbl', 'Abbott Laboratories'), ('mark_cap', '57000000000'), ('tot_asset', '41247000000')]
[('company', 'http://www.wikidata.org/entity/Q128896'), ('complbl', 'Advanced Micro Devices'), ('mark_cap', '131930000000'), ('tot_asset', '8962000000

182

In [32]:
# get employees from companies

queryString = """
SELECT DISTINCT ?o
WHERE { 

?company wdt:P1128 ?o .

} 
ORDER BY ?p
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('o', '0')]
[('o', '1')]
[('o', '2')]
[('o', '2.45')]
[('o', '3')]
[('o', '4')]
[('o', '4.4')]
[('o', '4.55')]
[('o', '5')]
[('o', '6')]


10

In [36]:
# GOAL (1)
# get relevant numeric information for each kind of company instance
# [number_of_employees; number_of_sub_organizations; number_of_developed_prod; number_of_manufactured_prod]

queryString = """
SELECT DISTINCT ?company ?complbl ?tot_emp ?subs_num ?devs_num ?manuf_num
WHERE { 

?company wdt:P31 ?type ;
         wdt:P1128 ?tot_emp .

?type <http://schema.org/name> ?typelbl .
        
FILTER (CONTAINS(?typelbl, "company"))

{
    SELECT ?company (COUNT(DISTINCT ?sub_org) AS ?subs_num)
    WHERE {
    
        ?sub_org wdt:P749 ?company .
        
        ?company wdt:P31 ?type .
        
        ?type <http://schema.org/name> ?typelbl .
        
        FILTER (CONTAINS(?typelbl, "company"))
    }
    GROUP BY ?company

}

{
    SELECT ?company (COUNT(DISTINCT ?dev) AS ?devs_num)
    WHERE {
    
        ?dev wdt:P178 ?company .
        
        ?company wdt:P31 ?type .
        
        ?type <http://schema.org/name> ?typelbl .
        
        FILTER (CONTAINS(?typelbl, "company"))
    }
    GROUP BY ?company

}

{
    SELECT ?company (COUNT(DISTINCT ?manuf) AS ?manuf_num)
    WHERE {

        ?manuf wdt:P176 ?company .
        
        ?company wdt:P31 ?type .
        
        ?type <http://schema.org/name> ?typelbl .
        
        FILTER (CONTAINS(?typelbl, "company"))
    }
    GROUP BY ?company

}

?company <http://schema.org/name> ?complbl .

} 
ORDER BY DESC (?tot_emp)
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q3884'), ('complbl', 'Amazon'), ('tot_emp', '1298000'), ('subs_num', '53'), ('devs_num', '40'), ('manuf_num', '6')]
[('company', 'http://www.wikidata.org/entity/Q463094'), ('complbl', 'Foxconn'), ('tot_emp', '618460'), ('subs_num', '12'), ('devs_num', '1'), ('manuf_num', '64')]
[('company', 'http://www.wikidata.org/entity/Q157645'), ('complbl', 'Deutsche Post AG'), ('tot_emp', '547459'), ('subs_num', '8'), ('devs_num', '1'), ('manuf_num', '1')]
[('company', 'http://www.wikidata.org/entity/Q156578'), ('complbl', 'Volkswagen Group'), ('tot_emp', '513642'), ('subs_num', '35'), ('devs_num', '1'), ('manuf_num', '156')]
[('company', 'http://www.wikidata.org/entity/Q487494'), ('complbl', 'Tesco'), ('tot_emp', '423092'), ('subs_num', '16'), ('devs_num', '2'), ('manuf_num', '1')]
[('company', 'http://www.wikidata.org/entity/Q20718'), ('complbl', 'Samsung Electronics'), ('tot_emp', '392305'), ('subs_num', '12'), ('devs_num', '53'), ('manuf_num

30

In [39]:
# GOAL(5.1)
# get top 10 of companies in according of number of employees

queryString = """
SELECT DISTINCT ?company ?complbl ?tot_emp
WHERE { 

?company wdt:P31 ?type ;
         wdt:P1128 ?tot_emp .

?type <http://schema.org/name> ?typelbl .
        
FILTER (CONTAINS(?typelbl, "company"))

?company <http://schema.org/name> ?complbl .

} 
ORDER BY DESC (?tot_emp)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q483551'), ('complbl', 'Walmart'), ('tot_emp', '2500000')]
[('company', 'http://www.wikidata.org/entity/Q3884'), ('complbl', 'Amazon'), ('tot_emp', '1298000')]
[('company', 'http://www.wikidata.org/entity/Q819425'), ('complbl', 'Indian Railways'), ('tot_emp', '1230000')]
[('company', 'http://www.wikidata.org/entity/Q660770'), ('complbl', 'Russian Railways'), ('tot_emp', '711000')]
[('company', 'http://www.wikidata.org/entity/Q267840'), ('complbl', 'Randstad N.V.'), ('tot_emp', '658580')]
[('company', 'http://www.wikidata.org/entity/Q463094'), ('complbl', 'Foxconn'), ('tot_emp', '618460')]
[('company', 'http://www.wikidata.org/entity/Q1322045'), ('complbl', 'G4S'), ('tot_emp', '618260')]
[('company', 'http://www.wikidata.org/entity/Q157645'), ('complbl', 'Deutsche Post AG'), ('tot_emp', '547459')]
[('company', 'http://www.wikidata.org/entity/Q2667719'), ('complbl', 'Aviation Industry Corporation of China'), ('tot_emp', '535942')]
[('c

10

In [61]:
# GOAL(5.2) PART 1
# get companies which have operating income or total assets or market capitalization defined in the dataset

queryString = """
SELECT ?company ?complbl ?mark_cap ?tot_asset ?tot_income
WHERE { 

?company wdt:P31 ?type .

OPTIONAL { ?company wdt:P2226 ?mark_cap  }
OPTIONAL { ?company wdt:P2403 ?tot_asset  }
OPTIONAL { ?company wdt:P3362 ?tot_income  }
    
?type <http://schema.org/name> ?typelbl .
        
FILTER (CONTAINS(?typelbl, "company")) 
FILTER (EXISTS {?company wdt:P2226 ?mark_cap} || EXISTS {?company wdt:P2403 ?tot_asset} || EXISTS {?company wdt:P3362 ?tot_income})

?company <http://schema.org/name> ?complbl .

} 
GROUP BY ?company ?complbl
ORDER BY ?complbl
LIMIT 40
"""

print("Results")
run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q5476713'), ('complbl', '21st Century Fox'), ('mark_cap', '27000000000'), ('tot_asset', '53.831'), ('tot_income', '5000000000')]
[('company', 'http://www.wikidata.org/entity/Q4030613'), ('complbl', '2GIS'), ('tot_asset', '3122878000'), ('tot_income', '261286000')]
[('company', 'http://www.wikidata.org/entity/Q226889'), ('complbl', '36,6'), ('tot_asset', '39909000000'), ('tot_income', '-13140000000')]
[('company', 'http://www.wikidata.org/entity/Q4636301'), ('complbl', '3D Systems'), ('mark_cap', '1036000000')]
[('company', 'http://www.wikidata.org/entity/Q159433'), ('complbl', '3M'), ('mark_cap', '115280000000'), ('tot_asset', '34051000000'), ('tot_income', '7161000000')]
[('company', 'http://www.wikidata.org/entity/Q159433'), ('complbl', '3M'), ('mark_cap', '115280000000'), ('tot_asset', '47344000000'), ('tot_income', '6100000000')]
[('company', 'http://www.wikidata.org/entity/Q159433'), ('complbl', '3M'), ('mark_cap', '115280000000

40

In [66]:
# GOAL(5.2) PART 2
# get companies which have operating income or total assets or market capitalization defined in the dataset

queryString = """
SELECT ?country ?countrylbl (MIN(?mark_cap) AS ?cap_min) (MIN(?tot_asset) AS ?ass_min) (MIN(?tot_income) AS ?income_min) 
                            (MAX(?mark_cap) AS ?cap_max) (MAX(?tot_asset) AS ?ass_max) (MAX(?tot_income) AS ?income_max)
                            (AVG(?mark_cap) AS ?cap_avg) (AVG(?tot_asset) AS ?ass_avg) (AVG(?tot_income) AS ?income_avg)
WHERE { 

?company wdt:P31 ?type ;
         wdt:P17 ?country .

OPTIONAL { ?company wdt:P2226 ?mark_cap  }
OPTIONAL { ?company wdt:P2403 ?tot_asset  }
OPTIONAL { ?company wdt:P3362 ?tot_income  }
    
?type <http://schema.org/name> ?typelbl .
        
FILTER (CONTAINS(?typelbl, "company")) 
FILTER (EXISTS {?company wdt:P2226 ?mark_cap} || EXISTS {?company wdt:P2403 ?tot_asset} || EXISTS {?company wdt:P3362 ?tot_income})

?country <http://schema.org/name> ?countrylbl .

} 
GROUP BY ?country ?countrylbl
ORDER BY DESC (?income_avg)
LIMIT 40
"""

print("Results")
run_query(queryString)

Results
[('country', 'http://www.wikidata.org/entity/Q142'), ('countrylbl', 'France'), ('cap_min', '20348000'), ('ass_min', '13008'), ('income_min', '-59496000'), ('cap_max', '139000000000'), ('ass_max', '230978000000'), ('income_max', '44937000000000000'), ('cap_avg', '22910118350.272727272727273'), ('ass_avg', '33466980407.24'), ('income_avg', '3456694942939769.230769230769231')]
[('country', 'http://www.wikidata.org/entity/Q884'), ('countrylbl', 'South Korea'), ('ass_min', '6450000000'), ('income_min', '97000000'), ('ass_max', '262174300000000'), ('income_max', '29240000000000'), ('ass_avg', '51741428251287'), ('income_avg', '6459383119800')]
[('country', 'http://www.wikidata.org/entity/Q851'), ('countrylbl', 'Saudi Arabia'), ('cap_min', '2458000000000'), ('ass_min', '32525893'), ('income_min', '674871000000'), ('cap_max', '2458000000000'), ('ass_max', '1494126000000'), ('income_max', '674871000000'), ('cap_avg', '2458000000000'), ('ass_avg', '22599436479.117857142857143'), ('income

40

In [74]:
# get number of owners for each business

queryString = """
SELECT DISTINCT ?company ?complbl ?country1 ?country1lbl (COUNT(?owner) AS ?num_owners)
WHERE { 

?company wdt:P31 wd:Q4830453 ;
         wdt:P127 ?owner ;
         wdt:P17 ?country1 .
         
?owner wdt:P31 wd:Q4830453 ;
       wdt:P17 ?country2 .

?company <http://schema.org/name> ?complbl .
?country1 <http://schema.org/name> ?country1lbl .

FILTER (?country1 != ?country2)

} 
GROUP BY ?company ?complbl ?country1 ?country1lbl
ORDER BY DESC (?num_owners)
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q936563'), ('complbl', 'Italian Bourse'), ('country1', 'http://www.wikidata.org/entity/Q38'), ('country1lbl', 'Italy'), ('num_owners', '7')]
[('company', 'http://www.wikidata.org/entity/Q30284353'), ('complbl', 'Cefriel'), ('country1', 'http://www.wikidata.org/entity/Q38'), ('country1lbl', 'Italy'), ('num_owners', '6')]
[('company', 'http://www.wikidata.org/entity/Q278810'), ('complbl', 'Pennon Group'), ('country1', 'http://www.wikidata.org/entity/Q145'), ('country1lbl', 'United Kingdom'), ('num_owners', '5')]
[('company', 'http://www.wikidata.org/entity/Q691563'), ('complbl', 'Nord Stream AG'), ('country1', 'http://www.wikidata.org/entity/Q39'), ('country1lbl', 'Switzerland'), ('num_owners', '5')]
[('company', 'http://www.wikidata.org/entity/Q11985300'), ('complbl', 'Vítkovice Steel'), ('country1', 'http://www.wikidata.org/entity/Q213'), ('country1lbl', 'Czech Republic'), ('num_owners', '5')]
[('company', 'http://www.wikidata.org/en

50

In [83]:
# GOAL (5.3)
# get business for each country with the most number of owners from a different country

queryString = """

SELECT ?country1 ?country1lbl ?company ?companylbl ?max_num
WHERE
{
    {
        SELECT ?country1 (MAX(?num_owners) AS ?max_num)
        WHERE {

            {

                SELECT DISTINCT ?company ?country1 (COUNT(?owner) AS ?num_owners)
                WHERE { 

                ?company wdt:P31 wd:Q4830453 ;
                         wdt:P127 ?owner ;
                         wdt:P17 ?country1 .

                ?owner wdt:P31 wd:Q4830453 ;
                       wdt:P17 ?country2 .

                FILTER (?country1 != ?country2)

                } 
                GROUP BY ?company ?country1 ?complbl

            }

        }
        GROUP BY ?country1
        ORDER BY DESC (?max_num)
    }


    {
        SELECT DISTINCT ?company ?country1 (COUNT(?owner) AS ?num_owners)
        WHERE { 

        ?company wdt:P31 wd:Q4830453 ;
                 wdt:P127 ?owner ;
                 wdt:P17 ?country1 .

        ?owner wdt:P31 wd:Q4830453 ;
               wdt:P17 ?country2 .

        FILTER (?country1 != ?country2)

        } 
        GROUP BY ?company ?country1 ?complbl
    }
    
    ?country1 <http://schema.org/name> ?country1lbl .
    ?company <http://schema.org/name> ?companylbl .
    
    FILTER (?num_owners = ?max_num)
}
GROUP BY ?country1
ORDER BY DESC (?max_num)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('country1', 'http://www.wikidata.org/entity/Q38'), ('country1lbl', 'Italy'), ('company', 'http://www.wikidata.org/entity/Q936563'), ('companylbl', 'Italian Bourse'), ('max_num', '7')]
[('country1', 'http://www.wikidata.org/entity/Q214'), ('country1lbl', 'Slovakia'), ('company', 'http://www.wikidata.org/entity/Q15966341'), ('companylbl', 'Tatry Mountain Resorts'), ('max_num', '5')]
[('country1', 'http://www.wikidata.org/entity/Q39'), ('country1lbl', 'Switzerland'), ('company', 'http://www.wikidata.org/entity/Q691563'), ('companylbl', 'Nord Stream AG'), ('max_num', '5')]
[('country1', 'http://www.wikidata.org/entity/Q145'), ('country1lbl', 'United Kingdom'), ('company', 'http://www.wikidata.org/entity/Q278810'), ('companylbl', 'Pennon Group'), ('max_num', '5')]
[('country1', 'http://www.wikidata.org/entity/Q785'), ('country1lbl', 'Jersey'), ('company', 'http://www.wikidata.org/entity/Q58425902'), ('companylbl', 'Penta Investments Group Limited'), ('max_num', '5')]
[('country1',

10

In [81]:
# get max number of owner of a business for each country

queryString = """
SELECT ?country1 (MAX(?num_owners) AS ?max_num)
WHERE {

    {

        SELECT DISTINCT ?company ?country1 (COUNT(?owner) AS ?num_owners)
        WHERE { 

        ?company wdt:P31 wd:Q4830453 ;
                 wdt:P127 ?owner ;
                 wdt:P17 ?country1 .

        ?owner wdt:P31 wd:Q4830453 ;
               wdt:P17 ?country2 .

        FILTER (?country1 != ?country2)

        } 
        GROUP BY ?company ?country1 ?complbl

    }

}
GROUP BY ?country1
ORDER BY DESC (?max_num)
"""

print("Results")
run_query(queryString)

Results
[('country1', 'http://www.wikidata.org/entity/Q38'), ('max_num', '7')]
[('country1', 'http://www.wikidata.org/entity/Q145'), ('max_num', '5')]
[('country1', 'http://www.wikidata.org/entity/Q785'), ('max_num', '5')]
[('country1', 'http://www.wikidata.org/entity/Q39'), ('max_num', '5')]
[('country1', 'http://www.wikidata.org/entity/Q213'), ('max_num', '5')]
[('country1', 'http://www.wikidata.org/entity/Q214'), ('max_num', '5')]
[('country1', 'http://www.wikidata.org/entity/Q28'), ('max_num', '4')]
[('country1', 'http://www.wikidata.org/entity/Q650667'), ('max_num', '4')]
[('country1', 'http://www.wikidata.org/entity/Q40'), ('max_num', '4')]
[('country1', 'http://www.wikidata.org/entity/Q219'), ('max_num', '4')]
[('country1', 'http://www.wikidata.org/entity/Q30'), ('max_num', '4')]
[('country1', 'http://www.wikidata.org/entity/Q408'), ('max_num', '4')]
[('country1', 'http://www.wikidata.org/entity/Q912'), ('max_num', '4')]
[('country1', 'http://www.wikidata.org/entity/Q16'), ('max

125