# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [2]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-ea81ce57ad-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# Companies Workflow Series ("IT Companies explorative search") 

Consider the following exploratory information need:

> Compare companies across different sectors in U.K., U.S., and Canada, consider number of employees, companies owned or acquired, and revenue or assets

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P279`    | subclass      | predicate |
| `wdt:P17`      | country       | predicate |
| `wd:Q4830453`  | Business      | node      |
| `wd:Q13977`    | Bloomberg L.P.| node |
| `wd:Q502121`   | BlackBerry    | node |
| `wd:Q16`        | Canada        | node |
| `wd:Q145`      | U.K.          | node |
| `wd:Q30`       | U.S.A.        | node |


Also consider

```
?p wdt:P17 wd:Q16  . 
?p wdt:P31 wd:Q4830453  . 
```

is the BGP to retrieve all **canadian businesses**

## Workload Goals

1. Identify the BGP for obtaining number of employees of a company and other relevant numerical attributes

2. Identify the BGP to retrieve all companies owned by a company

3. Is there some company that owns companies in other countries?

4. Companies have different 'legal forms', compare the number of companies divided in different legal forms

5. Analyze the number of employees  and other relevant numeric attributes
 
   5.1 What are the top-10 companies for a given attribute?
   
   5.2 For which companies is defined some form of income or market capitalization or total assets? What is the min, max, and average in each category and country?
   
   5.3 Which business in each country owns more businesses in other countries?


In [2]:
# start your workflow here

In [3]:
queryString = """
SELECT COUNT(*)
WHERE { 

?p wdt:P17 wd:Q16  . 
?p wdt:P31 wd:Q4830453  . 
} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '2482')]


1

**1 - Identify the BGP for obtaining number of employees of a company and other relevant numerical attributes**

In [5]:
queryString = """
SELECT DISTINCT ?p ?pName
WHERE { 

?company wdt:P17 wd:Q16  . 
?company wdt:P31 wd:Q4830453  . 
?company ?p ?obj .

?company <http://schema.org/name> ?cName .
?p <http://schema.org/name> ?pName .
} 
ORDER BY ASC(?pName)

"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P6269'), ('pName', 'API endpoint')]
[('p', 'http://www.wikidata.org/prop/direct/P6150'), ('pName', 'Academy Awards Database nominee ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1661'), ('pName', 'Alexa rank')]
[('p', 'http://www.wikidata.org/prop/direct/P6683'), ('pName', 'Alexander Turnbull Library ID')]
[('p', 'http://www.wikidata.org/prop/direct/P1728'), ('pName', 'AllMusic artist ID')]
[('p', 'http://www.wikidata.org/prop/direct/P9618'), ('pName', 'AlternativeTo software ID')]
[('p', 'http://www.wikidata.org/prop/direct/P6276'), ('pName', 'Amazon Music artist ID')]
[('p', 'http://www.wikidata.org/prop/direct/P3276'), ('pName', 'Angel List ID')]
[('p', 'http://www.wikidata.org/prop/direct/P3861'), ('pName', 'App Store app ID')]
[('p', 'http://www.wikidata.org/prop/direct/P5260'), ('pName', 'App Store developer ID')]
[('p', 'http://www.wikidata.org/prop/direct/P2850'), ('pName', 'Apple Music artist ID (American version)')]
[('

365

Queste potrebbero essere delle propriètà numeriche e/o utili <br>
[('p', 'http://www.wikidata.org/prop/direct/P4103'), ('pName', 'assets under management')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P1128'), ('pName', 'employees')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P8340'), ('pName', 'estimated value')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P2295'), ('pName', 'net profit')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P2403'), ('pName', 'total assets')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P2137'), ('pName', 'total equity')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P2139'), ('pName', 'total revenue')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P127'), ('pName', 'owned by')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P1830'), ('pName', 'owner of')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P749'), ('pName', 'parent organization')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P361'), ('pName', 'part of')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P710'), ('pName', 'participant')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P1344'), ('pName', 'participant in')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P2652'), ('pName', 'partnership with')] <br>
[('p', 'http://www.wikidata.org/prop/direct/P2226'), ('pName', 'market capitalization')] <br>

In [7]:
queryString = """
SELECT DISTINCT ?cName ?obj
WHERE { 

?company wdt:P17 wd:Q16  . 
?company wdt:P31 wd:Q4830453  . 
?company wdt:P4103 ?obj . #num Asset ?

?company <http://schema.org/name> ?cName .
OPTIONAL{?obj <http://schema.org/name> ?objName .}

} 
ORDER BY ASC(?cName)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('cName', 'Brookfield Asset Management'), ('obj', '239825000000')]
[('cName', 'Caisse de dépôt et placement du Québec'), ('obj', '298500000000')]
[('cName', 'Great-West Lifeco'), ('obj', '565700000000')]
[('cName', 'Manulife Financial'), ('obj', '795100000000')]
[('cName', 'Royal Bank of Canada'), ('obj', '577200000000')]
[('cName', 'Sun Life Financial'), ('obj', '820800000000')]


6

In [None]:
Esploro le altre proprietà

In [22]:
queryString = """
SELECT DISTINCT (?cName AS ?companyName) (?obj1 AS ?Asset) (?obj2 AS ?numEm) (?ownby AS ?OwnedBy ) ?obj3 (?parentO AS ?Parent)
WHERE { 

?company wdt:P31 wd:Q4830453  . 
OPTIONAL {?company wdt:P4103 ?obj1 .} #num Asset
OPTIONAL {?company wdt:P1128 ?obj2 .} #num employees
OPTIONAL {?company wdt:P127 ?obj3 . #owned by
            ?obj3 <http://schema.org/name> ?ownby .}
OPTIONAL {?company wdt:P749 ?obj4 . #parent organization
            ?obj4 <http://schema.org/name> ?parentO .}

?company <http://schema.org/name> ?cName .

} 
ORDER BY ASC(?companyName)
LIMIT 500
"""

print("Results")
run_query(queryString)

Results
[('companyName', '"APR"')]
[('companyName', '"ATOEX"')]
[('companyName', '"Agentura REPRE" společnost s ručením omezeným')]
[('companyName', '"BETTINI S.R.L."')]
[('companyName', '"CZ TOP TRADE s.r.o."')]
[('companyName', '"ELVE svěřenský fond"')]
[('companyName', '"Hlásek, spol. s.r.o."')]
[('companyName', '"K" Line RoRo Bulk Ship Management')]
[('companyName', '"KF DOMENA" SPÓŁKA Z OGRANICZONA ODPOWIEDZIALNOSCIA')]
[('companyName', '"Kosmos" Tabak- & Cigarettenfabrik H. F. Wolf')]
[('companyName', '"Lékárna U nemocnice"')]
[('companyName', '"Nadace 700 let města Plzně"')]
[('companyName', '"Nadace Rudolfa Löwyho a plzeňských židů na záchranu plzeňských synagog"')]
[('companyName', '"R + R"')]
[('companyName', '"South Bohemia Star" společnost s ručením omezeným')]
[('companyName', '"Tabak Polska"')]
[('companyName', '"Yansour" Türkische Zigarettenfabrik H. Kemaleddin & Co. GmbH')]
[('companyName', '%Arabica')]
[('companyName', "''CableTV Co,Ltd")]
[('companyName', "''PDC")]
[(

500

It worked. So I can include the other properties. Test Blackberry

In [14]:
queryString = """
SELECT DISTINCT (?obj1 AS ?Num_Asset) (?obj2 AS ?num_Employees) (?obj5 AS ?est_value) (?obj6 AS ?net_profit) (?obj7 AS ?tot_rev) (?ownby AS ?OwnedBy )(?parentO AS ?Parent) 
WHERE { 


OPTIONAL {wd:Q502121 wdt:P4103 ?obj1 .} #num Asset
OPTIONAL {wd:Q502121 wdt:P1128 ?obj2 .} #num employees
OPTIONAL {wd:Q502121 wdt:P127 ?obj3 . #owned by
            ?obj3 <http://schema.org/name> ?ownby .}
OPTIONAL {wd:Q502121 wdt:P749 ?obj4 . #parent organization
            ?obj4 <http://schema.org/name> ?parentO .}
OPTIONAL {wd:Q502121 wdt:P8340 ?obj5 .} #est value

OPTIONAL {wd:Q502121 wdt:P2295 ?obj6 .} #net profit

OPTIONAL {wd:Q502121 wdt:P2139 ?obj7 .} #tot revenue
} 

LIMIT 500
"""

print("Results")
run_query(queryString)

Results
[('num_Employees', '3647'), ('net_profit', '-152000000'), ('tot_rev', '1040000000')]


1

General BGP

In [19]:
queryString = """
SELECT DISTINCT ?cName (?obj1 AS ?Num_Asset) (?obj2 AS ?num_Employees) (?obj5 AS ?est_value) (?obj6 AS ?net_profit) (?obj7 AS ?tot_rev) (GROUP_CONCAT(DISTINCT ?ownby;separator='; ') AS ?OwnedBy )(?parentO AS ?Parent) 
WHERE { 

?company wdt:P31 wd:Q4830453  .

OPTIONAL {?company wdt:P4103 ?obj1 .} #num Asset
OPTIONAL {?company wdt:P1128 ?obj2 .} #num employees
OPTIONAL {?company wdt:P127 ?obj3 . #owned by
            ?obj3 <http://schema.org/name> ?ownby .}
OPTIONAL {?company wdt:P749 ?obj4 . #parent organization
            ?obj4 <http://schema.org/name> ?parentO .}
OPTIONAL {?company wdt:P8340 ?obj5 .} #est value

OPTIONAL {?company wdt:P2295 ?obj6 .} #net profit

OPTIONAL {?company wdt:P2139 ?obj7 .} #tot revenue

?company <http://schema.org/name> ?cName.
} 
ORDER BY DESC (?num_Employees)
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('cName', 'Walmart'), ('num_Employees', '2500000'), ('net_profit', '13643000000'), ('tot_rev', '559151000000'), ('OwnedBy', 'BlackRock; State Street Corporation; The Vanguard Group; Walton Enterprises')]
[('cName', 'Amazon'), ('num_Employees', '1298000'), ('net_profit', '21300000000'), ('tot_rev', '386064000000'), ('OwnedBy', 'BlackRock; Jeff Bezos; The Vanguard Group')]
[('cName', 'Foxconn'), ('num_Employees', '618460'), ('net_profit', '151357164000'), ('tot_rev', '4358733357000'), ('OwnedBy', 'Terry Gou; The Vanguard Group')]
[('cName', 'PetroChina Company Limited'), ('num_Employees', '521566'), ('net_profit', '42098000000'), ('tot_rev', '1725428000000'), ('OwnedBy', 'China National Petroleum Corporation'), ('Parent', 'China National Petroleum Corporation')]
[('cName', 'Siemens'), ('num_Employees', '372000'), ('net_profit', '5584000000'), ('tot_rev', '83000000000'), ('OwnedBy', 'BlackRock; Qatar Investment Authority; Siemens family; treasury stock')]
[('cName', 'Rikushet'), 

50

**Answer 1** 

The BGP to obtain the num of the employees and other numerical value is

```
?company wdt:P31 wd:Q4830453  .

OPTIONAL {?company wdt:P4103 ?obj1 .} #num Asset
OPTIONAL {?company wdt:P1128 ?obj2 .} #num employees
OPTIONAL {?company wdt:P127 ?obj3 . #owned by
            ?obj3 <http://schema.org/name> ?ownby .}
OPTIONAL {?company wdt:P749 ?obj4 . #parent organization
            ?obj4 <http://schema.org/name> ?parentO .}
OPTIONAL {?company wdt:P8340 ?obj5 .} #est value

OPTIONAL {?company wdt:P2295 ?obj6 .} #net profit

OPTIONAL {?company wdt:P2139 ?obj7 .} #tot revenue

?company <http://schema.org/name> ?cName.

```

----------------

**2 - Identify the BGP to retrieve all companies owned by a company**

Look to Blackberry and Disney

In [15]:
queryString = """
SELECT (?ownby AS ?OwnedBy )(?parentO AS ?Parent) ?obj5 (?ownerOf AS ?OwnerOf)
WHERE { 

wd:Q502121 wdt:P31 wd:Q4830453  . 

OPTIONAL {wd:Q502121 wdt:P127 ?obj3 . #owned by
            ?obj3 <http://schema.org/name> ?ownBy .}
OPTIONAL {wd:Q502121 wdt:P749 ?obj4 . #parent organization
            ?obj4 <http://schema.org/name> ?parentO .}
OPTIONAL {wd:Q502121 wdt:P1830 ?obj5 . #owner of
            ?obj5 <http://schema.org/name> ?ownerOf .}

} 
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('obj5', 'http://www.wikidata.org/entity/Q84790062'), ('OwnerOf', 'BlackBerry')]


1

In [25]:
queryString = """
SELECT (?ownby AS ?OwnedBy )(?parentO AS ?Parent) (?ownerOf AS ?OwnerOf) ?obj5
WHERE {

wd:Q7414 wdt:P31 wd:Q4830453  . 

OPTIONAL {wd:Q7414 wdt:P127 ?obj3 . #owned by
            ?obj3 <http://schema.org/name> ?ownBy .}
OPTIONAL {wd:Q7414 wdt:P749 ?obj4 . #parent organization
            ?obj4 <http://schema.org/name> ?parentO .}
OPTIONAL {wd:Q7414 wdt:P1830 ?obj5 . #owner of
            ?obj5 <http://schema.org/name> ?ownerOf .}

} 
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('OwnerOf', 'Club Penguin'), ('obj5', 'http://www.wikidata.org/entity/Q390223')]
[('OwnerOf', 'YourTV'), ('obj5', 'http://www.wikidata.org/entity/Q22079560')]
[('OwnerOf', 'Endemol Shine UK'), ('obj5', 'http://www.wikidata.org/entity/Q5376047')]
[('OwnerOf', 'Walt Disney Television'), ('obj5', 'http://www.wikidata.org/entity/Q3565642')]
[('OwnerOf', 'Disneyland'), ('obj5', 'http://www.wikidata.org/entity/Q181185')]
[('OwnerOf', 'Beauty and the Beast'), ('obj5', 'http://www.wikidata.org/entity/Q16155122')]
[('OwnerOf', 'Disney Channel Netherlands & Flanders'), ('obj5', 'http://www.wikidata.org/entity/Q2592540')]
[('OwnerOf', '20,000 Leagues Under the Sea: Submarine Voyage'), ('obj5', 'http://www.wikidata.org/entity/Q2813838')]
[('OwnerOf', 'Disney Theatrical Group'), ('obj5', 'http://www.wikidata.org/entity/Q4132994')]
[('OwnerOf', 'Buena Vista International Television'), ('obj5', 'http://www.wikidata.org/entity/Q724613')]
[('OwnerOf', 'Hotstar'), ('obj5', 'http://www.wikidata.

50

In [27]:
##LucasArts
queryString = """
SELECT (?ownby AS ?OwnedBy )(?parentO AS ?Parent) (?ownerOf AS ?OwnerOf) ?obj5
WHERE {

wd:Q216611 wdt:P31 ?p

OPTIONAL {wd:Q216611 wdt:P127 ?obj3 . #owned by
            ?obj3 <http://schema.org/name> ?ownBy .}
OPTIONAL {wd:Q216611 wdt:P749 ?obj4 . #parent organization
            ?obj4 <http://schema.org/name> ?parentO .}
OPTIONAL {wd:Q216611 wdt:P1830 ?obj5 . #owner of
            ?obj5 <http://schema.org/name> ?ownerOf .}

} 
LIMIT 50
"""

print("Results")
run_query(queryString)

Results
[('Parent', 'Lucasfilm')]
[('Parent', 'Lucasfilm')]


2

If a company owns a company that owns others I would like to be able to retrive all sub-companies. 
Trying to count in this case Disney "owns" 345 company

In [26]:
queryString = """
SELECT COUNT(*)
WHERE {

wd:Q7414 wdt:P31 wd:Q4830453  . 
wd:Q7414 wdt:P1830* ?obj5 . #owner of

OPTIONAL{?obj5 wdt:P1830+ ?obj6 .}

} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '345')]


1

In [29]:
queryString = """
SELECT DISTINCT ?obj5 ?cName ?obj6 ?cName1
WHERE {

wd:Q7414 wdt:P31 wd:Q4830453  . 
wd:Q7414 wdt:P1830* ?obj5 . #owner of

?obj5 <http://schema.org/name> ?cName .

OPTIONAL{
?obj5 wdt:P1830+ ?obj6 .
?obj6 <http://schema.org/name> ?cName1 .
}

} 
ORDER BY ASC(?obj5)
"""

print("Results")
run_query(queryString)

Results
[('obj5', 'http://www.wikidata.org/entity/Q10279075'), ('cName', 'Fox Movies')]
[('obj5', 'http://www.wikidata.org/entity/Q1047009'), ('cName', 'Disney Store')]
[('obj5', 'http://www.wikidata.org/entity/Q1047410'), ('cName', 'Walt Disney Animation Studios')]
[('obj5', 'http://www.wikidata.org/entity/Q1052042'), ('cName', 'Epcot')]
[('obj5', 'http://www.wikidata.org/entity/Q1118859'), ('cName', 'Fox Crime')]
[('obj5', 'http://www.wikidata.org/entity/Q1141129'), ('cName', 'CrossGen')]
[('obj5', 'http://www.wikidata.org/entity/Q11681754'), ('cName', 'FXX')]
[('obj5', 'http://www.wikidata.org/entity/Q1169075'), ('cName', 'Lucasfilm Animation')]
[('obj5', 'http://www.wikidata.org/entity/Q1189112'), ('cName', 'Nat Geo Music')]
[('obj5', 'http://www.wikidata.org/entity/Q1200552'), ('cName', 'Marvel Entertainment')]
[('obj5', 'http://www.wikidata.org/entity/Q1204077'), ('cName', 'ImageMovers')]
[('obj5', 'http://www.wikidata.org/entity/Q1228987'), ('cName', 'Disney Digital 3-D')]
[('ob

598

Changing the operators: from * to +. I will retrieve all controlled companies of Disney. 

In [30]:
queryString = """
SELECT DISTINCT ?obj5 ?cName ?obj6 ?cName1
WHERE {

wd:Q7414 wdt:P31 wd:Q4830453  . 
wd:Q7414 wdt:P1830+ ?obj5 . #owner of

?obj5 <http://schema.org/name> ?cName .

OPTIONAL{
?obj5 wdt:P1830+ ?obj6 .
?obj6 <http://schema.org/name> ?cName1 .
}

} 
ORDER BY ASC(?obj5)
"""

print("Results")
run_query(queryString)

Results
[('obj5', 'http://www.wikidata.org/entity/Q10279075'), ('cName', 'Fox Movies')]
[('obj5', 'http://www.wikidata.org/entity/Q1047009'), ('cName', 'Disney Store')]
[('obj5', 'http://www.wikidata.org/entity/Q1047410'), ('cName', 'Walt Disney Animation Studios')]
[('obj5', 'http://www.wikidata.org/entity/Q1052042'), ('cName', 'Epcot')]
[('obj5', 'http://www.wikidata.org/entity/Q1118859'), ('cName', 'Fox Crime')]
[('obj5', 'http://www.wikidata.org/entity/Q1141129'), ('cName', 'CrossGen')]
[('obj5', 'http://www.wikidata.org/entity/Q11681754'), ('cName', 'FXX')]
[('obj5', 'http://www.wikidata.org/entity/Q1169075'), ('cName', 'Lucasfilm Animation')]
[('obj5', 'http://www.wikidata.org/entity/Q1189112'), ('cName', 'Nat Geo Music')]
[('obj5', 'http://www.wikidata.org/entity/Q1200552'), ('cName', 'Marvel Entertainment')]
[('obj5', 'http://www.wikidata.org/entity/Q1204077'), ('cName', 'ImageMovers')]
[('obj5', 'http://www.wikidata.org/entity/Q1228987'), ('cName', 'Disney Digital 3-D')]
[('ob

323

**Answer 2**

I can define the BGP for the owned company of a business

```
SELECT DISTINCT ?obj5 ?cName ?obj6 ?cName1
WHERE {

?company wdt:P31 wd:Q4830453  . 
?company wdt:P1830+ ?obj5 . #owner of

?obj5 <http://schema.org/name> ?cName .

OPTIONAL{
?obj5 wdt:P1830+ ?obj6 .
?obj6 <http://schema.org/name> ?cName1 .
}

} 
ORDER BY ASC(?obj5)

```

The simpler version is

```
SELECT DISTINCT ?obj5 ?cName ?obj6 ?cName1
WHERE {

?company wdt:P31 wd:Q4830453  . 
?company wdt:P1830 ?obj5 . #owner of

} 
ORDER BY ASC(?obj5)

```
------------------

**3 - Is there some company that owns companies in other countries?**

I try to search for the answer using the simpler version. 

In [33]:
queryString = """
SELECT DISTINCT ?cName ?cnName ?cName1 ?cnName1
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 

?company wdt:P1830 ?obj5 . #owner of
?obj5 wdt:P17 ?country1.

?obj5 <http://schema.org/name> ?cName1 .
?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .
?country1 <http://schema.org/name> ?cnName1 .

FILTER (?country1 != ?country)
} 
LIMIT 30
"""

print("Results")
run_query(queryString)

Results
[('cName', 'AG für Verkehrswesen'), ('cnName', 'German Reich'), ('cName1', 'Ostpreußische Kleinbahnen'), ('cnName1', 'Russia')]
[('cName', 'AG für Verkehrswesen'), ('cnName', 'German Reich'), ('cName1', 'Halle-Hettstedter Eisenbahn-Gesellschaft'), ('cnName1', 'German Democratic Republic')]
[('cName', 'AG für Verkehrswesen'), ('cnName', 'German Reich'), ('cName1', 'Industriebahn AG'), ('cnName1', 'Germany')]
[('cName', 'AG für Verkehrswesen'), ('cnName', 'German Reich'), ('cName1', 'Bremisch-Hannoversche Kleinbahn'), ('cnName1', 'Germany')]
[('cName', 'AG für Verkehrswesen'), ('cnName', 'German Reich'), ('cName1', 'Württembergische Nebenbahnen'), ('cnName1', 'Germany')]
[('cName', 'AG für Verkehrswesen'), ('cnName', 'German Reich'), ('cName1', 'Butzbach-Licher Eisenbahn'), ('cnName1', 'Germany')]
[('cName', 'AG für Verkehrswesen'), ('cnName', 'German Reich'), ('cName1', 'Frankfurt-Königsteiner Eisenbahn'), ('cnName1', 'Germany')]
[('cName', 'AG für Verkehrswesen'), ('cnName', 'G

30

restricting to USA Q30, UK Q145, and Canada Q16

In [45]:
queryString = """
SELECT DISTINCT ?cName ?cnName ?cName1 ?cnName1
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 

?company wdt:P1830 ?obj5 . #owner of
?obj5 wdt:P17 ?country1.

?obj5 <http://schema.org/name> ?cName1 .
?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .
?country1 <http://schema.org/name> ?cnName1 .

FILTER (?country1 != ?country)
FILTER (?country = wd:Q30 || ?country = wd:Q145 || ?country = wd:Q16)
} 
ORDER BY ASC(?cName)
LIMIT 70
"""

print("Results")
run_query(queryString)

Results
[('cName', '3G Capital'), ('cnName', 'United States of America'), ('cName1', 'Lojas Americanas'), ('cnName1', 'Brazil')]
[('cName', '3G Capital'), ('cnName', 'United States of America'), ('cName1', 'América Latina Logística'), ('cnName1', 'Brazil')]
[('cName', '3G Capital'), ('cnName', 'United States of America'), ('cName1', 'Restaurant Brands International'), ('cnName1', 'Canada')]
[('cName', 'A&E Networks'), ('cnName', 'United States of America'), ('cName1', 'Crime & Investigation Network Asia'), ('cnName1', 'Singapore')]
[('cName', 'A&E Networks'), ('cnName', 'United States of America'), ('cName1', 'Crime & Investigation Network'), ('cnName1', 'United Kingdom')]
[('cName', 'A&E Networks'), ('cnName', 'United States of America'), ('cName1', 'History TV18'), ('cnName1', 'India')]
[('cName', 'A&E Networks'), ('cnName', 'United States of America'), ('cName1', 'H2'), ('cnName1', 'Brazil')]
[('cName', 'AES Corporation'), ('cnName', 'United States of America'), ('cName1', 'Shulbins

70

I want some numerical values. Owned companies outside their countries

In [52]:
queryString = """
SELECT DISTINCT ?company ?cName ?cnName COUNT(DISTINCT(?cName1)) AS ?numOfForeignComp
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 

?company wdt:P1830 ?obj5 . #owner of
?obj5 wdt:P17 ?country1.

?obj5 <http://schema.org/name> ?cName1 .
?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .
?country1 <http://schema.org/name> ?cnName1 .

FILTER (?country1 != ?country)
FILTER (?country = wd:Q30 || ?country = wd:Q145 || ?country = wd:Q16)
} 
GROUP BY (?cName) (?cnName) (?company)
ORDER BY ASC(?cnName)

"""

print("Results")
run_query(queryString)

Results
[('company', 'http://www.wikidata.org/entity/Q835702'), ('cName', 'Corel'), ('cnName', 'Canada'), ('numOfForeignComp', '3')]
[('company', 'http://www.wikidata.org/entity/Q303439'), ('cName', 'Legoland Discovery Centre'), ('cnName', 'Canada'), ('numOfForeignComp', '1')]
[('company', 'http://www.wikidata.org/entity/Q83420552'), ('cName', 'Iskut IMC'), ('cnName', 'Canada'), ('numOfForeignComp', '1')]
[('company', 'http://www.wikidata.org/entity/Q76039'), ('cName', "Hudson's Bay Company"), ('cnName', 'Canada'), ('numOfForeignComp', '2')]
[('company', 'http://www.wikidata.org/entity/Q808908'), ('cName', 'Barrick Gold'), ('cnName', 'Canada'), ('numOfForeignComp', '5')]
[('company', 'http://www.wikidata.org/entity/Q7318069'), ('cName', 'Revera'), ('cnName', 'Canada'), ('numOfForeignComp', '1')]
[('company', 'http://www.wikidata.org/entity/Q1412968'), ('cName', 'NRJ Group'), ('cnName', 'Canada'), ('numOfForeignComp', '1')]
[('company', 'http://www.wikidata.org/entity/Q61862560'), ('cNa

167

In [None]:
Finally

In [54]:
queryString = """
SELECT DISTINCT ?cnName COUNT(DISTINCT(?company)) AS ?numCompany COUNT(DISTINCT(?cName1)) AS ?numOfForeignComp
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 

?company wdt:P1830 ?obj5 . #owner of
?obj5 wdt:P17 ?country1.

?obj5 <http://schema.org/name> ?cName1 .
?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .
?country1 <http://schema.org/name> ?cnName1 .

FILTER (?country1 != ?country)
FILTER (?country = wd:Q30 || ?country = wd:Q145 || ?country = wd:Q16)
} 
GROUP BY (?cnName) 
ORDER BY ASC(?cnName)

"""

print("Results")
run_query(queryString)

Results
[('cnName', 'Canada'), ('numCompany', '9'), ('numOfForeignComp', '16')]
[('cnName', 'United Kingdom'), ('numCompany', '53'), ('numOfForeignComp', '126')]
[('cnName', 'United States of America'), ('numCompany', '105'), ('numOfForeignComp', '327')]


3

**Answer 3**

As we can se from the previous queries in the considered countries(UK, Canada, USA) there are companies that own companies in other countries

-------------
**4 - Companies have different 'legal forms', compare the number of companies divided in different legal forms**

From the first query I can see that P1454 is "legal form"

In [67]:
queryString = """
SELECT DISTINCT ?cnName COUNT(DISTINCT(?lgName)) AS ?numLegalForms
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .
?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q30 || ?country = wd:Q145 || ?country = wd:Q16)
} 
GROUP BY (?cnName)
ORDER BY ASC(?cnName)

"""

print("Results")
run_query(queryString)

Results
[('cnName', 'Canada'), ('numLegalForms', '30')]
[('cnName', 'United Kingdom'), ('numLegalForms', '43')]
[('cnName', 'United States of America'), ('numLegalForms', '84')]


3

Now I look for the number of companies in each different legal form. Exploring Canada because of the number of different legal forms

In [66]:
queryString = """
SELECT DISTINCT ?cnName ?lgName COUNT(DISTINCT(?company)) AS ?num
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .
?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q16)
} 
GROUP BY (?cnName) (?lgName)
ORDER BY DESC (?num)

"""

print("Results")
run_query(queryString)

Results
[('cnName', 'Canada'), ('lgName', 'corporation'), ('num', '85')]
[('cnName', 'Canada'), ('lgName', 'joint-stock company'), ('num', '42')]
[('cnName', 'Canada'), ('lgName', 'public company'), ('num', '18')]
[('cnName', 'Canada'), ('lgName', 'privately held company'), ('num', '14')]
[('cnName', 'Canada'), ('lgName', 'company'), ('num', '11')]
[('cnName', 'Canada'), ('lgName', 'Crown corporation of Canada'), ('num', '11')]
[('cnName', 'Canada'), ('lgName', 'subsidiary'), ('num', '9')]
[('cnName', 'Canada'), ('lgName', 'S.A.'), ('num', '5')]
[('cnName', 'Canada'), ('lgName', 'cooperative'), ('num', '3')]
[('cnName', 'Canada'), ('lgName', 'GmbH & Co. KG'), ('num', '2')]
[('cnName', 'Canada'), ('lgName', 'Aktiengesellschaft'), ('num', '2')]
[('cnName', 'Canada'), ('lgName', 'private limited liability company'), ('num', '2')]
[('cnName', 'Canada'), ('lgName', 'limited partnership'), ('num', '2')]
[('cnName', 'Canada'), ('lgName', 'limited liability company'), ('num', '2')]
[('cnName',

30

Trying to include also UK and USA

In [5]:
queryString = """

SELECT DISTINCT ?cnName ?lgName COUNT(DISTINCT(?company)) AS ?num
WHERE {


{
?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .
?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q16)

}
UNION 
{
?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .
?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q145)
}

} 
GROUP BY (?cnName) (?lgName)
ORDER BY DESC (?num) ASC(?cnName)


"""

print("Results")
run_query(queryString)

Results
[('cnName', 'United Kingdom'), ('lgName', 'public limited company'), ('num', '240')]
[('cnName', 'United Kingdom'), ('lgName', 'private company limited by shares'), ('num', '218')]
[('cnName', 'Canada'), ('lgName', 'corporation'), ('num', '85')]
[('cnName', 'United Kingdom'), ('lgName', 'privately held company'), ('num', '48')]
[('cnName', 'United Kingdom'), ('lgName', 'limited company'), ('num', '46')]
[('cnName', 'Canada'), ('lgName', 'joint-stock company'), ('num', '42')]
[('cnName', 'United Kingdom'), ('lgName', 'public company'), ('num', '40')]
[('cnName', 'United Kingdom'), ('lgName', 'joint-stock company'), ('num', '35')]
[('cnName', 'United Kingdom'), ('lgName', 'private limited liability company'), ('num', '32')]
[('cnName', 'Canada'), ('lgName', 'public company'), ('num', '18')]
[('cnName', 'Canada'), ('lgName', 'privately held company'), ('num', '14')]
[('cnName', 'United Kingdom'), ('lgName', 'Registered Society'), ('num', '13')]
[('cnName', 'United Kingdom'), ('lgN

73

**Answer 4**

Here I consider only the top 4 different legal forms for each country: Canada, UK, USA to have a small view of the domanin.

We can see that the "private held company" is present in the top 4 of each country.

In [12]:
queryString = """

SELECT DISTINCT  ?cnName ?lgName ?num WHERE {
    {
    
SELECT DISTINCT ?cnName ?lgName COUNT(DISTINCT(?company)) AS ?num
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .
?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q16)
} 
GROUP BY (?cnName) (?lgName)
ORDER BY DESC (?num)
LIMIT 4

    }
  UNION
   {
     SELECT DISTINCT ?cnName ?lgName COUNT(DISTINCT(?company)) AS ?num
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .
?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q145)
} 
GROUP BY (?cnName) (?lgName)
ORDER BY DESC (?num)
LIMIT 4
    }
UNION
{
     SELECT DISTINCT ?cnName ?lgName COUNT(DISTINCT(?company)) AS ?num
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .
?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q30)
} 
GROUP BY (?cnName) (?lgName)
ORDER BY DESC (?num)
LIMIT 4
    }
    }
ORDER BY ASC (?cnName)
    """

print("Results")
run_query(queryString)

Results
[('cnName', 'Canada'), ('lgName', 'corporation'), ('num', '85')]
[('cnName', 'Canada'), ('lgName', 'joint-stock company'), ('num', '42')]
[('cnName', 'Canada'), ('lgName', 'public company'), ('num', '18')]
[('cnName', 'Canada'), ('lgName', 'privately held company'), ('num', '14')]
[('cnName', 'United Kingdom'), ('lgName', 'public limited company'), ('num', '240')]
[('cnName', 'United Kingdom'), ('lgName', 'private company limited by shares'), ('num', '218')]
[('cnName', 'United Kingdom'), ('lgName', 'privately held company'), ('num', '48')]
[('cnName', 'United Kingdom'), ('lgName', 'limited company'), ('num', '46')]
[('cnName', 'United States of America'), ('lgName', 'public company'), ('num', '433')]
[('cnName', 'United States of America'), ('lgName', 'joint-stock company'), ('num', '315')]
[('cnName', 'United States of America'), ('lgName', 'privately held company'), ('num', '281')]
[('cnName', 'United States of America'), ('lgName', 'limited liability company'), ('num', '192

12

----------------

**5 - Analyze the number of employees and other relevant numeric attributes**



***5.1 What are the top-10 companies for a given attribute?***

I consider the number of employees and the net profit, but the query can be personalized leaving only the attribute we like. 

In [None]:
queryString = """
SELECT DISTINCT ?cName (?obj1 AS ?Num_Asset) (?obj2 AS ?num_Employees) (?obj5 AS ?est_value) (?obj6 AS ?net_profit) (?obj7 AS ?tot_rev) (GROUP_CONCAT(DISTINCT ?ownby;separator='; ') AS ?OwnedBy )(?parentO AS ?Parent) 
WHERE { 

?company wdt:P31 wd:Q4830453  .

OPTIONAL {?company wdt:P4103 ?obj1 .} #num Asset
OPTIONAL {?company wdt:P1128 ?obj2 .} #num employees
OPTIONAL {?company wdt:P127 ?obj3 . #owned by
            ?obj3 <http://schema.org/name> ?ownby .}
OPTIONAL {?company wdt:P749 ?obj4 . #parent organization
            ?obj4 <http://schema.org/name> ?parentO .}
OPTIONAL {?company wdt:P8340 ?obj5 .} #est value

OPTIONAL {?company wdt:P2295 ?obj6 .} #net profit

OPTIONAL {?company wdt:P2139 ?obj7 .} #tot revenue

?company <http://schema.org/name> ?cName.
} 
ORDER BY DESC (?num_Employees)
LIMIT 50
"""

print("Results")
run_query(queryString)

*TOP 10 for the number of EMPLOYEES*

In [20]:
queryString = """
SELECT DISTINCT ?cName (?obj2 AS ?num_Employees)
WHERE { 

?company wdt:P31 wd:Q4830453  .

OPTIONAL {?company wdt:P1128 ?obj2 .} #num employees

?company <http://schema.org/name> ?cName.
} 
ORDER BY DESC (?num_Employees)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('cName', 'Rikushet'), ('num_Employees', 'nodeID://b70728')]
[('cName', 'Walmart'), ('num_Employees', '2500000')]
[('cName', 'China National Petroleum Corporation'), ('num_Employees', '1589508')]
[('cName', 'State Grid Corporation of China'), ('num_Employees', '1581000')]
[('cName', 'Amazon'), ('num_Employees', '1298000')]
[('cName', 'Randstad N.V.'), ('num_Employees', '658580')]
[('cName', 'Foxconn'), ('num_Employees', '618460')]
[('cName', 'G4S'), ('num_Employees', '618260')]
[('cName', 'RAO UES'), ('num_Employees', '577000')]
[('cName', 'DHL'), ('num_Employees', '570000')]


10

*TOP 10 for the net profit*, one of the most important attribute for a business 

In [21]:
queryString = """
SELECT DISTINCT ?cName (?obj6 AS ?net_profit)
WHERE { 

?company wdt:P31 wd:Q4830453  .

OPTIONAL {?company wdt:P2295 ?obj6 .} #net profit

?company <http://schema.org/name> ?cName.
} 
ORDER BY DESC (?net_profit)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('cName', 'Samsung Electronics'), ('net_profit', '22730000000000')]
[('cName', 'Toyota'), ('net_profit', '2312694000000')]
[('cName', 'Central Japan Railway Company'), ('net_profit', '391624000000')]
[('cName', 'Saudi Aramco'), ('net_profit', '330693000000')]
[('cName', 'Leica Camera'), ('net_profit', 'nodeID://b62766')]
[('cName', 'SoftBank'), ('net_profit', '1237812000000')]
[('cName', 'Sberbank of Russia'), ('net_profit', '760300000000')]
[('cName', 'Industrial and Commercial Bank of China'), ('net_profit', '317685000000')]
[('cName', 'East Japan Railway Company'), ('net_profit', '279435000000')]
[('cName', 'MOL Group'), ('net_profit', '251655000000')]


10

------------------

***5.2 For which companies is defined some form of income or market capitalization or total assets? What is the min, max, and average in each category and country?***

In [23]:
queryString = """
SELECT DISTINCT ?lgName
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company wdt:P2403 ?totasset .

?company <http://schema.org/name> ?cName .

?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q30 || ?country = wd:Q145 || ?country = wd:Q16)
} 

ORDER BY ASC(?cnName)

"""

print("Results")
run_query(queryString)

Results
[('lgName', 'public limited company')]
[('lgName', 'public company')]
[('lgName', 'limited liability company')]
[('lgName', 'privately held company')]
[('lgName', 'U.S. corporation')]
[('lgName', 'Delaware corporation')]
[('lgName', 'limited partnership')]
[('lgName', 'incorporation')]
[('lgName', 'government-sponsored enterprise')]
[('lgName', 'joint-stock company')]
[('lgName', 'corporation')]
[('lgName', 'private company')]
[('lgName', 'Washington corporation')]


13

**Answer 5.2** 

From the previous query, cosidering only the three countries, legal form like public limited company,corporation,ecc.. have the "total asset" value.

Testing for Canada, Q16

In [26]:
queryString = """
SELECT DISTINCT ?lgName MAX(?totasset)
WHERE {

?company wdt:P17  wd:Q16 .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company wdt:P2403 ?totasset .

?company <http://schema.org/name> ?cName .

?legalForm <http://schema.org/name> ?lgName .


} 

ORDER BY ASC(?cnName)

"""

print("Results")
run_query(queryString)

Results
[('lgName', 'corporation')]


1

Using market capitaliaztion?

In [27]:
queryString = """
SELECT DISTINCT ?lgName MAX(?marketcap)
WHERE {

?company wdt:P17  wd:Q16 .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap .

?company <http://schema.org/name> ?cName .

?legalForm <http://schema.org/name> ?lgName .


} 

ORDER BY ASC(?cnName)

"""

print("Results")
run_query(queryString)

Results
[('lgName', 'corporation'), ('callret-1', '45900000000')]
[('lgName', 'public company'), ('callret-1', '648532358')]


2

I will use market cap entity. So, I repeat the previous query. 

In [28]:
queryString = """
SELECT DISTINCT ?lgName
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap .

?company <http://schema.org/name> ?cName .

?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q30 || ?country = wd:Q145 || ?country = wd:Q16)
} 

ORDER BY ASC(?cnName)

"""

print("Results")
run_query(queryString)

Results
[('lgName', 'public limited company')]
[('lgName', 'public company')]
[('lgName', 'privately held company')]
[('lgName', 'U.S. corporation')]
[('lgName', 'Delaware corporation')]
[('lgName', 'incorporation')]
[('lgName', 'joint-stock company')]
[('lgName', 'corporation')]
[('lgName', 'Washington corporation')]


9

There are 9 different legal forms that have a market cap value

In [36]:
queryString = """
SELECT DISTINCT ?lgName (MAX(?marketcap) AS ?max_mcap) (MIN(?marketcap) AS ?min_mcap) (AVG(?marketcap) AS ?avg_mcap)
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .

?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q30 || ?country = wd:Q145 || ?country = wd:Q16)
} 
GROUP BY (?lgName)

"""

print("Results")
run_query(queryString)

Results
[('lgName', 'Washington corporation'), ('max_mcap', '2270000000000'), ('min_mcap', '407000000000'), ('avg_mcap', '1649250000000')]
[('lgName', 'U.S. corporation'), ('max_mcap', '2470000000000'), ('min_mcap', '41070000000'), ('avg_mcap', '985799166666.666666666666667')]
[('lgName', 'public limited company'), ('max_mcap', '185400000000'), ('min_mcap', '47000000000'), ('avg_mcap', '116200000000')]
[('lgName', 'joint-stock company'), ('max_mcap', '650340000000'), ('min_mcap', '3960000000'), ('avg_mcap', '171168050000')]
[('lgName', 'privately held company'), ('max_mcap', '62500000000'), ('min_mcap', '62500000000'), ('avg_mcap', '62500000000')]
[('lgName', 'Delaware corporation'), ('max_mcap', '1900000000000'), ('min_mcap', '1530000000'), ('avg_mcap', '345468080580.777777777777778')]
[('lgName', 'corporation'), ('max_mcap', '70600000000'), ('min_mcap', '6882000000'), ('avg_mcap', '41127333333.333333333333333')]
[('lgName', 'public company'), ('max_mcap', '2270000000000'), ('min_mcap

9

Trying to split into the different countries. One query for each attribute I'm looking for: One query for the max market cap, one for min market cap and another one for the average. 

In [None]:
Max Market cap

In [23]:
queryString = """
SELECT DISTINCT ?cnName ?cName ?lgName ?max_mcap
WHERE {

?company wdt:P17 ?country .
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap1 .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .

?legalForm <http://schema.org/name> ?lgName .

FILTER (?marketcap1 = ?max_mcap)

{
SELECT DISTINCT ?country ?lgName (MAX(?marketcap) AS ?max_mcap)
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap .


?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q30 || ?country = wd:Q145 || ?country = wd:Q16)


}
GROUP BY (?lgName) (?country)

} 

}
ORDER BY ASC (?cnName) ASC(?lgName)
"""

print("Results")
run_query(queryString)

Results
[('cnName', 'Canada'), ('cName', 'Thomson Reuters'), ('lgName', 'corporation'), ('max_mcap', '45900000000')]
[('cnName', 'Canada'), ('cName', 'Cineplex Entertainment'), ('lgName', 'public company'), ('max_mcap', '648532358')]
[('cnName', 'United Kingdom'), ('cName', 'HSBC'), ('lgName', 'public limited company'), ('max_mcap', '185400000000')]
[('cnName', 'United States of America'), ('cName', 'Alphabet Inc.'), ('lgName', 'Delaware corporation'), ('max_mcap', '1900000000000')]
[('cnName', 'United States of America'), ('cName', 'Apple Inc.'), ('lgName', 'U.S. corporation'), ('max_mcap', '2470000000000')]
[('cnName', 'United States of America'), ('cName', 'Microsoft'), ('lgName', 'Washington corporation'), ('max_mcap', '2270000000000')]
[('cnName', 'United States of America'), ('cName', 'eBay'), ('lgName', 'corporation'), ('max_mcap', '70600000000')]
[('cnName', 'United States of America'), ('cName', 'Amazon'), ('lgName', 'incorporation'), ('max_mcap', '1670000000000')]
[('cnName',

11

In [None]:
Min Market cap

In [21]:
queryString = """
SELECT DISTINCT ?cnName ?cName ?lgName ?min_mcap
WHERE {

?company wdt:P17 ?country .
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap1 .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .

?legalForm <http://schema.org/name> ?lgName .

FILTER (?marketcap1 = ?max_mcap)

{
SELECT DISTINCT ?country ?lgName (MIN(?marketcap) AS ?min_mcap)
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap .


?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q30 || ?country = wd:Q145 || ?country = wd:Q16)


}
GROUP BY (?lgName) (?country)
} 

}
ORDER BY ASC (?cnName) ASC(?lgName)
"""

print("Results")
run_query(queryString)

Results
[('cnName', 'Canada'), ('cName', 'BlackBerry'), ('lgName', 'corporation'), ('min_mcap', '6882000000')]
[('cnName', 'Canada'), ('cName', 'Thomson Reuters'), ('lgName', 'corporation'), ('min_mcap', '6882000000')]
[('cnName', 'Canada'), ('cName', 'Cineplex Entertainment'), ('lgName', 'public company'), ('min_mcap', '648532358')]
[('cnName', 'United Kingdom'), ('cName', 'HSBC'), ('lgName', 'public limited company'), ('min_mcap', '47000000000')]
[('cnName', 'United Kingdom'), ('cName', 'Lloyds Banking Group'), ('lgName', 'public limited company'), ('min_mcap', '47000000000')]
[('cnName', 'United States of America'), ('cName', 'Intel'), ('lgName', 'Delaware corporation'), ('min_mcap', '1530000000')]
[('cnName', 'United States of America'), ('cName', 'Alphabet Inc.'), ('lgName', 'Delaware corporation'), ('min_mcap', '1530000000')]
[('cnName', 'United States of America'), ('cName', 'Nvidia'), ('lgName', 'Delaware corporation'), ('min_mcap', '1530000000')]
[('cnName', 'United States of 

87

As we can see there are many companies with the same minimun market cap. This mean probably that those are approximated values. 
I notice that Alphabet(Google ndr) is preset in both results set, max and min market cap. Maybe this is due different resources or different moment in the company history.

Explore the property market cap

In [27]:
queryString = """
SELECT DISTINCT  ?p ?pName ?o
WHERE {

wd:P2226 ?p ?o .

OPTIONAL{?p <http://schema.org/name> ?pName .}

}

"""

print("Results")
run_query(queryString)

Results
[('p', 'http://www.wikidata.org/prop/direct/P31'), ('pName', 'instance of'), ('o', 'http://www.wikidata.org/entity/Q21451178')]
[('p', 'http://www.wikidata.org/prop/direct/P1896'), ('pName', 'source website for the property'), ('o', 'http://finance.yahoo.com')]
[('p', 'http://www.wikidata.org/prop/direct/P1629'), ('pName', 'Wikidata item of this property'), ('o', 'http://www.wikidata.org/entity/Q471719')]
[('p', 'http://www.wikidata.org/prop/direct/P1855'), ('pName', 'Wikidata property example'), ('o', 'http://www.wikidata.org/entity/Q312')]
[('p', 'http://www.wikidata.org/prop/direct/P2302'), ('pName', 'property constraint'), ('o', 'http://www.wikidata.org/entity/Q53869507')]
[('p', 'http://www.wikidata.org/prop/direct/P2302'), ('pName', 'property constraint'), ('o', 'http://www.wikidata.org/entity/Q21510851')]
[('p', 'http://www.wikidata.org/prop/direct/P2302'), ('pName', 'property constraint'), ('o', 'http://www.wikidata.org/entity/Q21514353')]
[('p', 'http://www.wikidata.or

28

In [29]:
queryString = """
SELECT DISTINCT ?cnName ?cName ?lgName ?cap
WHERE {

?company wdt:P17 ?country .
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap1 .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .

?legalForm <http://schema.org/name> ?lgName .

FILTER (?marketcap1 = ?cap)

{
SELECT DISTINCT ?country ?lgName (MAX(?marketcap) AS ?cap)
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap .


?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q30 || ?country = wd:Q145 || ?country = wd:Q16)


}
GROUP BY (?lgName) (?country)

} 
UNION
{
SELECT DISTINCT ?country ?lgName (MIN(?marketcap) AS ?cap)
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap .


?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q30 || ?country = wd:Q145 || ?country = wd:Q16)


}
GROUP BY (?lgName) (?country)

} 


}
ORDER BY ASC (?cnName) ASC(?lgName)
"""

print("Results")
run_query(queryString)

Results
[('cnName', 'Canada'), ('cName', 'Thomson Reuters'), ('lgName', 'corporation'), ('cap', '45900000000')]
[('cnName', 'Canada'), ('cName', 'BlackBerry'), ('lgName', 'corporation'), ('cap', '6882000000')]
[('cnName', 'Canada'), ('cName', 'Cineplex Entertainment'), ('lgName', 'public company'), ('cap', '648532358')]
[('cnName', 'United Kingdom'), ('cName', 'HSBC'), ('lgName', 'public limited company'), ('cap', '185400000000')]
[('cnName', 'United Kingdom'), ('cName', 'Lloyds Banking Group'), ('lgName', 'public limited company'), ('cap', '47000000000')]
[('cnName', 'United States of America'), ('cName', 'Alphabet Inc.'), ('lgName', 'Delaware corporation'), ('cap', '1900000000000')]
[('cnName', 'United States of America'), ('cName', 'Alphabet Inc.'), ('lgName', 'Delaware corporation'), ('cap', '1530000000')]
[('cnName', 'United States of America'), ('cName', 'Apple Inc.'), ('lgName', 'U.S. corporation'), ('cap', '2470000000000')]
[('cnName', 'United States of America'), ('cName', 'No

19

In [57]:
queryString = """
SELECT DISTINCT ?cnName ?cmpn ?lgName ?capitalization WHERE {
{
SELECT DISTINCT ?cnName ?cmpn ?lgName ?max_mcap
WHERE {

?company wdt:P17 ?country .
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap1 .

?company <http://schema.org/name> ?cName1 .
?country <http://schema.org/name> ?cnName .

?legalForm <http://schema.org/name> ?lgName .

FILTER (?marketcap1 = ?max_mcap)

BIND (?cName1 AS ?cmpn)

{
SELECT DISTINCT ?country ?lgName (MAX(?marketcap) AS ?max_mcap)
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap .


?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q30 || ?country = wd:Q145 || ?country = wd:Q16)


}
GROUP BY (?lgName) (?country)

} 

}
ORDER BY ASC (?cnName) ASC(?lgName)
}
UNION
{
SELECT DISTINCT ?cnName ?cmpn ?lgName ?min_mcap
WHERE {

?company wdt:P17 ?country .
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap2 .

?company <http://schema.org/name> ?cName2 .
?country <http://schema.org/name> ?cnName .

?legalForm <http://schema.org/name> ?lgName .

FILTER (?marketcap2 = ?min_mcap)

BIND (?cName2 AS ?cmpn)

{
SELECT DISTINCT ?country ?lgName (MIN(?marketcap) AS ?min_mcap)
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 
?company wdt:P1454 ?legalForm .

?company wdt:P2226 ?marketcap .


?legalForm <http://schema.org/name> ?lgName .


FILTER (?country = wd:Q30 || ?country = wd:Q145 || ?country = wd:Q16)


}
GROUP BY (?lgName) (?country)

} 

}
ORDER BY ASC (?cnName) ASC(?lgName)
}

}
ORDER BY ?cnName
"""

print("Results")
run_query(queryString)

Results
[('cnName', 'Canada'), ('cmpn', 'Thomson Reuters'), ('lgName', 'corporation'), ('capitalization', '1')]
[('cnName', 'Canada'), ('cmpn', 'Cineplex Entertainment'), ('lgName', 'public company'), ('capitalization', '1')]
[('cnName', 'Canada'), ('cmpn', 'BlackBerry'), ('lgName', 'corporation')]
[('cnName', 'Canada'), ('cmpn', 'Cineplex Entertainment'), ('lgName', 'public company')]
[('cnName', 'United Kingdom'), ('cmpn', 'HSBC'), ('lgName', 'public limited company'), ('capitalization', '1')]
[('cnName', 'United Kingdom'), ('cmpn', 'Lloyds Banking Group'), ('lgName', 'public limited company')]
[('cnName', 'United States of America'), ('cmpn', 'Alphabet Inc.'), ('lgName', 'Delaware corporation'), ('capitalization', '1')]
[('cnName', 'United States of America'), ('cmpn', 'Apple Inc.'), ('lgName', 'U.S. corporation'), ('capitalization', '1')]
[('cnName', 'United States of America'), ('cmpn', 'Microsoft'), ('lgName', 'Washington corporation'), ('capitalization', '1')]
[('cnName', 'Unite

22

***5.3 Which business in each country owns more businesses in other countries?***

It is a question similar to a previous one. I try to reuse that query modifying it a little bit

In [10]:
queryString = """

SELECT DISTINCT ?cnName ?cName COUNT(DISTINCT(?cName1)) AS ?numOfForeignComp
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 

?company wdt:P1830 ?obj5 . #owner of
?obj5 wdt:P17 ?country1.

?obj5 <http://schema.org/name> ?cName1 .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .

?country1 <http://schema.org/name> ?cnName1 .

FILTER (?country1 != ?country)
FILTER (?country = wd:Q16)
} 
GROUP BY (?cnName) (?cName)
ORDER BY DESC(?numOfForeignComp)
LIMIT 1


"""

print("Results")
run_query(queryString)

Results
[('cnName', 'Canada'), ('cName', 'Barrick Gold'), ('numOfForeignComp', '5')]


1

In [11]:
queryString = """

SELECT DISTINCT  ?cnName ?cName ?numOfForeignComp WHERE {
    {
    
SELECT DISTINCT ?cnName ?cName COUNT(DISTINCT(?cName1)) AS ?numOfForeignComp
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 

?company wdt:P1830 ?obj5 . #owner of
?obj5 wdt:P17 ?country1.

?obj5 <http://schema.org/name> ?cName1 .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .

?country1 <http://schema.org/name> ?cnName1 .

FILTER (?country1 != ?country)
FILTER (?country = wd:Q16)
} 
GROUP BY (?cnName) (?cName)
ORDER BY DESC(?numOfForeignComp)
LIMIT 1

    }
  UNION
   {
     SELECT DISTINCT ?cnName ?cName COUNT(DISTINCT(?cName1)) AS ?numOfForeignComp
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 

?company wdt:P1830 ?obj5 . #owner of
?obj5 wdt:P17 ?country1.

?obj5 <http://schema.org/name> ?cName1 .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .

?country1 <http://schema.org/name> ?cnName1 .

FILTER (?country1 != ?country)
FILTER (?country = wd:Q30)
} 
GROUP BY (?cnName) (?cName)
ORDER BY DESC(?numOfForeignComp)
LIMIT 1
    }
UNION
{
     SELECT DISTINCT ?cnName ?cName COUNT(DISTINCT(?cName1)) AS ?numOfForeignComp
WHERE {

?company wdt:P17 ?country .
?company wdt:P31 wd:Q4830453  . 

?company wdt:P1830 ?obj5 . #owner of
?obj5 wdt:P17 ?country1.

?obj5 <http://schema.org/name> ?cName1 .

?company <http://schema.org/name> ?cName .
?country <http://schema.org/name> ?cnName .

?country1 <http://schema.org/name> ?cnName1 .

FILTER (?country1 != ?country)
FILTER (?country = wd:Q145)
} 
GROUP BY (?cnName) (?cName)
ORDER BY DESC(?numOfForeignComp)
LIMIT 1
    }
    }
ORDER BY ASC (?cnName)
    """

print("Results")
run_query(queryString)

Results
[('cnName', 'Canada'), ('cName', 'Barrick Gold'), ('numOfForeignComp', '5')]
[('cnName', 'United Kingdom'), ('cName', 'Liberty Global'), ('numOfForeignComp', '8')]
[('cnName', 'United States of America'), ('cName', 'The Walt Disney Company'), ('numOfForeignComp', '59')]


3

**Answer 5.3**

In Canada Barrick Gold owns 5 different business outside its country

In UK Liberty Global owns 8 different business outside its country

In USA The Walt Disney Company owns 59 different business outside its country