# General instructions

The goal of the project is to materialize a set of **exploratory workloads** over a real-world, large-scale,  open-domain KG: [WikiData](https://www.wikidata.org/wiki/Wikidata:Main_Page)

An exploratory workload  is composed by a set of queries, where each query is related to the information obtained previously.

An exploratory workload starts with a usually vague, open ended question, and does not assume the person issuing the workload has a clear understanding of the data contained in the target database or its structure.

Remeber that:

1. All the queries must run in the python notebook
2. You can use classes and properties only if you find them via a SPARQL query that must be present in the notebook
3. You do not delete useless queries. Keep everything that is synthatically valid 

```
?p <http://schema.org/name> ?name .
```
    
is the BGP returning a human-readable name of a property or a class in Wikidata.
    
    

In [2]:
## SETUP used later

from SPARQLWrapper import SPARQLWrapper, JSON


prefixString = """
##-cdcf47393e-##
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX sc: <http://schema.org/>
"""

# select and construct queries
def run_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
       results = sparql.query()
       json_results = results.convert()
       if len(json_results['results']['bindings'])==0:
          print("Empty")
          return 0
    
       for bindings in json_results['results']['bindings']:
          print( [ (var, value['value'])  for var, value in bindings.items() ] )

       return len(json_results['results']['bindings'])

    except Exception as e :
        print("The operation failed", e)
    
# ASk queries
def run_ask_query(queryString):
    to_run = prefixString + "\n" + queryString

    sparql = SPARQLWrapper("http://a256-gc1-02.srv.aau.dk:5820/sparql")
    sparql.setTimeout(300)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(to_run)

    try :
        return sparql.query().convert()

    except Exception as e :
        print("The operation failed", e)


# Companies Workflow Series ("Economy of EU States") 

Consider the following exploratory information need:

> Compare businesses across different sectors and types in E.U. countries

## Useful URIs for the current workflow


The following are given:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1647`   | subproperty   | predicate |
| `wdt:P31`     | instance of   | predicate |
| `wdt:P279`    | subclass      | predicate |
| `wdt:P17`      | country       | predicate |
| `wd:Q458`      | E.U.         | node |
| `wd:Q142`      | France       | node      |
| `wd:Q4830453`  | Business     | node      |
| `wd:Q6881511`  | Enterprise   | node      |
| `wd:Q29110228` | AXA          | node |
| `wd:Q43183`    | insurance    | node |




Also consider

```
{ 
?p wdt:P17 wd:Q142  . 
?p wdt:P31 wd:Q6881511  . 
} UNION {
?p wdt:P17 wd:Q142  . 
?p wdt:P31 wd:Q4830453  . 
}



```

is the BGP to retrieve all **french enterprises and businesses**

## Workload Goals

1. Identify the BGP for obtaining the type, legal form, and industry of a company

2. Identify the BGP to retrieve all companies owned by a company located in a EU country

3. Which company has the largest presence in E.U.?

4. Companies have different 'legal forms', compare the number of companies divided in different legal forms

5. Analyze the number of companies per type, legal form, and industry in each state
 
   5.1 What are the top-3 legal form in E.U.? 
   
   5.2 For which companies is defined some form of income or market capitalization or total assets? What is the min, max, and average in each country for a given legal form?
   
   5.3 Which business in each country owns more businesses in other E.U. countries?
   
   5.4 What can we say about industry sectors in various countries?


In [1]:
# start your workflow here

In [3]:
queryString = """
SELECT COUNT(*)
WHERE { 

{ 
?p wdt:P17 wd:Q142  . 
?p wdt:P31 wd:Q6881511  . 
} UNION {
?p wdt:P17 wd:Q142  . 
?p wdt:P31 wd:Q4830453  . 
}
} 
"""

print("Results")
run_query(queryString)

Results
[('callret-0', '4577')]


1

#### Comment 
In not all the task is specified that the countries that we need to take into account are only the E.U. ones, so when not specified I assume to take into account also the ones not in E.U. 

## Task 1

1. Identify the BGP for obtaining the type, legal form, and industry of a company

In [14]:
queryString = """
SELECT ?pname 
WHERE { 

wd:Q29110228 <http://schema.org/name> ?pname .
} 
"""

print("Predicates")
run_query(queryString)

Predicates
[('pname', 'Panasonic Europe Ltd')]


1

#### Comment 

I have discovered an error in the URIs given for the notebook, in fact the company related to wd:Q29110228 code is not "AXA" but is "Panasonic Europe Ltd". By the way is not a big issue I work with it.  
Let's try to start from "Panasonic Europe Ltd" to find the information about type, legal form and industry of a company

In [4]:
#Let's look for the properies related to "Panasonic Europe Ltd" (wd:Q29110228) and their relatives connected nodes
queryString = """
SELECT ?pname ?oname ?p ?obj 
WHERE { 

wd:Q29110228 ?p ?obj .

?p <http://schema.org/name> ?pname .
OPTIONAL {?obj <http://schema.org/name> ?oname }.
} 
"""

print("Predicates")
run_query(queryString)

Predicates
[('pname', 'legal form'), ('oname', 'private company limited by shares'), ('p', 'http://www.wikidata.org/prop/direct/P1454'), ('obj', 'http://www.wikidata.org/entity/Q6832945')]
[('pname', 'parent organization'), ('oname', 'Panasonic Corporation'), ('p', 'http://www.wikidata.org/prop/direct/P749'), ('obj', 'http://www.wikidata.org/entity/Q53247')]
[('pname', 'country'), ('oname', 'United Kingdom'), ('p', 'http://www.wikidata.org/prop/direct/P17'), ('obj', 'http://www.wikidata.org/entity/Q145')]
[('pname', 'instance of'), ('oname', 'business'), ('p', 'http://www.wikidata.org/prop/direct/P31'), ('obj', 'http://www.wikidata.org/entity/Q4830453')]
[('pname', 'instance of'), ('oname', 'enterprise'), ('p', 'http://www.wikidata.org/prop/direct/P31'), ('obj', 'http://www.wikidata.org/entity/Q6881511')]
[('pname', 'owner of'), ('oname', 'Ficosa'), ('p', 'http://www.wikidata.org/prop/direct/P1830'), ('obj', 'http://www.wikidata.org/entity/Q5446533')]
[('pname', 'subsidiary'), ('oname'

20

In [6]:
# And now the incomeing ones 
queryString = """
SELECT ?sname ?pname ?sub ?p  
WHERE { 

?sub ?p wd:Q29110228 .

?p <http://schema.org/name> ?pname .
?sub <http://schema.org/name> ?sname .
} 
"""

print("Predicates")
run_query(queryString)

Predicates
[('sname', 'Panasonic Corporation'), ('pname', 'subsidiary'), ('sub', 'http://www.wikidata.org/entity/Q53247'), ('p', 'http://www.wikidata.org/prop/direct/P355')]
[('sname', 'Panasonic AVC Networks Czech'), ('pname', 'owned by'), ('sub', 'http://www.wikidata.org/entity/Q41567544'), ('p', 'http://www.wikidata.org/prop/direct/P127')]
[('sname', 'Panasonic Automotive Systems Czech'), ('pname', 'owned by'), ('sub', 'http://www.wikidata.org/entity/Q52852506'), ('p', 'http://www.wikidata.org/prop/direct/P127')]


3

#### Comment 

From this search we found the legal form of a company but not the type and the industry but we find some additional information, like instance of (business/enterprise) and parent organization that can help us in finding the missing info like.

Brief recap:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1454`   | legal form   | predicate |
| `wdt:P749`     | parent organization   | predicate |

In [8]:
queryString = """
SELECT ?oname ?pname ?nname ?obj ?p ?node 
WHERE { 

wd:Q29110228 wdt:P1454|wdt:P749 ?obj .
?obj ?p ?node. 

?p <http://schema.org/name> ?pname .
?obj <http://schema.org/name> ?oname .
OPTIONAL {?node <http://schema.org/name> ?nname }.
}
ORDER BY ASC (?oname)
"""

print("Predicates")
run_query(queryString)

Predicates
[('oname', 'Panasonic Corporation'), ('pname', 'product or material produced'), ('nname', 'television'), ('obj', 'http://www.wikidata.org/entity/Q53247'), ('p', 'http://www.wikidata.org/prop/direct/P1056'), ('node', 'http://www.wikidata.org/entity/Q289')]
[('oname', 'Panasonic Corporation'), ('pname', 'product or material produced'), ('nname', 'laptop'), ('obj', 'http://www.wikidata.org/entity/Q53247'), ('p', 'http://www.wikidata.org/prop/direct/P1056'), ('node', 'http://www.wikidata.org/entity/Q3962')]
[('oname', 'Panasonic Corporation'), ('pname', 'product or material produced'), ('nname', 'desktop computer'), ('obj', 'http://www.wikidata.org/entity/Q53247'), ('p', 'http://www.wikidata.org/prop/direct/P1056'), ('node', 'http://www.wikidata.org/entity/Q56155')]
[('oname', 'Panasonic Corporation'), ('pname', 'product or material produced'), ('nname', 'computer hardware'), ('obj', 'http://www.wikidata.org/entity/Q53247'), ('p', 'http://www.wikidata.org/prop/direct/P1056'), ('

169

#### Comment

Thank to this query we have discovered that exists a property called "industry" that specifies the industry of a company and that probably "instance of" property refers to the type of the company. Let's now check if the types can tell us that we are talking about a company.

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P452`   | industry   | predicate |

In [10]:
#Let's check if the types have some super classes of reference that can permit us to understand if we are actually looking at a type of company
queryString = """
SELECT ?oname ?pname ?ctname ?obj ?p ?ct  
WHERE { 

wd:Q29110228 wdt:P31 ?obj .
?obj ?p ?ct .

?p <http://schema.org/name> ?pname .
?obj <http://schema.org/name> ?oname .
OPTIONAL {?ct <http://schema.org/name> ?ctname }.
}
ORDER BY ASC (?oname)
"""

print("Predicates")
run_query(queryString)

Predicates
[('oname', 'business'), ('pname', 'objective of project or action'), ('ctname', 'profit'), ('obj', 'http://www.wikidata.org/entity/Q4830453'), ('p', 'http://www.wikidata.org/prop/direct/P3712'), ('ct', 'http://www.wikidata.org/entity/Q2112073')]
[('oname', 'business'), ('pname', "topic's main Wikimedia portal"), ('ctname', 'Portal:Companies'), ('obj', 'http://www.wikidata.org/entity/Q4830453'), ('p', 'http://www.wikidata.org/prop/direct/P1151'), ('ct', 'http://www.wikidata.org/entity/Q11171715')]
[('oname', 'business'), ('pname', 'different from'), ('ctname', 'business'), ('obj', 'http://www.wikidata.org/entity/Q4830453'), ('p', 'http://www.wikidata.org/prop/direct/P1889'), ('ct', 'http://www.wikidata.org/entity/Q19862406')]
[('oname', 'business'), ('pname', 'different from'), ('ctname', 'Business'), ('obj', 'http://www.wikidata.org/entity/Q4830453'), ('p', 'http://www.wikidata.org/prop/direct/P1889'), ('ct', 'http://www.wikidata.org/entity/Q11418769')]
[('oname', 'business'

134

#### Comment
Comparing the results of "enterprise" and "business" we see that they have the same node connected with the property "said to be same as" that is "company", maybe this node is the one that identify the concept of company. 
There are other two BGP that catches my attention that is:
   - 'enterprise' - 'subclass of' - 'juridical person', 
   - 'enterprise' - 'instance of' - 'type of organisation', 
   - 'business' - 'subclass of' -'organization'  
let's not consider them yet, instead let's see which other nodes are linked through the property "said to be same as" (wdt:P460) with "company" (wd:Q783794) .

In [12]:
# let's print the nodes connected through "said to be same as" with "company"
queryString = """
SELECT ?sname ?sub
WHERE { 

?sub wdt:P460 wd:Q783794 .

?sub <http://schema.org/name> ?sname .
}
ORDER BY ASC (?sname)
"""

print("Predicates")
run_query(queryString)

Predicates
[('sname', 'business'), ('sub', 'http://www.wikidata.org/entity/Q4830453')]
[('sname', 'commercial organization'), ('sub', 'http://www.wikidata.org/entity/Q21980538')]
[('sname', 'enterprise'), ('sub', 'http://www.wikidata.org/entity/Q6881511')]


3

In [13]:
# let's see the relations of "commercial organization" (wd:Q21980538)
queryString = """
SELECT ?pname ?oname ?p ?obj
WHERE { 

wd:Q21980538 ?p ?obj .

?p <http://schema.org/name> ?pname .
OPTIONAL { ?obj <http://schema.org/name> ?oname } .
}
ORDER BY ASC (?pname)
"""

print("Predicates")
run_query(queryString)

Predicates
[('pname', 'Freebase ID'), ('p', 'http://www.wikidata.org/prop/direct/P646'), ('obj', '/m/027230n')]
[('pname', 'Great Russian Encyclopedia Online ID'), ('p', 'http://www.wikidata.org/prop/direct/P2924'), ('obj', '2085412')]
[('pname', 'KBpedia ID'), ('p', 'http://www.wikidata.org/prop/direct/P8408'), ('obj', 'CommercialOrganization')]
[('pname', 'Quora topic ID'), ('p', 'http://www.wikidata.org/prop/direct/P3417'), ('obj', 'Commercial-Business')]
[('pname', 'Thesaurus for Graphic Materials ID'), ('p', 'http://www.wikidata.org/prop/direct/P5160'), ('obj', 'tgm002348')]
[('pname', 'opposite of'), ('oname', 'nonprofit organization'), ('p', 'http://www.wikidata.org/prop/direct/P461'), ('obj', 'http://www.wikidata.org/entity/Q163740')]
[('pname', 'said to be the same as'), ('oname', 'business'), ('p', 'http://www.wikidata.org/prop/direct/P460'), ('obj', 'http://www.wikidata.org/entity/Q4830453')]
[('pname', 'said to be the same as'), ('oname', 'company'), ('p', 'http://www.wikid

12

#### Comment
Looking at the results maybe we can consider as type of company only business and enterprise to not enlarge too much the search, and so the BGP to retrieve type, legal form and industry of a company is:  

```
{  
?p wdt:P31 wd:Q6881511 ;
    wdt:P1454 ?legal ;
    wdt:P452 ?industry .
} UNION { 
?p wdt:P31 wd:Q4830453 ;
    wdt:P1454 ?legal ;
    wdt:P452 ?industry .
}
``` 

or

```
{  
?c wdt:P31 ?type ;
    wdt:P1454 ?legal ;
    wdt:P452 ?industry .

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
}
```

Maybe to be more precise we can set inustry and legal form as OPTIONAL BGP in order to find all the companies and only if present their information about industry and legal form.

#### Comment
So from now on I'm going to consider a company a node connected to the node "business" or "enterprise" through the relation "instance of" to be more close to the main objective of the exploratory search (the part under the title at the top of the nodebook).

## Task 2

2. Identify the BGP to retrieve all companies owned by a company located in a EU country

#### Comment
Maybe this relations can be usefull:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1830`   | owner of   | predicate |
| `wdt:P749`     | parent organization   | predicate |

In [4]:
# let's see if the country France (wd:Q142) is connected to the E.U. node with some property
queryString = """
SELECT ?pname ?p
WHERE { 
 
wd:Q142 ?p wd:Q458 .

?p <http://schema.org/name> ?pname .
#OPTIONAL { ?obj <http://schema.org/name> ?oname } .
}
ORDER BY ASC (?pname)
"""

print("Predicates")
run_query(queryString)

Predicates
[('pname', 'member of'), ('p', 'http://www.wikidata.org/prop/direct/P463')]


1

#### Comment
We have extracted:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P463`   | member of   | predicate |

In [13]:
# Let's find all the E.U. companies that own other companies and so we can find all the companies ownedby an E.U. located company
queryString = """
SELECT  DISTINCT ?company ?sname ?cname
WHERE { 

?c wdt:P31 ?type ;
    wdt:P17 ?country ;
    wdt:P1830 ?sub .
?sub wdt:P31 ?type2 .
?country wdt:P463 wd:Q458 .

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?type2 = wd:Q6881511 || ?type2 = wd:Q4830453).

?c <http://schema.org/name> ?company .
?country <http://schema.org/name> ?cname .
?sub <http://schema.org/name> ?sname .
}
ORDER BY ASC (?company)
"""

print("Results")
run_query(queryString)

Results
[('company', '1.Tatranská, akciová spoločnosť'), ('sname', 'Carepar'), ('cname', 'Slovakia')]
[('company', '3TS Capital Partners'), ('sname', 'Zoot'), ('cname', 'Finland')]
[('company', '4check'), ('sname', 'About Me'), ('cname', 'Czech Republic')]
[('company', 'A.B.R. Holding'), ('sname', 'Automotodrom Brno'), ('cname', 'Czech Republic')]
[('company', 'A.P. Møller Holding'), ('sname', 'APMH Invest'), ('cname', 'Denmark')]
[('company', 'A.P. Møller Holding'), ('sname', 'Maersk'), ('cname', 'Denmark')]
[('company', 'A1 Telekom Austria Group'), ('sname', 'A1 Srbija'), ('cname', 'Austria')]
[('company', 'A1 Telekom Austria Group'), ('sname', 'A1 Hrvatska'), ('cname', 'Austria')]
[('company', 'AB - Credit'), ('sname', 'HC Sparta Praha'), ('cname', 'Czech Republic')]
[('company', 'AB - Credit'), ('sname', 'Spolchemie'), ('cname', 'Czech Republic')]
[('company', 'AB Wilh. Becker'), ('sname', 'Beckers Group'), ('cname', 'Sweden')]
[('company', 'AC Sparta Prague'), ('sname', 'ACS Prope

1969

#### Comment 
And so this is the graph pattern to retrieve all the companies owned by an E.U. company

```
{  
?c wdt:P31 ?type ;
    wdt:P17 ?country ;
    wdt:P1830 ?sub .
?sub wdt:P31 ?type2 .
?country wdt:P463 wd:Q458 .

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?type2 = wd:Q6881511 || ?type2 = wd:Q4830453).
}
```

## Task 3

3. Which company has the largest presence in E.U.?

#### Comment 
I suppose that the company with the largest presence in E.U. means the company, from all over the world, that is owner of the larger number of other companies which country is in E.U. 

In [21]:
# Let's find the company with the higher number of E.U. "sub-companies" (let's print the first 10 companies only to compare the results)
queryString = """
SELECT DISTINCT ?company (COUNT(DISTINCT ?sname) AS ?num) ?c
WHERE { 

?c wdt:P31 ?type ;
    (wdt:P1830)* ?sub .
?sub wdt:P31 ?type2 ;
    wdt:P17 ?country .
?country wdt:P463 wd:Q458 .

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?type2 = wd:Q6881511 || ?type2 = wd:Q4830453).

?c <http://schema.org/name> ?company .
#?country <http://schema.org/name> ?cname .
?sub <http://schema.org/name> ?sname .
}
GROUP BY ?company ?c
ORDER BY DESC (?num)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('company', 'BlackRock'), ('num', '37'), ('c', 'http://www.wikidata.org/entity/Q219635')]
[('company', 'PNC Financial Services'), ('num', '37'), ('c', 'http://www.wikidata.org/entity/Q38928')]
[('company', 'Caisse des dépôts et consignations'), ('num', '33'), ('c', 'http://www.wikidata.org/entity/Q846484')]
[('company', 'Societé Génerale'), ('num', '32'), ('c', 'http://www.wikidata.org/entity/Q270363')]
[('company', 'Adiele Holdings Limited'), ('num', '29'), ('c', 'http://www.wikidata.org/entity/Q58426620')]
[('company', 'Bonefeld Holdings Limited'), ('num', '27'), ('c', 'http://www.wikidata.org/entity/Q58425985')]
[('company', 'Burao Holdings Limited'), ('num', '27'), ('c', 'http://www.wikidata.org/entity/Q58426656')]
[('company', 'Orcamat Limited'), ('num', '27'), ('c', 'http://www.wikidata.org/entity/Q58426751')]
[('company', 'Bavariso Holdings Limited'), ('num', '27'), ('c', 'http://www.wikidata.org/entity/Q58426821')]
[('company', 'Penta Investments Limited'), ('num', '26

10

In [26]:
#Let's check the correctness of the results printing the names of the companies owned by "BlackRock" (wd:Q219635)
queryString = """
SELECT DISTINCT ?sname ?cname ?sub
WHERE { 

wd:Q219635 wdt:P31 ?type ;
    (wdt:P1830)* ?sub .
?sub wdt:P31 ?type2 ;
    wdt:P17 ?country .
?country wdt:P463 wd:Q458 .

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?type2 = wd:Q6881511 || ?type2 = wd:Q4830453).

#?c <http://schema.org/name> ?company .
?country <http://schema.org/name> ?cname .
?sub <http://schema.org/name> ?sname .
}
ORDER BY ASC (?sname)
"""

print("Results")
run_query(queryString)

Results
[('sname', 'Air Productions'), ('cname', 'France'), ('sub', 'http://www.wikidata.org/entity/Q2828222')]
[('sname', 'Albani Brewery'), ('cname', 'Denmark'), ('sub', 'http://www.wikidata.org/entity/Q3246988')]
[('sname', 'Andrew Barclay Sons & Co.'), ('cname', 'United Kingdom'), ('sub', 'http://www.wikidata.org/entity/Q4756279')]
[('sname', 'Benskins Brewery'), ('cname', 'United Kingdom'), ('sub', 'http://www.wikidata.org/entity/Q4890217')]
[('sname', 'Birrificio Angelo Poretti'), ('cname', 'Italy'), ('sub', 'http://www.wikidata.org/entity/Q3640382')]
[('sname', 'Canal+ Spain'), ('cname', 'Spain'), ('sub', 'http://www.wikidata.org/entity/Q5823281')]
[('sname', 'Carlsberg Breweries'), ('cname', 'Denmark'), ('sub', 'http://www.wikidata.org/entity/Q27898237')]
[('sname', 'Carlsberg Group'), ('cname', 'Denmark'), ('sub', 'http://www.wikidata.org/entity/Q113215')]
[('sname', 'Carlsberg Polska'), ('cname', 'Poland'), ('sub', 'http://www.wikidata.org/entity/Q5042893')]
[('sname', 'Ceres

37

#### Comment
So the companies with the largest presence in E.U. are "BlackRock" and "PNC Financial Services" 

## Task 4

4. Companies have different 'legal forms', compare the number of companies divided in different legal forms

In [29]:
#Let's try to count the different legal forms for the companies from all over the world
queryString = """
SELECT ?legal (COUNT(DISTINCT ?c) AS ?num)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P1454 ?l .

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).

?l <http://schema.org/name> ?legal .
}
GROUP BY ?legal
ORDER BY DESC (?num)
"""

print("Results")
run_query(queryString)

Results
[('legal', 'společnost s ručením omezeným'), ('num', '13075')]
[('legal', 'akciová společnost'), ('num', '5557')]
[('legal', 'joint-stock company'), ('num', '5412')]
[('legal', 'GmbH'), ('num', '3287')]
[('legal', 'Spoločnosť s ručením obmedzeným'), ('num', '2307')]
[('legal', 'Aktiengesellschaft'), ('num', '2012')]
[('legal', 'public company'), ('num', '1443')]
[('legal', 'aksjeselskap'), ('num', '1318')]
[('legal', 'GmbH & Co. KG'), ('num', '1289')]
[('legal', 'S.A.'), ('num', '1248')]
[('legal', 'kabushiki gaisha'), ('num', '1246')]
[('legal', 'private limited liability company'), ('num', '1231')]
[('legal', 'privately held company'), ('num', '1002')]
[('legal', 'Gesellschaft mit beschränkter Haftung'), ('num', '974')]
[('legal', 'open joint-stock company'), ('num', '896')]
[('legal', 'akciová spoločnosť'), ('num', '838')]
[('legal', 'public limited company'), ('num', '726')]
[('legal', 'Oy'), ('num', '638')]
[('legal', 'Société par actions simplifiée'), ('num', '610')]
[('l

409

#### Comment 
Here we have a list of legal forms with the respective number of companies that follows that legal form but they are expresseed in different languages, maybe there is a way to group the ones with the same meaning...

From the first task we have

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wd:Q6832945`   | private company limited by shares   | node |

Let's try to see its outgoing properties

In [41]:
# Let's see the properties of a nodes (in this case "private company limited by shares" (wd:Q6832945)) that is a "legal form"
queryString = """
SELECT ?pname ?oname ?p ?obj
WHERE { 

wd:Q6832945 ?p ?obj.

?p <http://schema.org/name> ?pname .
#?l <http://schema.org/name> ?legal .
OPTIONAL { ?obj <http://schema.org/name> ?oname } .
}

"""

print("Results")
run_query(queryString)

Results
[('pname', 'applies to jurisdiction'), ('oname', 'United Kingdom'), ('p', 'http://www.wikidata.org/prop/direct/P1001'), ('obj', 'http://www.wikidata.org/entity/Q145')]
[('pname', 'country'), ('oname', 'United Kingdom'), ('p', 'http://www.wikidata.org/prop/direct/P17'), ('obj', 'http://www.wikidata.org/entity/Q145')]
[('pname', 'subclass of'), ('oname', 'joint-stock company'), ('p', 'http://www.wikidata.org/prop/direct/P279'), ('obj', 'http://www.wikidata.org/entity/Q134161')]
[('pname', 'subclass of'), ('oname', 'juridical person'), ('p', 'http://www.wikidata.org/prop/direct/P279'), ('obj', 'http://www.wikidata.org/entity/Q155076')]
[('pname', 'subclass of'), ('oname', 'private limited liability company'), ('p', 'http://www.wikidata.org/prop/direct/P279'), ('obj', 'http://www.wikidata.org/entity/Q18624259')]
[('pname', 'instance of'), ('oname', 'type of business entity in the UK'), ('p', 'http://www.wikidata.org/prop/direct/P31'), ('obj', 'http://www.wikidata.org/entity/Q576554

8

#### Comment 
The "official name" property could have done what we explained if the previous comment but it's not a node, so it's not suitable for our purpose and so we need to use as a result for the task 4 the output of the first query. 

## Task 5

5. Analyze the number of companies per type, legal form, and industry in each state

#### Comment
Also in this case is not specified E.U. companies so I'm going to consider the E.U. companies only when specified

### Task 5.1

    5.1 What are the top-3 legal form in E.U.? 

In [43]:
# Let's count the E.U. company for each legal form and let's return the top 3 most used legal form 
queryString = """
SELECT ?legal (COUNT(DISTINCT ?c) AS ?num)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P1454 ?l ;
    wdt:P17 ?country .
?country wdt:P463 wd:Q458 .

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).

#?c <http://schema.org/name> ?company .
#?country <http://schema.org/name> ?cname .
#?sub <http://schema.org/name> ?sname .
?l <http://schema.org/name> ?legal .
}
GROUP BY ?legal
ORDER BY DESC (?num)
LIMIT 3
"""

print("Results")
run_query(queryString)

Results
[('legal', 'společnost s ručením omezeným'), ('num', '13073')]
[('legal', 'akciová společnost'), ('num', '5546')]
[('legal', 'GmbH'), ('num', '2898')]


3

#### Comment 
These are the top 3 legal form in E.U.

### Task 5.2

    5.2 For which companies is defined some form of income or market capitalization or total assets? What is the min, max, and average in each country for a given legal form?

In [51]:
#Let's try to print the distinct field of all the companies to see if there are info about income or market capitalization or total assets
queryString = """
SELECT DISTINCT ?pname ?p
WHERE { 

?c wdt:P31 ?type ;
    ?p ?o .

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).

#?c <http://schema.org/name> ?company .
?p <http://schema.org/name> ?pname .
OPTIONAL { ?o <http://schema.org/name> ?oname } .
}
GROUP BY ?pname ?p
ORDER BY ASC (?pname)
"""

print("Results")
run_query(queryString)

Results
[('pname', 'AAA hotel ID'), ('p', 'http://www.wikidata.org/prop/direct/P8868')]
[('pname', 'AAGM person or institution ID'), ('p', 'http://www.wikidata.org/prop/direct/P8783')]
[('pname', 'ACM Classification Code (2012)'), ('p', 'http://www.wikidata.org/prop/direct/P2179')]
[('pname', 'ACMA Radiocommunications Client Number'), ('p', 'http://www.wikidata.org/prop/direct/P2472')]
[('pname', 'ACNP journal ID'), ('p', 'http://www.wikidata.org/prop/direct/P6981')]
[('pname', 'ADAGP artist ID'), ('p', 'http://www.wikidata.org/prop/direct/P3901')]
[('pname', 'AGORHA person/institution ID'), ('p', 'http://www.wikidata.org/prop/direct/P2342')]
[('pname', 'API endpoint'), ('p', 'http://www.wikidata.org/prop/direct/P6269')]
[('pname', 'ARAE ID'), ('p', 'http://www.wikidata.org/prop/direct/P9226')]
[('pname', 'ARCHON code'), ('p', 'http://www.wikidata.org/prop/direct/P3642')]
[('pname', 'ARTIC artist ID'), ('p', 'http://www.wikidata.org/prop/direct/P6295')]
[('pname', 'ARTIC artwork ID'), 

1702

#### Comment

We have discovered these properties:
    
| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P8340`     | estimated value   | predicate |
| `wdt:P2226`     | market capitalization   | predicate |
| `wdt:P2295`     | net profit   | predicate |
| `wdt:P3362`     | operating income   | predicate |
| `wdt:P2403`     | total assets   | predicate |
| `wdt:P2133`     | total debt   | predicate |
| `wdt:P2137`     | total equity   | predicate |
| `wdt:P2402`     | total expenditure   | predicate |
| `wdt:P2138`     | total liabilities   | predicate |
| `wdt:P1092`     | total produced   | predicate |
| `wdt:P2139`   | total revenue   | predicate |

Due to the text of the task i limit my analysis only at "market capitalization" (wdt:P2296) "operating income" (wdt:P3362) and "total assets" (wdt:P2403).  
So let's try to print the legal form of the companies that have at least one of this properties

In [57]:
# let's try to print the legal form of the companies that have at least one of the previous properties
queryString = """
SELECT ?legal (GROUP_CONCAT(DISTINCT ?money ; separator = ", ") AS ?cash)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P1454 ?l ;
    ?m ?o.

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?m = wdt:P2226 || ?m = wdt:P3362 || ?m = wdt:P2403) .

#?c <http://schema.org/name> ?company .
?m <http://schema.org/name> ?money .
?l <http://schema.org/name> ?legal .
}
GROUP BY ?legal
ORDER BY ASC (?legal)
"""

print("Results")
run_query(queryString)

Results
[('legal', 'Aktiengesellschaft'), ('cash', 'market capitalization, operating income, total assets')]
[('legal', 'Altrechtlicher Verein'), ('cash', 'total assets')]
[('legal', 'Anonim Şirket'), ('cash', 'operating income')]
[('legal', 'Anpartsselskab'), ('cash', 'total assets')]
[('legal', 'Anstalt des öffentlichen Rechts'), ('cash', 'total assets')]
[('legal', 'Bankaktiebolag'), ('cash', 'market capitalization, operating income, total assets')]
[('legal', 'Besloten vennootschap'), ('cash', 'operating income, total assets')]
[('legal', 'Delaware corporation'), ('cash', 'market capitalization, operating income, total assets')]
[('legal', 'Eingetragene Genossenschaft'), ('cash', 'total assets')]
[('legal', 'Gemeinnützige GmbH'), ('cash', 'total assets')]
[('legal', 'Gesellschaft mit beschränkter Haftung'), ('cash', 'total assets')]
[('legal', 'GmbH'), ('cash', 'operating income, total assets')]
[('legal', 'GmbH & Co. KG'), ('cash', 'operating income, total assets')]
[('legal', 'Jo

86

#### Comment
For 86 legal form out of 409 we have information about economic situation.
The second part of this task is not completely clear so I assume to evaluate only the MAX, MIN and AVG for "total assets", "operative income" and "market capitalization".

In [63]:
#For each legal form let's print the MAX "market capitalization" value for a company in each country that has at least one of them
queryString = """
SELECT ?legal ?cname (MAX(?o) AS ?maxmktcap)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P1454 ?l ;
    wdt:P17 ?country ;
    ?m ?o.

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?m = wdt:P2226) .

?c <http://schema.org/name> ?company .
?m <http://schema.org/name> ?money .
?country <http://schema.org/name> ?cname .
?l <http://schema.org/name> ?legal .
}
GROUP BY ?legal ?cname
ORDER BY ASC (?legal)
"""

print("Results")
run_query(queryString)

Results
[('legal', 'Aktiengesellschaft'), ('cname', 'Austria'), ('maxmktcap', '10960000000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Switzerland'), ('maxmktcap', '237363000000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Germany'), ('maxmktcap', '139800000000')]
[('legal', 'Bankaktiebolag'), ('cname', 'Finland'), ('maxmktcap', '415000000000')]
[('legal', 'Delaware corporation'), ('cname', 'United States of America'), ('maxmktcap', '1900000000000')]
[('legal', 'Naamloze vennootschap'), ('cname', 'Kingdom of the Netherlands'), ('maxmktcap', '9431200000')]
[('legal', 'Naamloze vennootschap'), ('cname', 'Netherlands'), ('maxmktcap', '35600000000')]
[('legal', 'S.A.'), ('cname', 'Greece'), ('maxmktcap', '3896025.7')]
[('legal', 'S.A.'), ('cname', 'France'), ('maxmktcap', '10203000000')]
[('legal', 'Societas Europaea'), ('cname', 'Germany'), ('maxmktcap', '8270000000')]
[('legal', 'Societas Europaea'), ('cname', 'Austria'), ('maxmktcap', '3500000000')]
[('legal', 'Société Anonyme'),

46

In [62]:
#For each legal form let's print the MAX "operative income" value for a company in each country that has at least one of them
queryString = """
SELECT ?legal ?cname (MAX(?o) AS ?maxopinc)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P1454 ?l ;
    wdt:P17 ?country ;
    ?m ?o.

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?m = wdt:P3362) .

#?c <http://schema.org/name> ?company .
?m <http://schema.org/name> ?money .
?country <http://schema.org/name> ?cname .
?l <http://schema.org/name> ?legal .
}
GROUP BY ?legal ?cname
ORDER BY ASC (?legal)
"""

print("Results")
run_query(queryString)

Results
[('legal', 'Aktiengesellschaft'), ('cname', 'Austria'), ('maxopinc', '264100000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Switzerland'), ('maxopinc', '50998000000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Germany'), ('maxopinc', '12902000000')]
[('legal', 'Anonim Şirket'), ('cname', 'Turkey'), ('maxopinc', '1154000000')]
[('legal', 'Bankaktiebolag'), ('cname', 'Finland'), ('maxopinc', '5127000000')]
[('legal', 'Besloten vennootschap'), ('cname', 'Switzerland'), ('maxopinc', '1491700000')]
[('legal', 'Besloten vennootschap'), ('cname', 'Netherlands'), ('maxopinc', '4889000000')]
[('legal', 'Delaware corporation'), ('cname', 'United States of America'), ('maxopinc', '41224000000')]
[('legal', 'GmbH'), ('cname', 'Germany'), ('maxopinc', '93166000')]
[('legal', 'GmbH & Co. KG'), ('cname', 'Germany'), ('maxopinc', '477377000')]
[('legal', 'GmbH & Co. KG'), ('cname', 'Spain'), ('maxopinc', '477377000')]
[('legal', 'Kōsha'), ('cname', 'South Korea'), ('maxopinc', '12004599

132

In [61]:
#For each legal form let's print the MAX "total assets" value for a company in each country that has at least one of them
queryString = """
SELECT ?legal ?cname (MAX(?o) AS ?maxtotass)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P1454 ?l ;
    wdt:P17 ?country ;
    ?m ?o.

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?m = wdt:P2403) .

#?c <http://schema.org/name> ?company .
?m <http://schema.org/name> ?money .
?country <http://schema.org/name> ?cname .
?l <http://schema.org/name> ?legal .
}
GROUP BY ?legal ?cname
ORDER BY ASC (?legal)
"""

print("Results")
run_query(queryString)

Results
[('legal', 'Aktiengesellschaft'), ('cname', 'Austria'), ('maxtotass', '32112000000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Switzerland'), ('maxtotass', '439299000000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Germany'), ('maxtotass', '409732000000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'German Reich'), ('maxtotass', '82339839')]
[('legal', 'Anpartsselskab'), ('cname', 'Denmark'), ('maxtotass', '86933')]
[('legal', 'Anstalt des öffentlichen Rechts'), ('cname', 'Germany'), ('maxtotass', '1332465865.19')]
[('legal', 'Bankaktiebolag'), ('cname', 'Finland'), ('maxtotass', '615659000000')]
[('legal', 'Besloten vennootschap'), ('cname', 'Switzerland'), ('maxtotass', '53801000000')]
[('legal', 'Besloten vennootschap'), ('cname', 'Netherlands'), ('maxtotass', '23647000000')]
[('legal', 'Delaware corporation'), ('cname', 'United States of America'), ('maxtotass', '319616000000')]
[('legal', 'Eingetragene Genossenschaft'), ('cname', 'Germany'), ('maxtotass', '49865800

143

In [65]:
#For each legal form let's print the MIN "market capitalization" value for a company in each country that has at least one of them
queryString = """
SELECT ?legal ?cname (MIN(?o) AS ?minmktcap)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P1454 ?l ;
    wdt:P17 ?country ;
    ?m ?o.

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?m = wdt:P2226) .

?c <http://schema.org/name> ?company .
?m <http://schema.org/name> ?money .
?country <http://schema.org/name> ?cname .
?l <http://schema.org/name> ?legal .
}
GROUP BY ?legal ?cname
ORDER BY ASC (?legal)
"""

print("Results")
run_query(queryString)

Results
[('legal', 'Aktiengesellschaft'), ('cname', 'Austria'), ('minmktcap', '10960000000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Switzerland'), ('minmktcap', '172750000000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Germany'), ('minmktcap', '111760000')]
[('legal', 'Bankaktiebolag'), ('cname', 'Finland'), ('minmktcap', '415000000000')]
[('legal', 'Delaware corporation'), ('cname', 'United States of America'), ('minmktcap', '1530000000')]
[('legal', 'Naamloze vennootschap'), ('cname', 'Kingdom of the Netherlands'), ('minmktcap', '9431200000')]
[('legal', 'Naamloze vennootschap'), ('cname', 'Netherlands'), ('minmktcap', '102640000')]
[('legal', 'S.A.'), ('cname', 'Greece'), ('minmktcap', '3896025.7')]
[('legal', 'S.A.'), ('cname', 'France'), ('minmktcap', '157659497')]
[('legal', 'Societas Europaea'), ('cname', 'Germany'), ('minmktcap', '8270000000')]
[('legal', 'Societas Europaea'), ('cname', 'Austria'), ('minmktcap', '3500000000')]
[('legal', 'Société Anonyme'), ('cname',

46

In [66]:
#For each legal form let's print the MIN "operative income" value for a company in each country that has at least one of them
queryString = """
SELECT ?legal ?cname (MIN(?o) AS ?minopinc)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P1454 ?l ;
    wdt:P17 ?country ;
    ?m ?o.

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?m = wdt:P3362) .

#?c <http://schema.org/name> ?company .
?m <http://schema.org/name> ?money .
?country <http://schema.org/name> ?cname .
?l <http://schema.org/name> ?legal .
}
GROUP BY ?legal ?cname
ORDER BY ASC (?legal)
"""

print("Results")
run_query(queryString)

Results
[('legal', 'Aktiengesellschaft'), ('cname', 'Austria'), ('minopinc', '-457000000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Switzerland'), ('minopinc', '22317000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Germany'), ('minopinc', '25000000')]
[('legal', 'Anonim Şirket'), ('cname', 'Turkey'), ('minopinc', '1154000000')]
[('legal', 'Bankaktiebolag'), ('cname', 'Finland'), ('minopinc', '5127000000')]
[('legal', 'Besloten vennootschap'), ('cname', 'Switzerland'), ('minopinc', '1491700000')]
[('legal', 'Besloten vennootschap'), ('cname', 'Netherlands'), ('minopinc', '1296000000')]
[('legal', 'Delaware corporation'), ('cname', 'United States of America'), ('minopinc', '-4408000000')]
[('legal', 'GmbH'), ('cname', 'Germany'), ('minopinc', '93166000')]
[('legal', 'GmbH & Co. KG'), ('cname', 'Germany'), ('minopinc', '219163000')]
[('legal', 'GmbH & Co. KG'), ('cname', 'Spain'), ('minopinc', '477377000')]
[('legal', 'Kōsha'), ('cname', 'South Korea'), ('minopinc', '12004599000')

132

In [67]:
#For each legal form let's print the MIN "total assets" value for a company in each country that has at least one of them

queryString = """
SELECT ?legal ?cname (MIN(?o) AS ?mintotass)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P1454 ?l ;
    wdt:P17 ?country ;
    ?m ?o.

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?m = wdt:P2403) .

#?c <http://schema.org/name> ?company .
?m <http://schema.org/name> ?money .
?country <http://schema.org/name> ?cname .
?l <http://schema.org/name> ?legal .
}
GROUP BY ?legal ?cname
ORDER BY ASC (?legal)
"""

print("Results")
run_query(queryString)

Results
[('legal', 'Aktiengesellschaft'), ('cname', 'Austria'), ('mintotass', '1490000000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Switzerland'), ('mintotass', '284824000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Germany'), ('mintotass', '82339839')]
[('legal', 'Aktiengesellschaft'), ('cname', 'German Reich'), ('mintotass', '82339839')]
[('legal', 'Anpartsselskab'), ('cname', 'Denmark'), ('mintotass', '86933')]
[('legal', 'Anstalt des öffentlichen Rechts'), ('cname', 'Germany'), ('mintotass', '1332465865.19')]
[('legal', 'Bankaktiebolag'), ('cname', 'Finland'), ('mintotass', '615659000000')]
[('legal', 'Besloten vennootschap'), ('cname', 'Switzerland'), ('mintotass', '53801000000')]
[('legal', 'Besloten vennootschap'), ('cname', 'Netherlands'), ('mintotass', '14704000000')]
[('legal', 'Delaware corporation'), ('cname', 'United States of America'), ('mintotass', '1589422000')]
[('legal', 'Eingetragene Genossenschaft'), ('cname', 'Germany'), ('mintotass', '4986580075')]
[('l

143

In [68]:
#For each legal form let's print the AVG "market capitalization" value between companies company in each country that has at least one of them
queryString = """
SELECT ?legal ?cname (AVG(?o) AS ?avgmktcap)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P1454 ?l ;
    wdt:P17 ?country ;
    ?m ?o.

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?m = wdt:P2226) .

?c <http://schema.org/name> ?company .
?m <http://schema.org/name> ?money .
?country <http://schema.org/name> ?cname .
?l <http://schema.org/name> ?legal .
}
GROUP BY ?legal ?cname
ORDER BY ASC (?legal)
"""

print("Results")
run_query(queryString)

Results
[('legal', 'Aktiengesellschaft'), ('cname', 'Austria'), ('avgmktcap', '10960000000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Switzerland'), ('avgmktcap', '205056500000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Germany'), ('avgmktcap', '45480380014.285714285714286')]
[('legal', 'Bankaktiebolag'), ('cname', 'Finland'), ('avgmktcap', '415000000000')]
[('legal', 'Delaware corporation'), ('cname', 'United States of America'), ('avgmktcap', '285756789686.482758620689655')]
[('legal', 'Naamloze vennootschap'), ('cname', 'Kingdom of the Netherlands'), ('avgmktcap', '9431200000')]
[('legal', 'Naamloze vennootschap'), ('cname', 'Netherlands'), ('avgmktcap', '20354468571.428571428571429')]
[('legal', 'S.A.'), ('cname', 'Greece'), ('avgmktcap', '3896025.7')]
[('legal', 'S.A.'), ('cname', 'France'), ('avgmktcap', '3068109916.166666666666667')]
[('legal', 'Societas Europaea'), ('cname', 'Germany'), ('avgmktcap', '8270000000')]
[('legal', 'Societas Europaea'), ('cname', 'Austria')

46

In [69]:
#For each legal form let's print the AVG "operative income" value between companies company in each country that has at least one of them
queryString = """
SELECT ?legal ?cname (AVG(?o) AS ?avgopinc)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P1454 ?l ;
    wdt:P17 ?country ;
    ?m ?o.

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?m = wdt:P3362) .

#?c <http://schema.org/name> ?company .
?m <http://schema.org/name> ?money .
?country <http://schema.org/name> ?cname .
?l <http://schema.org/name> ?legal .
}
GROUP BY ?legal ?cname
ORDER BY ASC (?legal)
"""

print("Results")
run_query(queryString)

Results
[('legal', 'Aktiengesellschaft'), ('cname', 'Austria'), ('avgopinc', '-57266666.666666666666667')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Switzerland'), ('avgopinc', '13280719500')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Germany'), ('avgopinc', '5128921428.571428571428571')]
[('legal', 'Anonim Şirket'), ('cname', 'Turkey'), ('avgopinc', '1154000000')]
[('legal', 'Bankaktiebolag'), ('cname', 'Finland'), ('avgopinc', '5127000000')]
[('legal', 'Besloten vennootschap'), ('cname', 'Switzerland'), ('avgopinc', '1491700000')]
[('legal', 'Besloten vennootschap'), ('cname', 'Netherlands'), ('avgopinc', '2296200000')]
[('legal', 'Delaware corporation'), ('cname', 'United States of America'), ('avgopinc', '6759567700')]
[('legal', 'GmbH'), ('cname', 'Germany'), ('avgopinc', '93166000')]
[('legal', 'GmbH & Co. KG'), ('cname', 'Germany'), ('avgopinc', '360479250')]
[('legal', 'GmbH & Co. KG'), ('cname', 'Spain'), ('avgopinc', '477377000')]
[('legal', 'Kōsha'), ('cname', 'South K

132

In [70]:
#For each legal form let's print the AVG "total assets" value between companies company in each country that has at least one of them
queryString = """
SELECT ?legal ?cname (AVG(?o) AS ?avgtotass)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P1454 ?l ;
    wdt:P17 ?country ;
    ?m ?o.

FILTER(?type = wd:Q6881511 || ?type = wd:Q4830453).
FILTER(?m = wdt:P2403) .

#?c <http://schema.org/name> ?company .
?m <http://schema.org/name> ?money .
?country <http://schema.org/name> ?cname .
?l <http://schema.org/name> ?legal .
}
GROUP BY ?legal ?cname
ORDER BY ASC (?legal)
"""

print("Results")
run_query(queryString)

Results
[('legal', 'Aktiengesellschaft'), ('cname', 'Austria'), ('avgtotass', '12423400000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Switzerland'), ('avgtotass', '187858608000')]
[('legal', 'Aktiengesellschaft'), ('cname', 'Germany'), ('avgtotass', '118096647427.4375')]
[('legal', 'Aktiengesellschaft'), ('cname', 'German Reich'), ('avgtotass', '82339839')]
[('legal', 'Anpartsselskab'), ('cname', 'Denmark'), ('avgtotass', '86933')]
[('legal', 'Anstalt des öffentlichen Rechts'), ('cname', 'Germany'), ('avgtotass', '1332465865.19')]
[('legal', 'Bankaktiebolag'), ('cname', 'Finland'), ('avgtotass', '615659000000')]
[('legal', 'Besloten vennootschap'), ('cname', 'Switzerland'), ('avgtotass', '53801000000')]
[('legal', 'Besloten vennootschap'), ('cname', 'Netherlands'), ('avgtotass', '18073800000')]
[('legal', 'Delaware corporation'), ('cname', 'United States of America'), ('avgtotass', '64961191789.473684210526316')]
[('legal', 'Eingetragene Genossenschaft'), ('cname', 'Germany'), ('a

143

#### Comment

Don't know if I have satisfied the request but this is, in my opinion, the solution that better fits the text of the task

### Task 5.3

    5.3 Which business in each country owns more businesses in other E.U. countries?

#### Comment

From the text of the task I understand that the countries considered are only the E.U. ones.

Remind:

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P1830`   | owner of   | predicate |

In [101]:
#Count the number of businesses owned by another business in different country and then for each country I extract the max value
queryString = """
SELECT ?cname (MAX(?num) AS ?max)
WHERE { 
?com wdt:P31 ?type ;
    wdt:P17 ?cou .
?cou wdt:P463 wd:Q458 .

FILTER(?com = ?c).
{
SELECT ?c (COUNT(DISTINCT ?sub) AS ?num)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P17 ?country ;
    (wdt:P1830)* ?sub .
?sub wdt:P31 ?type ;
    wdt:P17 ?country2 .
?country wdt:P463 wd:Q458 .
?country2 wdt:P463 wd:Q458 .
    
FILTER(?type = wd:Q4830453).
FILTER(?country != ?country2) .
}
GROUP BY ?c
}

?com <http://schema.org/name> ?company .
?cou <http://schema.org/name> ?cname .
}
GROUP BY ?cname
ORDER BY DESC (?max)
"""

print("Results")
run_query(queryString)

Results
[('cname', 'France'), ('max', '26')]
[('cname', 'Cyprus'), ('max', '21')]
[('cname', 'Luxembourg'), ('max', '17')]
[('cname', 'Czech Republic'), ('max', '13')]
[('cname', 'United Kingdom'), ('max', '9')]
[('cname', 'Slovakia'), ('max', '8')]
[('cname', 'Germany'), ('max', '8')]
[('cname', 'Denmark'), ('max', '7')]
[('cname', 'Kingdom of the Netherlands'), ('max', '7')]
[('cname', 'Poland'), ('max', '5')]
[('cname', 'Belgium'), ('max', '5')]
[('cname', 'Finland'), ('max', '5')]
[('cname', 'Italy'), ('max', '5')]
[('cname', 'Malta'), ('max', '5')]
[('cname', 'Austria'), ('max', '4')]
[('cname', 'Sweden'), ('max', '3')]
[('cname', 'Ireland'), ('max', '3')]
[('cname', 'Hungary'), ('max', '2')]
[('cname', 'Slovenia'), ('max', '1')]
[('cname', 'Estonia'), ('max', '1')]
[('cname', 'Spain'), ('max', '1')]
[('cname', 'Greece'), ('max', '1')]
[('cname', 'Croatia'), ('max', '1')]
[('cname', 'Bulgaria'), ('max', '1')]
[('cname', 'Latvia'), ('max', '1')]
[('cname', 'Portugal'), ('max', '1')

28

#### Comment
REMEMBER TO CHECK THE RESULTS

#### Comment
I have retrieved the max number of company of another E.U. state that a E.U. state company owns. I haven't been able to print even the name of the company due to the rules for aggregation functions.  
UK is no more in E.U. so probably the database is not updated.

### Task 5.4

    5.4 What can we say about industry sectors in various countries?

#### Comment
As before I'm going to considere the whole globe not only E.U. states because not specified.

We have

| IRI           | Description   | Role      |
| -----------   | -----------   |-----------|
| `wdt:P452`   | industry   | predicate |

In [3]:
# Let's print fpr each country the number of company belonging to each industry present in their territory 
queryString = """
SELECT ?cname ?industry (COUNT(DISTINCT ?c) AS ?num)
WHERE { 

?c wdt:P31 ?type ;
    wdt:P17 ?count ;
    wdt:P452 ?ind .

FILTER(?type = wd:Q4830453 || ?type = wd:Q4830453).

?ind <http://schema.org/name> ?industry .
?count <http://schema.org/name> ?cname .
}
GROUP BY ?cname ?industry
ORDER BY ASC (?cname) (?industry)
"""

print("Results")
run_query(queryString)

Results
[('cname', 'Afghanistan'), ('industry', 'film production company'), ('num', '1')]
[('cname', 'Afghanistan'), ('industry', 'mass media'), ('num', '1')]
[('cname', 'Afghanistan'), ('industry', 'mobile phone industry'), ('num', '2')]
[('cname', 'Afghanistan'), ('industry', 'telecommunications industry'), ('num', '2')]
[('cname', 'Afghanistan'), ('industry', 'television'), ('num', '1')]
[('cname', 'Albania'), ('industry', 'brewing'), ('num', '1')]
[('cname', 'Albania'), ('industry', 'electric utility'), ('num', '1')]
[('cname', 'Albania'), ('industry', 'energy supply'), ('num', '1')]
[('cname', 'Albania'), ('industry', 'finance'), ('num', '1')]
[('cname', 'Albania'), ('industry', 'logistics'), ('num', '1')]
[('cname', 'Albania'), ('industry', 'mass-market retailing'), ('num', '1')]
[('cname', 'Albania'), ('industry', 'mobile phone industry'), ('num', '3')]
[('cname', 'Albania'), ('industry', 'petroleum industry'), ('num', '2')]
[('cname', 'Albania'), ('industry', 'retail'), ('num',

10237

#### Comment
Let's verify the if the results are correct

In [24]:
# Let's limit the previous search and print the nodes URIs
queryString = """
SELECT ?cname ?industry (COUNT(DISTINCT ?c) AS ?num) ?ind ?count
WHERE { 

?c wdt:P31 ?type ;
    wdt:P17 ?count ;
    wdt:P452 ?ind .

FILTER(?type = wd:Q4830453 || ?type = wd:Q4830453).

?ind <http://schema.org/name> ?industry .
?count <http://schema.org/name> ?cname .
}
GROUP BY ?cname ?industry ?ind ?count
ORDER BY ASC (?cname) (?industry)
LIMIT 10
"""

print("Results")
run_query(queryString)

Results
[('cname', 'Afghanistan'), ('industry', 'film production company'), ('num', '1'), ('ind', 'http://www.wikidata.org/entity/Q1762059'), ('count', 'http://www.wikidata.org/entity/Q889')]
[('cname', 'Afghanistan'), ('industry', 'mass media'), ('num', '1'), ('ind', 'http://www.wikidata.org/entity/Q11033'), ('count', 'http://www.wikidata.org/entity/Q889')]
[('cname', 'Afghanistan'), ('industry', 'mobile phone industry'), ('num', '2'), ('ind', 'http://www.wikidata.org/entity/Q56598901'), ('count', 'http://www.wikidata.org/entity/Q889')]
[('cname', 'Afghanistan'), ('industry', 'telecommunications industry'), ('num', '2'), ('ind', 'http://www.wikidata.org/entity/Q25245117'), ('count', 'http://www.wikidata.org/entity/Q889')]
[('cname', 'Afghanistan'), ('industry', 'television'), ('num', '1'), ('ind', 'http://www.wikidata.org/entity/Q289'), ('count', 'http://www.wikidata.org/entity/Q889')]
[('cname', 'Albania'), ('industry', 'brewing'), ('num', '1'), ('ind', 'http://www.wikidata.org/entit

10

In [3]:
# Let's print the names of the company belonging to "mobile phone industry" (wd:Q56598901) in "Afghanistan" (wd:Q889)
queryString = """

SELECT ?company
WHERE { 

?com wdt:P31 ?type ;
    wdt:P17 wd:Q889 ;
    wdt:P452 wd:Q56598901 .

FILTER(?type = wd:Q4830453 || ?type = wd:Q4830453).

?com <http://schema.org/name> ?company .
}
"""

print("Results")
run_query(queryString)

Results
[('company', 'Roshan')]
[('company', 'Afghan Wireless')]


2

#### Comment
The correctness test is succesful so the previous queries are correct.   
In this way we have counted all the company (of type "business" or "enterprise") of each different industry present in each country