<img src="_assets/openingextractiveslogo.png" alt="logos" width="400" style="float:right">

# <span style = 'color:#363487'>A notebook for analysing Beneficial Ownership Data Standard (BODS) data</span>

## Background

This report contains analyses of data produced in line with the [Beneficial Ownership Data Standard](https://standard.openownership.org/) (BODS), using an [initial dataset](https://data.gov.lv/dati/lv/dataset/plg-bods/resource/19a7d5f5-5586-4de2-a710-fc7145a129f2) released by the Register of Enterprises of the Republic of Latvia. The analyses presented here are a subset of a full BODS analysis notebook, which can be used to analyse any BODS dataset. The aim of the notebook is to inform discussion on how BODS data can be improved to make it as useful as possible.

It has been produced as part of the [Opening Extractives programme](https://www.openownership.org/en/topics/opening-extractives/) which is implemented jointly between the [Extractive Industries Transparency Initiative International Secretariat](https://eiti.org/opening-extractives) and Open Ownership. Opening Extractives aims to transform the availability and use of beneficial ownership data for effective governance in the extractive sector.

The structure of this notebook - and of the example feedback report - aligns with the Open Ownership [Principles for Effective Beneficial Ownership Disclosure](https://www.openownership.org/en/principles/) (OO Principles). 

The OO Principles provide a framework for implementing comprehensive beneficial ownership transparency reforms, and assessing and improving existing disclosure regimes. They seek to generate actionable and usable data across the widest range of policy applications of beneficial ownership data. Effective disclosure needs high quality, reliable data to maximise usability for all potential users and to minimise loopholes.

Here, we provide example queries that align to each of the principles, and accompanying guidance of how to interpret BODS data in light of the principles.


<div style="background-color:#f2f4f8">

## Setup

### How to complete this section

All code chunks in this section need to be run in order for subsequent queries to work. The code chunk that installs libraries from `requirements.txt` only needs be run once if using the notebook locally, while the other chunks need to be run once per session. The majority of functions to run this notebook are in the `qbods` library, which can be found and edited within the script `qbods.py` that accompanies this notebook.

</div>

 <div style="background-color:f2f4f8">


### Install and import libraries

</div>

In [None]:
!pip3 install -r requirements.txt

In [None]:
import qbods
import pandas as pd

 <div style="background-color:#f2f4f8">

### Download and unzip data

Insert the url pointing to the csv download from the [BODS data analysis tools](https://bods-data.openownership.org/). See the methodology document if this resource is not yet available.

</div>

In [None]:
qbods.getBodsData("https://s3.eu-west-1.amazonaws.com/oo-bodsdata/data/latvia/csv.zip")

 <div style="background-color:#f2f4f8">

### Read in data

</div>

In [None]:
d = qbods.readBodsData('csv')
list(d.keys())

## Principles

<div style = "background-color:#f2f4f8">

### How to complete this section

- For each section, run the queries
- Carefully check the tables and graphs and fix any errors

</div>

### 1. Robust definition

This section reports how the data maps to the [robust definition](https://www.openownership.org/en/principles/robust-definition/) principle. 

#### Natural Persons

- Robust and clear definitions of beneficial ownership should state that a beneficial owner should be a natural person

##### Interested parties

**Query description:** Provides a breakdown of the types of people and companies who own and control companies.

**Query result:** The below table and graph provide counts of interested parties in ownership or control statements, grouped by type (person type if  from a person statement, or entity type if from an entity statement), and by whether at least one interest is described as beneficial ownership or control.

In [None]:
o121 = qbods.q121(d['ooc_interests'],d['ooc_statement'],d['person_statement'],d['entity_statement'])
o121[0]

In [None]:
o121[1]

**Query interpretation:** A robust definition of beneficial ownership will clearly specify that beneficial owners should be natural persons. In the resulting data we may expect to see interested party types that correspond to both entities and natural persons as interested parties, but only interested party types that correspond to natural persons (knownPerson, unknownPerson, anonymousPersons) should be described as beneficial owners. 

Anonymous and unknown persons may be described as beneficial owners, but, in accordance with the sufficient detail principle, these person types are expected to constitute a small overall proportion of all owner types, compared to known persons.

**Query feedback:** In the Latvia data the vast majority of interested parties are known persons, with a smaller set of registered entities. Though most statements do not contain information about beneficial ownership or control, of those that do, no statements have entities described as a beneficial owner. Instead, all statements where an interested party is a natural person are described as beneficial ownership, and all those where a registered entity is the interested party are described as not being beneficial ownership. This is in line with the [robust definition](https://www.openownership.org/en/principles/robust-definition/) principle, although it is possible for natural persons to be interested parties in ownership arrangements that do not constitute beneficial ownership.

#### Ownership and control

- Definitions should cover all relevant forms of ownership and control

##### Interest types and beneficial ownership or control

**Query description:** Provides a breakdown of the ways in which people own and control companies.

**Query result:** The below table and graph provide counts of interests in ownership or control statements, grouped by interest type, and by whether the interest was classified as beneficial ownership or control.

In [None]:
o131 = qbods.q131(d['ooc_interests'],d['ooc_statement'])
o131[0]

In [None]:
o131[1]

**Query interpretation:** Providing details about ownership or control interests is not a required element, according to the [BODS v0.2 schema](https://standard.openownership.org/en/0.2.0/schema/index.html). However, we would expect Interest details to be recorded for a majority of ownership or control statements. A range (i.e. at least two) of interest types should be recorded across ownership or control statements. Beneficial owners in particular are expected to encompass [a range of interest types](https://standard.openownership.org/en/0.2.0/schema/reference.html?highlight=interest#interesttype), so we expect to see a range of values in the data where [beneficial ownership or control is True](https://standard.openownership.org/en/0.3.0/schema/guidance/repr-beneficial-ownership.html). In most jurisdictions, the majority of interest types are expected to be ‘shareholding’ or ‘voting’ rights, and only a minority of cases should be ‘other influence or control’.

**Query feedback:** In this dataset, most interest types are missing, which makes it difficult to ascertain the nature of ownership or control arrangements. Of those where there is interest data almost all are 'other interest or control' which suggests either that BODS v0.2 did not properly capture the nature of ownership and control in this set of declaring entities, or that there was in issue with the data modelling, collection or publishing process.

### 2. Comprehensive coverage

This section reports how the data maps to the [comprehensive coverage](https://www.openownership.org/en/principles/comprehensive-coverage/) principle.

#### Coverage of entities

- All types of entities and arrangements through which ownership and control can be exercised should be included in declarations, unless reasonably exempt

##### Entity types declaring their beneficial ownership

**Query description:** Provides a breakdown of the entity types that are subjects of ownership or control arrangements.

**Query result:** The below table and figure show counts of subjects in ownership or control statements, grouped by entity type, and by whether at least one interest in the ownership or control statement is described as beneficial ownership or control.

In [None]:
o213 = qbods.q213(d['ooc_interests'],d['ooc_statement'],d['entity_statement'])
o213[0]

In [None]:
o213[1]

**Query interpretation:** Depending on the disclosure regime, we may expect to see a range of entity types declaring their beneficial owners. Missing interest data may impede the ability to determine which companies are declaring their beneficial owners.

**Query feedback:** In the [Latvia data](https://bods-data.openownership.org/source/latvia/), the vast majority of entities that are subjects in Ownership or Control Statements are Registered Entities, with a smaller number of arrangements and anonymous entities. There is insufficient data on beneficial ownership or control to determine if there are differences in beneficial ownership reporting between entity types.

##### Anonymous and unknown entities

**Query description:** Provides a breakdown of reasons given for missing entity information.

**Query result:** The below table and graph provide counts of entity statements, grouped by reasons given for missing entity information (to be used where entities are given a type unspecified or anonymous), and by entity type.

In [None]:
o221 = qbods.q221(d['entity_statement'])
o221[0]

In [None]:
o221[1]

**Query interpretation:** All instances where entity types are listed as anonymous or unknown are [expected to be accompanied by an explanation](https://standard.openownership.org/en/0.2.0/schema/reference.html?highlight=unspecifiedEntityDetails#entitystatement), so there should be no missing data on reasons provided for these entity types. In addition, a range of reasons are expected and we would not expect all reasons to be described as ‘unknown’. Reasons for unepcified entity details are not expected when the entity type is not 'anonymous entity' or 'unknown entity'. In instances where the reason given is that the subject or interested party is exempt from disclosure, accompanying free-text descriptions are likely to be useful to clarify why this might be the case.

**Query feedback:** In this dataset, no reasons are provided where unknown or anonymous entity types are used. However, the number of unknown and anonymous types used is very small. Unspecified entity details are provided for a small number of statements where the entity type is 'registered entity', which is not expected. All of these statements use the same reason, and all provide a freetext description. Together, this suggests that there are some issues with how data on unspecified and anonymous entities are being modeled, but we see no suggestion in these results that anonymous and unknown entities are being used to obscure information about companies in this dataset.

### 3. Sufficient detail

This section reports how the data maps to the [sufficient detail](https://www.openownership.org/en/principles/sufficient-detail/) principle.



#### Detail about people

- Sufficient information should be collected to be able to unambiguously identify people, entities, and arrangements, using clear identifiers for natural persons, legal entities and arrangements

##### Birth date

**Query description:** Provides a breakdown of people by the decade and year of their birth.

**Query result:** The table below provides counts of person statements, grouped by their decade of birth. The figure shows the distribution by individual year of birth.

In [None]:
o314 = qbods.q314(d['person_statement'])
o314[0]

In [None]:
o314[1]

**Query interpretation:**  A range of birth and death years are expected, though values outside realistic ranges (e.g. with very old year of death or very recent year of birth) may indicate verification issues.

**Query feedback:** Most age ranges are in a realistic range and follow a realistic distribution, however the presence of some birth years after 2010 and in the future suggest that there may be verification or publishing issues.

#### Detail about interests

- Where BO is held indirectly through multiple legal entities or legal arrangements, or ownership or control are exerted formally or informally through another natural person, sufficient information should be collected to understand full ownership chains

##### Size and composition of beneficial ownership chains

**Query description:** Provides an overview of the size and composition of ownership chains. Note that this query is run for a subsample of the dataset (here, 250 ownership chains).

**Query result:** The below table shows the average, minimum and maximum values for the number nodes per ownership chain, the number of entities per chain, and the number of persons per chain. The graph provides a visual overview of these chains in network format.

In [None]:
o331 = qbods.q331(d['ooc_statement'], 250)
o331[0]

In [None]:
o331[1]

**Query interpretation:**  Values of two nodes per chain, and one entity statement and one person statement per chain would be expected if all declaring entities declared a single beneficial owner. Fewer than one person statement per chain would indicate that many chains exist with no person statements involved, and more than one person statement or entity statement per chain would indicate companies with multiple owners.

**Query feedback:** In this Latvia dataset most ownership chains contain a single entity and a single person. This suggests that most ownership chains in this dataset correspond to simple ownership structures, though some more complex chains can also be found.

### 4. A central register

This section reports how the data maps to the [central register](https://www.openownership.org/en/principles/sufficient-detail/) principle.





#### 4.1 Publisher information

- Beneficial ownership disclosures should be collated and held within a central register

##### 4.1.1 Publisher details

**Query description:** Provides a breakdown of publisher details in beneficial ownership declarations.

**Query result:** The below table provides counts of publisher names in person statements, entity statements, and ownership or control statements.

In [None]:
o411 = qbods.q411(d['entity_statement'],d['person_statement'],d['ooc_statement'])
o411[0]

**Query interpretation:**  Publisher name information should be provided for the majority of statements, and if the data come from a single, central register, all statements should come from the same publisher.

**Query feedback:** All statements contain publisher information, and all statements are published by the same publisher. This is in line with the central register principle.

### 5. Public access

This section reports how the data map to the [public access](https://www.openownership.org/en/principles/public-access/) principle.

#### Data sharing

- This data should be available as open data: published under a [specified licence](https://opendefinition.org/licenses/) which allows anyone to access, use, and share it without barriers such as identification, registration requirements, or the collection of data about users

##### License details

**Query description:** Provides a breakdown of licence details in beneficial ownership declarations.

**Query result:** The below table provides counts of licence types in person statements, entity statements, and ownership or control statements.

In [None]:
o511 = qbods.q511(d['entity_statement'],d['person_statement'],d['ooc_statement'])
o511[0]

**Query interpretation:**  All statements should have license information, and the license should be an [Open Definition](http://opendefinition.org/licenses/) recommended conformant license.

**Query feedback:** All statements have license information, with a link to the [license](http://opendefinition.org/licenses/cc-zero/) scheme. The license scheme itself is an Open Definition recommended conformant license, as per BODS guidance and the public access principle.

### 6. Structured data

This section reports how the data map to the [structured data](https://www.openownership.org/en/principles/structured-data/) principle.



##### BODS version

**Query description:** Provides a breakdown of which BODS version is declared in beneficial ownership declarations.

**Query result:** The below table and figure provide counts of BODS versions found in person statements, entity statements, and ownership or control statements.

In [None]:
o611 = qbods.q611(d['entity_statement'],d['person_statement'],d['ooc_statement'])
o611[0]

**Query interpretation:**  All statements should include the BODS version number, and all statements in a single release should conform to the same version of BODS.

**Query feedback:** All statements indicate they are published to BODS v0.2, which is in line with the structure and content of the data, and with the structured data principle.

### 7. Verification

This section reports how the data map to the [verification](https://www.openownership.org/en/principles/verification/) principle.

#### Source types

When data is submitted, measures should be taken to verify the:
- Beneficial owner
- Entity
- Ownership or control relationship between the beneficial owner and the entity


##### Source type details

**Query description:** Provides a breakdown of the source types in beneficial ownership declarations. This field includes the option to declare that a statement has been verified.

**Query result:** The below table provides counts of source types in person statements, entity statements and ownership or control statements.

In [None]:
o711 = qbods.q711(d['entity_statement'], d['person_statement'], d['ooc_statement'])
o711[0]

**Query interpretation:** All statements should have a source type object. If the implementing country has a verification process being implemented, then use of the verified source type may provide a useful means for monitoring this process. Note that each statement can have more than one source type.

**Query feedback:** All statements have a source type, but none in this dataset have a source type of verified, meaning that it is unclear from the data whether there is a verification process in place.

### 8. Up to date and auditable

This section reports how the data maps to the [up to date and auditable](https://www.openownership.org/en/principles/up-to-date-and-auditable/) principle.



#### Statement dates

- Initial registration and subsequent changes to beneficial ownership should be legally required to be submitted in a timely manner, with information updated within a short, defined time period after changes occur

- Data should be confirmed as correct on at least an annual basis

- All changes in beneficial ownership should be reported

#### Statement date details

**Query description:** provides a breakdown of statement dates in beneficial ownership declarations.

**Query result:** The below table and graph provide counts of the year provided in the statement dates for person statements, entity statements, and ownership or control statements.

In [None]:
o811 = qbods.q811(d['entity_statement'],d['person_statement'],d['ooc_statement'])
o811[0]

In [None]:
o811[1]

**Query interpretation:** Statement dates should be provided for all statements, with no unrealistically old or recent dates. Changes over time are expected, but should reflect known changes in declaration uptake/guidance.

**Query feedback:** Most, but not all, statements include statement dates. Implementing verification processes will be very challenging for statements with no date. All dates provfided are within a realistic timeframe, with no very old dates or dates in the future.

### 9. Sanctions and enforcement

This section reports how the data maps to the [sanctions and enforcement](https://www.openownership.org/en/principles/sanctions-and-enforcement/) principle.

#### Undisclosed information

- Data on noncompliance should be made available

##### Reasons for interested parties not being disclosed

**Query description:** Provides a breakdown of reasons for missing information about owners in beneficial ownership declarations.

**Query result:** The below table provides counts of ownership or control statements where the interested party was listed as ‘unspecified’, grouped by the reason given for this status.

In [None]:
o921 = qbods.q921(d['ooc_statement']) 
o921

**Query interpretation:** All instances where an interested party is listed as unspecified are expected to be accompanied by an explanation, so there should be no missing data on reasons provided for these person types. In addition, a range of reasons are expected and we would not expect all reasons to be described as ‘unknown’ or ‘information-unknown-to-publisher’.

**Query feedback:** All instances where no interested party has been provided are accompanied with a reason, with all entries providing the same reason of there being no beneficial owners. This is in line with the sanctions and enforcement principle guidance that data should be provided on non-compliance.
