# Using Arelle to analyse XBRL ESEF files

First of all, we specify which files we want to download from https://filings.xbr.org/ . 

In [1]:
import os

SAMPLE_URLS = ['https://filings.xbrl.org/222100VRLXV3FPMG4982/2022-12-31/ESEF/LU/0/ESEF_Allegro.eu_Group_Consolidated_Financial_Statements_31.12.2022.zip',
                'https://filings.xbrl.org/222100VRLXV3FPMG4982/2023-12-31/ESEF/LU/0/Allegroeu-2023-12-31-en.zip',
                'https://filings.xbrl.org/5493008JPA4HYMH1HX51/2022-12-31/ESEF/LU/0/SES%20Annual%20report%20-2022-12-31-en.zip',
                'https://filings.xbrl.org/5493008JPA4HYMH1HX51/2023-12-31/ESEF/LU/0/SES_Annual_report_-2023-12-31-en.zip']
WORKDIR = os.path.join('..', 'samples')
REPORTPACKAGES_DIR = os.path.join(WORKDIR, 'reports')
JSON_DATA_DIR = os.path.join(WORKDIR, 'data', 'json')
CSV_DATA_DIR = os.path.join(WORKDIR, 'data', 'csv')
TABLES_DATA_DIR = os.path.join(WORKDIR, 'data', 'tables')
os.makedirs(REPORTPACKAGES_DIR, exist_ok=True)
os.makedirs(JSON_DATA_DIR, exist_ok=True)
os.makedirs(CSV_DATA_DIR, exist_ok=True)
os.makedirs(TABLES_DATA_DIR, exist_ok=True)

We collect the file name from the full URL for each sample file.

In [2]:
import urllib.parse
import pathlib

sample_filenames:list[list[str, str]] = []
for url in SAMPLE_URLS:
    parsed_url = urllib.parse.urlparse(url)
    filename = pathlib.PurePosixPath(urllib.parse.unquote(parsed_url.path)).parts[-1]
    sample_filenames.append([filename, parsed_url.path])
print(sample_filenames)

[['ESEF_Allegro.eu_Group_Consolidated_Financial_Statements_31.12.2022.zip', '/222100VRLXV3FPMG4982/2022-12-31/ESEF/LU/0/ESEF_Allegro.eu_Group_Consolidated_Financial_Statements_31.12.2022.zip'], ['Allegroeu-2023-12-31-en.zip', '/222100VRLXV3FPMG4982/2023-12-31/ESEF/LU/0/Allegroeu-2023-12-31-en.zip'], ['SES Annual report -2022-12-31-en.zip', '/5493008JPA4HYMH1HX51/2022-12-31/ESEF/LU/0/SES%20Annual%20report%20-2022-12-31-en.zip'], ['SES_Annual_report_-2023-12-31-en.zip', '/5493008JPA4HYMH1HX51/2023-12-31/ESEF/LU/0/SES_Annual_report_-2023-12-31-en.zip']]


Knowing the URL of each file and the final destination dir, we download the files.

The final name of the downloaded file is identical with the name specified in the URL.

In [3]:
import urllib.request

def download_file(url, filename) -> str:
    download_path = os.path.join(REPORTPACKAGES_DIR, filename)
    print(f'Downloading "{url}" to"{download_path}"')
    urllib.request.urlretrieve(url, download_path)
    return download_path

for i, url in enumerate(SAMPLE_URLS):
    sample_filenames[i][1] = download_file(url, sample_filenames[i][0])

Downloading "https://filings.xbrl.org/222100VRLXV3FPMG4982/2022-12-31/ESEF/LU/0/ESEF_Allegro.eu_Group_Consolidated_Financial_Statements_31.12.2022.zip" to"..\samples\reports\ESEF_Allegro.eu_Group_Consolidated_Financial_Statements_31.12.2022.zip"
Downloading "https://filings.xbrl.org/222100VRLXV3FPMG4982/2023-12-31/ESEF/LU/0/Allegroeu-2023-12-31-en.zip" to"..\samples\reports\Allegroeu-2023-12-31-en.zip"
Downloading "https://filings.xbrl.org/5493008JPA4HYMH1HX51/2022-12-31/ESEF/LU/0/SES%20Annual%20report%20-2022-12-31-en.zip" to"..\samples\reports\SES Annual report -2022-12-31-en.zip"
Downloading "https://filings.xbrl.org/5493008JPA4HYMH1HX51/2023-12-31/ESEF/LU/0/SES_Annual_report_-2023-12-31-en.zip" to"..\samples\reports\SES_Annual_report_-2023-12-31-en.zip"


Finally, we can start using the Arelle API.

The API is only in beta stage, it may change in the future.

The main principle is to define a `Session` object for which you can specify as many options as you like. They are specified using a `RuntimeOptions` object.

The current options are mostly string or booleans.

To know which options are available, read the commandline options given in https://arelle.readthedocs.io/en/2.27.5/command_line.html .

You may also supply plug-in-specific options. For plug-ins supplied by Arelle itself, you must read the plug-in code to learn which options are available.

In the following example, the goal is to generate a OIM-JSON file from an ESEF report package.

An ESEF report package is essentially a ZIP archive containing the inline XBRL report alongside an _XBRL taxonomy extension_.

The JSON result is extracted as quickly as possible. If during the extraction you want to also validate the whole report package, uncomment the option `validate=True`.

**N. B.** If you want to validate an ESEF report package from 2022 backwards, use `disclosureSystemName='esef-2022'`!

In [4]:
from arelle.api.Session import Session
from arelle.RuntimeOptions import RuntimeOptions

def convert_to_oim_json(file_and_path: list[str], targetDir: str) -> str:
    json_filename = ".".join(file_and_path[0].split('.')[:-1 or None]) + '.json'
    oim_json_path = os.path.join(targetDir, json_filename)
    print(f'JSON converting {file_and_path[0]} to {oim_json_path}')
    options = RuntimeOptions(
        entrypointFile=str(file_and_path[1]),
        disclosureSystemName='esef',
        internetConnectivity='online',
        keepOpen=False,
        logFormat="[%(messageCode)s] %(message)s - %(file)s",
        # deduplicateFacts='consistent-pairs',
        plugins='validate/ESEF|saveLoadableOIM',
        pluginOptions={
            'saveLoadableOIM': oim_json_path,
        },
        strictOptions=False,
        # validate=True
    )
    with Session() as session:
        session.run(options)

for zip_file in sample_filenames:
    convert_to_oim_json(zip_file, JSON_DATA_DIR)

JSON converting ESEF_Allegro.eu_Group_Consolidated_Financial_Statements_31.12.2022.zip to ..\samples\data\json\ESEF_Allegro.eu_Group_Consolidated_Financial_Statements_31.12.2022.json
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Save Loadable OIM successful, version 1.2. - saveLoadableOIM 
[info] loaded in 4.94 secs at 2024-06-20T09:40:27 - C:\projects\xbrl\sources\arelleSamples\samples\reports\ESEF_Allegro.eu_Group_Consolidated_Financial_Statements_31.12.2022.zip\ESEF_Allegro.eu_Group_Consolidated_Financial_Statements_31.12.2022\reports\_IXDS 
JSON converting Allegroeu-2023-12-31-en.zip to ..\samples\data\json\Allegroeu-2023-12-31-en.json
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Save Loadable OIM successful, version 1.2. - saveLoa

If you go carefully through the JSON results, you will notice that several facts are duplicated! For a way to deduplicate the facts before generting the JSON file, have a look at https://arelle.readthedocs.io/en/2.27.5/user_guides/fact_deduplication.html .

Converting to OIM-CSV is the same code. The only difference is that you specify a target file ending with `.csv`:

In [5]:
from arelle.api.Session import Session
from arelle.RuntimeOptions import RuntimeOptions

def convert_to_oim_csv(file_and_path: list[str], targetDir: str) -> str:
    csv_filename = ".".join(file_and_path[0].split('.')[:-1 or None]) + '.csv'
    oim_csv_path = os.path.join(targetDir, csv_filename)
    print(f'CSV converting {file_and_path[0]} to {oim_csv_path}')
    options = RuntimeOptions(
        entrypointFile=str(file_and_path[1]),
        disclosureSystemName='esef',
        internetConnectivity='online',
        keepOpen=False,
        logFormat="[%(messageCode)s] %(message)s - %(file)s",
        # deduplicateFacts='consistent-pairs',
        plugins='validate/ESEF|saveLoadableOIM',
        pluginOptions={
            'saveLoadableOIM': oim_csv_path,
        },
        strictOptions=False,
        # validate=True
    )
    with Session() as session:
        session.run(options)

for zip_file in sample_filenames:
    convert_to_oim_csv(zip_file, CSV_DATA_DIR)

CSV converting ESEF_Allegro.eu_Group_Consolidated_Financial_Statements_31.12.2022.zip to ..\samples\data\csv\ESEF_Allegro.eu_Group_Consolidated_Financial_Statements_31.12.2022.csv
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Save Loadable OIM successful, version 1.2. - saveLoadableOIM 
[info] Activation of plug-in Save Loadable OIM successful, version 1.2. - saveLoadableOIM 
[info] Activation of plug-in Save Loadable OIM successful, version 1.2. - saveLoadableOIM 
[info] Activation of plug-in Save Loadable OIM successful, version 

We first open the PostgreSQL database and we create the database tables (only done once).

If the database already exists, clear the tables it contains.

In [6]:
import pg_db_utils

pg_host = 'localhost'
pg_port = 5432
pg_user = 'xbrl_user'
pg_password = 'user123'
pg_db = 'xbrl_db'
# If your tables do not exist yet, execute:
# engine = pg_db_utils.init_DB(pg_user, pg_password, pg_host, pg_port, pg_db)
engine = pg_db_utils.connect_DB(pg_user, pg_password, pg_host, pg_port, pg_db)
DB_session = pg_db_utils.session_factory(engine)
pg_db_utils.delete_all(DB_session)

Opening connection to database xbrl_db


Use Arelle to open the sample reports, to iterate through all facts and to save them to the database.
Moreover, also save to disk the presentation tables found in a taxonomy report.

In [7]:
from arelle.api.Session import Session
from arelle.ModelInstanceObject import ModelContext
from arelle.RuntimeOptions import RuntimeOptions
from arelle.ValidateXbrlCalcs import inferredDecimals
from pg_db_utils import insert_document, insert_fact

def add_dimensions(context: ModelContext) -> dict[str, str] :
    dimensions_dict: dict[str, str] = {}
    for _qn, dim in sorted(context.qnameDims.items(), key=lambda item: item[0]):
        if dim.isExplicit:
            dim_value = dim.memberQname.clarkNotation
        else: # typed dimension
            if dim.typedMember.get("{http://www.w3.org/2001/XMLSchema-instance}nil") in ("true", "1"):
                dim_value = None
            else:
                dim_value = dim.typedMember.stringValue
        dim_name = dim.dimensionQname.clarkNotation
        dimensions_dict[dim_name] = dim_value
    return dimensions_dict

def extract_data(file_and_path: list[str], target_dir: str) -> str:
    table_files_dir = ".".join(file_and_path[0].split('.')[:-1 or None]) + '_'
    table_files_path = os.path.join(target_dir, table_files_dir)
    print(f'Extracting data from {file_and_path[0]} to database and directory {table_files_path}')
    document_id = insert_document(DB_session, file_and_path[0])
    options = RuntimeOptions(
        entrypointFile=str(file_and_path[1]),
        disclosureSystemName='esef',
        internetConnectivity='online',
        keepOpen=True,
        logFormat="[%(messageCode)s] %(message)s - %(file)s",
        # deduplicateFacts='consistent-pairs',
        plugins='validate/ESEF',
        strictOptions=False,
        # validate=True
    )
    os.makedirs(table_files_path, exist_ok=True)
    with Session() as session:
        session.run(options)
        model_xbrls = session.get_models()
        for model_xbrl in model_xbrls:
            facts = model_xbrl.facts
            for fact in facts:
                concept = fact.concept
                concept_full_name = concept.qname.clarkNotation
                context = fact.context
                entity = context.entity.stringValue
                if context.isInstantPeriod:
                    period_start = None
                    period_end = context.instantDatetime
                elif context.isStartEndPeriod:
                    period_start = context.startDatetime
                    period_end = context.endDatetime
                else: # forever period
                    period_start = period_end = None
                all_dimensions = add_dimensions(context)
                value = fact.value
                decimals = inferredDecimals(fact)
                unit_as_string = None
                if fact.unit:
                    unit_as_string = fact.unit.value
                language = 'en'
                insert_fact(DB_session,
                            document_id,
                            value,
                            decimals,
                            concept_full_name,
                            entity,
                            period_start,
                            period_end,
                            unit_as_string,
                            language,
                            all_dimensions)
            model_xbrl.close()

for zip_file in sample_filenames:
    extract_data(zip_file, TABLES_DATA_DIR)

Extracting data from ESEF_Allegro.eu_Group_Consolidated_Financial_Statements_31.12.2022.zip to database and directory ..\samples\data\tables\ESEF_Allegro.eu_Group_Consolidated_Financial_Statements_31.12.2022_
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of pl

  if fact.unit:


Extracting data from Allegroeu-2023-12-31-en.zip to database and directory ..\samples\data\tables\Allegroeu-2023-12-31-en_
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activation of plug-in Validate ESMA ESEF successful, version 1.2023.00. - validate/ESEF 
[info] Activa