# Tutorial Convert Solvency 2 XBRL-instances to CSV, HTML and pickles

This tutorial describes how to convert XBRL-instances to csv-, html- and pickle files per template.

We use Arelle, a open source package for processing XBRL. In addition this repository contains code to process the Solvency 2 and FTK instances efficiently.

In [1]:
from arelle import ModelManager, Cntlr, ModelXbrl, XbrlConst, RenderingEvaluator, \
                   ViewFileRenderedGrid, ModelFormulaObject
from arelle import PackageManager, FileSource

import src
import pandas as pd
import os
from os import listdir, walk, makedirs, environ
from os.path import isfile, join, exists, basename
from datetime import datetime

## Initialize the Arelle model manager

First we specify the directories with the taxonomy and instances. You can put your own instances in the data/instances directory, or you can specify here the directories that you want to use.

In [2]:
# the taxonomy should be data/taxonomy/arelle
# the instances you want to use should be in data/instances

XBRL_TAXONOMY_PATH = join('..', 'data', 'taxonomies')
XBRL_INSTANCES_PATH = join('..', 'data', 'instances')

LANGUAGE = "en-GB"
environ['XDG_CONFIG_HOME'] = XBRL_TAXONOMY_PATH

# The role defined in the model.xsd schema for resources representing codes of rows or columns is
euRCcode = 'http://www.eurofiling.info/xbrl/role/rc-code'

To process XBRL, we need a controller and a modelmanager object. 

In the controller you can specify logging. Here we have set logging to print in this notebook.

In [3]:
# Now we make a modelmanager
# logFileName = "logToPrint" -> logging is print to notebook
# logFileName = "arelle.log" -> logging is to filename (use .json or .xml for specific format)

controller = Cntlr.Cntlr(logFileName = "logToPrint")
controller.webCache.workOffline = True
controller.logger.messageCodeFilter = None

modelmanager = ModelManager.initialize(controller)
modelmanager.defaultLang = LANGUAGE
modelmanager.formulaOptions = ModelFormulaObject.FormulaOptions()
modelmanager.loadCustomTransforms()

## Initialize taxonomies

In [4]:
taxonomies = ['FTK Taxonomy 2.1.0_tcm46-386386.zip', 
              'EIOPA_SolvencyII_XBRL_Taxonomy_2.4.0_with_external_hotfix.zip']
PackageManager.init(controller)
for taxonomy in taxonomies:
    fs = FileSource.openFileSource(filename = join(XBRL_TAXONOMY_PATH, taxonomy))
    fs.open()
    PackageManager.addPackage(controller, fs.baseurl)
    fs.close()
PackageManager.rebuildRemappings(controller)
PackageManager.save(controller)

2020-10-14 11:36:36,909 [arelle.packageRewriteOverlap] Packages overlap the same rewrite start string http://www.eurofiling.info/ - EIOPA_SolvencyII_XBRL_Taxonomy_2.4.0_with_external_hotfix.zip , FTK Taxonomy 2.1.0_tcm46-386386.zip 

2020-10-14 11:36:36,925 [arelle.packageRewriteOverlap] Packages overlap the same rewrite start string http://www.xbrl.org/ - EIOPA_SolvencyII_XBRL_Taxonomy_2.4.0_with_external_hotfix.zip , FTK Taxonomy 2.1.0_tcm46-386386.zip 



## Read XBRL-instance in the modelmanager

Now we are able to read and process an XBRL-instance.

We read the example instances provided with the taxonomy.

In [5]:
# Solvency 2:
prefix = ""
# the example instance of the quarterly templates for solo
instance_name = 'qrs_240_instance.xbrl'  
# the example instance of the annual templates
# instance_name = 'aeb_240_instance.xbrl'

# FTK:
# prefix = "FTK."
# the example instance of the FTK assets templates
# instance_name = 'DNB-NR_FTK-2019-06_2019-12-31_MOD_FTK-BEL.XBRL'

In [6]:
xbrl_instance = ModelXbrl.load(modelManager = modelmanager, 
                               url = join(XBRL_INSTANCES_PATH, instance_name))
RenderingEvaluator.init(xbrl_instance)

2020-10-14 11:37:12,276 [] Formula xpath2 grammar initialized in 0,92 secs - 

2020-10-14 11:37:13,560 [info:profileActivity] ... formula parameter checks 1,284 secs
 - qrs_240_instance.xbrl 

2020-10-14 11:37:14,954 [info:profileActivity] ... custom function checks and compilation 1,394 secs
 - qrs_240_instance.xbrl 



## Convert XBRL-instance to CSV and Pandas-pickle

For each template or table in the instance we export the results to a csv file and a Pandas pickle-file. 

A Pandas pickle-file maintains the correct indices, whereas the csv does not, so if you want to access the data read the pickle (we included an example below).

The csv-files and the pickle-files are stored in a subdirectory identical to the name of the XBRL-instance (without extension)

In [7]:
# The location of the csv-files
subdir = basename(instance_name).split(".")[0]

In [8]:
# get tables in instance and sort by short name and print the first ten tables
tables = list(xbrl_instance.modelRenderingTables)
tables.sort(key = lambda table: table.genLabel(lang = LANGUAGE,strip = True, role = euRCcode))
for table in tables:
    print(table.genLabel(lang = LANGUAGE,strip = True, role = euRCcode))

S.01.01.02.01
S.01.02.01.01
S.02.01.02.01
S.05.01.02.01
S.05.01.02.02
S.06.02.01.01
S.06.02.01.02
S.06.03.01.01
S.08.01.01.01
S.08.01.01.02
S.08.02.01.01
S.08.02.01.02
S.12.01.02.01
S.17.01.02.01
S.23.01.01.01
S.23.01.01.02
S.28.01.01.01
S.28.01.01.02
S.28.01.01.03
S.28.01.01.04
S.28.01.01.05
S.28.02.01.01
S.28.02.01.02
S.28.02.01.03
S.28.02.01.04
S.28.02.01.05
S.28.02.01.06
T.99.01.01.01


In [9]:
# create csv and pickle files
# time_stamp = datetime.now().strftime("%Y_%m_%d-%H_%M_%S")

# use verbose_labels = False if you want the row-column code as column names
# use verbose_labels = True if you want labels as column names

for table in tables:
    obj = src.generateCSV.generateCSVTables(xbrl_instance, join(XBRL_INSTANCES_PATH, subdir), 
                                            table = table, 
                                            lang = LANGUAGE,
                                            verbose_labels = False)

2020-10-14 11:37:16,327 [xbrlte:closedDefinitionNodeZeroCardinality] Closed definition node s2md_c74 does not contribute at least one structural node - qrs_240_instance.xbrl 12, 16

2020-10-14 11:37:16,442 []  ... saved output ..\data\instances\qrs_240_instance\S.01.01.02.01.csv and .pickle - 

2020-10-14 11:37:17,229 [xbrlte:closedDefinitionNodeZeroCardinality] Closed definition node s2md_c499 does not contribute at least one structural node - qrs_240_instance.xbrl 12, 16

2020-10-14 11:37:17,263 []  ... saved output ..\data\instances\qrs_240_instance\S.01.02.01.01.csv and .pickle - 

2020-10-14 11:37:18,080 []  ... saved output ..\data\instances\qrs_240_instance\S.02.01.02.01.csv and .pickle - 

2020-10-14 11:37:19,077 []  ... saved output ..\data\instances\qrs_240_instance\S.05.01.02.01.csv and .pickle - 

2020-10-14 11:37:19,880 []  ... saved output ..\data\instances\qrs_240_instance\S.05.01.02.02.csv and .pickle - 

2020-10-14 11:37:21,016 []  ... saved output ..\data\instances\qr

In [10]:
# construct one dataframe with all data from closed axis tables
df_closed_axis = pd.DataFrame()  
for table in tables:
    table_name = table.genLabel(lang = LANGUAGE,strip = True, role = euRCcode)
    if exists(join(XBRL_INSTANCES_PATH, subdir, table_name + '.pickle')):
        df = pd.read_pickle(join(XBRL_INSTANCES_PATH, subdir, table_name + '.pickle'))  # read dataframe
        if df.index.nlevels == 2:  # if 2 indexes (entity, period) --> closed axis table
            if len(df_closed_axis) == 0:  
                # no data yet --> copy dataframe
                df_closed_axis = df.copy()
            else:  
                # join to existing dataframe
                df_closed_axis = df_closed_axis.join(df)
df_closed_axis.to_pickle(join(XBRL_INSTANCES_PATH, subdir, subdir + '.pickle'))

## Example to read a template from the pickle files

The easiest way to access the data of a separate template is to read the corresponding pickle-file.

In [12]:
t = tables[12].genLabel(lang = LANGUAGE,strip = True, role = euRCcode)
df = pd.read_pickle(join(XBRL_INSTANCES_PATH, subdir, prefix + t + ".pickle"))
df

Unnamed: 0_level_0,Unnamed: 1_level_0,"S.12.01.02.01,R0010,C0020","S.12.01.02.01,R0010,C0030","S.12.01.02.01,R0010,C0060","S.12.01.02.01,R0010,C0090","S.12.01.02.01,R0010,C0100","S.12.01.02.01,R0010,C0150","S.12.01.02.01,R0010,C0160","S.12.01.02.01,R0010,C0190","S.12.01.02.01,R0010,C0200","S.12.01.02.01,R0010,C0210",...,"S.12.01.02.01,R0200,C0020","S.12.01.02.01,R0200,C0030","S.12.01.02.01,R0200,C0060","S.12.01.02.01,R0200,C0090","S.12.01.02.01,R0200,C0100","S.12.01.02.01,R0200,C0150","S.12.01.02.01,R0200,C0160","S.12.01.02.01,R0200,C0190","S.12.01.02.01,R0200,C0200","S.12.01.02.01,R0200,C0210"
entity,period,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
0LFF1WMNTWG5PTIYYI38,2019-12-31,309135200.0,425431900.0,653423900.0,264880500.0,872559700.0,375959200.0,933717100.0,90930554.09,377637400.0,392468316.1,...,144684300.0,709926400.0,242732800.0,948366700.0,419871700.0,819253818.6,907755300.0,910766700.0,927530800.0,683920900.0


If you want to obtain the dataframe with all data of the templates with closed axes, then you can use:

In [13]:
df = pd.read_pickle(join(XBRL_INSTANCES_PATH, subdir, subdir + ".pickle"))
df

Unnamed: 0_level_0,Unnamed: 1_level_0,"S.01.01.02.01,R0010,C0010","S.01.01.02.01,R0030,C0010","S.01.01.02.01,R0110,C0010","S.01.01.02.01,R0140,C0010","S.01.01.02.01,R0150,C0010","S.01.01.02.01,R0170,C0010","S.01.01.02.01,R0180,C0010","S.01.01.02.01,R0220,C0010","S.01.01.02.01,R0290,C0010","S.01.01.02.01,R0410,C0010",...,"S.28.02.01.06,R0520,C0140","S.28.02.01.06,R0520,C0150","S.28.02.01.06,R0530,C0140","S.28.02.01.06,R0530,C0150","S.28.02.01.06,R0540,C0140","S.28.02.01.06,R0540,C0150","S.28.02.01.06,R0550,C0140","S.28.02.01.06,R0550,C0150","S.28.02.01.06,R0560,C0140","S.28.02.01.06,R0560,C0150"
entity,period,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
0LFF1WMNTWG5PTIYYI38,2019-12-31,Reported,Exempted under Article 35 (6) to (8),Exempted under Article 35 (6) to (8),Not due annually as reported for Quarter 4,Reported,Not reported other reason,Not reported o/a no derivative transactions,Reported,Exempted under Article 35 (6) to (8),Reported,...,585446600.0,444118300.0,566830000.0,750487500.0,907076800.0,216390100.0,563403902.1,951522200.0,548739100.0,530605100.0


## Validate instance from Arelle

It should be possible to validate the instance (performing the validation rules within the taxonomy) with Arelle with the following code. But we did not test this!

In [14]:
# modelXbrl = modelmanager.load(join(XBRL_INSTANCES_PATH, instance_name))

In [15]:
# controller = Cntlr.Cntlr(logFileName = "logToPrint")
# controller.webCache.workOffline = True
# controller.setLogCodeFilter(None)
# controller.logger.setLevel('INFO')

In [16]:
# modelmanager.validateInferDecimals = True
# modelmanager.validateCalcLB = True
# modelmanager.validate()