# Convert DNBs Additional Validation Rules to Patterns

DNBs additional validation rules are available in the 'solvency2-rules' subfolder of the repository.  
The formulas in this file use a specific syntax, this notebook converts this syntax to a syntax that can be interpreted by Python.  
The resulting formulas are called 'patterns'.

## Import packages

In [None]:
import pandas as pd  # dataframes
from os.path import join # some os dependent functionality
from src import Evaluator  # conversion from 'rules' to expressions for the data-patterns packages

## General parameters

In [None]:
# Location and name of the file with the additional rules:
RULES_PATH = join('..', 'data', 'downloaded files')  
FILENAME_RULES = '2020-01-22 Set aanvullende controleregels Solvency II_tcm46-387021.xlsx'

In [None]:
# Location and names of files with all possible datapoints for QRS and ARS
DATAPOINTS_PATH = join('..', 'data', 'datapoints')
FILENAME_DATAPOINTS_QRS = 'QRS.csv'
FILENAME_DATAPOINTS_ARS = 'ARS.csv'

In [None]:
# Input parameters:
PARAMETERS = {'decimal': 0}
# currently only 'decimal' is available which specifies tolerance during evaluation of patterns.
# decimal: 0 means tolerance = abs(1.5e-0) (= 1.5)

In [None]:
# # We log to rules.log in the data/instances path
# logging.basicConfig(filename = join(INSTANCES_DATA_PATH, 'rules.log'),level = logging.INFO, 
#                     format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

## Read file with all possible datapoints

We use a simplified taxonomy with all possible datapoints, located in the data/datapoints directory.  
The evaluator uses this taxonomy to generate the patterns.

In [None]:
# Load files to dataframe:
df_datapoints_qrs = pd.read_csv(join(DATAPOINTS_PATH, FILENAME_DATAPOINTS_QRS), sep=";").fillna("")
df_datapoints_ars = pd.read_csv(join(DATAPOINTS_PATH, FILENAME_DATAPOINTS_ARS), sep=";").fillna("")

In [None]:
df_datapoints_qrs.head()

In [None]:
df_datapoints_ars.head()

## Read DNBs Additional Validation Rules

DNBs additional validation rules are currently published as an Excel file on the DNB statistics website. We included the Excel file here in the project.

Here we read the Excel and perform some data cleaning.

In [None]:
df_rules = pd.read_excel(join(RULES_PATH, FILENAME_RULES), header = 1, engine='openpyxl')
df_rules.drop_duplicates(inplace=True) #remove double lines
df_rules.fillna("", inplace = True)
df_rules = df_rules.set_index('ControleRegelCode')

<> " " has to be converted to <> None

In [None]:
df_rules['Formule'] = df_rules['Formule'].str.replace('" "','None')

Some rules check dates to be filled by > 0, this has to be changed to <> None

In [None]:
list_of_rules = ['S.15.01_105',
                 'S.15.01_107',
                 'S.23.04_111',
                 'S.23.04_112',
                 'S.23.04_121',
                 'S.23.04_122',
                 'S.23.04_133',
                 'S.23.04_144',
                 'S.23.04_145', 
                 'S.30.01_105',
                 'S.30.01_106',
                 'S.30.01_117',
                 'S.30.01_118',
                 'S.30.03_102',
                 'S.30.03_103',
                 'S.36.01_106',
                 'S.36.02_106',
                 'S.36.02_108',
                 'S.36.03_104',
                 'S.10.01_115',
                 'S.15.01_106',
                 'S.15.01_108',
                 'S.23.04_127',
                 'S.23.04_128',
                 'S.23.04_137',
                 'S.23.04_148',
                 'S.23.04_149']

df_rules.loc[list_of_rules, 'Formule'] = df_rules.loc[list_of_rules, 'Formule'].str.replace("> 0",'<> None').str.replace(">0",'<> None')

The Excel file contains rules for different report-types. In the next step we filter out the rules for QRS and ARS respectively.

In [None]:
df_rules_qrs = df_rules.copy()[(df_rules['Standaard'] == 'SOLVENCY') | (df_rules['Standaard'] == 'QRS')]
df_rules_ars = df_rules.copy()[(df_rules['Standaard'] == 'SOLVENCY') | (df_rules['Standaard'] == 'ARS')]

In [None]:
df_rules_qrs.head()

In [None]:
df_rules_ars.head()

## Convert the rules to patterns

The evaluator is a piece of Python code, which takes the Additional Validation Rules as input, and transforms it to expressions that can be interpreted by the data_patterns package (patterns).

In [None]:
evaluator_qrs = Evaluator(df_rules_qrs, df_datapoints_qrs, PARAMETERS)

evaluator_ars = Evaluator(df_rules_ars, df_datapoints_ars, PARAMETERS)

In [None]:
evaluator_qrs.df_patterns.head()

In [None]:
evaluator_ars.df_patterns.head()

## Export patterns to rules folder

In [None]:
evaluator_qrs.df_patterns.to_excel(join('..', 'solvency2-rules', "qrs_patterns_additional_rules.xlsx"))

In [None]:
evaluator_ars.df_patterns.to_excel(join('..', 'solvency2-rules', "ars_patterns_additional_rules.xlsx"))