Oasis FM Testing Tool
==================

This notebook allows example insurance structures to be input in OED format and ran against the development version of the Oasis finanical engine.  

## Input files 

The input file formats are defined by the Open Data Standards (OED) specification. <br/>
Please visit https://github.com/OasisLMF/OpenDataStandards for more information.


## Notebook steps 

This notebook runs the following on the selected set of OED files which can either by uploaded or selected from the suite of FM test cases by drop down menu. The in-built test cases are the same used for automated testing to validate the correctness of the FM module. 

### Step 1. Generate Oasis files 
1. Select Input files.
2. View and edit the OED input files.
3. Edit the OED input files (optional)
4. Generate deterministic keys data.
5. Generate a set of oasis files (Not editable).

### Step 2. View Oasis files 
6. Visualize FM tree structure and files.
7. Visualize RI tree structure and files. 

### Step 3. Generate deterministic losses 
8. Run the current ktools FM module 
9. Run the new python based FM module
10. Compare loss outputs between the two modules 

## Options for test parameters 

**Keys Data:** The deterministic keys return can be edited to return muilt-peril and/or multi-coverage.
* `number_of_subperils` - Adds `x` number of peril ids per location row, where perils are interger values between [1..x]
* `supported_coverage_types` - Select a list of integers of values between 1 and 4, where each represents a coverage type. 
```
    1 = buildings
    2 = other
    3 = content
    4 = business interruption
```

**Allocation Rules:** Losses can either output at the contract level or back-allocated to the lowest level, which is item_id, using one of three command line options. There are three meaningful values here – don’t allocate (0) used typically for all levels where a breakdown of losses is not required in output, allocate back to items (1) in proportion to the input (ground up) losses, or allocate back to items (2) in proportion to the losses from the prior level calculation.

Reinsurance has an additional rule (3), where the layers are applied differently to the FM tree. When set 
layers can be defined throughout the hierarchy and back-allocation is in proportion to the losses from the previous level taking layer number into account.  

* `alloc_il` - set the allocation rule for insured losses
* `alloc_ri` - set the allocation rule for reinsurance losses

```
    0 = Losses are output at the contract level and not back-allocated
    1 = Losses are back-allocated to items on the basis of the input losses (e.g. ground up loss)
    2 = Losses are back-allocated to items on the basis of the prior level losses
    3 = (RI only) Losses are back-allocated by layer and level. 
```

**Output losses:** The losses are generated using the `run_exposure` function, which is equivalent of running `oasislmf exposure run --src-dir <path_to_oed_dir>` on command line.   

* `loss_factors` select the number of loss factors to test, where each float value in a list is between [0.0..1.0] that represents a % of ground up losses. For example, using `[0.45, 1.0]` will run the loss calculation twice, first assuming a 45% ground up loss (GUL) and again with 100% GUL.

* `output_level` - Set how the output losses are aggregated, valid options are a single string from `'item', 'loc', 'pol', 'acc', 'port'`
```
    'item' = Aggregate losses by item id
    'loc'  = Aggregate losses by location number
    'pol'  = Aggregate losses by policy number
    'acc'  = Aggregate losses by account number
    'port' = Aggregate losses by portfolio
```

## Key for FM tree diagram 

The image below is an example FM tree with errors to show all types of node. 

* **Grey boxes** - Item level nodes
* **Blue elipse** - Valid FM node, either coverage terms (level 1), location terms (level 2) or policy terms (3+)
* **Orange Boxes** - Valid FM node with multiple layers 
* **Pink elipse** - Either the trees root or an FM node missing its calcrule.   

<img src=https://user-images.githubusercontent.com/9889973/105785178-2c5af100-5f72-11eb-9ffd-d86a1fb84632.png width="550" style="float:left"> 


In [None]:
# Standard Python libraries
import io
import json
import os
import shutil

# 3rd party Python libraries
import fm_testing_tool.widgets as widgets
import fm_testing_tool.functions as functions
from IPython.display import Image
import pandas as pd
from IPython.display import Markdown, display, Javascript
from oasislmf.manager import OasisManager

In [None]:
# Upload local files into the Notebook (Optional)
display(Markdown('''#### Files must use the following naming convention:
* `location.csv` - Location exposure data (required)
* `account.csv`  - Account terms and conditions (required)
* `ri_info.csv`  - Reinsurance information (optional)
* `ri_scope.csv` - Reinsirance scoping rules (optional)
'''))
widgets.file_uploader('./validation/examples/uploaded')

In [None]:
# Select a validation example to run, the default value points the file upload location
# If a variables shows a 'None' results then the s
source_exposure = {}
widgets.select_source_dir(source_exposure, examples_dir='./validation/examples')

In [None]:
# Running this cell will execute the entire notebook (if inputs selected)
try:
    if (source_exposure['location_path'] and source_exposure['location_path']):
        display(Markdown('### Running files from "{}"'.format(os.path.basename(source_exposure['source_dir']))))
        display(Javascript("Jupyter.notebook.execute_cells_below()"))
    else:        
        raise ValueError('Error: missing OED loc/acc files or no test case selected')
except NameError:
   raise ValueError('Error: Must run setup cells 1,2 and 3 first')

## Adjust Test Parameters 

In [None]:
# Keys Data
number_of_subperils = 1
supported_coverage_types = [1,3,4] # 1 = buildings, 2 = other, 3 = content, 4 = business interruption

# Allocation Rules 
alloc_il = 2
alloc_ri = 3

# Output losses 
loss_factors = [1]
output_level = 'loc' # valid options are 'item', 'loc', 'pol', 'acc', 'port'

In [None]:
# Load source files from selected test case 
location_df = functions.load_df(source_exposure['location_path'], required_file='location.csv')
account_df =  functions.load_df(source_exposure['account_path'], required_file='account.csv')
ri_info_df =  functions.load_df(source_exposure['ri_info_path'])
ri_scope_df = functions.load_df(source_exposure['ri_scope_path'])

In [None]:
# View/edit the location data.
location_grid = widgets.show_df(location_df)
location_grid

In [None]:
# View/edit the account data. 
account_grid = widgets.show_df(account_df)
account_grid

In [None]:
# View/edit the ri_info data (Optinal). 
ri_info_grid = widgets.show_df(ri_info_df)
ri_info_grid

In [None]:
# View/edit the ri_scope data (Optinal). 
ri_scope_grid = widgets.show_df(ri_scope_df)
ri_scope_grid

In [None]:
# Store exposure and create run dir 
run_dir = os.path.join('runs', os.path.basename(source_exposure['source_dir']))
os.makedirs(run_dir, exist_ok=True)

# Pick up any edits for required files in the grid before running the analysis
location_df = location_grid.get_changed_df()
loc_csv = os.path.join(run_dir, "location.csv")
location_df.to_csv(path_or_buf=loc_csv, encoding='utf-8', index=False)

account_df = account_grid.get_changed_df()
acc_csv = os.path.join(run_dir, "account.csv")
account_df.to_csv(path_or_buf=acc_csv, encoding='utf-8', index=False)
 
# Pick up any edits ri file edits if there 
if (ri_scope_df is not None) and (ri_info_df is not None):
    ri_info_df = ri_info_grid.get_changed_df()
    info_csv = os.path.join(run_dir, "ri_info.csv")
    ri_info_df.to_csv(path_or_buf=info_csv, encoding='utf-8', index=False)
    
    ri_scope_df = ri_scope_grid.get_changed_df()
    scope_csv = os.path.join(run_dir, "ri_scope.csv")
    ri_scope_df.to_csv(path_or_buf=scope_csv, encoding='utf-8', index=False)
else:
    info_csv = None
    scope_csv = None

In [None]:
# Generate keys file
keys_csv = os.path.join(run_dir, "keys.csv")
OasisManager().generate_keys_deterministic(
    oed_location_csv=loc_csv,
    keys_data_csv=keys_csv,
    supported_oed_coverage_types=supported_coverage_types,
    num_subperils=number_of_subperils,
)

keys_df = functions.load_df(keys_csv, required_file='keys.csv') 
keys_grid = widgets.show_df(keys_df)
keys_grid

# Generate Oasis files

In [None]:
%%time
 # Pick up any edits in the Keys data 
keys_df = keys_grid.get_changed_df()
keys_df.to_csv(path_or_buf=keys_csv, encoding='utf-8', index=False)

# Start Oasis files generation
oasis_files = OasisManager().generate_files(
    oasis_files_dir=run_dir,
    oed_location_csv=loc_csv,
    oed_accounts_csv=acc_csv,
    oed_info_csv=info_csv,
    oed_scope_csv=scope_csv,
    keys_data_csv=keys_csv,
    disable_summarise_exposure=True,
    write_ri_tree=True)

print("Location rows: {}".format(len(location_df)))
print("Lookup rows: {}".format(len(keys_df)))

In [None]:

# Show FM items 
fm_file = 'items'
display(Markdown(f'### {fm_file}.csv'))
widgets.show_df(functions.load_df(oasis_files[fm_file]))

In [None]:
# Show FM coverages 
fm_file = 'coverages'
display(Markdown(f'### {fm_file}.csv'))
widgets.show_df(functions.load_df(oasis_files[fm_file]))

## View Direct Insurance files  

In [None]:
# Show FM summary map 
fm_file = 'fm_summary_map'
display(Markdown(f'### {fm_file}.csv'))
#widgets.show_df(functions.load_df(os.path.join(run_dir, 'gul_summary_map.csv')))
fm_summary_map = functions.load_df(os.path.join(run_dir, 'fm_summary_map.csv'))
widgets.show_df(fm_summary_map)

In [None]:
# Show FM fm_programme 
fm_file = 'fm_programme'
display(Markdown(f'### {fm_file}.csv'))
fm_programme = functions.load_df(oasis_files[fm_file])
widgets.show_df(fm_programme)

In [None]:
# Show FM fm_policytc 
fm_file = 'fm_policytc'
display(Markdown(f'### {fm_file}.csv'))
fm_policytc = functions.load_df(oasis_files[fm_file])
widgets.show_df(fm_policytc)

In [None]:
# Show FM fm_profile 
fm_file = 'fm_profile'
display(Markdown(f'### {fm_file}.csv'))
fm_profile = functions.load_df(oasis_files[fm_file])
widgets.show_df(fm_profile)

In [None]:
# Display FM tree
fm_tree = functions.create_fm_tree(fm_programme, fm_policytc, fm_profile, fm_summary_map)
functions.render_fm_tree(fm_tree, filename='tree.png')
display(Markdown(f'### FM calc Tree'))
display(Image(filename='tree.png'))
display(Markdown('### [ FM calcuation rules - reference doc](https://github.com/OasisLMF/ktools/blob/master/docs/md/fmprofiles.md)'))

## View Reinsurance files  

In [None]:
# Edit this value to dislay another inuring priority
selected_ri_layer = "1"   

# Show created RI layers
ri_dir = None
if (ri_scope_df is not None) and (ri_info_df is not None):
    with open(oasis_files['ri_layers'], 'r') as layers:
        ri_metadata = json.load(layers)
    print(json.dumps(ri_metadata, indent=4))
    
    # Select RI layer to display
    ri_dir = ri_metadata[selected_ri_layer]["directory"]
    display(Markdown(f"### Showing RI layer {selected_ri_layer}"))

In [None]:
# Show RI fm_programme 
if ri_dir:
    ri_file = 'fm_programme.csv'
    display(Markdown(f'### RI_{selected_ri_layer} - {ri_file}'))
    ri_programme = functions.load_df(os.path.join(ri_dir, ri_file))
    display(widgets.show_df(ri_programme))

In [None]:
# Show RI fm_policytc 
if ri_dir:
    ri_file = 'fm_policytc.csv'
    display(Markdown(f'### RI_{selected_ri_layer} - {ri_file}'))
    ri_policytc = functions.load_df(os.path.join(ri_dir, ri_file))
    display(widgets.show_df(ri_policytc))

In [None]:
# Show RI fm_profile 
if ri_dir:
    ri_file = 'fm_profile.csv'
    display(Markdown(f'### RI_{selected_ri_layer} - {ri_file}'))
    ri_profile = functions.load_df(os.path.join(ri_dir, ri_file))
    display(widgets.show_df(ri_profile))

In [None]:
# Show RI Tree
if ri_dir:
    tree_fp = os.path.join(ri_dir, 'fm_tree.png')
    display(Markdown(f'### RI_{selected_ri_layer} - calc Tree'))
    display(Image(filename=tree_fp))

    #ri_policytc.rename(columns={'profile_id': 'policytc_id' }, inplace=True)
    #ri_profile.rename(columns={'profile_id': 'policytc_id' }, inplace=True)
    #ri_tree = functions.create_fm_tree(ri_programme, ri_policytc, ri_profile, fm_summary_map)
    #functions.render_fm_tree(ri_tree, filename='ri_tree.png')
    #display(Image(filename='ri_tree.png'))

# Generate Losses (Ktools FM)


In [None]:
%%time
# Run Deterministic Losses FM
output_losses = os.path.join(run_dir, 'losses.csv')
OasisManager().run_exposure(
    src_dir=run_dir,
    output_level=output_level,
    output_file=output_losses,
    num_subperils=number_of_subperils,
    coverage_types=supported_coverage_types,
    loss_factor=loss_factors,
    ktools_alloc_rule_il=alloc_il,
    ktools_alloc_rule_ri=alloc_ri,
    net_ri=True,
    include_loss_factor=True,
    print_summary=False,
    fmpy=False,
)
ktools_losses = functions.load_df(output_losses)

In [None]:
# Show loss results 
if len(loss_factors) > 1:
    for i in range(len(loss_factors)):
        factor_losses = ktools_losses[ktools_losses.loss_factor_idx == i]
        display(Markdown(f'### Loss Factor {loss_factors[i]*100} %'))

        display(Markdown('**Total gul** = {}'.format(
            factor_losses.loss_gul.sum()
        )))
        display(Markdown('**Total il** = {}'.format(
            factor_losses.loss_il.sum()
        )))
        if hasattr(factor_losses, 'loss_ri'):
            display(Markdown('**Total ri ceded** = {}'.format(
                factor_losses.loss_ri.sum()
            )))
        
        display(factor_losses.drop(columns='loss_factor_idx'))
else: 
    display(ktools_losses)

# Generate Losses (Python FM)

In [None]:
%%time
# Run Deterministic Losses fmpy
output_losses = os.path.join(run_dir, 'losses.csv')
OasisManager().run_exposure(
    src_dir=run_dir,
    output_level=output_level,
    output_file=output_losses,
    num_subperils=number_of_subperils,
    coverage_types=supported_coverage_types,
    loss_factor=loss_factors,
    ktools_alloc_rule_il=alloc_il,
    ktools_alloc_rule_ri=alloc_ri,
    net_ri=True,
    include_loss_factor=True,
    print_summary=False,
    fmpy=True,
)
fmpy_losses = functions.load_df(output_losses)

In [None]:
# Show loss results 
if len(loss_factors) > 1:
    for i in range(len(loss_factors)):
        factor_losses = fmpy_losses[fmpy_losses.loss_factor_idx == i]
        display(Markdown(f'### Loss Factor {loss_factors[i]*100} %'))

        display(Markdown('**Total gul** = {}'.format(
            factor_losses.loss_gul.sum()
        )))
        display(Markdown('**Total il** = {}'.format(
            factor_losses.loss_il.sum()
        )))
        if hasattr(factor_losses, 'loss_ri'):
            display(Markdown('**Total ri ceded** = {}'.format(
                factor_losses.loss_ri.sum()
            )))
            
        display(factor_losses.drop(columns='loss_factor_idx'))
else: 
    display(fmpy_losses)

# Percentage difference between FM modules 

In [None]:
# Show output difference % 
col_filter = [col for col in fmpy_losses.columns if col in ['loss_gul', 'loss_il', 'loss_ri']]
pc_diff_df = fmpy_losses.drop(columns=col_filter)
if len(loss_factors) > 1:
    pc_diff_df['loss_il_difference'] = (fmpy_losses['loss_il'] - ktools_losses['loss_il']).abs() / (fmpy_losses['loss_il'] + ktools_losses['loss_il'] * 2)* 100
    if hasattr(fmpy_losses, 'loss_ri'):
        pc_diff_df['loss_ri_difference'] = (fmpy_losses['loss_ri'] - ktools_losses['loss_ri']).abs() / (fmpy_losses['loss_ri'] + ktools_losses['loss_ri'] * 2)* 100
    
    pc_diff_df.fillna(0, inplace=True)
    
    for i in range(len(loss_factors)):
        display(Markdown(f'### Loss Factor {loss_factors[i]*100} %'))
        diff_losses = pc_diff_df[pc_diff_df.loss_factor_idx == i]
        display(diff_losses.drop(columns='loss_factor_idx'))
    
else: 
    pc_diff_df['loss_il_difference'] = (fmpy_losses['loss_il'] - ktools_losses['loss_il']).abs() / (fmpy_losses['loss_il'] + ktools_losses['loss_il'] * 2)* 100
    pc_diff_df['loss_ri_difference'] = (fmpy_losses['loss_ri'] - ktools_losses['loss_ri']).abs() / (fmpy_losses['loss_ri'] + ktools_losses['loss_ri'] * 2)* 100
    pc_diff_df.fillna(0, inplace=True)
    display(pc_diff_df)