# Setting Up
We will first need to install the package, as Google Colab's default environment doesn't have the chainladder package pre-installed. 

Simply execute `pip install chainladder`, Colab is smart enough to know that this is not a piece of python code, but to execute it in shell. FYI, `pip` stands for "Package Installer for Python". You will need to run this step using your terminal instead of using a python notebook when you are ready to install the package on your machine.

In [21]:
%load_ext lab_black

Other commonly used packages, such as `numpy`, `pandas`, and `matplotlib` are already pre-installed, we just need to load them into our environment.

In [22]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import chainladder as cl

print("chainladder", cl.__version__)

chainladder 0.8.13


# Your Journey Begins

Let's begin by looking at a sample dataset, called `xyz`, which is hosted on https://raw.githubusercontent.com/casact/chainladder-python/master/chainladder/utils/data/xyz.csv.

Let's load the dataset into the memory with `pandas`, then inspect its "`head`".

In [23]:
xyz_df = pd.read_csv(
    "https://raw.githubusercontent.com/casact/chainladder-python/master/chainladder/utils/data/xyz.csv"
)
xyz_df.head()

Unnamed: 0,AccidentYear,DevelopmentYear,Incurred,Paid,Reported,Closed,Premium
0,2002,2002,12811,2318,1342,203,61183
1,2003,2003,9651,1743,1373,181,69175
2,2004,2004,16995,2221,1932,235,99322
3,2005,2005,28674,3043,2067,295,138151
4,2006,2006,27066,3531,1473,307,107578


Can you list all of the unique accident years?

In [24]:
xyz_df["AccidentYear"].unique()

array([2002, 2003, 2004, 2005, 2006, 2007, 2008, 1998, 1999, 2000, 2001])

How many are there?

In [25]:
xyz_df["AccidentYear"].nunique()

11

# Triangle Basics

Let's load the data into the chainladder triangle format. And let's call it `xyz_tri`.

In [27]:
xyz_tri = cl.Triangle(
    data=xyz_df,
    origin="AccidentYear",
    development="DevelopmentYear",
    columns=["Incurred", "Paid", "Reported", "Closed", "Premium"],
    cumulative=True,
)
xyz_tri

Unnamed: 0,Triangle Summary
Valuation:,2008-12
Grain:,OYDY
Shape:,"(1, 5, 11, 11)"
Index:,[Total]
Columns:,"[Incurred, Paid, Reported, Closed, Premium]"


What does the incurred triangle look like?

In [28]:
xyz_tri["Incurred"]

Unnamed: 0,12,24,36,48,60,72,84,96,108,120,132
1998,,,11171.0,12380.0,13216.0,14067.0,14688.0,16366.0,16163.0,15835.0,15822.0
1999,,13255.0,16405.0,19639.0,22473.0,23764.0,25094.0,24795.0,25071.0,25107.0,
2000,15676.0,18749.0,21900.0,27144.0,29488.0,34458.0,36949.0,37505.0,37246.0,,
2001,11827.0,16004.0,21022.0,26578.0,34205.0,37136.0,38541.0,38798.0,,,
2002,12811.0,20370.0,26656.0,37667.0,44414.0,48701.0,48169.0,,,,
2003,9651.0,16995.0,30354.0,40594.0,44231.0,44373.0,,,,,
2004,16995.0,40180.0,58866.0,71707.0,70288.0,,,,,,
2005,28674.0,47432.0,70340.0,70655.0,,,,,,,
2006,27066.0,46783.0,48804.0,,,,,,,,
2007,19477.0,31732.0,,,,,,,,,


How about paid?

In [29]:
xyz_tri["Paid"]

Unnamed: 0,12,24,36,48,60,72,84,96,108,120,132
1998,,,6309.0,8521.0,10082.0,11620.0,13242.0,14419.0,15311.0,15764.0,15822.0
1999,,4666.0,9861.0,13971.0,18127.0,22032.0,23511.0,24146.0,24592.0,24817.0,
2000,1302.0,6513.0,12139.0,17828.0,24030.0,28853.0,33222.0,35902.0,36782.0,,
2001,1539.0,5952.0,12319.0,18609.0,24387.0,31090.0,37070.0,38519.0,,,
2002,2318.0,7932.0,13822.0,22095.0,31945.0,40629.0,44437.0,,,,
2003,1743.0,6240.0,12683.0,22892.0,34505.0,39320.0,,,,,
2004,2221.0,9898.0,25950.0,43439.0,52811.0,,,,,,
2005,3043.0,12219.0,27073.0,40026.0,,,,,,,
2006,3531.0,11778.0,22819.0,,,,,,,,
2007,3529.0,11865.0,,,,,,,,,


# Pandas-like Operations

Let's see how `.iloc[...]` and `.loc[...]` similarly to pandas. They take 4 parameters: [index, column, origin, valuation].

What if we want the row from AY 1998 Incurred data?

In [31]:
xyz_tri.iloc[:, 0, 0, :]

Unnamed: 0,12,24,36,48,60,72,84,96,108,120,132
1998,,,11171,12380,13216,14067,14688,16366,16163,15835,15822


What if you only want the valuation at age 60?

In [32]:
xyz_tri.iloc[:, 0, 0, 4]

Unnamed: 0,60
1998,13216


Let's use `.loc[...]` to get the incurred triangle.

In [33]:
xyz_tri.iloc[:, 0, :, 4]

Unnamed: 0,60
1998,13216.0
1999,22473.0
2000,29488.0
2001,34205.0
2002,44414.0
2003,44231.0
2004,70288.0
2005,
2006,
2007,


How do we get the latest Incurred diagonal only?

In [36]:
xyz_tri["Incurred"].latest_diagonal

Unnamed: 0,2008
1998,15822
1999,25107
2000,37246
2001,38798
2002,48169
2003,44373
2004,70288
2005,70655
2006,48804
2007,31732


Very often, we want incremental triangles instead. Let's convert the Incurred triangle to the incremental form.

In [38]:
xyz_tri["Incurred"].cum_to_incr()

Unnamed: 0,12,24,36,48,60,72,84,96,108,120,132
1998,,,11171.0,1209.0,836.0,851.0,621.0,1678.0,-203.0,-328.0,-13.0
1999,,13255.0,3150.0,3234.0,2834.0,1291.0,1330.0,-299.0,276.0,36.0,
2000,15676.0,3073.0,3151.0,5244.0,2344.0,4970.0,2491.0,556.0,-259.0,,
2001,11827.0,4177.0,5018.0,5556.0,7627.0,2931.0,1405.0,257.0,,,
2002,12811.0,7559.0,6286.0,11011.0,6747.0,4287.0,-532.0,,,,
2003,9651.0,7344.0,13359.0,10240.0,3637.0,142.0,,,,,
2004,16995.0,23185.0,18686.0,12841.0,-1419.0,,,,,,
2005,28674.0,18758.0,22908.0,315.0,,,,,,,
2006,27066.0,19717.0,2021.0,,,,,,,,
2007,19477.0,12255.0,,,,,,,,,


We can also convert the triangle to the valuation format, what we often see on Schedule Ps.

In [39]:
xyz_tri["Incurred"].dev_to_val()

Unnamed: 0,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007,2008
1998,,,11171.0,12380.0,13216.0,14067.0,14688.0,16366.0,16163.0,15835.0,15822
1999,,,13255.0,16405.0,19639.0,22473.0,23764.0,25094.0,24795.0,25071.0,25107
2000,,,15676.0,18749.0,21900.0,27144.0,29488.0,34458.0,36949.0,37505.0,37246
2001,,,,11827.0,16004.0,21022.0,26578.0,34205.0,37136.0,38541.0,38798
2002,,,,,12811.0,20370.0,26656.0,37667.0,44414.0,48701.0,48169
2003,,,,,,9651.0,16995.0,30354.0,40594.0,44231.0,44373
2004,,,,,,,16995.0,40180.0,58866.0,71707.0,70288
2005,,,,,,,,28674.0,47432.0,70340.0,70655
2006,,,,,,,,,27066.0,46783.0,48804
2007,,,,,,,,,,19477.0,31732


Another function that is often useful is the `.heatmap()` method. Let's inspect the incurred amount and see if there are trends.

In [40]:
xyz_tri["Incurred"].heatmap()

Unnamed: 0,12,24,36,48,60,72,84,96,108,120,132
1998,,,11171.0,12380.0,13216.0,14067.0,14688.0,16366.0,16163.0,15835.0,15822.0
1999,,13255.0,16405.0,19639.0,22473.0,23764.0,25094.0,24795.0,25071.0,25107.0,
2000,15676.0,18749.0,21900.0,27144.0,29488.0,34458.0,36949.0,37505.0,37246.0,,
2001,11827.0,16004.0,21022.0,26578.0,34205.0,37136.0,38541.0,38798.0,,,
2002,12811.0,20370.0,26656.0,37667.0,44414.0,48701.0,48169.0,,,,
2003,9651.0,16995.0,30354.0,40594.0,44231.0,44373.0,,,,,
2004,16995.0,40180.0,58866.0,71707.0,70288.0,,,,,,
2005,28674.0,47432.0,70340.0,70655.0,,,,,,,
2006,27066.0,46783.0,48804.0,,,,,,,,
2007,19477.0,31732.0,,,,,,,,,


# Development

How can we get the incurred link ratios?

We can also apply a `.heatmap()` to make it too, to help us visulize the highs and lows.

Let's get a volume-weighted average LDFs for our Incurred triangle.

How about the CDFs?

We can also use only the latest 3 periods in the calculation of LDFs.

# Deterministic Models

Before we can build any models, we need to use `fit_transform()`, so that the object is actually modified with our selected development pattern(s).

Set the development of the triangle to use only 3 periods.

Let's fit a chainladder model to our Incurred triangle.

How can we get the model's ultimate estimate?

How about just the IBNR?

Let's fit an Expected Loss model, with an aprior of 90% on Premium, and get its ultimates.

Try it on the Paid triangle, do you get the same ultimate?

How about a Bornhuetter-Ferguson model?

How about Benktander, with 2 iterations?

How about Cape Cod?

Let's store the Cape Cod model as `cc_result`. We can also use `.to_frame()` to leave `chainladder` and go to a `DataFrame`. Let's make a bar chart over origin years to see what they look like.

# Stochastic Models

The Mack's Chainladder model is available.

There are many attributes that are available, such as `full_std_err_`, `total_process_risk_`, `total_parameter_risk_`, `mack_std_err_` and `total_mack_std_err_`.

MackChainladder also has a `summary_` attribute.

Let's make a graph, that shows the Reported and IBNR as stacked bars, and error bars showing Mack Standard Errors.

ODP Bootstrap is also available. Let's build sample 10,000 Incurred triangles.

We can fit a basic chainladder to all sampled triangles. We now have 10,000 simulated chainladder models, all (most) with unique LDFs.

We can use `predict()` to use the model characteristics (their unique LDFs) to predict our basic Incurred triangle.

Let's make another graph.