N_periods with asymmetrical triangles #94

johalnes · 2020-09-01T09:01:13Z

Following my regular test code:

data = pd.read_csv('https://raw.githubusercontent.com/casact/chainladder-python/master/chainladder/utils/data/prism.csv')
data['AccYr'] = data['AccidentDate'].str[:4]

x = cl.Triangle(data=data,
            origin='AccYr', development='PaymentDate',
            columns=['Paid', 'Incurred'],
            origin_format='%Y', development_format='%Y-%m-%d').incr_to_cum()

asymmetrical=x.grain('OYDQ')['Paid']
periods=[2]*6 + [-1]*33

annual = x.grain('OYDY')['Paid']
periods_an = [2]*6 + [-1]*3

Now this works for annual, but the assymetrical will return operands could not be broadcast together with shapes (1,1,40,10) (10,39)

cl.Development(average='volume', n_periods=periods).fit(asymmetrical).ldf_
cl.Development(average='volume', n_periods=periods_an).fit(annual).ldf_

The problem I try to solve is one LoB with development fluctating really close to 1, which results in just noice. Have you developed any clever ideas to manually manipulating the development pattern?

I also feel like exponential weighted average often makes sense. Often one will include as much data as possible, but having latest periods beeing most relevant. Would this be a good addidtion to the simple, volume and regression averages that is implemented today? Or having a method to calculate the weights, which could be passed into the sample_weight parameter to the Development.fit method?

The text was updated successfully, but these errors were encountered:

jbogaardt · 2020-09-01T11:56:54Z

Interesting one. Thanks again for reporting!

The problem I try to solve is one LoB with development fluctating really close to 1, which results in just noice. Have you developed any clever ideas to manually manipulating the development pattern?

If I understand the question correctly, I typically solve for this by including a Tail factor with an attachment_age just before the patterns of the triangle become unstable or unreliable. Alternatively, you can manually drop out any age-to-age factors from the calculation using drop argument of the Development estimator but that's a bit more tedious.

I also feel like exponential weighted average often makes sense. Often one will include as much data as possible, but having latest periods beeing most relevant. Would this be a good addidtion to the simple, volume and regression averages that is implemented today?

I think this could work, but is complex enough to warrant its own github issue. The available (limited) choices are based on the fact that they are intended to be compatible with the MackChainladder method. In theory, it seems an EWA should also be compatible since it can fit within a weighted least squares framework. I don't believe I've seen a paper that extends Mack in this way, but I can't see a reason it wouldn't work.

jbogaardt · 2020-09-01T12:59:53Z

This is now fixed and will show up in chainladder==0.7.7

johalnes · 2020-09-01T20:27:00Z

@jbogaardt I had not seen the attachment_age factor! Nice! How do you find those pages? I manged now by having the link, but don't think there any buttons for it on readthedocs?

Tried one of your examples, following the code above I added the following pipeline:

steps=[
('dev', cl.Development(average='volume')),
('tail', cl.TailCurve('inverse_power', attachment_age = 18)),
('model', cl.Chainladder())]
pipe = cl.Pipeline(steps=steps)
pipe.fit(asymmetrical)

And typing pipe.named_steps.tail.ldf_ gives a satisfying result! As expected, the LDF for dev step and tail step is equal. From there the tail is smoother.

But in the final model, there are quite small changes and except the tail, typing named_steps.dev.ldf_ and named.steps.model.ldf_ gives exactly the same results. Why is this correct? I expected named.steps.model.ldf_ to be equal the tail ldf.

jbogaardt · 2020-09-02T03:48:35Z

How do you find those pages?

Good feedback to know they aren't easy to get to. From the docs landing page, clicking on the Tail Estimation hyperlink takes you there. No sure how to make them easy to find. I am open to suggestions.

But in the final model, there are quite small changes and except the tail, typing named_steps.dev.ldf_ and named.steps.model.ldf_ gives exactly the same results. Why is this correct? I expected named.steps.model.ldf_ to be equal the tail ldf.

This is bug #96 which has been now been resolved on master. It will be released in chainladder==0.7.7

import chainladder as cl
import pandas as pd
data = pd.read_csv('https://raw.githubusercontent.com/casact/chainladder-python/master/chainladder/utils/data/prism.csv')
data['AccYr'] = data['AccidentDate'].str[:4]

x = cl.Triangle(data=data,
            origin='AccYr', development='PaymentDate',
            columns=['Paid', 'Incurred'],
            origin_format='%Y', development_format='%Y-%m-%d').incr_to_cum()

steps=[
    ('dev', cl.Development(average='volume')),
    ('tail', cl.TailCurve('inverse_power', attachment_age = 18)),
    ('model', cl.Chainladder())]
pipe = cl.Pipeline(steps=steps)
pipe.fit(asymmetrical)

# This now works
assert pipe.named_steps.model.ldf_ == pipe.named_steps.tail.ldf_

jbogaardt added the Bug label Sep 1, 2020

jbogaardt added a commit that referenced this issue Sep 1, 2020

Fix to #94

93b1d29

jbogaardt closed this as completed Sep 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

N_periods with asymmetrical triangles #94

N_periods with asymmetrical triangles #94

johalnes commented Sep 1, 2020

jbogaardt commented Sep 1, 2020

jbogaardt commented Sep 1, 2020

johalnes commented Sep 1, 2020

jbogaardt commented Sep 2, 2020

N_periods with asymmetrical triangles #94

N_periods with asymmetrical triangles #94

Comments

johalnes commented Sep 1, 2020

jbogaardt commented Sep 1, 2020

jbogaardt commented Sep 1, 2020

johalnes commented Sep 1, 2020

jbogaardt commented Sep 2, 2020