Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

N_periods with asymmetrical triangles #94

Closed
johalnes opened this issue Sep 1, 2020 · 4 comments
Closed

N_periods with asymmetrical triangles #94

johalnes opened this issue Sep 1, 2020 · 4 comments
Labels

Comments

@johalnes
Copy link
Contributor

johalnes commented Sep 1, 2020

Following my regular test code:

data = pd.read_csv('https://raw.githubusercontent.com/casact/chainladder-python/master/chainladder/utils/data/prism.csv')
data['AccYr'] = data['AccidentDate'].str[:4]

x = cl.Triangle(data=data,
            origin='AccYr', development='PaymentDate',
            columns=['Paid', 'Incurred'],
            origin_format='%Y', development_format='%Y-%m-%d').incr_to_cum()

asymmetrical=x.grain('OYDQ')['Paid']
periods=[2]*6 + [-1]*33

annual = x.grain('OYDY')['Paid']
periods_an = [2]*6 + [-1]*3

Now this works for annual, but the assymetrical will return operands could not be broadcast together with shapes (1,1,40,10) (10,39)

cl.Development(average='volume', n_periods=periods).fit(asymmetrical).ldf_
cl.Development(average='volume', n_periods=periods_an).fit(annual).ldf_

The problem I try to solve is one LoB with development fluctating really close to 1, which results in just noice. Have you developed any clever ideas to manually manipulating the development pattern?

I also feel like exponential weighted average often makes sense. Often one will include as much data as possible, but having latest periods beeing most relevant. Would this be a good addidtion to the simple, volume and regression averages that is implemented today? Or having a method to calculate the weights, which could be passed into the sample_weight parameter to the Development.fit method?

@jbogaardt jbogaardt added the Bug label Sep 1, 2020
@jbogaardt
Copy link
Collaborator

Interesting one. Thanks again for reporting!

The problem I try to solve is one LoB with development fluctating really close to 1, which results in just noice. Have you developed any clever ideas to manually manipulating the development pattern?

If I understand the question correctly, I typically solve for this by including a Tail factor with an attachment_age just before the patterns of the triangle become unstable or unreliable. Alternatively, you can manually drop out any age-to-age factors from the calculation using drop argument of the Development estimator but that's a bit more tedious.

I also feel like exponential weighted average often makes sense. Often one will include as much data as possible, but having latest periods beeing most relevant. Would this be a good addidtion to the simple, volume and regression averages that is implemented today?

I think this could work, but is complex enough to warrant its own github issue. The available (limited) choices are based on the fact that they are intended to be compatible with the MackChainladder method. In theory, it seems an EWA should also be compatible since it can fit within a weighted least squares framework. I don't believe I've seen a paper that extends Mack in this way, but I can't see a reason it wouldn't work.

jbogaardt added a commit that referenced this issue Sep 1, 2020
@jbogaardt
Copy link
Collaborator

This is now fixed and will show up in chainladder==0.7.7

@johalnes
Copy link
Contributor Author

johalnes commented Sep 1, 2020

@jbogaardt I had not seen the attachment_age factor! Nice! How do you find those pages? I manged now by having the link, but don't think there any buttons for it on readthedocs?

Tried one of your examples, following the code above I added the following pipeline:

steps=[
('dev', cl.Development(average='volume')),
('tail', cl.TailCurve('inverse_power', attachment_age = 18)),
('model', cl.Chainladder())]
pipe = cl.Pipeline(steps=steps)
pipe.fit(asymmetrical)

And typing pipe.named_steps.tail.ldf_ gives a satisfying result! As expected, the LDF for dev step and tail step is equal. From there the tail is smoother.

But in the final model, there are quite small changes and except the tail, typing named_steps.dev.ldf_ and named.steps.model.ldf_ gives exactly the same results. Why is this correct? I expected named.steps.model.ldf_ to be equal the tail ldf.

@jbogaardt
Copy link
Collaborator

How do you find those pages?

Good feedback to know they aren't easy to get to. From the docs landing page, clicking on the Tail Estimation hyperlink takes you there. No sure how to make them easy to find. I am open to suggestions.

But in the final model, there are quite small changes and except the tail, typing named_steps.dev.ldf_ and named.steps.model.ldf_ gives exactly the same results. Why is this correct? I expected named.steps.model.ldf_ to be equal the tail ldf.

This is bug #96 which has been now been resolved on master. It will be released in chainladder==0.7.7

import chainladder as cl
import pandas as pd
data = pd.read_csv('https://raw.githubusercontent.com/casact/chainladder-python/master/chainladder/utils/data/prism.csv')
data['AccYr'] = data['AccidentDate'].str[:4]

x = cl.Triangle(data=data,
            origin='AccYr', development='PaymentDate',
            columns=['Paid', 'Incurred'],
            origin_format='%Y', development_format='%Y-%m-%d').incr_to_cum()

steps=[
    ('dev', cl.Development(average='volume')),
    ('tail', cl.TailCurve('inverse_power', attachment_age = 18)),
    ('model', cl.Chainladder())]
pipe = cl.Pipeline(steps=steps)
pipe.fit(asymmetrical)

# This now works
assert pipe.named_steps.model.ldf_ == pipe.named_steps.tail.ldf_

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants