Skip to content

Add Friedland datasets to cl.load_sample()#730

Merged
kennethshsu merged 7 commits into
mainfrom
friedland
Apr 30, 2026
Merged

Add Friedland datasets to cl.load_sample()#730
kennethshsu merged 7 commits into
mainfrom
friedland

Conversation

@genedan
Copy link
Copy Markdown
Collaborator

@genedan genedan commented Apr 30, 2026

Closes #561. The Friedland datasets have been sitting around in the utils/data folder for quite some time, but haven't been loadable by cl.load_sample(). This PR enables that, updates the documentation, and also adds a unit test to make sure all of the supported datasets can load.

I've also added cumulative claim columns to friedland_us_industry_auto_case and friedland_xyz_case . These datasets previously had mixed cumulative/incremental columns which would be incompatible with the Triangle class. This changes those samples to be just cumulative. Therefore, load_sample() won't be 1-to-1 with the text, but the old columns can still be derived and still remain in the original CSVs.


Note

Medium Risk
Medium risk because it expands load_sample() dataset-specific parsing rules and changes/removes several sample CSVs, which could break downstream users relying on prior sample names/column schemas.

Overview
Adds support for loading the previously bundled Friedland sample datasets via cl.load_sample() by introducing dataset-specific column/origin/development mappings (including semiannual friedland_auto_freq_sev handling) and improving the invalid-key error message.

Updates sample CSVs to align with Triangle expectations by adding cumulative Paid Claims columns to friedland_us_industry_auto_case and friedland_xyz_case, adjusting friedland_auto_freq_sev half-year labels, and removing three older friedland_uspp_*increasing* CSV variants.

Extends test coverage with a new unit test that iterates over all files in utils/data and asserts each can be loaded, and updates the sample-data documentation table to list the Friedland datasets and their attributes.

Reviewed by Cursor Bugbot for commit 2932f75. Bugbot is set up for automated code reviews on this repo. Configure here.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.25%. Comparing base (a020797) to head (2932f75).
⚠️ Report is 8 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #730      +/-   ##
==========================================
+ Coverage   85.13%   85.25%   +0.11%     
==========================================
  Files          85       85              
  Lines        4911     4950      +39     
  Branches      630      645      +15     
==========================================
+ Hits         4181     4220      +39     
  Misses        521      521              
  Partials      209      209              
Flag Coverage Δ
unittests 85.25% <100.00%> (+0.11%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kennethshsu kennethshsu merged commit 1465a69 into main Apr 30, 2026
13 checks passed
@kennethshsu kennethshsu deleted the friedland branch April 30, 2026 20:13
@henrydingliu
Copy link
Copy Markdown
Collaborator

do we need to update manifest.in as well? i'm still not able to load_sample friedland locally

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] Add Friedland datasets to chainladder.load_sample().

3 participants