In [9]:
import pandas as pd
from pathlib import Path
import sys

sys.path.insert(0, str(Path.cwd().parent))  # adds parent directory
from experiments_lib import prompt_ilec_data, set_context_window_size, get_context_window

%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Setup LLM params

In [2]:
set_context_window_size(10)

In [3]:
get_context_window()

deque([], maxlen=10)

# Actuarial Intern

The intern's manager has described the following task:

>We'd like to select an appropriate expected basis for each insurance plan type. We expect that the basis will vary by issue year, but are open to other ideas.  Which of the bases would provide the best fit, if we are looking at the actual-to-expected ratio on an amount basis? The following expected bases are available:

* Expected_Death_QX7580E_by_Amount
* Expected_Death_QX2001VBT_by_Amount
* Expected_Death_QX2008VBT_by_Amount
* Expected_Death_QX2008VBTLU_by_Amount
* Expected_Death_QX2015VBT_by_Amount
* Expected_Death_QX7580E_by_Policy
* Expected_Death_QX2001VBT_by_Policy
* Expected_Death_QX2008VBT_by_Policy
* Expected_Death_QX2008VBTLU_by_Policy
* Expected_Death_QX2015VBT_by_Policy
* ExpDeathQx2015VBTwMI_byPol
* ExpDeathQx2015VBTwMI_byAmt


In [4]:
print(prompt_ilec_data("How many differerent insurance plan types are there, list them"))

Running request...
There are 7 different insurance plan types:

- Other
- UL
- VL
- Term
- VLSG
- Perm
- ULSG


In [5]:
print(prompt_ilec_data("Calculate the actual-to-expected ratio by amount (Actual Amount / Expected Amount) for each basis and plan type.  Which basis looks like the best fit?"))

Running request...
I can do that — a couple quick clarifying questions so I compute exactly what you want:

1) By "basis" do you mean the Age_Basis column?  
2) Which expected column should I use for the denominator? Options include:
   - Expected_Death_QX2015VBT_by_Amount (QX2015 VBT)
   - ExpDeathQx2015VBTwMI_byAmt (QX2015 VBT with MI)
   - Expected_Death_QX2008VBT_by_Amount, Expected_Death_QX2001VBT_by_Amount, Expected_Death_QX7580E_by_Amount, etc.

If you don’t care, I will use Expected_Death_QX2015VBT_by_Amount (a current standard) and compute sum(Death_Claim_Amount) / sum(Expected_Death_QX2015VBT_by_Amount) grouped by Age_Basis and Insurance_Plan, then return the ratios and highlight which basis looks closest to 1.0 (best fit). Confirm or tell me which expected column to use.


In [7]:
print(prompt_ilec_data("Calculate it for all columns that end with _by_Amount", max_turns=50))

Running request...
Done — I computed Actual / Expected (by amount) for every expected column whose name ends with "_by_Amount", grouped by Age_Basis and Insurance_Plan.

What I used
- Actual amount: SUM(Death_Claim_Amount)
- Expected columns (those that end with _by_Amount):
  - Expected_Death_QX7580E_by_Amount
  - Expected_Death_QX2001VBT_by_Amount
  - Expected_Death_QX2008VBT_by_Amount
  - Expected_Death_QX2008VBTLU_by_Amount
  - Expected_Death_QX2015VBT_by_Amount
- Ratio formula per group: SUM(Death_Claim_Amount) / SUM(Expected_X_by_Amount). If the summed expected = 0 the ratio is returned as NULL.

SQL I ran (aggregated by Age_Basis, Insurance_Plan):
SELECT Age_Basis, Insurance_Plan,
  SUM(Death_Claim_Amount) AS Actual_Amount,
  SUM(Expected_Death_QX7580E_by_Amount) AS Exp_QX7580E,
  CASE WHEN SUM(Expected_Death_QX7580E_by_Amount)=0 THEN NULL ELSE SUM(Death_Claim_Amount)/SUM(Expected_Death_QX7580E_by_Amount) END AS Ratio_QX7580E,
  SUM(Expected_Death_QX2001VBT_by_Amount) AS Exp_QX2

In [8]:
print(prompt_ilec_data("Provide a full CSV export by Age_Basis and Insurance_Plan", max_turns=50))

Running request...
Here is the full CSV export (aggregated by Age_Basis and Insurance_Plan). Columns:
Age_Basis, Insurance_Plan, Actual_Amount, Exp_QX7580E, Ratio_QX7580E, Exp_QX2001VBT, Ratio_QX2001VBT, Exp_QX2008VBT, Ratio_QX2008VBT, Exp_QX2008VBTLU, Ratio_QX2008VBTLU, Exp_QX2015VBT, Ratio_QX2015VBT

(Note: ratios = Actual_Amount / Exp_..., rounded to 6 decimal places; if an expected sum = 0 the ratio would be blank.)

Age_Basis,Insurance_Plan,Actual_Amount,Exp_QX7580E,Ratio_QX7580E,Exp_QX2001VBT,Ratio_QX2001VBT,Exp_QX2008VBT,Ratio_QX2008VBT,Exp_QX2008VBTLU,Ratio_QX2008VBTLU,Exp_QX2015VBT,Ratio_QX2015VBT
ALB,Other,7344706.0,13772715.842673732,0.533295,10511551.905406151,0.698692,8044714.481738072,0.913049,11563186.902724985,0.635173,6896760.409524547,1.064948
ALB,Perm,26281220633.0,41148174192.05634,0.638691,33835514258.696033,0.776673,27622562078.140617,0.951397,33056468160.877834,0.795043,24254173661.80818,1.083600
ALB,Term,24059855494.0,73924683973.85648,0.325746,48687999808.35161

In [14]:
df = pd.read_csv("actuarial_intern_results.csv")
df = df.loc[:, ~df.columns.str.startswith("Exp_")]

num_cols = df.select_dtypes(include="number").columns
amt_col = "Actual_Amount"
pct_cols = [c for c in num_cols if c != amt_col]

# If your numeric columns are PROPORTIONS (e.g., 0.123 -> 12.3%):
df_disp = df.copy()
df_disp[pct_cols] = (df_disp[pct_cols] * 100).round(3)

fmt = {c: "{:.3f}%".format for c in pct_cols}
fmt[amt_col] = lambda x: f"${(x/1_000_000):,.3f}M"

# Notebook (HTML) view:
df_disp.style.format(fmt)

# Plain text (e.g., console/logs):
print(df_disp.to_string(formatters=fmt))


   Age_Basis Insurance_Plan Actual_Amount Ratio_QX7580E Ratio_QX2001VBT Ratio_QX2008VBT Ratio_QX2008VBTLU Ratio_QX2015VBT
0        ALB          Other       $7.345M       53.329%         69.869%         91.305%           63.517%        106.495%
1        ALB           Perm  $26,281.221M       63.869%         77.667%         95.140%           79.504%        108.360%
2        ALB           Term  $24,059.855M       32.575%         49.434%         74.370%           44.730%         88.891%
3        ALB             UL  $21,899.409M       58.556%         74.912%         97.500%           78.322%        112.996%
4        ALB           ULSG   $5,696.525M       53.458%         63.580%         74.748%           58.741%         86.903%
5        ALB             VL   $6,020.575M       49.188%         64.002%         87.942%           66.418%        100.844%
6        ALB           VLSG   $2,329.008M       44.527%         59.343%         83.931%           59.225%         96.650%
7        ANB          Ot