# Generate Contracts Store

This notebook generates the **Data Contract** file:
- `model_contract.json` - Model mathematical definition

**Note:** System prompt and language rules are in `code/report_generator/prompts.py`

**Output:** `knowledge_base/contracts/`

In [None]:
import json
import os

OUTPUT_DIR = '../../knowledge_base/contracts'
os.makedirs(OUTPUT_DIR, exist_ok=True)

print(f'Output directory: {OUTPUT_DIR}')

## Model Contract

Defines the mathematical specification of the model.

In [None]:
model_contract = {
    "model_name": "LGBM Offset Poisson Mortality Model",
    "version": "1.0",
    "created_date": "2026-01",
    
    "observation_unit": {
        "definition": "policy-year-cell",
        "description": "Each row represents a unique combination of policy characteristics observed during a calendar year",
        "note": "NOT an individual policy, but an aggregated cell"
    },
    
    "model_type": {
        "algorithm": "LightGBM",
        "objective": "poisson",
        "offset": "log(Policies_Exposed)",
        "formula": "log(E[Death_Count]) = log(Exposure) + f(X)"
    },
    
    "prediction_semantics": {
        "model_output": "mortality_rate",
        "interpretation": "Expected deaths per unit exposure",
        "to_get_expected_deaths": "predicted_rate * Policies_Exposed",
        "warning": "Do NOT interpret as probability. It is a rate."
    },
    
    "features": {
        "numerical": ["Attained_Age", "Issue_Age", "Duration"],
        "categorical": [
            "Sex", "Smoker_Status", "Insurance_Plan", "Face_Amount_Band",
            "Preferred_Class", "SOA_Post_Lvl_Ind", "SOA_Antp_Lvl_TP", "SOA_Guar_Lvl_TP"
        ]
    },
    
    "target": {
        "column": "Death_Count",
        "type": "count",
        "exposure_column": "Policies_Exposed"
    },
    
    "two_stage_model": {
        "stage_1": "LGBM predicts base mortality rate (year-agnostic)",
        "stage_2": "Year factors adjust for temporal trends",
        "year_factors_file": "models/year_factors_offset.csv"
    }
}

# Save
with open(f'{OUTPUT_DIR}/model_contract.json', 'w') as f:
    json.dump(model_contract, f, indent=2)

print('model_contract.json saved!')
print(json.dumps(model_contract, indent=2))

## Verify Output

In [None]:
print('Generated files:')
for f in os.listdir(OUTPUT_DIR):
    if f.endswith('.json'):
        filepath = os.path.join(OUTPUT_DIR, f)
        size = os.path.getsize(filepath)
        print(f'  âœ“ {f} ({size} bytes)')

print('\n--- Prompts Location ---')
print('System prompt & template: code/report_generator/prompts.py')
print('Validator rules: code/report_generator/validator.py')