# RILA 1Y10B Price Elasticity Inference - Refactored

**Product**: FlexGuard 1Y10B (1-year term, 10% buffer)

**Architecture**: Uses `UnifiedNotebookInterface` for inference with methodology validation

**Created**: 2026-01-26

---

## Purpose

Run price elasticity inference using the refactored API:
- **Data Source**: Processed data from `00_data_pipeline.ipynb`
- **Inference**: Uses `UnifiedNotebookInterface` with RILA methodology
- **Validation**: Economic constraint checking (own rate positive, competitor negative)

**Key Benefits**:
1. Methodology-aware inference (RILA constraint rules)
2. Automatic lag-0 competitor validation
3. Environment-agnostic (same code works for all products)
4. Cleaner separation between data processing and inference

---

## Section 1: Setup

In [None]:
# Setup: Add project root to path
import sys
import os
from pathlib import Path

# Auto-detect project root
project_root = Path().resolve()
while not (project_root / "src").exists() and project_root != project_root.parent:
    project_root = project_root.parent

if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))

print(f"Project root: {project_root}")

In [None]:
# Standard imports
import pandas as pd
import numpy as np
import warnings
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
import random

warnings.filterwarnings('ignore')
sns.set_theme(style="whitegrid", palette="deep")

# Our refactored interface
from src.notebooks import create_interface

# Legacy imports for visualization (not yet refactored)
from src.visualization.inference_plots import (
    prepare_visualization_data_pct,
    prepare_visualization_data_dollars,
    generate_price_elasticity_visualization_pct,
    generate_price_elasticity_visualization_dollars,
    save_visualization_files,
    export_csv_files
)

print("Dependencies loaded successfully")

## Section 2: Reproducibility Setup

In [None]:
# =============================================================================
# REPRODUCIBILITY: Random Seed Initialization
# =============================================================================
RANDOM_SEED = 42  # Fixed seed for reproducible bootstrap results

# Set all random seeds for full reproducibility
np.random.seed(RANDOM_SEED)
random.seed(RANDOM_SEED)

print(f"✓ Random seed initialized: {RANDOM_SEED}")
print("  All bootstrap operations will be reproducible across runs.")

## Section 3: Create Interface and Load Data

In [None]:
# Create interface for 1Y10B product
# Note: We use "local" environment since we're loading from processed parquet files
interface = create_interface(
    product_code="1Y10B",
    environment="local",
    adapter_kwargs={"data_dir": project_root / "notebooks/rila_1y10b/outputs/datasets_1y10b"}
)

# Verify configuration
print("Product Configuration:")
print(f"  Code: {interface.product.product_code}")
print(f"  Name: {interface.product.name}")
print(f"  Type: {interface.product.product_type}")
print(f"  Buffer: {interface.product.buffer_level * 100:.0f}%")
print(f"  Term: {interface.product.term_years} year(s)")
print()
print(f"Methodology: {interface.methodology}")
print(f"Constraint rules active: {len(interface.get_constraint_rules())} rules")

In [None]:
# Load processed data from data pipeline notebook
data_path = project_root / "notebooks/rila_1y10b/outputs/datasets_1y10b/final_dataset.parquet"

if not data_path.exists():
    raise FileNotFoundError(
        f"Data file not found: {data_path}\n"
        "Please run notebooks/rila_1y10b/00_data_pipeline.ipynb first."
    )

df = pd.read_parquet(data_path)

print(f"Loaded dataset: {df.shape[0]:,} rows × {df.shape[1]:,} columns")
print(f"Date range: {df['date'].min()} to {df['date'].max()}")

## Section 4: Data Filtering and Preparation

In [None]:
# Apply business filters
# 1. Remove zero sales records
initial_count = len(df)
df_filtered = df[df["sales"] != 0].copy()
print(f"Zero sales filter: {initial_count:,} → {len(df_filtered):,} records")

# 2. Create temporal weight decay for model training
weight_decay_factor = 0.999
df_filtered["weight"] = [weight_decay_factor ** (len(df_filtered) - k) for k in range(len(df_filtered))]

# 3. Apply date filter (use data from 2022-09-01 onwards)
date_filter = pd.to_datetime("2022-09-01")
mask_time = df_filtered["date"] > date_filter
df_model = df_filtered[mask_time][:-1].copy()  # Exclude last record to prevent lookahead bias

print(f"\nFinal training dataset: {len(df_model):,} records")
print(f"Date range: {df_model['date'].min()} to {df_model['date'].max()}")

## Section 5: Feature Selection and Validation

In [None]:
# Define features for inference
# These are the key drivers of RILA sales elasticity
# Note: Using features available in fixture data
features = [
    # Own rate (treatment variable) - Prudential's cap rate
    "prudential_rate_current",
    
    # Competitor rates (lagged to avoid simultaneity bias)
    "competitor_mid_t2",  # t-2 lag
    "competitor_mid_t3",  # t-3 lag
    "competitor_mid_t4",  # t-4 lag
    
    # Economic indicators
    "econ_treasury_5y_t1",  # 5-year treasury rate
    "VIXCLS",  # VIX volatility index
]

target = "sales_target_current"

print("Model Configuration:")
print(f"  Target: {target}")
print(f"  Features: {len(features)}")
for i, feat in enumerate(features, 1):
    print(f"    {i}. {feat}")

In [None]:
# Validate features using interface methodology
# This checks for lag-0 competitor features (forbidden by causal framework)
print("Validating features against RILA methodology:\n")

for feature in features:
    is_forbidden = interface._is_competitor_lag_zero(feature)
    status = "❌ FORBIDDEN" if is_forbidden else "✓ OK"
    print(f"  {feature}: {status}")

# Check if any features violate constraints
forbidden_features = [f for f in features if interface._is_competitor_lag_zero(f)]
if forbidden_features:
    raise ValueError(
        f"Lag-0 competitor features detected: {forbidden_features}\n"
        "These violate causal identification (simultaneity bias).\n"
        "Use lagged competitors (t1, t2, t3...) instead."
    )

print("\n✓ All features validated")

## Section 6: Run Inference

**Note**: As documented in `CURRENT_WORK.md`, `interface.run_inference()` has limitations (returns hardcoded zeros). 

For production use, we use the legacy `center_baseline()` function directly until the interface is fully implemented.

In [None]:
# Import legacy inference function (workaround for interface limitation)
from src.models.inference import center_baseline

# Prepare data for inference
X = df_model[features]
y = df_model[target]
weights = df_model["weight"]

# Run bootstrap ensemble inference
print("Running bootstrap ensemble inference...")
print(f"  Observations: {len(X):,}")
print(f"  Features: {len(features)}")
print(f"  Bootstrap samples: 1000")
print()

results = center_baseline(
    X=X,
    y=y,
    features=features,
    n_estimators=1000,
    sample_weight=weights,
    random_state=RANDOM_SEED
)

print("Inference complete!")

## Section 7: Coefficient Validation

Use the interface's methodology to validate coefficients meet economic constraints.

In [None]:
# Extract coefficients from results
coefficients = dict(zip(features, results['coefficients']))

# Display coefficients
print("Model Coefficients:")
print("=" * 60)
for feature, coef in coefficients.items():
    sign = "+" if coef > 0 else ""
    print(f"  {feature:30s}: {sign}{coef:10.6f}")

print()

In [None]:
# Validate against RILA methodology constraints
validation = interface.validate_coefficients(coefficients)

print("Economic Constraint Validation:")
print("=" * 60)

if validation["passed"]:
    print("\n✓ PASSED Constraints:")
    for item in validation["passed"]:
        print(f"  {item['feature']:30s}: {item['coefficient']:+.4f} (expected {item['expected']})")

if validation["violated"]:
    print("\n❌ VIOLATED Constraints:")
    for item in validation["violated"]:
        print(f"  {item['feature']:30s}: {item['coefficient']:+.4f}")
        print(f"    Expected {item['expected']}, got {item['actual']}")

# Overall status
print()
if not validation["violated"]:
    print("✓ All economic constraints satisfied")
else:
    print(f"⚠ {len(validation['violated'])} constraint(s) violated")

## Section 8: Model Fit Metrics

In [None]:
# Display model fit statistics
print("Model Fit Metrics:")
print("=" * 60)
print(f"  R² Score:          {results['r2']:.4f}")
print(f"  Mean Squared Error: {results['mse']:.2f}")
print(f"  Mean Absolute Error: {results['mae']:.2f}")
print()
print(f"  Training samples:   {len(X):,}")
print(f"  Features used:      {len(features)}")
print(f"  Bootstrap samples:  1000")

## Section 9: Rate Scenario Analysis

In [None]:
# Define rate scenarios for prediction
from src.models.inference import rate_adjustments

# Generate rate scenarios from -200 to +200 bps in 10 bps increments
rate_scenarios = np.arange(-0.02, 0.021, 0.001)  # -2% to +2% in 0.1% steps

print(f"Running predictions for {len(rate_scenarios)} rate scenarios...")

# Get predictions for each scenario
predictions_df = rate_adjustments(
    model=results['model'],
    X=X,
    features=features,
    rate_range=rate_scenarios,
    baseline_sales=df_model['sales'].iloc[-1]
)

print(f"Predictions generated for scenarios from {rate_scenarios[0]*100:.1f}% to {rate_scenarios[-1]*100:.1f}%")
print(f"\nPreview of predictions:")
print(predictions_df.head())

## Section 10: Confidence Intervals

In [None]:
# Calculate confidence intervals for predictions
from src.models.inference import confidence_interval

print("Calculating bootstrap confidence intervals...")

ci_results = confidence_interval(
    X=X,
    y=y,
    features=features,
    rate_scenarios=rate_scenarios,
    n_bootstrap=100,  # Use 100 for speed; increase for production
    confidence_level=0.95,
    sample_weight=weights,
    random_state=RANDOM_SEED
)

print("Confidence intervals calculated")
print(f"\nSample confidence intervals (first 5 scenarios):")
for i in range(min(5, len(ci_results))):
    scenario = rate_scenarios[i] * 100
    lower = ci_results['lower'][i]
    upper = ci_results['upper'][i]
    mean = ci_results['mean'][i]
    print(f"  Scenario {scenario:+.1f}%: [{lower:,.0f}, {upper:,.0f}] (mean: {mean:,.0f})")

## Section 11: Visualization

In [None]:
# Prepare visualization data
viz_data_pct = prepare_visualization_data_pct(predictions_df, ci_results)
viz_data_dollars = prepare_visualization_data_dollars(predictions_df, ci_results, df_model['sales'].iloc[-1])

# Generate plots
fig_pct = generate_price_elasticity_visualization_pct(viz_data_pct)
fig_dollars = generate_price_elasticity_visualization_dollars(viz_data_dollars)

plt.show()

## Section 12: Export Results

In [None]:
# Create output directories
output_dir = Path("outputs/inference_1y10b")
bi_dir = Path("BI_TEAM_1Y10B")

output_dir.mkdir(parents=True, exist_ok=True)
bi_dir.mkdir(parents=True, exist_ok=True)

# Save visualizations
save_visualization_files(
    fig_pct=fig_pct,
    fig_dollars=fig_dollars,
    output_dir=output_dir,
    product_code="1Y10B"
)

# Export CSV files for Tableau
export_csv_files(
    predictions_df=predictions_df,
    ci_results=ci_results,
    coefficients=coefficients,
    output_dir=bi_dir,
    product_code="1Y10B"
)

print(f"Results exported to:")
print(f"  Visualizations: {output_dir}")
print(f"  Tableau CSVs: {bi_dir}")

## Summary

### What This Notebook Does

1. **Creates interface** for 1Y10B with RILA methodology
2. **Loads processed data** from data pipeline notebook
3. **Validates features** against causal framework (no lag-0 competitors)
4. **Runs inference** using bootstrap ensemble
5. **Validates coefficients** against economic constraints
6. **Generates predictions** for rate scenarios
7. **Exports results** for business intelligence

### Architecture Benefits

| Feature | Legacy Approach | Refactored Approach |
|---------|----------------|---------------------|
| Constraint validation | Manual | Automatic (methodology) |
| Lag-0 detection | Manual checks | Built-in validation |
| Product switching | Edit hardcoded values | Change product_code |
| Economic rules | Scattered in code | Centralized in methodology |

### Next Steps

1. Run forecasting notebook: `02_time_series_forecasting_refactored.ipynb`
2. Compare results with legacy inference notebook
3. Validate mathematical equivalence