# RILA Price Elasticity: Architecture Walkthrough

**Time**: ~25 minutes

This notebook walks you through the full data flow from loading data to interpreting elasticity estimates.

By the end, you'll understand:
1. How the Dependency Injection (DI) pattern works
2. How to run inference for any RILA product
3. How to interpret results
4. What the economic constraints mean

---

## 1. Creating an Interface

The `UnifiedNotebookInterface` is the main entry point for all analysis.

**Key concept**: Dependency Injection (DI) means the interface accepts *components* that can be swapped:
- `DataAdapter`: Where to get data (AWS, local, fixtures)
- `AggregationStrategy`: How to combine competitor rates
- `Methodology`: Economic constraint rules

Let's start with **fixture mode** (no AWS credentials needed).

In [None]:
# Setup: Add project root to path (for notebook execution)
import sys
from pathlib import Path
project_root = Path().resolve().parent.parent
if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))

# Standard imports
import pandas as pd
import numpy as np

# Our interface
from src.notebooks import create_interface

In [None]:
# Create interface for RILA 6Y20B product
# environment="fixture" uses test data - safe to run without AWS credentials

# Get the fixtures directory (relative to project root)
fixtures_dir = project_root / "tests" / "fixtures" / "rila"

interface = create_interface(
    "6Y20B", 
    environment="fixture",
    adapter_kwargs={"fixtures_dir": str(fixtures_dir)}
)

# Explore what was configured
print("Product Configuration:")
print(f"  Code: {interface.product.product_code}")
print(f"  Type: {interface.product.product_type}")
print(f"  Buffer: {interface.product.buffer_level * 100:.0f}%")
print(f"  Term: {interface.product.term_years} years")
print()
print(f"Data Adapter: {interface.adapter.source_type}")
print(f"Aggregation Strategy: {interface.aggregation}")
print(f"Methodology: {interface.methodology}")

### What just happened?

The `create_interface` factory function:
1. Looked up `6Y20B` in the product registry
2. Created a `FixtureAdapter` (because `environment="fixture"`)
3. Selected `WeightedAggregation` (default for RILA products)
4. Loaded `RILAMethodology` (economic constraints for RILA)

**Why this matters**: You can run the *exact same analysis code* on test data or production data - just change `environment`.

---

## 2. Understanding the Data

Let's load the pre-processed fixture data and explore its structure.

In [None]:
# Load the fixture data directly (bypassing the adapter for exploration)
# In production, you'd use interface.load_data()
df = pd.read_parquet(fixtures_dir / "final_weekly_dataset.parquet")

print(f"Dataset shape: {df.shape[0]} rows x {df.shape[1]} columns")
print(f"\nThis is weekly data from {df['date'].min()} to {df['date'].max()}")

In [None]:
# Key column categories
target_cols = [c for c in df.columns if 'sales_target' in c][:5]
own_rate_cols = [c for c in df.columns if 'prudential_rate' in c][:5]
competitor_cols = [c for c in df.columns if 'competitor_mid' in c][:5]
economic_cols = [c for c in df.columns if 'treasury' in c or 'volatility' in c][:4]

print("Target columns (what we predict):")
for col in target_cols:
    print(f"  {col}")
    
print("\nOwn rate columns (treatment variable):")
for col in own_rate_cols:
    print(f"  {col}")
    
print("\nCompetitor rate columns (controls):")
for col in competitor_cols:
    print(f"  {col}")
    
print("\nEconomic indicator columns (controls):")
for col in economic_cols:
    print(f"  {col}")

### Feature Naming Convention

**Internal naming (used by inference):**
```
{entity}_{metric}_{time}

Examples:
- prudential_rate_t0    -> Prudential's rate at time t (current)
- prudential_rate_t1    -> Prudential's rate at time t-1
- competitor_weighted_t2 -> Competitor weighted mean at t-2
- sales_target_t0       -> Sales at time t (what we predict)
```

**Legacy naming (in fixture files):**
```
- prudential_rate_current  -> auto-converted to prudential_rate_t0
- competitor_mid_t2        -> auto-converted to competitor_weighted_t2
```

The interface automatically normalizes legacy names when you call `run_inference()`.

The lag notation (`_t1`, `_t2`, etc.) indicates how many weeks in the past.

---

## 3. The Causal Framework (Critical!)

Before running any model, you must understand **why we use lagged features**.

### The Problem: Simultaneity Bias

If competitor rates (`C_t`) and our sales (`Sales_t`) both respond to the same market conditions, using `C_t` to predict `Sales_t` creates **spurious correlation**.

```
    Market Conditions
          |
    +-----+-----+
    |           |
    v           v
  C_t        Sales_t    <- Both driven by market -> spurious correlation!
```

### The Solution: Use Lagged Competitors

```
  C_{t-1}     Market Conditions
    |               |
    |         +-----+
    |         |
    v         v
          Sales_t    <- C_{t-1} is predetermined, not simultaneous!
```

**CRITICAL RULE**: Never use `competitor_*_current` or `competitor_*_t0` features in the model!

In [None]:
# Let's verify our data has the forbidden lag-0 columns
# (they exist in data for completeness, but should NOT be used in modeling)

lag_0_cols = [c for c in df.columns if 'competitor' in c.lower() 
              and ('_current' in c or '_t0' in c)]

print(f"Found {len(lag_0_cols)} lag-0 competitor columns (FORBIDDEN in models):")
for col in lag_0_cols[:5]:
    print(f"  {col}")
print(f"  ... and {len(lag_0_cols) - 5} more")

### Why is Own Rate (`prudential_rate_t0`) OK?

We *control* our own rate. Prudential sets the cap rate **before** observing that week's application-date sales. The contract-issue-date lag (19-76 days) creates an "identification window" where we can't immediately adjust to sales.

This is why the model uses:
- `prudential_rate_t0` (P_lag_0) - **OK**
- `competitor_weighted_t2`, `competitor_weighted_t3` - **OK** (lagged)
- `competitor_weighted_t0` - **FORBIDDEN**

---

## 4. Running Inference

Now let's run the elasticity model. The interface handles:
- Feature selection (or you can specify features)
- Leakage validation (rejects lag-0 competitors)
- Constrained OLS fitting
- Result packaging

In [None]:
# Define features for the model
# Note: We use lagged competitor rates (t2, t3) - never t0!
# 
# Internal naming convention (2026-01-26):
#   - _t0 for current time (not _current)
#   - competitor_weighted (not competitor_mid)
#
features = [
    "prudential_rate_t0",         # Own rate (treatment) - OK at t=0
    "competitor_weighted_t2",     # Competitor rate at t-2 (lagged - OK)
    "competitor_weighted_t3",     # Competitor rate at t-3 (lagged - OK)
    "econ_treasury_5y_t1",        # Treasury rate at t-1 (control)
]

# Verify none of these are forbidden lag-0 competitors
for f in features:
    is_forbidden = interface._is_competitor_lag_zero(f)
    status = "FORBIDDEN" if is_forbidden else "OK"
    print(f"{f}: {status}")

In [None]:
# Run inference
# Note: The interface will validate features and raise an error if lag-0 competitors are detected
# Data is automatically normalized (prudential_rate_current â†’ prudential_rate_t0, etc.)

results = interface.run_inference(
    data=df,
    features=features,
    config={
        "target_column": "sales_target_t0",  # Internal naming convention
        "n_bootstrap": 100,  # Fewer iterations for demo speed
    }
)

print("Inference completed!")
print(f"\nNumber of observations: {results['n_observations']}")

---

## 5. Interpreting Results

The results contain:
- `coefficients`: Feature weights from the model
- `elasticity_point`: Main estimate of price sensitivity
- `model_fit`: R-squared and other fit metrics

In [None]:
# View coefficients
print("Model Coefficients:")
print("=" * 50)

for feature, coef in results["coefficients"].items():
    sign = "+" if coef > 0 else ""
    print(f"  {feature}: {sign}{coef:.6f}")

In [None]:
# Validate economic constraints
validation = interface.validate_coefficients(results["coefficients"])

print("\nEconomic Constraint Validation:")
print("=" * 50)

if validation["passed"]:
    print("\nPASSED constraints:")
    for item in validation["passed"]:
        print(f"  {item['feature']}: {item['coefficient']:.4f}")
        print(f"    {item['reason']}")

if validation["violated"]:
    print("\nVIOLATED constraints:")
    for item in validation["violated"]:
        print(f"  {item['feature']}: {item['coefficient']:.4f}")
        print(f"    {item['reason']}")

if validation["warnings"]:
    print("\nCONTEXT-DEPENDENT (warnings):")
    for item in validation["warnings"]:
        print(f"  {item['feature']}: {item['coefficient']:.4f}")
        print(f"    {item['reason']}")

### Understanding the Signs

| Coefficient | Expected | Interpretation |
|-------------|----------|----------------|
| `prudential_rate_t0` | **Positive** | Higher cap rate (yield) -> more sales |
| `competitor_weighted_*` | **Negative** | Higher competitor rates -> customers switch away |

**Key insight**: Cap rate is a YIELD (customer benefit), not a price (customer cost). This is the opposite of traditional price elasticity!

In [None]:
# Model fit metrics
print("\nModel Fit:")
print("=" * 50)
for metric, value in results["model_fit"].items():
    # Handle both numeric and string values
    if isinstance(value, (int, float)):
        print(f"  {metric}: {value:.4f}")
    else:
        print(f"  {metric}: {value}")

---

## 6. What Would Happen with Lag-0 Competitors?

Let's demonstrate the validation system by trying to use forbidden features.

In [None]:
# This SHOULD fail - we're using lag-0 competitor features
forbidden_features = [
    "prudential_rate_t0",
    "competitor_weighted_t0",  # <- FORBIDDEN: lag-0 competitor
    "competitor_weighted_t2",
]

try:
    results_bad = interface.run_inference(
        data=df,
        features=forbidden_features,
        config={"target_column": "sales_target_t0"}
    )
    print("ERROR: This should have raised an exception!")
except ValueError as e:
    print("Good! The system correctly rejected lag-0 competitors:")
    print(f"\n{e}")

The system **automatically rejects** any attempt to use lag-0 competitor features. This is a safety mechanism to prevent causal identification violations.

---

## 7. Production Usage (AWS)

In production, you'd use AWS data instead of fixtures. Here's what that looks like (don't run this without AWS credentials):

```python
# Production example (requires AWS credentials)
from src.notebooks import create_interface

aws_config = {
    "sts_endpoint_url": "https://sts.us-east-1.amazonaws.com",
    "role_arn": "arn:aws:iam::123456789:role/DataRole",
    "xid": "user123",
    "bucket_name": "my-data-bucket"
}

interface = create_interface(
    "6Y20B",
    environment="aws",
    adapter_kwargs={"config": aws_config}
)

# Rest of the code is identical!
df = interface.load_data()
results = interface.run_inference(df)
```

The only change is `environment="aws"` and providing credentials. All analysis code stays the same.

---

## 8. Summary: Key Takeaways

### What You Learned

1. **Dependency Injection**: Swap data sources without changing analysis code
2. **Causal Identification**: Lag-0 competitors are forbidden (simultaneity bias)
3. **Economic Constraints**: Own rate positive, competitor rates negative
4. **Yield vs Price**: Cap rate is a yield, so higher = more demand (opposite of traditional elasticity)
5. **Naming Convention**: Internal uses `_t0` (not `_current`) and `competitor_weighted` (not `competitor_mid`)

### Critical Rules

| Rule | Why |
|------|-----|
| No lag-0 competitors | Avoids simultaneity bias |
| Own rate coefficient > 0 | Cap rate is a yield (customer benefit) |
| Competitor coefficient < 0 | Substitution effect |
| Use fixture mode for testing | Don't need AWS to develop |

### Next Steps

1. Read `knowledge/analysis/CAUSAL_FRAMEWORK.md` for deeper causal understanding
2. Read `knowledge/integration/LESSONS_LEARNED.md` for common pitfalls
3. Try `docs/onboarding/COMMON_TASKS.md` for practical recipes

---

## Appendix: Exploring the Methodology Rules

In [None]:
# View the constraint rules for RILA products
rules = interface.get_constraint_rules()

print("RILA Methodology Constraint Rules:")
print("=" * 50)
for rule in rules:
    # constraint_type may be an enum or string
    ctype = rule.constraint_type.value if hasattr(rule.constraint_type, 'value') else rule.constraint_type
    print(f"\n{ctype}:")
    print(f"  Pattern: {rule.feature_pattern}")
    print(f"  Expected sign: {rule.expected_sign}")
    print(f"  Rationale: {rule.business_rationale}")
    print(f"  Strict: {rule.strict}")

In [None]:
# View expected coefficient signs
expected_signs = interface.get_coefficient_signs()

print("\nExpected Coefficient Signs:")
print("=" * 50)
for pattern, sign in expected_signs.items():
    print(f"  Features matching '{pattern}': {sign}")