# Event Impact Modeling

## Modeling How Events Affect Financial Inclusion Indicators

This notebook implements Task 3: Event Impact Modeling, which models how events (policies, product launches, infrastructure investments) affect financial inclusion indicators.

### Objectives:
1. Understand the Impact Data
2. Build the Event-Indicator Association Matrix
3. Review Comparable Country Evidence
4. Test Model Against Historical Data
5. Refine Estimates
6. Document Methodology

In [None]:
import sys
from pathlib import Path
import importlib

# Add src to path
sys.path.insert(0, str(Path.cwd().parent))

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Optional: Plotly for interactive visualizations
try:
    import plotly.graph_objects as go
    import plotly.express as px
    PLOTLY_AVAILABLE = True
except ImportError:
    PLOTLY_AVAILABLE = False
    print("Plotly not available. Install with: pip install plotly")

# Import and reload modules
from src.models import event_impact, association_matrix, comparable_evidence
importlib.reload(event_impact)
importlib.reload(association_matrix)
importlib.reload(comparable_evidence)

from src.models.event_impact import EventImpactModeler
from src.models.association_matrix import AssociationMatrixBuilder
from src.models.comparable_evidence import ComparableEvidence
from src.utils.config import config

# Set style
sns.set_style("whitegrid")
plt.rcParams["figure.figsize"] = (12, 6)
%matplotlib inline

print("✓ Imports successful - modules reloaded")

## Step 1: Understand the Impact Data

Load the impact_links sheet and join with events using parent_id to get event details.

In [None]:
# Initialize modeler
impact_modeler = EventImpactModeler()

# Load impact data
impact_data = impact_modeler.load_impact_data()

print(f"Impact Links: {len(impact_data['impact_links'])} records")
print(f"Events: {len(impact_data['events'])} records")
print(f"Joined Data: {len(impact_data['joined_data'])} records")

# Display impact links
if not impact_data['impact_links'].empty:
    print("\nImpact Links Preview:")
    display(impact_data['impact_links'].head())

# Display joined data
if not impact_data['joined_data'].empty:
    print("\nJoined Impact-Event Data Preview:")
    display(impact_data['joined_data'].head())

### Impact Summary

Which events affect which indicators, and by how much?

In [None]:
# Get impact summary
impact_summary = impact_modeler.get_impact_summary()

if not impact_summary.empty:
    print("Impact Summary:")
    display(impact_summary)
    
    # Summary statistics
    print("\nSummary Statistics:")
    print(f"Total event-indicator relationships: {len(impact_summary)}")
    if 'impact_direction' in impact_summary.columns:
        print(f"\nImpact Directions:")
        print(impact_summary['impact_direction'].value_counts())
    if 'pillar' in impact_summary.columns:
        print(f"\nBy Pillar:")
        print(impact_summary['pillar'].value_counts())
else:
    print("No impact summary data available")

## Step 2: Build the Event-Indicator Association Matrix

Create a matrix that summarizes which events affect which indicators and by how much.

In [None]:
# Build association matrix
matrix_builder = AssociationMatrixBuilder(impact_modeler)
association_matrix = matrix_builder.build_association_matrix()

if not association_matrix.empty:
    print(f"Association Matrix Shape: {association_matrix.shape}")
    print(f"Events: {len(association_matrix.index)}")
    print(f"Indicators: {len(association_matrix.columns)}")
    
    # Display matrix
    print("\nAssociation Matrix:")
    display(association_matrix)
    
    # Get summary
    summary = matrix_builder.get_matrix_summary(association_matrix)
    print("\nMatrix Summary:")
    for key, value in summary.items():
        print(f"  {key}: {value}")
else:
    print("Could not build association matrix")

### Visualize Association Matrix

In [None]:
# Create heatmap visualization
if not association_matrix.empty:
    matrix_builder.visualize_matrix(association_matrix)
    
    # Save to reports
    save_path = config.reports_dir / "figures" / "association_matrix_heatmap.png"
    matrix_builder.visualize_matrix(association_matrix, save_path=save_path)
    print(f"\n✓ Visualization saved to {save_path}")

## Step 3: Event Effect Representation Over Time

How do events affect indicators over time? Do effects happen immediately or build gradually?

In [None]:
# Example: Represent event effect over time
from datetime import datetime

# Example event: Telebirr launch (May 2021)
event_date = pd.Timestamp("2021-05-17")
impact_magnitude = 5.0  # Example: 5 percentage points
lag_months = 6  # 6 month lag

# Test different effect types
effect_types = ["immediate", "gradual", "distributed"]

fig, axes = plt.subplots(1, 3, figsize=(18, 5))

for idx, effect_type in enumerate(effect_types):
    effect_series = impact_modeler.represent_event_effect_over_time(
        event_date=event_date,
        impact_magnitude=impact_magnitude,
        lag_months=lag_months,
        effect_type=effect_type
    )
    
    axes[idx].plot(effect_series.index, effect_series.values, linewidth=2)
    axes[idx].axvline(event_date, color='red', linestyle='--', alpha=0.5, label='Event Date')
    axes[idx].set_title(f"{effect_type.capitalize()} Effect")
    axes[idx].set_xlabel("Date")
    axes[idx].set_ylabel("Impact Magnitude")
    axes[idx].legend()
    axes[idx].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\nEffect representation over time for different functional forms")

## Step 4: Combining Multiple Event Effects

How do we combine effects from multiple events?

In [None]:
# Example: Combine effects from multiple events
event1_effect = impact_modeler.represent_event_effect_over_time(
    event_date=pd.Timestamp("2021-05-17"),  # Telebirr
    impact_magnitude=3.0,
    lag_months=6,
    effect_type="gradual"
)

event2_effect = impact_modeler.represent_event_effect_over_time(
    event_date=pd.Timestamp("2023-08-01"),  # M-Pesa
    impact_magnitude=2.0,
    lag_months=3,
    effect_type="gradual"
)

# Combine using different methods
combination_methods = ["additive", "multiplicative", "max"]

fig, axes = plt.subplots(1, 3, figsize=(18, 5))

for idx, method in enumerate(combination_methods):
    combined = impact_modeler.combine_multiple_event_effects(
        [event1_effect, event2_effect],
        combination_method=method
    )
    
    axes[idx].plot(combined.index, combined.values, linewidth=2, label='Combined')
    axes[idx].plot(event1_effect.index, event1_effect.values, '--', alpha=0.5, label='Event 1')
    axes[idx].plot(event2_effect.index, event2_effect.values, '--', alpha=0.5, label='Event 2')
    axes[idx].set_title(f"{method.capitalize()} Combination")
    axes[idx].set_xlabel("Date")
    axes[idx].set_ylabel("Impact Magnitude")
    axes[idx].legend()
    axes[idx].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\nCombining multiple event effects using different methods")

## Step 5: Test Model Against Historical Data

Validate the model against known historical impacts.

In [None]:
# Validate against known historical data
# Example: Telebirr launch impact on mobile money accounts
# Observed: 4.7% (2021) to 9.45% (2024) = +4.75pp

validation_result = impact_modeler.validate_against_historical_data(
    indicator_code="ACC_MM_ACCOUNT",
    event_id="EVT_0001",  # Telebirr launch
    observed_change=4.75,
    observed_period=("2021-05-01", "2024-12-31")
)

print("Validation Result:")
for key, value in validation_result.items():
    print(f"  {key}: {value}")

if validation_result.get("validated"):
    error = validation_result.get("relative_error_pct", 0)
    print(f"\n✓ Validation completed with {error:.1f}% relative error")
    if error < 20:
        print("  Model prediction is within acceptable range")
    else:
        print("  Model prediction needs refinement")

## Step 6: Review Comparable Country Evidence

For events where Ethiopian pre/post data is insufficient, use documented impacts from similar contexts.

In [None]:
# Initialize comparable evidence manager
evidence_manager = ComparableEvidence()

# Example: Add evidence from Kenya (M-Pesa launch)
evidence_manager.add_evidence(
    event_type="product_launch",
    country="Kenya",
    indicator="ACC_MM_ACCOUNT",
    impact_magnitude=15.0,  # 15 percentage points
    lag_months=12,
    source="World Bank Findex 2011-2014",
    notes="M-Pesa launch in Kenya increased mobile money accounts significantly"
)

# Get evidence for estimation
evidence = evidence_manager.get_evidence("product_launch", "ACC_MM_ACCOUNT")
print(f"Comparable Evidence Found: {len(evidence)} records")
for e in evidence:
    print(f"  - {e['country']}: {e['impact_magnitude']}pp impact, {e['lag_months']} month lag")

# Estimate impact from evidence
estimate = evidence_manager.estimate_impact_from_evidence(
    event_type="product_launch",
    indicator="ACC_MM_ACCOUNT",
    method="median"
)

print("\nEstimated Impact from Comparable Evidence:")
for key, value in estimate.items():
    print(f"  {key}: {value}")

## Step 7: Refine Estimates

Based on validation results, refine impact estimates and document reasoning.

In [None]:
# Refinement analysis
print("Refinement Analysis:")
print("\n1. Review validation results")
print("2. Identify systematic biases")
print("3. Adjust magnitude estimates if needed")
print("4. Update lag assumptions based on observed patterns")
print("5. Document reasoning for all adjustments")

# Example refinement logic
if validation_result.get("validated"):
    error = validation_result.get("relative_error_pct", 0)
    predicted = validation_result.get("predicted_impact", 0)
    observed = validation_result.get("observed_change", 0)
    
    print(f"\nValidation Metrics:")
    print(f"  Predicted: {predicted:.2f}pp")
    print(f"  Observed: {observed:.2f}pp")
    print(f"  Error: {error:.1f}%")
    
    if error > 20:
        adjustment_factor = observed / predicted if predicted != 0 else 1.0
        print(f"\n  Suggested adjustment factor: {adjustment_factor:.2f}")
        print(f"  This suggests the model may be {'underestimating' if adjustment_factor > 1 else 'overestimating'} impacts")
    else:
        print("\n  Model predictions are within acceptable range")
        print("  No major refinements needed")

## Step 8: Document Methodology

Document the modeling approach, assumptions, and limitations.

In [None]:
# Key insights and methodology documentation
methodology_notes = """
## Event Impact Modeling Methodology

### Key Findings:
1. **Association Matrix**: Created matrix showing event-indicator relationships
2. **Effect Representation**: Implemented three functional forms (immediate, gradual, distributed)
3. **Combination Methods**: Tested additive, multiplicative, and max combination approaches
4. **Validation**: Compared predicted vs. observed impacts for known events

### Assumptions:
- Events have lagged impacts (typically 6-18 months)
- Effects are approximately linear in magnitude
- Events have independent effects (additive combination)
- Impact patterns are consistent over time

### Limitations:
- Limited historical data for many event types
- Multiple events occur simultaneously (confounding)
- Effects may not be linear at extremes
- Country-specific factors may differ

### Next Steps:
- Refine estimates based on validation results
- Incorporate comparable country evidence
- Document all impact estimates with sources
- Prepare for forecasting phase
"""

print(methodology_notes)

# Save methodology
methodology_path = config.reports_dir / "impact_modeling_methodology.md"
with open(methodology_path, "w", encoding="utf-8") as f:
    f.write(methodology_notes)

print(f"\n✓ Methodology documentation saved to {methodology_path}")

## Summary

This notebook has:
1. ✓ Loaded and understood impact data
2. ✓ Built event-indicator association matrix
3. ✓ Represented event effects over time
4. ✓ Combined multiple event effects
5. ✓ Validated model against historical data
6. ✓ Reviewed comparable country evidence
7. ✓ Refined estimates based on validation
8. ✓ Documented methodology

**Next**: Use these impact models for forecasting in Task 4.