# Forecasting Access and Usage

## Forecasting Account Ownership and Digital Payment Usage (2025-2027)

This notebook implements Task 4: Forecasting Access and Usage, which generates forecasts for Account Ownership Rate and Digital Payment Usage for 2025-2027.

### Objectives:
1. Define Forecast Targets
2. Select Forecasting Approach
3. Generate Baseline and Event-Augmented Forecasts
4. Create Scenario Analysis (Optimistic, Base, Pessimistic)
5. Quantify Uncertainty with Confidence Intervals
6. Interpret Results and Key Uncertainties

In [None]:
import sys
from pathlib import Path
import importlib

# Add src to path
sys.path.insert(0, str(Path.cwd().parent))

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Optional: Plotly for interactive visualizations
try:
    import plotly.graph_objects as go
    import plotly.express as px
    from plotly.subplots import make_subplots
    PLOTLY_AVAILABLE = True
except ImportError:
    PLOTLY_AVAILABLE = False
    print("Plotly not available. Install with: pip install plotly")

# Import and reload modules
from src.models import forecaster
from src.analysis import eda, visualizer
importlib.reload(forecaster)
importlib.reload(eda)
importlib.reload(visualizer)

from src.models.forecaster import ForecastModeler
from src.analysis.eda import EDAAnalyzer
from src.analysis.visualizer import DataVisualizer
from src.utils.config import config

# Set style
sns.set_style("whitegrid")
plt.rcParams["figure.figsize"] = (12, 6)
%matplotlib inline

print("✓ Imports successful - modules reloaded")

## Step 1: Define Forecast Targets

**Target 1: Account Ownership Rate (Access)**
- Indicator: % of adults with account at financial institution or mobile money
- Indicator Code: `ACC_OWNERSHIP`
- Pillar: `ACCESS`

**Target 2: Digital Payment Usage**
- Indicator: % of adults who made or received digital payment
- Indicator Code: `USG_DIGITAL_PAY` (or proxy: `ACC_MM_ACCOUNT`)
- Pillar: `USAGE`

In [None]:
# Define targets
targets = {
    "access": {
        "indicator_code": "ACC_OWNERSHIP",
        "pillar": "ACCESS",
        "description": "Account Ownership Rate (% of adults with account)"
    },
    "usage": {
        "indicator_code": "USG_DIGITAL_PAY",
        "pillar": "USAGE",
        "description": "Digital Payment Usage (% of adults using digital payments)"
    }
}

forecast_years = [2025, 2026, 2027]

print("Forecast Targets:")
for name, info in targets.items():
    print(f"  {name.upper()}: {info['description']}")
    print(f"    Indicator: {info['indicator_code']}, Pillar: {info['pillar']}")
print(f"\nForecast Years: {forecast_years}")

## Step 2: Load Historical Data and Generate Forecasts

Initialize the forecast modeler and generate forecasts for both targets.

In [None]:
# Initialize forecast modeler
forecast_modeler = ForecastModeler()
eda_analyzer = EDAAnalyzer()
visualizer = DataVisualizer(eda_analyzer)

# Load data
forecast_modeler.load_data()
print("✓ Data loaded successfully")

# Generate forecasts
forecast_results = {}

for name, info in targets.items():
    print(f"\n{'='*80}")
    print(f"Forecasting {name.upper()}: {info['description']}")
    print(f"{'='*80}")
    
    try:
        result = forecast_modeler.forecast_indicator(
            indicator_code=info["indicator_code"],
            pillar=info["pillar"],
            forecast_years=forecast_years,
            include_events=True,
            model_type="linear",
            confidence_level=0.95
        )
        forecast_results[name] = result
        print(f"✓ {name} forecast completed")
        print(f"  Model RMSE: {result['model_metrics']['rmse']:.2f}")
        print(f"  Model MAE: {result['model_metrics']['mae']:.2f}")
        
        # Display forecast table
        table = forecast_modeler.generate_forecast_table(result, scenario="base")
        print(f"\nForecast Table:")
        display(table)
        
    except Exception as e:
        print(f"⚠ Could not forecast {name}: {e}")
        # Try with alternative indicator for usage
        if name == "usage":
            try:
                print("  Trying ACC_MM_ACCOUNT as proxy...")
                result = forecast_modeler.forecast_indicator(
                    indicator_code="ACC_MM_ACCOUNT",
                    pillar="ACCESS",
                    forecast_years=forecast_years,
                    include_events=True,
                    model_type="linear",
                    confidence_level=0.95
                )
                forecast_results[name] = result
                targets[name]["indicator_code"] = "ACC_MM_ACCOUNT"
                targets[name]["pillar"] = "ACCESS"
                print(f"✓ Used ACC_MM_ACCOUNT as proxy for usage")
                table = forecast_modeler.generate_forecast_table(result, scenario="base")
                display(table)
            except Exception as e2:
                print(f"✗ Failed to forecast {name} with proxy: {e2}")

## Step 3: Scenario Analysis

Generate optimistic, base, and pessimistic scenarios for each forecast.

In [None]:
# Generate scenario comparison tables
for name, result in forecast_results.items():
    print(f"\n{'='*80}")
    print(f"{name.upper()}: Scenario Analysis")
    print(f"{'='*80}")
    
    scenarios = result["scenarios"]
    
    # Create scenario comparison table
    scenario_table = pd.DataFrame({
        "Year": scenarios["base"]["year"],
        "Optimistic": scenarios["optimistic"]["forecast"].round(1),
        "Base": scenarios["base"]["forecast"].round(1),
        "Pessimistic": scenarios["pessimistic"]["forecast"].round(1),
        "Range": (scenarios["optimistic"]["forecast"] - scenarios["pessimistic"]["forecast"]).round(1)
    })
    
    print("\nScenario Comparison:")
    display(scenario_table)
    
    # Summary statistics
    print(f"\nSummary:")
    print(f"  Optimistic Average: {scenarios['optimistic']['forecast'].mean():.1f}%")
    print(f"  Base Average: {scenarios['base']['forecast'].mean():.1f}%")
    print(f"  Pessimistic Average: {scenarios['pessimistic']['forecast'].mean():.1f}%")
    print(f"  Scenario Range: {scenarios['optimistic']['forecast'].mean() - scenarios['pessimistic']['forecast'].mean():.1f} percentage points")

## Step 4: Visualize Forecasts

Create comprehensive visualizations showing historical data, forecasts, confidence intervals, and scenarios.

In [None]:
# Create comprehensive forecast visualizations
if PLOTLY_AVAILABLE and forecast_results:
    fig = make_subplots(
        rows=len(forecast_results), cols=1,
        subplot_titles=[targets[name]["description"] for name in forecast_results.keys()],
        vertical_spacing=0.15
    )
    
    for idx, (name, result) in enumerate(forecast_results.items(), 1):
        historical = result["historical"]
        forecast = result["forecast"]
        scenarios = result["scenarios"]
        
        # Historical data
        fig.add_trace(
            go.Scatter(
                x=historical["year"],
                y=historical["value_numeric"],
                mode="lines+markers",
                name="Historical",
                line=dict(color="blue", width=2),
                marker=dict(size=8),
                showlegend=(idx == 1)
            ),
            row=idx, col=1
        )
        
        # Confidence interval
        fig.add_trace(
            go.Scatter(
                x=forecast["year"],
                y=forecast["upper_bound"],
                mode="lines",
                name="Upper Bound (95% CI)",
                line=dict(width=0),
                showlegend=(idx == 1)
            ),
            row=idx, col=1
        )
        
        fig.add_trace(
            go.Scatter(
                x=forecast["year"],
                y=forecast["lower_bound"],
                mode="lines",
                name="Lower Bound (95% CI)",
                line=dict(width=0),
                fill="tonexty",
                fillcolor="rgba(0,100,80,0.2)",
                showlegend=(idx == 1)
            ),
            row=idx, col=1
        )
        
        # Base forecast
        fig.add_trace(
            go.Scatter(
                x=forecast["year"],
                y=forecast["forecast"],
                mode="lines+markers",
                name="Base Forecast",
                line=dict(color="green", width=2, dash="dash"),
                marker=dict(size=8),
                showlegend=(idx == 1)
            ),
            row=idx, col=1
        )
        
        # Scenarios
        fig.add_trace(
            go.Scatter(
                x=scenarios["optimistic"]["year"],
                y=scenarios["optimistic"]["forecast"],
                mode="lines",
                name="Optimistic",
                line=dict(color="lightgreen", width=1, dash="dot"),
                showlegend=(idx == 1)
            ),
            row=idx, col=1
        )
        
        fig.add_trace(
            go.Scatter(
                x=scenarios["pessimistic"]["year"],
                y=scenarios["pessimistic"]["forecast"],
                mode="lines",
                name="Pessimistic",
                line=dict(color="lightcoral", width=1, dash="dot"),
                showlegend=(idx == 1)
            ),
            row=idx, col=1
        )
        
        fig.update_xaxes(title_text="Year", row=idx, col=1)
        fig.update_yaxes(title_text="Percentage", row=idx, col=1)
    
    fig.update_layout(
        title="Forecasts: Access and Usage (2025-2027) with Scenarios",
        height=400 * len(forecast_results),
        hovermode="x unified"
    )
    fig.show()
else:
    # Matplotlib fallback
    fig, axes = plt.subplots(len(forecast_results), 1, figsize=(14, 6 * len(forecast_results)))
    if len(forecast_results) == 1:
        axes = [axes]
    
    for idx, (name, result) in enumerate(forecast_results.items()):
        historical = result["historical"]
        forecast = result["forecast"]
        scenarios = result["scenarios"]
        
        # Historical
        axes[idx].plot(historical["year"], historical["value_numeric"], 
                      "o-", label="Historical", linewidth=2, markersize=8)
        
        # Confidence interval
        axes[idx].fill_between(forecast["year"], forecast["lower_bound"], 
                              forecast["upper_bound"], alpha=0.2, color="green")
        
        # Forecasts
        axes[idx].plot(forecast["year"], forecast["forecast"], 
                      "--o", label="Base Forecast", linewidth=2, markersize=8)
        axes[idx].plot(scenarios["optimistic"]["year"], scenarios["optimistic"]["forecast"], 
                      ":", label="Optimistic", linewidth=1, alpha=0.7)
        axes[idx].plot(scenarios["pessimistic"]["year"], scenarios["pessimistic"]["forecast"], 
                      ":", label="Pessimistic", linewidth=1, alpha=0.7)
        
        axes[idx].set_title(targets[name]["description"])
        axes[idx].set_xlabel("Year")
        axes[idx].set_ylabel("Percentage")
        axes[idx].legend()
        axes[idx].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()

## Step 5: Interpret Results and Key Uncertainties

Generate comprehensive interpretation of forecast results, including key predictions, uncertainty quantification, and limitations.

In [None]:
# Generate interpretation
print("=" * 80)
print("FORECAST INTERPRETATION")
print("=" * 80)

for name, result in forecast_results.items():
    print(f"\n{targets[name]['description']}:")
    print("-" * 80)
    
    forecast_df = result["forecast"]
    
    # Key predictions
    avg_forecast = forecast_df["forecast"].mean()
    growth = forecast_df["forecast"].iloc[-1] - forecast_df["forecast"].iloc[0]
    
    print(f"\nKey Predictions:")
    print(f"  • Average forecast (2025-2027): {avg_forecast:.1f}%")
    print(f"  • Projected growth: {growth:+.1f} percentage points")
    print(f"  • 2025: {forecast_df[forecast_df['year']==2025]['forecast'].iloc[0]:.1f}%")
    print(f"  • 2027: {forecast_df[forecast_df['year']==2027]['forecast'].iloc[0]:.1f}%")
    
    # Uncertainty quantification
    avg_range = (forecast_df["upper_bound"] - forecast_df["lower_bound"]).mean()
    print(f"\nUncertainty:")
    print(f"  • Average confidence interval range: ±{avg_range/2:.1f} percentage points")
    print(f"  • 95% confidence intervals provided")
    
    # Scenario ranges
    scenarios = result["scenarios"]
    optimistic_avg = scenarios["optimistic"]["forecast"].mean()
    pessimistic_avg = scenarios["pessimistic"]["forecast"].mean()
    scenario_range = optimistic_avg - pessimistic_avg
    
    print(f"\nScenario Analysis:")
    print(f"  • Optimistic scenario: {optimistic_avg:.1f}% average")
    print(f"  • Base scenario: {scenarios['base']['forecast'].mean():.1f}% average")
    print(f"  • Pessimistic scenario: {pessimistic_avg:.1f}% average")
    print(f"  • Total scenario spread: {scenario_range:.1f} percentage points")

# Limitations
print("\n" + "=" * 80)
print("KEY LIMITATIONS AND UNCERTAINTIES")
print("=" * 80)
print("""
1. Sparse Historical Data:
   - Only 5 Findex survey points over 13 years (2011-2024)
   - Limited data points for robust trend estimation
   - High uncertainty in trend extrapolation

2. Event Impact Assumptions:
   - Event effects based on Task 3 modeling (may be incomplete)
   - Assumes known events and their impacts
   - Does not account for unknown future events

3. Model Limitations:
   - Linear trend assumption may not hold long-term
   - Confidence intervals based on historical residuals only
   - Does not account for structural breaks or regime changes

4. Scenario Assumptions:
   - Scenarios use simple multipliers (±20%)
   - Do not model specific policy or market changes
   - Optimistic/pessimistic bounds are illustrative

5. External Factors:
   - Does not account for macroeconomic shocks
   - Assumes stable regulatory environment
   - No explicit modeling of demographic or technological changes
""")