# Monthly current account balance forecasting demo

This notebook shows how to train and use the helpers defined in `core/models/current_accounts/monthly_balance/prediction.py` to generate a 12-month forecast for current-account balances.  It relies on synthetic data so that the workflow can be executed without access to production tables.  Replace the sample frames with real extracts (or load them through the `MonthlyBalanceDataLoader`) when running the model for actual reporting.


## 1. Imports and configuration

The forecasting utilities depend on pandas, Polars, scikit-learn, and statsmodels.  Make sure those packages (and their dependencies such as NumPy) are installed in the active environment before running the cells below.


In [None]:
from __future__ import annotations

import itertools
import math
import random
from datetime import date

import pandas as pd
import polars as pl

from core.models.current_accounts.monthly_balance.prediction import (
    MonthlyBalanceConfig,
    MonthlyBalanceForecaster,
)

random.seed(42)

## 2. Build synthetic training data

In production the `MonthlyBalanceDataLoader` would read data from Spark tables. For demonstration purposes we synthesise a small monthly panel covering every segment combination (`salary_flg`, `pensioner_flg`, `is_vip_or_prv`) along with corresponding FTP and key-rate series.


In [None]:
salary_levels = ["NS", "S"]
pension_levels = ["NP", "P"]
vip_levels = ["mass", "priv", "vip"]

actual_dates = pd.date_range("2019-01-31", periods=48, freq="M")
rate_dates = pd.date_range("2019-01-31", periods=72, freq="M")

def build_current_accounts() -> pl.DataFrame:
    rows = []
    for idx, ts in enumerate(actual_dates):
        # smooth trend and seasonality to keep the example realistic
        seasonal = 1 + 0.05 * math.sin(2 * math.pi * ((idx % 12) / 12))
        trend = 1 + 0.01 * idx
        for salary, pension, vip in itertools.product(
            salary_levels, pension_levels, vip_levels
        ):
            salary_factor = {"NS": 1.0, "S": 1.1}[salary]
            pension_factor = {"NP": 1.0, "P": 1.05}[pension]
            vip_factor = {"mass": 1.0, "priv": 1.2, "vip": 1.4}[vip]
            noise = 1 + random.uniform(-0.02, 0.02)
            balance = (
                1_000_000
                * salary_factor
                * pension_factor
                * vip_factor
                * trend
                * seasonal
                * noise
            )
            n_accounts = 1000 * salary_factor * vip_factor * seasonal * noise
            rows.append(
                {
                    "report_dt": ts,
                    "salary_flg": salary,
                    "pensioner_flg": pension,
                    "is_vip_or_prv": vip,
                    "balance_amt": float(balance),
                    "n_accounts": float(n_accounts),
                }
            )
    df = pl.DataFrame(rows).with_columns(pl.col("report_dt").cast(pl.Date))
    return df.sort(["report_dt", "salary_flg", "pensioner_flg", "is_vip_or_prv"])

def build_ftp_rates() -> pl.DataFrame:
    rows = []
    for idx, ts in enumerate(rate_dates):
        rows.append(
            {
                "report_dt": ts,
                "VTB_90d_ftp_rate": 0.05 + 0.005 * math.sin(idx / 3),
                "VTB_365d_ftp_rate": 0.055 + 0.004 * math.cos(idx / 4),
            }
        )
    df = pl.DataFrame(rows).with_columns(pl.col("report_dt").cast(pl.Date))
    return df.sort("report_dt")

def build_market_rates() -> pl.DataFrame:
    rows = []
    for idx, ts in enumerate(rate_dates):
        rows.append(
            {
                "report_dt": ts,
                "key_rate": 0.06 + 0.003 * math.sin(idx / 5),
            }
        )
    df = pl.DataFrame(rows).with_columns(pl.col("report_dt").cast(pl.Date))
    return df.sort("report_dt")

current_accounts = build_current_accounts()
ftp_rates = build_ftp_rates()
market_rates = build_market_rates()

current_accounts.head()

## 3. Train the forecaster

Instantiate `MonthlyBalanceForecaster` with the desired horizon and fit it on the synthetic panel.  The helper creates the feature set and trains one Lasso model per forecast month.


In [None]:
config = MonthlyBalanceConfig(horizon=12, lasso_alpha=0.01)
forecaster = MonthlyBalanceForecaster(config)
_ = forecaster.fit(current_accounts, ftp_rates, market_rates)
print("Model fitted for", config.horizon, "months ahead")

## 4. Prepare a rate scenario and forecast

For inference we provide a forward-looking FTP and key-rate scenario covering the forecast horizon.  The current-account history is reused to compute the lag features, while the scenario supplies the exogenous rate paths.


In [None]:
scenario_dates = pd.date_range("2023-01-31", periods=config.horizon, freq="M")
scenario_start = scenario_dates[0].date()
scenario_end = scenario_dates[-1].date()

ftp_scenario = ftp_rates.filter(
    (pl.col("report_dt") >= scenario_start) & (pl.col("report_dt") <= scenario_end)
)
market_scenario = market_rates.filter(
    (pl.col("report_dt") >= scenario_start) & (pl.col("report_dt") <= scenario_end)
)

forecast_df = forecaster.predict(
    current_accounts=current_accounts,
    ftp_rates=ftp_rates,
    ftp_rates_scenario=ftp_scenario,
    market_rates=market_rates,
    market_rates_scenario=market_scenario,
    forecast_start=scenario_start,
    horizon=config.horizon,
)

forecast_df.head()

## 5. Persist or post-process the output

The result is a pandas `DataFrame` with one row per forecast month and customer segment.  Save it or pivot it as needed.


In [None]:
(
    forecast_df
    .groupby(["report_dt", "salary_flg", "pensioner_flg", "is_vip_or_prv"], as_index=False)
    .agg({"balance_amt_pred": "sum"})
    .head()
)

In [None]:
# Optional: write the forecast to CSV for downstream reporting
# forecast_df.to_csv("monthly_balance_forecast.csv", index=False)
