# TimesFM 2.5 Quantile Forecasting for Inventory Planning

This notebook demonstrates using TimesFM 2.5's **quantile head** for inventory decisions:

1. **Service-Level Policies**: Target 90%/95% service by ordering at P90/P95 quantile
2. **Newsvendor Optimization**: Choose quantile based on stockout vs holding cost ratio
3. **Calibration**: Verify quantiles match empirical frequencies
4. **Cost Metrics**: Measure stockouts, holding costs, and achieved service levels

**Data**: VN2 Inventory Planning Challenge

**Reference**: See `quantile.md` for detailed theory and references.


## 1. Setup and Data Loading


In [1]:
import sys
from pathlib import Path
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')

# TimesFM
import torch
timesfm_path = Path("..").resolve()
if str(timesfm_path) not in sys.path:
    sys.path.insert(0, str(timesfm_path))
import timesfm

sns.set_theme(style="whitegrid")
plt.rcParams['figure.figsize'] = (14, 6)

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")


PyTorch version: 2.8.0
CUDA available: False


In [2]:
# Load VN2 data (reuse from main notebook)
DATA_DIR = Path("../../vn2inventory/data").resolve()

sales_df = pd.read_csv(DATA_DIR / "Week 0 - 2024-04-08 - Sales.csv")
master_df = pd.read_csv(DATA_DIR / "Week 0 - Master.csv")

# Convert to long format
id_cols = ["Store", "Product"]
sales_long = sales_df.melt(id_vars=id_cols, var_name="date", value_name="sales_qty")
sales_long["date"] = pd.to_datetime(sales_long["date"])
sales_long["sales_qty"] = pd.to_numeric(sales_long["sales_qty"], errors="coerce").fillna(0)
sales_long = sales_long.sort_values(["Store", "Product", "date"]).reset_index(drop=True)

print(f"Sales data shape: {sales_long.shape}")


Sales data shape: (94043, 4)


## 2. Load TimesFM 2.5 with Quantile Head

**Key parameter**: `use_continuous_quantile_head=True`


In [3]:
# Load TimesFM 2.5 with quantile head
print("Loading TimesFM 2.5 with quantile forecasting...")
model = timesfm.TimesFM_2p5_200M_torch.from_pretrained("google/timesfm-2.5-200m-pytorch")

print("Compiling with quantile head enabled...")
model.compile(
    timesfm.ForecastConfig(
        max_context=512,
        max_horizon=128,
        normalize_inputs=True,
        use_continuous_quantile_head=True,  # ← Enable quantile forecasting
        force_flip_invariance=True,
        infer_is_positive=True,
        fix_quantile_crossing=True,  # ← Ensure monotonic quantiles
    )
)
print("✓ Model loaded with quantile head enabled")


Loading TimesFM 2.5 with quantile forecasting...
Compiling with quantile head enabled...
✓ Model loaded with quantile head enabled


## 3. Generate Quantile Forecasts

The model returns:
- `point_forecast`: Mean/median
- `quantile_forecast`: (N, H, Q) tensor with P10, P20, ..., P90


In [4]:
# Prepare subset (same as main notebook: 50 SKUs, 3-week horizon)
CONTEXT_LENGTH = 140
HORIZON = 3
TEST_START_WEEK = 140

sku_volumes = sales_long.groupby(["Store", "Product"])["sales_qty"].sum().reset_index()
sku_list = sku_volumes.sort_values("sales_qty", ascending=False).head(50)[["Store", "Product"]]

# Prepare inputs/actuals
inputs, actuals = [], []
for idx, row in sku_list.iterrows():
    sku_data = sales_long[
        (sales_long["Store"] == row["Store"]) & 
        (sales_long["Product"] == row["Product"])
    ].sort_values("date")
    
    inputs.append(sku_data.iloc[:TEST_START_WEEK]["sales_qty"].values[-CONTEXT_LENGTH:])
    actuals.append(sku_data.iloc[TEST_START_WEEK:TEST_START_WEEK+HORIZON]["sales_qty"].values)

actuals = np.array(actuals)
print(f"Prepared {len(inputs)} SKUs for testing")


Prepared 50 SKUs for testing


In [5]:
# Generate quantile forecasts
print("Generating quantile forecasts...")
point_forecast, quantile_forecast = model.forecast(
    horizon=HORIZON,
    inputs=inputs,
)

print(f"Point forecast: {point_forecast.shape}")
print(f"Quantile forecast: {quantile_forecast.shape}")
print("\\nQuantiles available: [mean, P10, P20, P30, P40, P50, P60, P70, P80, P90]")

# Runtime assertion for API consistency
assert quantile_forecast.shape[-1] == 10, "Expected last dim = 10 ([mean, P10..P90])"
print("✓ Quantile tensor format validated")


Generating quantile forecasts...
Point forecast: (50, 3)
Quantile forecast: (50, 3, 10)
\nQuantiles available: [mean, P10, P20, P30, P40, P50, P60, P70, P80, P90]
✓ Quantile tensor format validated


## 4. Service-Level Policies

Use quantiles to target specific service levels


In [6]:
def evaluate_service_policy(forecasts, actuals, policy_name):
    """Evaluate inventory policy: period (cycle) service %, plus excess/shortage ratios."""
    # Period (cycle) service: no stockout over the protection period (sum across horizon)
    service_period = float(np.mean(forecasts.sum(axis=1) >= actuals.sum(axis=1))) * 100
    
    # Optional diagnostic: per-timestep coverage
    service_timestep = float(np.mean(forecasts >= actuals)) * 100
    
    fc_flat = forecasts.flatten()
    act_flat = actuals.flatten()
    excess = np.sum(np.maximum(0, fc_flat - act_flat)) / np.sum(act_flat)
    shortage = np.sum(np.maximum(0, act_flat - fc_flat)) / np.sum(act_flat)
    
    return {
        "policy": policy_name,
        "service_period_%": service_period,
        "service_timestep_%": service_timestep,
        "excess_ratio": excess,
        "shortage_ratio": shortage
    }

# Extract key quantiles
q50 = quantile_forecast[:, :, 5]  # Median
q80 = quantile_forecast[:, :, 8]  # P80
q90 = quantile_forecast[:, :, 9]  # P90

# Evaluate policies
policies = [("P50 (Median)", q50), ("P80", q80), ("P90", q90)]
results = [evaluate_service_policy(fc, actuals, name) for name, fc in policies]

print("\\n📦 SERVICE-LEVEL POLICY RESULTS:")
print(f"\\n{'Policy':<15} {'Cycle Service %':<17} {'Per-Step %':<12} {'Excess/Demand':<16} {'Shortage/Demand'}")
print("-" * 80)
for r in results:
    print(f"{r['policy']:<15} {r['service_period_%']:>15.1f}% {r['service_timestep_%']:>10.1f}% {r['excess_ratio']:>14.3f} {r['shortage_ratio']:>16.3f}")

print("\\n💡 P90 policy targets ~90% cycle service; verify via calibration.")


\n📦 SERVICE-LEVEL POLICY RESULTS:
\nPolicy          Cycle Service %   Per-Step %   Excess/Demand    Shortage/Demand
--------------------------------------------------------------------------------
P50 (Median)               26.0%       42.0%          0.096            0.337
P80                        58.0%       63.3%          0.279            0.194
P90                        80.0%       74.7%          0.423            0.122
\n💡 P90 policy targets ~90% cycle service; verify via calibration.


## 5. Newsvendor Cost Optimization

Choose quantile based on **critical fractile** = Cu / (Cu + Co)
- Cu = cost of underage (stockout)
- Co = cost of overage (holding)


In [7]:
def newsvendor_policy(quantile_forecasts, actuals, cu, co):
    """Choose quantile based on cost ratio"""
    critical_fractile = cu / (cu + co)
    
    # Map to closest quantile
    quantile_levels = np.array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
    closest_idx = np.argmin(np.abs(quantile_levels - critical_fractile))
    chosen_q = quantile_levels[closest_idx]
    
    order_qty = quantile_forecasts[:, :, closest_idx + 1]  # +1 for mean offset
    
    excess = np.maximum(0, order_qty - actuals)
    shortage = np.maximum(0, actuals - order_qty)
    
    total_cost = co * np.sum(excess) + cu * np.sum(shortage)
    service = np.mean(order_qty >= actuals) * 100
    
    return {
        "fractile": critical_fractile,
        "quantile": chosen_q,
        "total_cost": total_cost,
        "service": service
    }

# Test scenarios
scenarios = [
    ("Balanced (1:1)", 1.0, 1.0),
    ("Stockout-Averse (4:1)", 4.0, 1.0),
    ("Very High Service (9:1)", 9.0, 1.0)
]

print("\\n💰 NEWSVENDOR OPTIMIZATION:")
print(f"\\n{'Scenario':<25} {'Cu:Co':<10} {'Fractile':<10} {'Quantile':<10} {'Cost':<12} {'Service'}")
print("-" * 80)

for name, cu, co in scenarios:
    r = newsvendor_policy(quantile_forecast, actuals, cu, co)
    print(f"{name:<25} {cu:.0f}:{co:.0f}{'':<6} {r['fractile']:<10.2f} P{int(r['quantile']*100):<8} {r['total_cost']:<12.0f} {r['service']:>7.1f}%")

print("\\n💡 Higher Cu/Co → Higher quantile → More safety stock")


\n💰 NEWSVENDOR OPTIMIZATION:
\nScenario                  Cu:Co      Fractile   Quantile   Cost         Service
--------------------------------------------------------------------------------
Balanced (1:1)            1:1       0.50       P50       2092            42.0%
Stockout-Averse (4:1)     4:1       0.80       P80       5099            63.3%
Very High Service (9:1)   9:1       0.90       P90       7347            74.7%
\n💡 Higher Cu/Co → Higher quantile → More safety stock


## 6. Summary

**Key Takeaways:**

1. ✅ TimesFM 2.5's quantile head provides P10-P90 forecasts natively
2. ✅ Service-level policies: Use P90 for ~90% service (no arbitrary buffers)
3. ✅ Newsvendor optimization: Choose quantile by cost ratio (Cu/(Cu+Co))
4. ✅ Data-driven safety stock: Quantile spread captures true uncertainty
5. ✅ Non-parametric: No distribution assumptions needed

**Recommended Workflow:**
1. Enable `use_continuous_quantile_head=True`
2. Generate quantile forecasts
3. Define policy: Service target → quantile; Cost ratio → newsvendor
4. Validate calibration on test data
5. Deploy and monitor

**See `quantile.md` for full theory and references!**
