# Gas Storage Optimization ‚Äî LSMC (Least Squares Monte Carlo)

This notebook implements a **stochastic dynamic programming** approach to gas storage optimization using the **Bellman equation** and **Least Squares Monte Carlo (LSMC)** to estimate continuation values.

### How does this differ from the MILP approach?

| | MILP (`gas_storage_milp.ipynb`) | LSMC (this notebook) |
|:--|:--|:--|
| **Prices** | Deterministic forward curve | Stochastic ‚Äî many simulated price paths |
| **Decision** | One optimal schedule for a known curve | An **optimal policy** that adapts to realized prices |
| **Output** | Exact optimal profit for one price scenario | Distribution of profits across many scenarios |
| **Method** | Linear programming (Pyomo + SCIP) | Backward induction + regression |
| **Value** | Extrinsic value only | Captures **optionality** ‚Äî the value of waiting and reacting |

### Why LSMC?
A gas storage is like a **real option**: you have the *right but not the obligation* to inject or withdraw gas at any time. The value of this flexibility depends on price uncertainty. LSMC lets us estimate this value by solving the Bellman equation backwards through time, using regression to approximate conditional expectations.

### Simplifications (vs. the full MILP)
- **Monthly** time steps (12 months) instead of daily (365 days)
- **Discrete** inventory grid instead of continuous
- **Flat** injection/withdrawal rates (no piecewise curves)
- **No** BSD (minimum state-to-date) constraints

The core Bellman logic is identical ‚Äî only the granularity differs.

In [None]:
import datetime as dt
import json
import numpy as np
import pandas as pd
from pathlib import Path
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

from gas_storage.gas_price_simulations import GasPriceSimulations

## 1. Storage Parameters & Price Simulations

We load the same storage ("Abraham") and time period as the MILP notebook: **April 2024 ‚Äì March 2025**.

The key parameters:
- **WGV** = 200,000 MWh ‚Äî maximum gas that can be stored
- **IR** = 2,100 MWh/day ‚Äî maximum injection rate
- **WR** = 2,800 MWh/day ‚Äî maximum withdrawal rate
- **Injection season** = April‚ÄìSeptember (can only inject)
- **Withdrawal season** = October‚ÄìMarch (can only withdraw)

We simulate 1,000 price paths using the mean-reverting model from `gas_price_simulations.py`, then aggregate to **monthly average prices** (12 time steps).

In [None]:
# ‚îÄ‚îÄ Load storage parameters ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
data_path = Path("../data/")
with open(data_path / "storages.json") as f:
    storages = json.load(f)
storage_params = storages[0]  # "Abraham" storage

# Time period (same as MILP notebook)
date_start = dt.datetime(2024, 4, 1)
date_end = dt.datetime(2025, 3, 31)

# Storage parameters
period = storage_params["TimePeriods"][1]  # Second time period
WGV = period["WGV"]                        # 200,000 MWh
IR = period["InjectionRate"]               # 2,100 MWh/day
WR = period["WithdrawalRate"]              # 2,800 MWh/day
injection_months = storage_params["InjectionSeason"]  # [4,5,6,7,8,9]

print(f"Storage: {storage_params['GasStorageName']}")
print(f"Period:  {date_start.date()} to {date_end.date()}")
print(f"WGV:     {WGV:,} MWh  |  IR: {IR:,} MWh/day  |  WR: {WR:,} MWh/day")
print(f"Injection months: {injection_months}")

# ‚îÄ‚îÄ Generate price simulations ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
N_sim = 1000
simulator = GasPriceSimulations()
daily_paths = simulator.get_simulations(number_of_simulations=N_sim, date_start=date_start)
all_dates = simulator.index

# Filter to our optimization period
date_mask = (all_dates >= date_start) & (all_dates <= date_end)
daily_prices = daily_paths[:, date_mask]
period_dates = all_dates[date_mask]

# ‚îÄ‚îÄ Aggregate to monthly prices ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
# For each month we compute the average simulated price across its days.
# This is the price at which we trade during that month.
monthly_dates = pd.date_range(date_start, date_end, freq="MS")
T = len(monthly_dates)   # 12 time steps
monthly_prices = np.zeros((T, N_sim))
days_per_month = np.zeros(T, dtype=int)

for t, m_start in enumerate(monthly_dates):
    m_end = monthly_dates[t + 1] - pd.Timedelta(days=1) if t < T - 1 else date_end
    m_mask = (period_dates >= m_start) & (period_dates <= m_end)
    monthly_prices[t, :] = daily_prices[:, m_mask].mean(axis=1)
    days_per_month[t] = m_mask.sum()

# ‚îÄ‚îÄ Monthly rate caps ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
monthly_inj_cap = days_per_month * IR   # MWh injectable per month
monthly_wit_cap = days_per_month * WR   # MWh withdrawable per month

print(f"\nTime steps: T = {T} months")
print(f"Sim paths:  N = {N_sim}")
print(f"\nMonthly summary:")
print(f"{'Month':>8s}  {'Days':>5s}  {'Inj cap':>10s}  {'Wit cap':>10s}  {'Avg price':>10s}")
print("-" * 50)
for t in range(T):
    d = monthly_dates[t]
    print(f"{d.strftime('%b %Y'):>8s}  {days_per_month[t]:5d}  "
          f"{monthly_inj_cap[t]:>10,.0f}  {monthly_wit_cap[t]:>10,.0f}  "
          f"{monthly_prices[t, :].mean():10.2f}")

In [None]:
# Plot simulated monthly average price paths
fig = go.Figure()
for k in range(min(50, N_sim)):
    fig.add_trace(go.Scatter(
        x=monthly_dates, y=monthly_prices[:, k],
        mode="lines", line=dict(color="lightgray", width=0.5),
        showlegend=False
    ))
fig.add_trace(go.Scatter(
    x=monthly_dates, y=monthly_prices.mean(axis=1),
    mode="lines+markers", line=dict(color="orange", width=2),
    name="Mean price"
))
fig.update_layout(
    template="plotly_dark",
    title="Monthly average gas prices (simulated, 50 paths shown)",
    xaxis_title="Month", yaxis_title="Price [EUR/MWh]",
    height=400,
)
fig.show()

## 2. The Bellman Equation for Gas Storage

*(This section mirrors the theory from `toy_gas_storage.ipynb` ‚Äî adapted to our real storage problem.)*

### The idea

At every time step $t$, we observe the current **gas price** $S_t$ and **inventory level** $I_t$, and we must choose an **action** $a_t$ (inject, withdraw, or hold). The **value function** $V_t(I)$ represents the maximum expected profit we can earn from time $t$ onward, starting with inventory $I$.

### Notation

$$
\begin{align*}
T &= 12 \;\text{(monthly time steps: Apr 2024 ‚Äì Mar 2025)} \\
N &= 1{,}000 \;\text{(simulated price paths)} \\
I_{\max} &= 200{,}000 \;\text{MWh (working gas volume)} \\
\Delta I &= 20{,}000 \;\text{MWh (inventory grid step)} \\
q_{\text{in}} &= \text{IR} \times \text{days}(t) \;\text{MWh (monthly injection cap, rounded to } \Delta I \text{)} \\
q_{\text{out}} &= \text{WR} \times \text{days}(t) \;\text{MWh (monthly withdrawal cap, rounded to } \Delta I \text{)} \\
\mathcal{I} &= \{0,\;\Delta I,\;2\Delta I,\;\dots,\;I_{\max}\} \;\;\text{(inventory grid ‚Äî 11 levels)}
\end{align*}
$$

### Action set

At time $t$ with inventory $I$:

$$
\mathcal{A}(I, t) = \begin{cases}
\{a \ge 0 \mid a \le q_{\text{in}}(t),\;\; I + a \le I_{\max},\;\; a \in \Delta I \cdot \mathbb{Z}_{\ge 0}\} & \text{if } t \in \text{injection season} \\[6pt]
\{a \le 0 \mid |a| \le q_{\text{out}}(t),\;\; I + a \ge 0,\;\; a \in \Delta I \cdot \mathbb{Z}_{\le 0}\} & \text{if } t \in \text{withdrawal season}
\end{cases}
$$

### The Bellman equation

$$
\boxed{V_t(I) = \max_{a \in \mathcal{A}(I,\,t)} \Big\{ \underbrace{-a \cdot S_t}_{\text{immediate cash flow}} \;+\; \underbrace{\mathbb{E}\big[V_{t+1}(I + a)\;\big|\;S_t\big]}_{\text{continuation value}} \Big\}}
$$

**Reading the equation:**
- **Inject** ($a > 0$): we *buy* gas ‚Üí cash flow $= -a \cdot S_t < 0$ (we pay), inventory goes up.
- **Hold** ($a = 0$): no trade ‚Üí cash flow $= 0$, inventory stays.
- **Withdraw** ($a < 0$): we *sell* gas ‚Üí cash flow $= -a \cdot S_t > 0$ (we receive), inventory goes down.

We pick the action that maximizes the **total** of immediate cash flow **plus** the expected value of being in the resulting state tomorrow.

### Terminal condition

At $t = T$ (end of March 2025), the storage contract ends. Any remaining gas has zero value:

$$
V_T(I) = 0 \quad \forall\; I \in \mathcal{I}
$$

### Why backward induction?

We cannot solve $V_0$ directly ‚Äî it depends on $V_1$, which depends on $V_2$, and so on. So we solve **backwards**:

$$
V_T \;\rightarrow\; V_{T-1} \;\rightarrow\; \cdots \;\rightarrow\; V_1 \;\rightarrow\; V_0
$$

At each step, the continuation values from the next step are already known.

## 3. LSMC ‚Äî Estimating Continuation Values

### The problem

At each time step $t$, for each next inventory level $I' = I + a$, we need to compute:

$$
\mathbb{E}\big[V_{t+1}(I')\;\big|\;S_t\big]
$$

This is the expected future value of having inventory $I'$ *tomorrow*, **conditional on today's price** $S_t$. In the toy example we could compute this exactly (two paths, simple average). With 1,000 paths, we use **regression**.

### The LSMC solution

1. From the previous backward step, we already know $V_{t+1}^{(k)}(I')$ for **each path** $k = 1, \dots, N$.
2. We **regress** these values on polynomial basis functions of today's price:

$$
V_{t+1}^{(k)}(I') \;\approx\; \beta_0 + \beta_1\,S_t^{(k)} + \beta_2\,(S_t^{(k)})^2 + \beta_3\,(S_t^{(k)})^3
$$

3. The **fitted values** from the regression give us a smooth estimate of $\mathbb{E}[V_{t+1}(I') \mid S_t]$.

### Key principle: *Decide with fitted, record with actual*

This is the heart of the Longstaff‚ÄìSchwartz method:

- Use **fitted** (regression-based) continuation values to **choose** the best action $a^*_k$ for each path.
- **Record** $V_t^{(k)}(I)$ using the **actual** (pathwise) continuation values:

$$
V_t^{(k)}(I) = -a^*_k \cdot S_t^{(k)} \;+\; V_{t+1}^{(k)}(I + a^*_k) \quad \text{(actual, not fitted!)}
$$

**Why?** The regression provides a smooth estimate of expected future value that depends *only* on current information ($S_t$). This prevents **look-ahead bias**: we don't peek at future prices to decide. But we record the *true* pathwise value so that the next regression step works with unbiased data.

### Visual intuition

For a fixed next inventory level $I'$, the regression maps today's price to an estimate of tomorrow's value:

$$
S_t \;\;\xrightarrow[\text{OLS}]{\text{polynomial regression}}\;\; \hat{\mathbb{E}}\big[V_{t+1}(I')\;\big|\;S_t\big]
$$

## 4. Discretization & Setup

We discretize inventory into a grid of 11 levels and define monthly rate caps:

| Parameter | Value |
|:--|:--|
| $\Delta I$ | 20,000 MWh |
| Inventory levels | $\{0,\; 20{,}000,\; 40{,}000,\; \dots,\; 200{,}000\}$ (11 levels) |
| Monthly injection cap | $\text{IR} \times \text{days}(t)$, rounded down to multiple of $\Delta I$ |
| Monthly withdrawal cap | $\text{WR} \times \text{days}(t)$, rounded down to multiple of $\Delta I$ |

The function `get_feasible_actions(I, t)` returns all valid inventory changes at time $t$ given current inventory $I$, respecting:
1. **Seasonal constraints** (injection/withdrawal months)
2. **Rate limits** (max change per month)
3. **Inventory bounds** $[0, \text{WGV}]$

In [None]:
# ‚îÄ‚îÄ Inventory grid ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
delta_I = 20_000                                        # step size [MWh]
inventory_grid = np.arange(0, WGV + delta_I, delta_I)   # [0, 20k, ..., 200k]
n_levels = len(inventory_grid)                           # 11 levels

# ‚îÄ‚îÄ Monthly rate caps rounded to grid ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
monthly_inj_steps = np.floor(monthly_inj_cap / delta_I).astype(int) * delta_I
monthly_wit_steps = np.floor(monthly_wit_cap / delta_I).astype(int) * delta_I

print(f"Inventory grid: {inventory_grid}")
print(f"Number of levels: {n_levels}")
print(f"Grid spacing: {delta_I:,} MWh")
print(f"\nMonthly caps (rounded to grid):")
print(f"{'Month':>8s}  {'Inj cap':>10s}  {'Wit cap':>10s}")
print("-" * 35)
for t in range(T):
    d = monthly_dates[t]
    print(f"{d.strftime('%b %Y'):>8s}  {monthly_inj_steps[t]:>10,}  {monthly_wit_steps[t]:>10,}")


def get_feasible_actions(I: float, t: int) -> list:
    """Return sorted list of feasible inventory changes at time t given inventory I.
    
    Actions are constrained by:
      1. Seasonal restrictions (injection/withdrawal months)
      2. Rate limits (monthly caps rounded to grid)
      3. Inventory bounds [0, WGV]
    
    Returns:
        Sorted list of feasible actions (multiples of delta_I).
        Positive = injection, negative = withdrawal, 0 = hold.
    """
    month = monthly_dates[t].month
    actions = [0]  # holding is always feasible
    
    if month in injection_months:
        # Injection season: can inject up to monthly cap
        max_inj = min(monthly_inj_steps[t], WGV - I)
        n_steps = int(max_inj // delta_I)
        actions += [delta_I * s for s in range(1, n_steps + 1)]
    else:
        # Withdrawal season: can withdraw up to monthly cap
        max_wit = min(monthly_wit_steps[t], I)
        n_steps = int(max_wit // delta_I)
        actions += [-delta_I * s for s in range(1, n_steps + 1)]
    
    return sorted(set(actions))


# ‚îÄ‚îÄ Quick sanity check: show feasible actions ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
print("\nFeasible actions examples:")
test_cases = [
    (0, 0, "Apr, empty"),
    (0, 100_000, "Apr, half full"),
    (0, 200_000, "Apr, full"),
    (6, 200_000, "Oct, full"),
    (6, 0, "Oct, empty"),
]
for t_ex, I_ex, desc in test_cases:
    acts = get_feasible_actions(I_ex, t_ex)
    print(f"  t={t_ex} ({desc:20s}): {[f'{a:+,}' for a in acts]}")

## 5. Backward Induction with LSMC

This is the heart of the algorithm. We work **backwards** from $t = T$ to $t = 0$:

### Algorithm (for each time step $t = T{-}1, \dots, 0$):

1. **Fit regressions.** For each next inventory level $I'$, regress $V_{t+1}^{(k)}(I')$ on a polynomial of $S_t^{(k)}$:
$$\hat{\mathbb{E}}[V_{t+1}(I') \mid S_t] = \beta_0 + \beta_1 S_t + \beta_2 S_t^2 + \beta_3 S_t^3$$

2. **Choose actions.** For each current level $I$ and each path $k$, evaluate all feasible actions $a$ using **fitted** continuation values:
$$\text{total}^{(k)}(a) = -a \cdot S_t^{(k)} + \hat{\mathbb{E}}[V_{t+1}(I+a) \mid S_t^{(k)}]$$
Pick $a^*_k = \arg\max_a \;\text{total}^{(k)}(a)$.

3. **Record values.** Store the **actual** (not fitted) value:
$$V_t^{(k)}(I) = -a^*_k \cdot S_t^{(k)} + V_{t+1}^{(k)}(I + a^*_k)$$

4. **Store regression coefficients** $\beta(I')$ for later use in forward simulation.

> **Why fitted for decisions but actual for recording?**  
> The regression gives a smooth, unbiased estimate of the conditional expectation ‚Äî good for decisions. But the recorded value should be the *true* pathwise cash flow so subsequent regressions work with honest data.

In [None]:
# ‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê
#  BACKWARD INDUCTION ‚Äî solving the Bellman equation via LSMC
# ‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê

poly_deg = 3  # cubic polynomial for LSMC regression

# V[t, i, k] = value of being at inventory level i at time t on path k
V = np.zeros((T + 1, n_levels, N_sim))
# V[T, :, :] = 0 (terminal condition ‚Äî already initialized)

# reg_coeffs[(t, i_next)] = polynomial coefficients for continuation value regression
reg_coeffs = {}

print(f"Running backward induction: T={T} months, {n_levels} levels, {N_sim} paths")
print(f"Polynomial degree: {poly_deg}\n")

for t in range(T - 1, -1, -1):
    S_t = monthly_prices[t, :]  # prices at time t across all paths
    X = np.column_stack([S_t**d for d in range(poly_deg + 1)])  # (N_sim, 4)
    
    # Step 1: Fit regressions for each next-inventory level
    fitted_cont = np.zeros((n_levels, N_sim))
    for i_next in range(n_levels):
        y = V[t + 1, i_next, :]  # actual values from (t+1) for this level
        beta = np.linalg.lstsq(X, y, rcond=None)[0]
        reg_coeffs[(t, i_next)] = beta
        fitted_cont[i_next, :] = X @ beta
    
    # Step 2 + 3: For each current inventory level, choose best action and record
    for i_idx in range(n_levels):
        I = inventory_grid[i_idx]
        feasible = get_feasible_actions(I, t)
        
        n_act = len(feasible)
        # Evaluate all actions using FITTED continuation (decide with fitted)
        total_fitted = np.zeros((n_act, N_sim))
        total_actual = np.zeros((n_act, N_sim))
        
        for a_idx, a in enumerate(feasible):
            I_next = I + a
            i_next_idx = int(round(I_next / delta_I))
            cf = -a * S_t  # immediate cash flow
            total_fitted[a_idx, :] = cf + fitted_cont[i_next_idx, :]
            total_actual[a_idx, :] = cf + V[t + 1, i_next_idx, :]
        
        # Pick best action per path using FITTED values
        best_a_idx = np.argmax(total_fitted, axis=0)
        # Record ACTUAL value (decide with fitted, record with actual)
        V[t, i_idx, :] = total_actual[best_a_idx, np.arange(N_sim)]
    
    if t % 3 == 0 or t == T - 1:
        print(f"  t={t:2d} ({monthly_dates[t].strftime('%b %Y'):>8s})  "
              f"E[V(0)] = {V[t, 0, :].mean():>12,.0f}  "
              f"E[V(max)] = {V[t, -1, :].mean():>12,.0f}")

print(f"\n{'='*50}")
print(f"  E[V‚ÇÄ(I=0)] = {V[0, 0, :].mean():>12,.0f} EUR  (std: {V[0, 0, :].std():>10,.0f})")
print(f"  This is the expected value of an empty storage today.")

## 6. Forward Simulation ‚Äî Extracting the Optimal Policy

The backward induction gave us:
- The **value function** $V_t(I)$ for every $(t, I)$ pair and path
- **Regression coefficients** $\beta(t, I')$ that approximate $\mathbb{E}[V_{t+1}(I') \mid S_t]$

Now we run a **forward simulation** to see what the optimal policy actually *does*:

1. Start with $I_0 = 0$ (empty storage)
2. At each $t$, use the stored regression coefficients to compute fitted continuation values
3. Choose the action $a^*$ that maximizes immediate cash flow + fitted continuation
4. Record the action, cash flow, and update inventory

This gives us **realized inventory paths** and **realized profits** across all simulated price scenarios.

> **Why a separate forward pass?**  
> The backward induction computes values for *every* inventory level. The forward simulation follows only the *actually-visited* states, starting from $I_0 = 0$, producing the trading strategy we'd execute in practice.

In [None]:
# ‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê
#  FORWARD SIMULATION ‚Äî execute the optimal policy
# ‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê

I_paths = np.zeros((N_sim, T + 1))     # inventory trajectory per path
I_paths[:, 0] = 0                       # start with empty storage

action_paths = np.zeros((N_sim, T))     # action taken at each step
cf_paths = np.zeros((N_sim, T))         # cash flow at each step

for t in range(T):
    S_t = monthly_prices[t, :]
    X = np.column_stack([S_t**d for d in range(poly_deg + 1)])

    # Compute fitted continuation for ALL next-inventory levels using stored betas
    fitted_cont_fwd = np.zeros((n_levels, N_sim))
    for i_next in range(n_levels):
        fitted_cont_fwd[i_next, :] = X @ reg_coeffs[(t, i_next)]

    # For each inventory level currently reached by some paths, choose optimal action
    for i_idx in range(n_levels):
        I = inventory_grid[i_idx]
        mask = np.isclose(I_paths[:, t], I)
        if not mask.any():
            continue

        feasible = get_feasible_actions(I, t)
        n_act = len(feasible)

        # Evaluate all actions using fitted continuation
        total_fitted = np.zeros((n_act, N_sim))
        for a_idx, a in enumerate(feasible):
            I_next = I + a
            i_next_idx = int(round(I_next / delta_I))
            total_fitted[a_idx, :] = -a * S_t + fitted_cont_fwd[i_next_idx, :]

        best_a_idx = np.argmax(total_fitted, axis=0)

        # Record chosen action for paths currently at this inventory level
        for a_idx, a in enumerate(feasible):
            update = mask & (best_a_idx == a_idx)
            action_paths[update, t] = a

    # Update cash flows and inventory
    cf_paths[:, t] = -action_paths[:, t] * monthly_prices[t, :]
    I_paths[:, t + 1] = I_paths[:, t] + action_paths[:, t]

# ‚îÄ‚îÄ Summary statistics ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
total_profit = cf_paths.sum(axis=1)

print(f"Forward simulation results ({N_sim} paths)")
print(f"{'='*45}")
print(f"Mean total profit:   {total_profit.mean():>12,.0f} EUR")
print(f"Std total profit:    {total_profit.std():>12,.0f} EUR")
print(f"Min total profit:    {total_profit.min():>12,.0f} EUR")
print(f"Max total profit:    {total_profit.max():>12,.0f} EUR")
print(f"Median total profit: {np.median(total_profit):>12,.0f} EUR")

In [None]:
# ‚îÄ‚îÄ Plot results ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
x_labels = [d.strftime("%b '%y") for d in monthly_dates] + [date_end.strftime("%b '%y")]

fig = make_subplots(
    rows=3, cols=1, shared_xaxes=False,
    subplot_titles=[
        "Inventory paths (50 sample paths + mean)",
        "Mean monthly cash flow",
        "Total profit distribution",
    ],
    vertical_spacing=0.10,
)

# ‚îÄ‚îÄ Row 1: Inventory paths ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
for k in range(min(50, N_sim)):
    fig.add_trace(go.Scatter(
        x=x_labels, y=I_paths[k, :], mode="lines",
        line=dict(color="lightgray", width=0.5), showlegend=False
    ), row=1, col=1)
fig.add_trace(go.Scatter(
    x=x_labels, y=I_paths.mean(axis=0), mode="lines+markers",
    line=dict(color="cyan", width=3), name="Mean inventory"
), row=1, col=1)
fig.update_yaxes(title_text="Inventory [MWh]", row=1, col=1)

# ‚îÄ‚îÄ Row 2: Mean cash flow per month ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
mean_cf = cf_paths.mean(axis=0)
colors = ["#74d576" if c >= 0 else "#ff6b6b" for c in mean_cf]
fig.add_trace(go.Bar(
    x=x_labels[:T], y=mean_cf, marker_color=colors,
    name="Mean cash flow", showlegend=False
), row=2, col=1)
fig.update_yaxes(title_text="Cash flow [EUR]", row=2, col=1)

# ‚îÄ‚îÄ Row 3: Profit distribution ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
fig.add_trace(go.Histogram(
    x=total_profit, nbinsx=50,
    marker_color="orange", name="Total profit", showlegend=False
), row=3, col=1)
fig.update_xaxes(title_text="Total profit [EUR]", row=3, col=1)
fig.update_yaxes(title_text="Count", row=3, col=1)

fig.update_layout(template="plotly_dark", height=900, title="LSMC Gas Storage ‚Äî Results")
fig.show()

## 7. Verification & Analysis

We verify that the LSMC solution is correct by checking:

1. **Inventory bounds** ‚Äî all values in $[0, \text{WGV}]$
2. **Seasonal constraints** ‚Äî no withdrawal during injection season and vice versa
3. **Rate limits** ‚Äî actions never exceed monthly caps
4. **Non-negative average profit** ‚Äî the optimizer should always beat doing nothing
5. **Grid consistency** ‚Äî inventory always on multiples of $\Delta I$
6. **Backward vs. forward consistency** ‚Äî $E[V_0(0)]$ from backward ‚âà mean profit from forward

We also compare the LSMC policy against a **naive strategy** (always inject/withdraw at maximum rate) to show that the optimal policy adds value.

In [None]:
# ‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê
#  VERIFICATION CHECKS
# ‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê

checks_passed = 0
total_checks = 6

# Check 1: Inventory bounds
inv_min = I_paths.min()
inv_max = I_paths.max()
ok1 = inv_min >= -1e-6 and inv_max <= WGV + 1e-6
print(f"{'‚úÖ' if ok1 else '‚ùå'} Check 1: Inventory bounds [{inv_min:,.0f}, {inv_max:,.0f}] ‚äÇ [0, {WGV:,}]")
checks_passed += ok1

# Check 2: Seasonal constraints
ok2 = True
for t in range(T):
    month = monthly_dates[t].month
    if month in injection_months:
        if (action_paths[:, t] < -1e-6).any():
            ok2 = False
            break
    else:
        if (action_paths[:, t] > 1e-6).any():
            ok2 = False
            break
print(f"{'‚úÖ' if ok2 else '‚ùå'} Check 2: Seasonal constraints respected")
checks_passed += ok2

# Check 3: Rate limits
max_action = np.abs(action_paths).max()
ok3 = max_action <= max(monthly_inj_cap.max(), monthly_wit_cap.max()) + 1e-6
print(f"{'‚úÖ' if ok3 else '‚ùå'} Check 3: Max action = {max_action:,.0f} MWh (within rate limits)")
checks_passed += ok3

# Check 4: Non-negative average profit
ok4 = total_profit.mean() > 0
print(f"{'‚úÖ' if ok4 else '‚ùå'} Check 4: Mean profit = {total_profit.mean():,.0f} EUR > 0")
checks_passed += ok4

# Check 5: Grid consistency
all_on_grid = np.all(np.abs(I_paths % delta_I) < 1e-6)
print(f"{'‚úÖ' if all_on_grid else '‚ùå'} Check 5: All inventory values on grid multiples of {delta_I:,}")
checks_passed += all_on_grid

# Check 6: Backward vs forward consistency
backward_value = V[0, 0, :].mean()
forward_value = total_profit.mean()
pct_diff = abs(backward_value - forward_value) / abs(backward_value) * 100
ok6 = pct_diff < 5.0  # within 5%
print(f"{'‚úÖ' if ok6 else '‚ùå'} Check 6: Backward E[V‚ÇÄ(0)] = {backward_value:,.0f}, "
      f"Forward mean = {forward_value:,.0f} (diff: {pct_diff:.1f}%)")
checks_passed += ok6

print(f"\n{'='*50}")
print(f"  {checks_passed}/{total_checks} checks passed")

In [None]:
# ‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê
#  COMPARISON WITH NAIVE STRATEGY
# ‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê
# Naive policy: always inject at max rate during injection season,
#               always withdraw at max rate during withdrawal season.
# This is a deterministic policy ‚Äî same actions regardless of price.

I_naive = np.zeros(T + 1)
a_naive = np.zeros(T)
I_curr = 0.0

for t in range(T):
    month = monthly_dates[t].month
    if month in injection_months:
        a = min(monthly_inj_cap[t], WGV - I_curr)
        a = int(a // delta_I) * delta_I
    else:
        a = -min(monthly_wit_cap[t], I_curr)
        a = -int(abs(a) // delta_I) * delta_I
    a_naive[t] = a
    I_curr += a
    I_naive[t + 1] = I_curr

# Evaluate naive policy across all price paths
cf_naive = -a_naive[:, None] * monthly_prices   # (T, N_sim)
naive_profit = cf_naive.sum(axis=0)             # (N_sim,)

print("Comparison: LSMC optimal policy vs. naive (max inject/withdraw)")
print("=" * 60)
print(f"{'':>25s}  {'LSMC':>12s}  {'Naive':>12s}")
print("-" * 55)
print(f"{'Mean profit':>25s}  {total_profit.mean():>12,.0f}  {naive_profit.mean():>12,.0f}")
print(f"{'Std profit':>25s}  {total_profit.std():>12,.0f}  {naive_profit.std():>12,.0f}")
print(f"{'Min profit':>25s}  {total_profit.min():>12,.0f}  {naive_profit.min():>12,.0f}")
print(f"{'Max profit':>25s}  {total_profit.max():>12,.0f}  {naive_profit.max():>12,.0f}")
print(f"\n  LSMC advantage (mean): {total_profit.mean() - naive_profit.mean():>+12,.0f} EUR")

# Plot comparison
fig = go.Figure()
fig.add_trace(go.Histogram(x=total_profit, nbinsx=50, name="LSMC optimal",
                            marker_color="orange", opacity=0.7))
fig.add_trace(go.Histogram(x=naive_profit, nbinsx=50, name="Naive (max rate)",
                            marker_color="steelblue", opacity=0.7))
fig.update_layout(
    template="plotly_dark", barmode="overlay",
    title="Profit distribution: LSMC optimal vs. naive strategy",
    xaxis_title="Total profit [EUR]", yaxis_title="Count",
    height=400,
)
fig.show()

# Plot inventory comparison
x_labels = [d.strftime("%b '%y") for d in monthly_dates] + [date_end.strftime("%b '%y")]
fig2 = go.Figure()
fig2.add_trace(go.Scatter(
    x=x_labels, y=I_paths.mean(axis=0), mode="lines+markers",
    line=dict(color="orange", width=3), name="LSMC (mean)"
))
fig2.add_trace(go.Scatter(
    x=x_labels, y=I_naive, mode="lines+markers",
    line=dict(color="steelblue", width=3, dash="dash"), name="Naive"
))
fig2.update_layout(
    template="plotly_dark",
    title="Average inventory path: LSMC vs. naive",
    xaxis_title="Month", yaxis_title="Inventory [MWh]",
    height=400,
)
fig2.show()

In [None]:
"""
## üîÑ MILP vs LSMC Comparison

### Terminology Clarification: Intrinsic vs Extrinsic Value

**I misspoke earlier!** Let me correct this:

- **Intrinsic value** = value if exercised immediately (storage has zero intrinsic value when empty)
- **Extrinsic value** = optionality value from uncertainty/flexibility

**MILP** (deterministic) captures:
- ‚úÖ Intrinsic value from price spreads in the forward curve
- ‚ùå NO extrinsic value (assumes prices are known with certainty)

**LSMC** (stochastic) captures:
- ‚úÖ Intrinsic value from expected price spreads
- ‚úÖ Extrinsic value from price uncertainty and optimal timing

So MILP doesn't capture the "option value" of waiting for favorable price realizations!
"""

# LSMC results from our backward/forward simulation
lsmc_expected_value = total_profit.mean()
lsmc_backward_value = V[0, 0, :].mean()

print(f"LSMC Results:")
print(f"  Mean profit (forward sim):  {lsmc_expected_value:>12,.0f} EUR")
print(f"  E[V‚ÇÄ(0)] (backward):         {lsmc_backward_value:>12,.0f} EUR")
print(f"  Standard deviation:          {total_profit.std():>12,.0f} EUR")
print(f"\nNote: To compare with MILP, you'd need to run gas_storage_milp.ipynb")
print(f"      with the same storage parameters and forward price curve.")

In [None]:
"""
### ü§î Why not run MILP on each trajectory separately?

**In theory, you could run MILP with perfect foresight on each price path:**
- This would give you an **upper bound** on achievable profit
- But it's **unrealistic** - you don't know future prices when making decisions!
- LSMC solves the realistic problem: optimal policy under uncertainty

**Key differences:**
"""

print("Conceptual comparison:")
print(f"\n{'Method':<35s} {'Assumption':<40s}")
print("="*75)
print(f"{'MILP on forward curve':<35s} {'Deterministic: one known price path':<40s}")
print(f"{'LSMC (this notebook)':<35s} {'Stochastic: optimal policy vs uncertainty':<40s}")
print(f"{'MILP per path (perfect foresight)':<35s} {'Clairvoyant: knows future before acting':<40s}")
print("\nExpected ordering: MILP(forward) ‚â§ LSMC ‚â§ MILP(perfect foresight)")
print(f"\nLSMC value from this notebook: {lsmc_expected_value:,.0f} EUR")

In [None]:
"""
### üéØ Why Different Approaches? Key Insights

#### 1Ô∏è‚É£ **Why not run MILP on each trajectory?**

**You could**, but it's unrealistic!

**Problem:** This assumes **perfect foresight** - you know the entire price path before making decisions.
- In reality, at month t, you only know prices up to t, not future prices
- MILP-per-path would give an **upper bound** (unattainable in practice)
- LSMC gives the **realistic optimum** under uncertainty

**Analogy:** MILP-per-path is like playing poker with all cards revealed. LSMC is playing with only your hand visible.

---

#### 2Ô∏è‚É£ **Why not use backward induction on a single forward curve (deterministic)?**

**You could**, but it's pointless!

```python
# Backward induction on forward curve = dynamic programming with no randomness
# At each state (t, I):
#   V(t, I) = max over actions { profit(action) + V(t+1, I_new) }
#
# No regression needed! Just recursive calculation.
# Result: EXACTLY the same as MILP! 
# (Both solve the same deterministic optimization)
```

**Why?** Bellman equation with no uncertainty collapses to standard dynamic programming:
- No need for Monte Carlo (only 1 path)
- No need for regression (continuation value is deterministic)
- Just recursively maximize: same as MILP's linear program

**Benefit of LSMC:** Only worth it when there's **uncertainty** to average over!

---

#### 3Ô∏è‚É£ **What about BSd (buy/sell/discharge rate limits)?**

**Current LSMC model has NO hard constraints** on:
- ‚ùå Maximum injection rate per month
- ‚ùå Maximum withdrawal rate per month  
- ‚ùå Inventory bounds (handled by grid, but not exactly)

**In MILP:**
```python
# Explicit constraints:
injection[t] <= max_injection_rate
withdrawal[t] <= max_withdrawal_rate
0 <= inventory[t] <= capacity
```

**In LSMC, you'd need to:**

**Option A: Enforce in action space** (what we do now)
```python
# In get_feasible_actions():
# Only generate actions that satisfy:
#   withdrawal <= min(inventory[t], max_withdrawal_rate * Œît)
#   injection <= min(capacity - inventory[t], max_injection_rate * Œît)
```
‚úÖ Simple, but **inventory grid spacing** limits enforcement accuracy
- If grid step = 20k MWh, but max_withdrawal = 15k MWh, grid might not align
- Need finer grid or interpolation

**Option B: Penalty in regression** (soft constraint)
```python
# In backward induction:
continuation_value[invalid_actions] = -999999  # penalty
# Regression learns to avoid these states
```
‚ùå Not a hard guarantee, just discourages violations

**Option C: Constrained regression** (advanced)
```python
# Use constrained least squares:
from scipy.optimize import lsq_linear
# Enforce: fitted_value(inventory, action) >= 0 for valid actions
#          fitted_value(inventory, action) = -‚àû for invalid actions
```
‚úÖ Mathematically rigorous, but complex implementation

**Current model:**
- Uses grid spacing = 20k MWh
- Storage rates: inj=40k, with=30k MWh/month (from storages.json)
- Actions: ¬±1,2,3 grid steps = ¬±20k, 40k, 60k MWh
- **Some actions violate rates!** (e.g., +60k > 40k injection limit)

**What would change if we enforce strictly?**
1. Reduce valid actions: only {-1, 0, +1} grid steps (max ¬±20k MWh)
2. Reduce flexibility ‚Üí lower storage value
3. More accurate model, but need finer grid for good policy

---

### üìä Summary Table

| Method | Price Knowledge | Constraints | Optionality | Value |
|--------|----------------|-------------|-------------|-------|
| **MILP (forward curve)** | Deterministic | ‚úÖ Hard (LP) | ‚ùå No | Run `gas_storage_milp.ipynb` |
| **LSMC (uncertainty)** | Stochastic | ‚ö†Ô∏è Soft (grid) | ‚úÖ Yes | **" + f"{lsmc_expected_value:,.0f}" + " EUR** |
| **MILP (per path, perfect foresight)** | Clairvoyant | ‚úÖ Hard (LP) | ‚úÖ Yes | Upper bound (unrealistic) |

**Key insight:** LSMC value ‚â• deterministic MILP because it captures:
- **Intrinsic value** from price spreads (like MILP)
- **+ Extrinsic value** from optionality (waiting for favorable prices)
"""

In [None]:
# Visualize LSMC profit distribution with statistics
fig = go.Figure()

fig.add_trace(go.Histogram(
    x=total_profit, 
    nbinsx=50,
    name='LSMC profits',
    marker_color='orange',
    opacity=0.7
))

fig.add_vline(x=lsmc_expected_value, line_dash="dash", line_color="red", line_width=3,
              annotation_text=f"Mean: {lsmc_expected_value:,.0f} EUR",
              annotation_position="top right")

fig.add_vline(x=np.median(total_profit), line_dash="dot", line_color="cyan", line_width=2,
              annotation_text=f"Median: {np.median(total_profit):,.0f} EUR",
              annotation_position="bottom right")

fig.update_layout(
    template="plotly_dark",
    title="LSMC Profit Distribution Across All Simulated Price Paths",
    xaxis_title="Total Profit [EUR]",
    yaxis_title="Count",
    height=400,
    showlegend=False
)

fig.show()

print(f"\n{'='*60}")
print(f"LSMC captures optionality value from price uncertainty.")
print(f"To compare with MILP, run gas_storage_milp.ipynb separately.")
print(f"{'='*60}")