#  Deep Learning for the Valuation of Bermudan Options: Primal-Dual Bounds Estimation

---

#### Authors:  
**Georges ROLLAND / Philippe YAO**

#### Professors:  
**Gilles Pagès / Vincent Lemaire**

---


#  Introduction

## Bermudan Options

A Bermudan option is a financial derivative that allows its holder to exercise their right at specific discrete dates before the final maturity.  
It lies between:
- European options (exercise only at maturity),
- and American options (exercise at any time before maturity).

---

## Project Objective

The goal is to **compute a reliable estimate of the price** of a Bermudan put option by combining two key approaches from the following papers:

- **Deep Optimal Stopping (Becker, Cheridito, Jentzen, 2019)**:  
  Using **neural networks** to approximate the optimal stopping strategy, determine a lower bound, and construct a candidate martingale for the upper bound.

- **Dual Approach of Rogers (2002)**:  
  Building an **upper bound** on the price by optimizing over candidate martingales.

---

## General Approach

- Simulation of trajectories for the underlying asset.
- Learning the optimal stopping rules using deep learning techniques.
- Estimation of the **lower bound** via Monte Carlo simulation.
- Estimation of the **upper bound** by combining different martingales (especially the discounted European put price).
- Application of **Richardson extrapolation** to enhance numerical accuracy.
- **Construction of a confidence interval** for the estimated price.
- Robustness Tests and variance reduction
---


### Librairies 

In [120]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from scipy.stats import norm
import pandas as pd
import time
from scipy.optimize import minimize

### Paramètres initiaux

In [159]:
r = 0.06    # risk-free interest rate
sigma = 0.4 # volatility
K = 100     # strike price
T = 0.5     # maturity (in years)
N = 50      # number of discrete time steps
d = 1       # dimension of the process
h = T / N   # time step size


### 1. Simulation des trajectoires et calcul du payoff

#####  Simulation under the Black-Scholes Model

The underlying asset follows the dynamics:

$$
dS_t = r S_t \, dt + \sigma S_t \, dW_t
$$

By discretization, we simulate:

$$
S_{n+1} = S_n \exp\left( \left(r - \frac{1}{2} \sigma^2\right) \Delta t + \sigma \sqrt{\Delta t} Z_n \right)
$$

where $Z_n \sim \mathcal{N}(0,1)$ are independent standard normal random variables

---

##### 💰 Payoff of the European Put Option

The standard payoff used is:

$$
g(S) = \max(K - S, 0)
$$

---


In [162]:
# Function to simulate paths under the Black-Scholes model
def simulate_paths(S0, M, N, r, sigma, T, d=1, seed=42):
    """
    Simulates M paths of a lognormal Brownian motion over N+1 time steps.
    
    Args:
        S0: initial asset price
        M: number of simulated paths
        N: number of time steps
        r: risk-free rate
        sigma: volatility
        T: maturity (in years)
        d: dimension of the process (default=1)
        seed: random seed for reproducibility
        
    Returns:
        time_grid: np.ndarray of time points
        X: np.ndarray of simulated paths (shape (N+1, M, d))
    """
    np.random.seed(seed)
    dt = T / N
    time_grid = np.linspace(0, T, N+1)
    
    # Initialization
    X = np.zeros((N+1, M, d))
    X[0] = S0

    # Path simulation
    for n in range(N):
        dW = np.random.normal(0, np.sqrt(dt), size=(M, d))
        X[n+1] = X[n] * np.exp((r - 0.5 * sigma**2) * dt + sigma * dW)

    return time_grid, X


# 2. Payoff function for the European Put Option
def payoff(S, K):
    """
    Standard payoff function for a European Put option.
    
    Args:
        S: underlying asset prices
        K: strike price
        
    Returns:
        np.ndarray of payoffs
    """
    return np.maximum(K - S, 0)


## Optimal Stopping Problem

We consider a discrete-time optimal stopping problem where the goal is to maximize the expected reward:

$$
\sup_{\tau \in \mathcal{T}} \mathbb{E}[g(\tau, X_\tau)],
$$

where $ X = (X_n)_{n=0}^N $ is a Markov process taking values in $ \mathbb{R}^d $, $ g $ is a measurable reward function, and $ \mathcal{T} $ is the set of all stopping times.

In our context, $ X $ follows a Black-Scholes type dynamic, and $ g(\tau, X_\tau) $ represents the discounted payoff of a Bermudan option.

To solve this problem:

- We approximate the optimal stopping decisions using **neural networks** to learn the continuation functions.
- We use two types of estimators:
  - A **classical estimator** (lower bound) based on the learned stopping policy.
  - A **dual estimator** (upper bound) inspired by the method of Rogers (2002).

The objective is to obtain an accurate estimate of the option price along with a rigorous confidence interval.


### 2. Neural Network for Approximating Stopping Decisions


To approximate the optimal stopping rule, we construct a family of functions $f_n^\theta : \mathbb{R}^d \to \{0, 1\}$ using feedforward neural networks.  
Each $f_n^\theta$ determines at each time step $n$ whether it is optimal to stop or to continue.

The neural network approximation $F^\theta$ has the structure:

$$
F^\theta = \psi \circ a_I^\theta \circ \varphi_{q_{I-1}} \circ a_{I-1}^\theta \circ \cdots \circ \varphi_{q_1} \circ a_1^\theta
$$

where:
- $\psi$ is the standard logistic (Sigmoid) activation function,
- $a_i^\theta$ are affine transformations (linear layers),
- $\varphi_q$ denotes the componentwise ReLU activation functions.

The network is trained via **stochastic gradient ascent** on Monte Carlo simulations, using standard techniques such as Xavier initialization and batch normalization.

At each step, we recursively approximate the optimal stopping rule, starting from the final time $N$ and moving backward to the initial time.
e initial time.


In [242]:
class StoppingNetwork(nn.Module):
    def __init__(self, input_dim, hidden_layers=[10, 10, 10]):
        """
        Neural network architecture to approximate stopping decisions.
        
        Args:
            input_dim: dimension of X_t (d)
            hidden_layers: list specifying the number of neurons in each hidden layer
        """
        super(StoppingNetwork, self).__init__()
        
        layers = []
        dims = [input_dim] + hidden_layers
        
        for i in range(len(hidden_layers)):
            layers.append(nn.Linear(dims[i], dims[i+1]))
            layers.append(nn.ReLU())  # ReLU activation

        layers.append(nn.Linear(hidden_layers[-1], 1))
        layers.append(nn.Sigmoid())  # Final Sigmoid activation
        
        self.model = nn.Sequential(*layers)

    def forward(self, x):
        return self.model(x).squeeze(-1)  # Output (batch_size,)

def cost_function_f_theta(model, Xn, gXn, g_tau_n1_Xtau_n1):
    """
    Compute the cost function to maximize (expectation form).
    """
    f_theta_Xn = model(Xn)
    loss = torch.mean(gXn * f_theta_Xn + g_tau_n1_Xtau_n1 * (1 - f_theta_Xn))
    return loss

def train_stopping_network(model, Xn, gXn, g_tau_n1_Xtau_n1, n_epochs=100, lr=0.001):
    """
    Train a single stopping network f_theta^n.
    """
    optimizer = torch.optim.Adam(model.parameters(), lr=lr)
    
    for epoch in range(n_epochs):
        optimizer.zero_grad()
        loss = -cost_function_f_theta(model, Xn, gXn, g_tau_n1_Xtau_n1)  # maximize
        loss.backward()
        optimizer.step()
    
    return model

def train_all_stopping_networks(X_paths, g_func, hidden_layers=[10, 10, 10], n_epochs=100, lr=0.01):
    """
    Train all stopping networks by backward induction, with normalization.

    Returns:
        stopping_networks: list of trained networks
        mean: mean used for normalization
        std: std used for normalization
    """
    N_plus_1, n_paths, d = X_paths.shape
    N = N_plus_1 - 1

    # Compute normalization parameters
    mean = np.mean(X_paths, axis=(0,1))  # shape (d,)
    std = np.std(X_paths, axis=(0,1)) + 1e-8  # to avoid division by zero

    # Convert to torch
    X_paths_torch = torch.tensor(X_paths, dtype=torch.float32)
    mean_torch = torch.tensor(mean, dtype=torch.float32)
    std_torch = torch.tensor(std, dtype=torch.float32)

    # Precompute immediate payoffs
    gX = torch.zeros((N_plus_1, n_paths))
    for n in range(N_plus_1):
        gX[n] = g_func(n, X_paths_torch[n])

    stopping_networks = [None for _ in range(N)]

    g_tau_n1_Xtau_n1 = gX[N].clone()  # final payoff

    for n in reversed(range(N)):
        print(f"Training stopping network for time step n = {n}")

        Xn = (X_paths_torch[n] - mean_torch) / std_torch  # normalization
        gXn = gX[n]

        model = StoppingNetwork(input_dim=d, hidden_layers=hidden_layers)
        trained_model = train_stopping_network(model, Xn, gXn, g_tau_n1_Xtau_n1, n_epochs=n_epochs, lr=lr)
        stopping_networks[n] = trained_model

        with torch.no_grad():
            stop_probs = trained_model(Xn)
            g_tau_n1_Xtau_n1 = gXn * stop_probs + g_tau_n1_Xtau_n1 * (1 - stop_probs)

    return stopping_networks, mean, std


### 3.Lower bound

In order to estimate the true price $V_0$ of the Bermudan option, we compute:

- A **lower bound** $L$ using the learned stopping decisions $f^\theta_n$,
- An **upper bound** $U$ based on the dual approach optimizing over martingalesr Bound

Once the networks $f_n^\theta$ are trained, the associated stopping time $\tau^\Theta$ provides a **lower bound** for the price:

$$
L = \mathbb{E} \left[ g(\tau^\Theta, X_{\tau^\Theta}) \right].
$$

To compute it:
- Simulate $K_L$ independent test paths $(y_n^k)_{n=0}^N$,
- Apply the learned stopping rule to each path,
- Use the Monte Carlo estimator:

$$
\hat{L} = \frac{1}{K_L} \sum_{k=1}^{K_L} g(l^k, y_{l^k}^k).
$$

where $l^k$ denotes the stopping time applied to path $k$.

By the law of large numbers, $\hat{L}$ converges to $L$ when $K_L \to \infty$.

In [246]:
def compute_lower_bound(stopping_networks, X_paths_test, payoff_func, mean, std):
    """
    Computes the lower bound using the learned stopping rules.
    """
    N_plus_1, n_paths, d = X_paths_test.shape
    N = N_plus_1 - 1

    X_paths_test_torch = torch.tensor(X_paths_test, dtype=torch.float32)
    X_paths_normalized = (X_paths_test_torch - mean) / std

    gX = np.zeros((N_plus_1, n_paths))
    for n in range(N_plus_1):
        S_n = X_paths_test[n, :, 0] * std.item() + mean.item()  # ⚡ Dénormaliser
        gX[n] = payoff_func(n, torch.tensor(S_n).unsqueeze(1))  # Keep shape (n_paths, 1)

    rewards = np.zeros(n_paths)
    stopping_times = np.zeros(n_paths, dtype=int)

    for j in range(n_paths):
        for n in range(N):
            prob_stop = stopping_networks[n](X_paths_normalized[n, j].unsqueeze(0)).detach().numpy()
            if prob_stop >= 0.5:
                rewards[j] = gX[n, j]
                stopping_times[j] = n
                break
        else:
            rewards[j] = gX[N, j]
            stopping_times[j] = N

    lower_bound = np.mean(rewards)
    return lower_bound, rewards


### Optimization of the Vector $ \lambda^* $ in the Rogers Method (Upper Bound)

The goal is to estimate the value of an American option by constructing an **upper bound** on its price.  
The upper bound relies on the following dual representation:

$$
Y_0^* = \inf_{M \in H_0^1} \mathbb{E}\left[ \sup_{0 \leq t \leq T} (Z_t - M_t) \right],
$$

where:
- $ H_0^1 $ is the set of martingales $ M $ such that $ M_0 = 0 $ and $ \sup_{0 \leq t \leq T} |M_t| \in L^1 $,
- $ Z_t $ is the discounted payoff process (for example, $ Z_t = e^{-rt}(K - S_t)^+ $).

---

### Strategy:

- Build a family of candidate martingales $ (M^1, M^2, \dots, M^d) $,
- Consider linear combinations $ M_t^\lambda = \sum_{k=1}^d \lambda_k M_t^k $,
- Solve the following optimization problem:

$$
\lambda^* = \arg\min_{\lambda \in \mathbb{R}^d} \mathbb{E}\left[\sup_{0 \leq t \leq T} \left(Z_t - \sum_{k=1}^d \lambda_k M_t^k\right) \right],
$$

where $ \lambda \cdot M_t = \sum_{k=1}^d \lambda_k M_t^k $.

This optimization is a **convex minimization** problem over $ \lambda $.
roblem over \( \lambda \).


### 4.First Martingale Construction

We construct a martingale based on the trained stopping networks.

At each time step $n$, we define noisy increments:

$$
\Delta M_n^{\Theta, k} = f_n^\theta(z_n^k) g(n, z_n^k) + (1 - f_n^\theta(z_n^k)) C_n^k - C_{n-1}^k,
$$

where:
- $f_n^\theta(z_n^k)$ is the stopping probability predicted by the neural network at time $n$,
- $g(n, z_n^k)$ is the immediate reward,
- $C_n^k$ is an estimated continuation value based on $J$ additional Monte Carlo simulations starting from $z_n^k$.

The cumulative martingale along the $k$-th path is then given by:

$$
M_n^k = \sum_{m=1}^n \Delta M_m^{\Theta, k},
$$

with the convention $M_0^k = 0$.

This martingale $M_n^k$ approximates the Snell envelope of the process and can be used to derive a **dual upper bound**:

$$
\hat{U} = \frac{1}{K_U} \sum_{k=1}^{K_U} \max_{0 \leq n \leq N} \left( g(n, z_n^k) - M_n^k \right).
$$

This martingale captures both immediate rewards and continuation values, refined by the trained networks.


In [214]:
def construct_martingale_deep(X_paths_test, stopping_networks, g_func, r, T):
    """
    Construit la martingale basée sur Deep Optimal Stopping (martingale classique).

    Args:
        X_paths_test: np.ndarray (N+1, n_paths, d)
        stopping_networks: list of trained networks
        g_func: function (n, Xn) -> rewards
        r: taux sans risque
        T: maturité

    Returns:
        martingale: np.ndarray (N+1, n_paths)
    """
    N_plus_1, n_paths, d = X_paths_test.shape
    N = N_plus_1 - 1
    h = T / N

    X_paths_test_torch = torch.tensor(X_paths_test, dtype=torch.float32)

    martingale = np.zeros((N+1, n_paths))

    # First compute the value at time 0 for normalization
    discount_factor_0 = np.exp(-r * 0 * h)
    value0 = g_func(0, X_paths_test_torch[0]).numpy() * discount_factor_0

    for n in range(N+1):
        discount_factor = np.exp(-r * n * h)
        if n == N:
            values = g_func(N, X_paths_test_torch[N]).numpy() * discount_factor
        else:
            stop_probs = stopping_networks[n](X_paths_test_torch[n]).detach().numpy()
            immediate_rewards = g_func(n, X_paths_test_torch[n]).numpy() * discount_factor
            continuation_values = np.zeros(n_paths)

            for k in range(n_paths):
                continuation_values[k] = immediate_rewards[k] * stop_probs[k]
        
            values = continuation_values

        # Centering immediately
        martingale[n] = values - value0

    return martingale

#### 5. Second martingale

In [211]:
def construct_martingale_rogers(X_paths_test, r, T, N, K, sigma):
    """
    Constructs the candidate martingale based on the discounted European put price.
    
    Returns:
        martingale: np.ndarray of shape (N+1, n_paths), the centered martingale satisfying M_0 = 0
    """
    times = np.linspace(0, T, N+1)
    n_paths = X_paths_test.shape[1]

    martingale = np.zeros((N+1, n_paths))

    # First compute M0 separately
    S0 = X_paths_test[0, :, 0]
    price0 = np.exp(-r * (T - times[0])) * european_put_price(times[0], S0, K, T, r, sigma)

    for n in range(N+1):
        S_n = X_paths_test[n, :, 0]  # underlying asset prices at time step n
        price = np.exp(-r * (T - times[n])) * european_put_price(times[n], S_n, K, T, r, sigma)
        martingale[n] = price - price0  # Direct centering at each step

    return martingale

#### Optimisation

In [225]:
from scipy.optimize import minimize

def optimize_lambda(payoffs, martingales):
    """
    Robustly optimizes the linear combination of candidate martingales.

    Args:
        payoffs: np.ndarray (N+1, n_paths) - actualized payoff paths
        martingales: np.ndarray (d, N+1, n_paths) - d candidate martingales

    Returns:
        lambda_opt: optimal vector of lambdas
    """
    N_plus_1, n_paths = payoffs.shape
    d = martingales.shape[0]

    def cost(lambda_vect):
        combined = np.tensordot(lambda_vect, martingales, axes=1)  # shape (N+1, n_paths)
        eta = np.max(payoffs - combined, axis=0)
        return np.mean(eta)

    # Initialization: small random noise around zero
    np.random.seed(0)
    initial_lambda = 0.01 * np.random.randn(d)

    # Robust optimization settings
    result = minimize(
        cost,
        initial_lambda,
        method='BFGS',
        options={
            'gtol': 0.1,       # relaxed tolerance on gradient (was 1e-6)
            'maxiter': 500,    # allow more iterations
            'disp': False,      # silent mode
            'eps': 1e-5         # numerical step for finite differences
        }
    )

    if not result.success:
        print(f"⚠️ Optimization failed: {result.message}")
        print("⚙️ Using the initial guess as fallback solution.")
        return initial_lambda

    return result.x


#### Upper bound

In [244]:
def compute_upper_bound(X_paths_test, stopping_networks, r, T, N, K, sigma, mean, std):
    """
    Computes the upper bound with martingale construction and optimization.
    """
    h = T / N
    times = np.linspace(0, T, N+1)
    n_paths = X_paths_test.shape[1]

    X_paths_test_torch = torch.tensor(X_paths_test, dtype=torch.float32)
    
    # Normalize for the networks
    X_paths_normalized = (X_paths_test_torch - mean) / std

    # Deep martingale
    martingale_deep = construct_martingale_deep(X_paths_normalized.numpy(), stopping_networks, payoff_func, r, T)

    # Rogers martingale
    martingale_rogers = construct_martingale_rogers(X_paths_test, r, T, N, K, sigma)

    # Discounted Payoff (⚡ Dénormalisation pour payoff)
    S_test = X_paths_test[:, :, 0] * std.item() + mean.item()
    discount_factors = np.exp(-r * times[:, None])
    payoffs = np.maximum(K - S_test, 0) * discount_factors  # shape (N+1, n_paths)

    # Optimize lambda
    martingales = np.stack([martingale_deep, martingale_rogers], axis=0)  # (2, N+1, n_paths)
    lambda_star = optimize_lambda(payoffs, martingales)

    # Combine martingales
    combined_martingale = np.tensordot(lambda_star, martingales, axes=1)  # (N+1, n_paths)

    # Eta values
    eta = np.max(payoffs - combined_martingale, axis=0)

    upper_bound = np.mean(eta)
    std_dev = np.std(eta)
    
    return upper_bound, std_dev, eta


#### Confident interval

In [234]:
# Richardson extrapolation
def richardson_extrapolation(V_coarse, V_fine):
    """
    Richardson extrapolation for improved upper bound estimation.
    """
    return 2 * V_fine - V_coarse

# Payoff wrapper for training
payoff_func = lambda n, X: payoff(X[:, 0], K)

# Function to compute point estimate and global confidence interval
def compute_interval_and_estimate(lower_bound, upper_bound,
                                  rewards_lower, values_upper,
                                  KL, KU):
    """
    Compute point estimate and 95% global confidence interval based on empirical variances.

    Args:
        lower_bound: float, estimated lower bound (mean)
        upper_bound: float, estimated upper bound (mean)
        rewards_lower: np.ndarray, realizations of g(l^k, y^k_{l^k})
        values_upper: np.ndarray, realizations of max_n (g(n, z_n^k) - M_n^k)
        KL: int, number of samples for lower bound
        KU: int, number of samples for upper bound

    Returns:
        dict containing the lower bound, upper bound, point estimate, confidence interval, and gap.
    """
    z_score = 1.96  # 95% confidence level (normal quantile)

    # Empirical standard deviations following the article
    lower_variance = np.sum((rewards_lower - lower_bound)**2) / (KL - 1)
    upper_variance = np.sum((values_upper - upper_bound)**2) / (KU - 1)

    # Final standard errors (for the CLT scaling)
    lower_se = np.sqrt(lower_variance / KL)
    upper_se = np.sqrt(upper_variance / KU)

    # Point estimate
    point_estimate = (lower_bound + upper_bound) / 2

    # Global confidence interval
    ci_lower = lower_bound - z_score * lower_se
    ci_upper = upper_bound + z_score * upper_se

    return {
        "Lower Bound": lower_bound,
        "Upper Bound": upper_bound,
        "Lower Std Error": lower_se,
        "Upper Std Error": upper_se,
        "Point Estimate": point_estimate,
        "Confidence Interval 95%": (ci_lower, ci_upper),
        "Gap": upper_bound - lower_bound
    }


##### First numerical results 

In [None]:
# True American option prices (from Rogers, 2002)
true_american_prices = {
    80: 21.6059,
    85: 18.0374,
    90: 14.9187,
    95: 12.2314,
    100: 9.9458,
    105: 8.0281,
    110: 6.4352,
    115: 5.1265,
    120: 4.0611
}

# Global parameters
M_train = 300      # number of training paths
M_test_L = 5000    # number of testing paths for lower bound
M_test_U = 5000    # number of testing paths for upper bound
N = 50             # number of time steps
T = 0.5            # maturity
r = 0.06           # risk-free interest rate
sigma = 0.4        # volatility
K = 100            # strike price
d = 1              # dimension of the process

# List of initial spot prices
S0_list = np.arange(80, 125, 5)
results = []

# Main loop over initial prices
for S0 in S0_list:
    print(f"Processing S0 = {S0}")

    start_time = time.time()

    # Fine simulation (2N time steps)
    times_train_fine, X_train_fine = simulate_paths(S0, M_train, 2*N, r, sigma, T)
    stopping_networks_fine = train_all_stopping_networks(X_train_fine, payoff_func, n_epochs=100, lr=0.001)

    times_test_fine, X_test_fine = simulate_paths(S0, M_test_U, 2*N, r, sigma, T)
    V_fine, mad_fine, std_fine, _ = compute_upper_bound(X_test_fine, stopping_networks_fine, r, T, 2*N, K, sigma)

    # Coarse reduction (every two time steps)
    X_test_coarse = X_test_fine[::2]
    V_coarse, mad_coarse, std_coarse, _ = compute_upper_bound(X_test_coarse, stopping_networks_fine, r, T, N, K, sigma)

    # Richardson extrapolation
    V_richardson = richardson_extrapolation(V_coarse, V_fine)

    # Lower bound estimation
    times_train, X_train = simulate_paths(S0, M_train, N, r, sigma, T)
    stopping_networks = train_all_stopping_networks(X_train, payoff_func, n_epochs=100, lr=0.001)

    times_test, X_test = simulate_paths(S0, M_test_L, N, r, sigma, T)
    lower_bound, rewards_lower = compute_lower_bound(stopping_networks, X_test, payoff_func)
    lower_std = np.std(rewards_lower)

    # European option price (Black-Scholes closed formula)
    european_price = european_put_price(0, S0, K, T, r, sigma)

    # Confidence interval and point estimate computation
    stats = compute_interval_and_estimate(lower_bound, lower_std, V_richardson, std_fine, K_L=M_test_L, K_U=M_test_U)

    end_time = time.time()
    execution_time = end_time - start_time

    # Store results
    result = {
        "S0": S0,
        "American Put Price": true_american_prices[S0],
        "European Put Price": european_price,
        "Lower Bound": stats["Lower Bound"],
        "Upper Bound": stats["Upper Bound"],
        "Point Estimate": stats["Point Estimate"],
        "Confidence Interval 95%": stats["Confidence Interval 95%"],
        "Gap": stats["Gap"],
        "Standard Error (Lower Bound)": lower_std / np.sqrt(M_test_L),
        "Standard Error (Upper Bound)": std_fine / np.sqrt(M_test_U),
        "Execution Time (s)": execution_time
    }

    results.append(result)

# Final conversion into a DataFrame
df_results = pd.DataFrame(results)
df_results_display = df_results.round(4)
print(df_results_display)

# (Optional) Export results to CSV
# df_results.to_csv("deep_optimal_stopping_results.csv", index=False)


Processing S0 = 80
Training stopping network for time step n = 99
Training stopping network for time step n = 98
Training stopping network for time step n = 97
Training stopping network for time step n = 96
Training stopping network for time step n = 95
Training stopping network for time step n = 94
Training stopping network for time step n = 93
Training stopping network for time step n = 92
Training stopping network for time step n = 91
Training stopping network for time step n = 90
Training stopping network for time step n = 89
Training stopping network for time step n = 88
Training stopping network for time step n = 87
Training stopping network for time step n = 86
Training stopping network for time step n = 85
Training stopping network for time step n = 84
Training stopping network for time step n = 83
Training stopping network for time step n = 82
Training stopping network for time step n = 81
Training stopping network for time step n = 80
Training stopping network for time step n