# Stock / Trading Signals - Stochastic Forecast Model

This notebook builds a **more realistic** short-term forecast by:
1. Modeling **log-returns** instead of raw prices,
2. Fitting an **AutoReg** process,
3. Adding **bootstrapped residual shocks** to simulate market variability.

The result is a scenario-style forecast path that is less likely to appear as an artificial straight line.

In [7]:
import numpy as np
import pandas as pd
import yfinance as yf
import plotly.graph_objects as go
import plotly.io as pio
from IPython.display import display
from statsmodels.tsa.ar_model import AutoReg

pio.renderers.default = 'vscode'

## Load real market data
Change `symbol` to test stocks, currencies, or minerals:
- Stock: `AAPL`, `MSFT`
- Currency: `EURUSD=X`, `GBPUSD=X`
- Mineral: `GC=F`, `SI=F`

In [2]:
symbol = 'AAPL'
period = '2y'
horizon = 30  # trading days

df = yf.Ticker(symbol).history(period=period, interval='1d')
series = df['Close'].dropna().astype(float)
series.tail()

Unnamed: 0_level_0,Close
Date,Unnamed: 1_level_1
2026-02-11 00:00:00-05:00,275.5
2026-02-12 00:00:00-05:00,261.730011
2026-02-13 00:00:00-05:00,255.779999
2026-02-17 00:00:00-05:00,263.880005
2026-02-18 00:00:00-05:00,264.350006


## Stochastic AR bootstrap forecaster

In [3]:
def stochastic_ar_bootstrap_forecast(price_series, steps=30, seed=42):
    prices = np.asarray(price_series, dtype=float)
    if np.any(prices <= 0):
        raise ValueError('Prices must be positive')

    log_returns = np.diff(np.log(prices))
    if len(log_returns) < 40:
        raise ValueError('Need at least 40 return points')

    lags = max(2, min(8, len(log_returns) // 15))
    model = AutoReg(log_returns, lags=lags, old_names=False).fit()

    residuals = np.asarray(model.resid, dtype=float)
    residuals = residuals[np.isfinite(residuals)]
    recent_residuals = residuals[-min(252, residuals.size):]

    low_q, high_q = np.quantile(log_returns, [0.01, 0.99])
    rng = np.random.default_rng(seed)

    params = np.asarray(model.params, dtype=float)
    intercept = params[0]
    coefs = params[1:]

    history = log_returns.tolist()
    simulated_returns = []

    for _ in range(steps):
        lag_values = np.array([history[-i] for i in range(1, lags + 1)], dtype=float)
        mean_return = float(intercept + np.dot(coefs, lag_values))
        shock = float(rng.choice(recent_residuals)) * 0.9
        next_return = float(np.clip(mean_return + shock, low_q, high_q))

        history.append(next_return)
        simulated_returns.append(next_return)

    future_log_prices = np.log(prices[-1]) + np.cumsum(np.asarray(simulated_returns, dtype=float))
    return np.exp(future_log_prices)

In [8]:
forecast_values = stochastic_ar_bootstrap_forecast(series.values, steps=horizon, seed=123)
future_dates = pd.bdate_range(series.index[-1] + pd.Timedelta(days=1), periods=horizon)

fig = go.Figure()
fig.add_trace(go.Scatter(x=series.index[-120:], y=series.values[-120:], mode='lines', name='Historical'))
fig.add_trace(go.Scatter(
    x=[series.index[-1], *future_dates],
    y=[series.values[-1], *forecast_values],
    mode='lines',
    name='Stochastic Forecast',
    line=dict(dash='dash')
))
fig.update_layout(title=f'{symbol} - Stochastic Forecast', xaxis_title='Date', yaxis_title='Price')
display(fig)

## Notes
- This forecast is a **single plausible scenario path**, not a guaranteed exact future.
- Re-run with different seeds for multiple scenarios and uncertainty bands.
- For production use, combine this with error backtesting (walk-forward validation).