# FX AI Backtest and Metrics

This notebook builds a simple backtest based on already trained models inside `app.py`.

Approach:
- use the same data preparation and training logic as in the Streamlit application;
- take classifier predictions (SVC by default) on the test segment;
- build a strategy: position = signal (BUY=+1, HOLD=0, SELL=-1), hold for one horizon step;
- calculate daily strategy returns, equity curve, Sharpe, max drawdown, hit rate;
- compare with simple buy&hold on the same period.

Such a backtest is intentionally simplified (no commissions, slippage, etc.), but allows for quick evaluation of signal behavior.

In [None]:
import os
import sys
import numpy as np
import pandas as pd
import plotly.graph_objects as go

sys.path.append(os.path.abspath("."))

from app import (
    load_price_data,
    add_features,
    add_targets,
    train_models,
    detect_patterns,
)

## 1. Instrument Configuration

Set base parameters for the backtest. If desired, they can be changed and the notebook re-run.

In [None]:
ticker = "EURUSD=X"
instrument_name = "EUR/USD"
years = 5
interval = "1d"

horizon = 7
lower_q = 0.33
upper_q = 0.66

config = {
    "ticker": ticker,
    "instrument_name": instrument_name,
    "years": years,
    "interval": interval,
    "horizon": horizon,
    "lower_q": lower_q,
    "upper_q": upper_q,
}
config

## 2. Data Preparation and Model Training

Repeating the same pipeline as inside `app.py`: data loading, features, targets, model training.
We are primarily interested in the classification part and the train/test split.

In [None]:
df_raw = load_price_data(ticker, years=years, interval=interval)
if df_raw is None or df_raw.empty:
    raise RuntimeError("Failed to load data for ticker")

df_full = add_features(df_raw)
df_model = add_targets(
    df_full.copy(),
    horizon=horizon,
    lower_q=lower_q,
    upper_q=upper_q,
)
model_data = train_models(df_model)
split = model_data["split"]
metrics = model_data.get("metrics")
df_model.tail()

## 3. Building a Simple Strategy Based on SVC Signals

Taking SVC class predictions:
- `-1` — SELL,
- `0` — HOLD,
- `1` — BUY.

Using the test segment (last 20% of data), holding a position equal to the signal at each step, and calculating the return one step forward.

In [None]:
df_back = df_model.iloc[split:].copy()
svc_pred = model_data.get("svc_pred_test")
if svc_pred is None:
    raise RuntimeError("No SVC predictions in model_data")

svc_pred = np.asarray(svc_pred, dtype=float)
n = min(len(df_back), len(svc_pred))
df_back = df_back.iloc[-n:].copy()
signals_svc = svc_pred[-n:]

df_back["signal_svc"] = signals_svc

close = df_back["Close"].astype(float)
next_ret = close.pct_change().shift(-1)
df_back["next_ret"] = next_ret
df_back["strategy_ret_svc"] = df_back["signal_svc"] * df_back["next_ret"]

df_back[["Close", "signal_svc", "next_ret", "strategy_ret_svc"]].tail(10)

## 4. Calculating Metrics: Return, Sharpe, Max Drawdown, Hit Rate

Calculating:
- accumulated return of strategy and buy&hold;
- daily Sharpe (≈ sqrt(252) * mean / std);
- max drawdown of equity curve;
- hit rate (fraction of days when the strategy correctly guesses the sign of the next movement).

In [None]:
ret_strat = df_back["strategy_ret_svc"].fillna(0.0)
equity_strat = (1.0 + ret_strat).cumprod()

equity_bh = close / close.iloc[0]

cum_ret_strat = float(equity_strat.iloc[-1] - 1.0)
cum_ret_bh = float(equity_bh.iloc[-1] - 1.0)

mean_daily = float(ret_strat.mean())
std_daily = float(ret_strat.std())
if std_daily > 0:
    sharpe = float(np.sqrt(252.0) * mean_daily / std_daily)
else:
    sharpe = 0.0

running_max = equity_strat.cummax()
drawdown = equity_strat / running_max - 1.0
max_dd = float(drawdown.min())

mask_valid = df_back["next_ret"].notna() & (df_back["signal_svc"] != 0)
hits = np.sign(df_back.loc[mask_valid, "signal_svc"]) == np.sign(df_back.loc[mask_valid, "next_ret"])
if len(hits) > 0:
    hit_rate = float(hits.mean())
else:
    hit_rate = 0.0

metrics_table = pd.DataFrame(
    [
        {
            "Metric": "Cumulative return (strategy)",
            "Value": cum_ret_strat,
        },
        {
            "Metric": "Cumulative return (buy&hold)",
            "Value": cum_ret_bh,
        },
        {
            "Metric": "Sharpe (strategy, daily)",
            "Value": sharpe,
        },
        {
            "Metric": "Max drawdown (strategy)",
            "Value": max_dd,
        },
        {
            "Metric": "Hit rate (direction, strategy)",
            "Value": hit_rate,
        },
    ]
)
metrics_table

## 5. Adding Strategies Based on LGBM and Hybrid Signals

`model_data` already contains predictions from other classifiers:
- `class_pred_test` — LGBM signals (after threshold and smoothing);
- `hybrid_class_pred_test` — Hybrid LGBM+LSTM (if LSTM is available).

Let's build similar simple strategies for them as for SVC, and compare metrics.

In [None]:
def build_strategy_from_signals(df_base: pd.DataFrame, signals_raw: np.ndarray, column_name: str):
    if signals_raw is None:
        return None
    sig = np.asarray(signals_raw, dtype=float)
    n_local = min(len(df_base), len(sig))
    df_loc = df_base.iloc[-n_local:].copy()
    df_loc[column_name] = sig[-n_local:]
    close_loc = df_loc["Close"].astype(float)
    next_ret_loc = close_loc.pct_change().shift(-1)
    df_loc[f"{column_name}_ret"] = df_loc[column_name] * next_ret_loc
    return df_loc

lgbm_pred = model_data.get("class_pred_test")
hybrid_pred = model_data.get("hybrid_class_pred_test")

df_lgbm = build_strategy_from_signals(df_back, lgbm_pred, "signal_lgbm")
df_hybrid = build_strategy_from_signals(df_back, hybrid_pred, "signal_hybrid")

strategies_equity = {"svc": equity_strat}
if df_lgbm is not None:
    ret_lgbm = df_lgbm["signal_lgbm_ret"].fillna(0.0)
    strategies_equity["lgbm"] = (1.0 + ret_lgbm).cumprod()
if df_hybrid is not None:
    ret_hybrid = df_hybrid["signal_hybrid_ret"].fillna(0.0)
    strategies_equity["hybrid"] = (1.0 + ret_hybrid).cumprod()

metrics_rows = []
for name, eq in strategies_equity.items():
    ret_seq = eq.pct_change().fillna(0.0)
    cum_ret = float(eq.iloc[-1] - 1.0)
    mean_d = float(ret_seq.mean())
    std_d = float(ret_seq.std())
    sharpe_local = float(np.sqrt(252.0) * mean_d / std_d) if std_d > 0 else 0.0
    run_max = eq.cummax()
    dd = eq / run_max - 1.0
    max_dd_local = float(dd.min())
    metrics_rows.append(
        {
            "Strategy": name,
            "Cumulative return": cum_ret,
            "Sharpe": sharpe_local,
            "Max drawdown": max_dd_local,
        }
    )

metrics_multi = pd.DataFrame(metrics_rows)
metrics_multi

## 6. Visualization of PnL for SVC / LGBM / Hybrid vs buy&hold Strategies

Now let's compare several strategies on one chart: SVC, LGBM, Hybrid and buy&hold.

In [None]:
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=equity_strat.index,
        y=equity_strat.values,
        mode="lines",
        name="Strategy (SVC)",
        line=dict(color="blue", width=2),
    )
)
if "lgbm" in strategies_equity:
    eq_lgbm = strategies_equity["lgbm"]
    fig.add_trace(
        go.Scatter(
            x=eq_lgbm.index,
            y=eq_lgbm.values,
            mode="lines",
            name="Strategy (LGBM)",
            line=dict(color="green", width=2, dash="dot"),
        )
    )
if "hybrid" in strategies_equity:
    eq_hybrid = strategies_equity["hybrid"]
    fig.add_trace(
        go.Scatter(
            x=eq_hybrid.index,
            y=eq_hybrid.values,
            mode="lines",
            name="Strategy (Hybrid)",
            line=dict(color="orange", width=2, dash="dash"),
        )
    )
fig.add_trace(
    go.Scatter(
        x=equity_bh.index,
        y=equity_bh.values,
        mode="lines",
        name="Buy & Hold",
        line=dict(color="gray", width=2, dash="dashdot"),
    )
)
fig.update_layout(
    title=f"Strategies PnL (SVC / LGBM / Hybrid) vs Buy&Hold ({instrument_name}, test segment)",
    xaxis_title="Date",
    yaxis_title="Equity (normalized)",
    hovermode="x unified",
    template="plotly_white",
    legend=dict(orientation="h"),
)
fig.show()

## 7. Recommendations and Typical Mistakes of Such a Backtest

1. **Absence of Transaction Costs**  
   The current calculation does not include commissions, spreads, or slippage. In reality, FX spreads and broker commissions can significantly reduce returns, especially if the strategy trades frequently.

2. **Fixed Holding Horizon**  
   Here it is assumed that the signal lasts exactly one step forward. In reality, signals can have different "lifespans", and more complex exit logic (take profit, stop loss, trailing, etc.) may be required.

3. **Use of Only One Classifier (SVC)**  
   The project has several models (LGBM, LSTM, Hybrid). This notebook uses only SVC as the main model for signals. For a more complete analysis, it is worth building similar backtests for different signal variants and comparing results.

4. **Risk of Overfitting to History**  
   Model parameters, horizon, and classification thresholds were selected on history. When the market regime changes, quality may deteriorate. It is useful to:
   - check the model on several non-overlapping time periods;
   - regularly retrain models and track rolling metrics.

5. **Signal Frequency and Volatility Filtering**  
   Here, all SVC signals fall into the strategy. The project already has volatility estimation by ATR(14); a logical development would be:
   - not to trade during extremely high volatility (reduce position size or completely filter signals);
   - adapt the classifier confidence threshold depending on volatility.

6. **Implicit Dependence on Specific Test Period**  
   We use the last 20% data window as a test. If the market was anomalous during this period, the metrics may not be representative. For a more reliable assessment, it is worth implementing rolling or cross-validation (e.g., walk-forward).

This notebook provides a starting point for assessing signal quality on historical data. Further complexity can be added: adding commissions, testing different models/threshold strategies, and building reports on multiple tickers.