# Walk-Forward Validation Runner and Analysis

This notebook executes a full Walk-Forward Validation pipeline and analyzes the aggregated out-of-sample results. This provides a more robust estimate of the strategy's performance across different market regimes, serving as the ultimate test of its viability.

In [None]:
import sys
import os

# Get the absolute path of the project's root directory
# os.getcwd() gets the current folder ('/notebooks')
# os.path.join(..., '..') goes one level up to the project root
project_root = os.path.abspath(os.path.join(os.getcwd(), '..'))

# Add the project root to the Python path
if project_root not in sys.path:
    sys.path.append(project_root)

# --- Imports and Setup ---
import yaml
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report

# Import custom project modules
from src.training_pipeline import TrainingPipeline
from src.backtester import VectorizedBacktester
from src.utils import plot_confusion_matrix, plot_roc_curve

# --- Load Configuration ---
# Construct the absolute path to the config file using project_root
config_path = os.path.join(project_root, 'configs', 'config.yaml')
print(f"Loading configuration from: {config_path}")
with open(config_path, 'r') as file:
    config = yaml.safe_load(file)
print("Configuration loaded successfully.")

In [None]:
# Create an instance of the training pipeline, passing the project_root
pipeline = TrainingPipeline(config=config, project_root=project_root)

# Run the entire walk-forward validation process.
# WARNING: This will take a significant amount of time to run,
# as it trains and evaluates the model 5 times.
wf_results = pipeline.run_walk_forward(n_splits=5)

print("\n\n✅ --- Walk-Forward Validation Finished! --- ✅")
print("Aggregated results are now available in the 'wf_results' variable.")

## 1. Aggregated Statistical Performance Analysis

Here we analyze the combined performance across all out-of-sample folds. This gives us a single, robust view of the model's predictive power over time.

In [None]:
print("Analyzing aggregated out-of-sample results...")

# Extract the combined results from all folds
true_labels = wf_results['true_labels']
pred_probas = wf_results['pred_probas']
binary_preds = (pred_probas > 0.5).astype(int)

# --- Display Overall Metrics ---
print("\n--- Overall Classification Report (All Out-of-Sample Folds) ---")
print(classification_report(true_labels, binary_preds))

# --- Plot Overall Evaluation Graphs ---
plot_confusion_matrix(true_labels, binary_preds)
plot_roc_curve(true_labels, pred_probas, "Walk-Forward Aggregated")

## 2. Aggregated Backtest Performance

This is the final and most important test. We combine the out-of-sample signals from all folds into a single continuous timeline and run one final backtest. This simulates how the strategy would have performed in a real-world scenario where the model is periodically retrained.

In [None]:
# Extract the combined backtest dataframe from the results
backtest_df = wf_results['backtest_df']

print(f"Aggregated backtest will run on {len(backtest_df)} signals.")
print(f"Covering the period from {backtest_df.index.min().date()} to {backtest_df.index.max().date()}.")

# Instantiate and run the backtester on the combined out-of-sample signals
backtester = VectorizedBacktester(
    price_data=backtest_df,
    signals=backtest_df['signal'],
    config=config
)

# Run the backtest with exit signals enabled
portfolio = backtester.run(commission=0.001, slippage=0.001)