# Backtesting Engine Tutorial

This notebook provides a comprehensive tutorial for the Backtesting Engine, covering all major features from basic usage to advanced analysis.

## Table of Contents
1. [Installation and Setup](#installation)
2. [Basic Concepts](#concepts)
3. [Data Management](#data)
4. [Creating Strategies](#strategies)
5. [Execution Simulation](#execution)
6. [Running Backtests](#backtests)
7. [Performance Analysis](#analysis)
8. [Advanced Features](#advanced)
9. [Best Practices](#best-practices)

## 1. Installation and Setup {#installation}

First, let's install the necessary dependencies and set up our environment.

In [None]:
# Install dependencies (run this in your terminal)
# pip install -e .

# Import necessary libraries
import sys
import os
import logging
from datetime import datetime, timedelta
from decimal import Decimal
from pathlib import Path

# Set up logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

# Add the project root to Python path
project_root = Path().absolute().parent
if str(project_root) not in sys.path:
    sys.path.append(str(project_root))

print(f"Project root: {project_root}")
print("Setup complete!")

## 2. Basic Concepts {#concepts}

The backtesting engine follows an event-driven architecture with these key components:

- **Events**: Market data, signals, orders, and fills
- **Portfolio**: Position tracking and P&L calculation
- **Strategies**: Trading logic and signal generation
- **Broker**: Order execution with realistic slippage and commissions
- **Engine**: Orchestrates the entire backtesting process

In [None]:
# Import core components
from backtesting_engine import BacktestEngine
from backtesting_engine.core.events import MarketEvent, SignalEvent, OrderEvent, FillEvent
from backtesting_engine.core.portfolio import Portfolio
from backtesting_engine.core.data_handler import DataConfig, CSVDataHandler

print("Core components imported successfully!")

# Let's examine the event types
print("\nEvent Types:")
print(f"MarketEvent: {MarketEvent.__doc__}")
print(f"SignalEvent: {SignalEvent.__doc__}")

## 3. Data Management {#data}

The engine supports multiple data sources: CSV files, databases, and APIs. Let's start by creating some sample data.

In [None]:
import pandas as pd
import numpy as np

def create_sample_data(symbol, start_date, end_date, initial_price=100):
    """Create realistic sample market data."""
    dates = pd.date_range(start_date, end_date, freq='D')
    
    # Generate returns with some autocorrelation and volatility clustering
    np.random.seed(42 + hash(symbol) % 1000)
    returns = np.random.normal(0.0005, 0.02, len(dates))
    
    # Add momentum and mean reversion effects
    for i in range(1, len(returns)):
        returns[i] += 0.1 * returns[i-1]  # Momentum
        if i > 20:
            returns[i] -= 0.05 * np.mean(returns[i-20:i])  # Mean reversion
    
    # Calculate prices
    prices = initial_price * np.exp(np.cumsum(returns))
    
    # Generate OHLCV data
    df = pd.DataFrame(index=dates)
    df['close'] = prices
    df['open'] = df['close'].shift(1) * (1 + np.random.normal(0, 0.003, len(dates)))
    df['high'] = np.maximum(df['open'], df['close']) * (1 + np.abs(np.random.normal(0, 0.008, len(dates))))
    df['low'] = np.minimum(df['open'], df['close']) * (1 - np.abs(np.random.normal(0, 0.008, len(dates))))
    df['volume'] = np.random.lognormal(15, 0.5, len(dates)).astype(int)
    df['adj_close'] = df['close']
    
    return df.dropna()

# Create sample data for multiple symbols
symbols = ['AAPL', 'MSFT', 'GOOGL']
start_date = '2020-01-01'
end_date = '2023-12-31'

data_dir = Path('data')
data_dir.mkdir(exist_ok=True)

for symbol in symbols:
    df = create_sample_data(symbol, start_date, end_date)
    file_path = data_dir / f'{symbol}.csv'
    df.to_csv(file_path)
    print(f"Created data for {symbol}: {len(df)} records")

# Display sample data
sample_df = pd.read_csv(data_dir / 'AAPL.csv', index_col=0, parse_dates=True)
print("\nSample AAPL data:")
print(sample_df.head())
print(f"\nData shape: {sample_df.shape}")
print(f"Date range: {sample_df.index[0]} to {sample_df.index[-1]}")

In [None]:
# Set up data handler
data_config = DataConfig(
    source_type='csv',
    path_or_connection='data',
    symbols=symbols,
    start_date=datetime(2020, 1, 1),
    end_date=datetime(2023, 12, 31),
    frequency='daily',
    validate_data=True,
    handle_missing='forward_fill'
)

print(f"Data configuration created for symbols: {data_config.symbols}")
print(f"Source type: {data_config.source_type}")
print(f"Date range: {data_config.start_date} to {data_config.end_date}")

## 4. Creating Strategies {#strategies}

Let's implement a simple mean reversion strategy and understand how strategies work.

In [None]:
from backtesting_engine.strategies.mean_reversion import MeanReversionStrategy
from backtesting_engine.strategies.base import BaseStrategy

# Create a mean reversion strategy
strategy = MeanReversionStrategy(
    strategy_id="mean_reversion_tutorial",
    symbols=symbols,
    lookback_period=20,
    std_dev_multiplier=2.0,
    rsi_period=14,
    rsi_oversold=30.0,
    rsi_overbought=70.0,
    position_size=0.05,  # 5% of portfolio per position
    stop_loss=0.05,      # 5% stop loss
    take_profit=0.10     # 10% take profit
)

print(f"Strategy created: {strategy}")
print(f"Strategy parameters: {strategy.parameters}")
print(f"Trading symbols: {strategy.symbols}")

### Custom Strategy Example

Let's create a simple custom strategy to understand the framework.

In [None]:
from typing import List, Optional
from backtesting_engine.core.events import SignalEvent, OrderSide

class SimpleMomentumStrategy(BaseStrategy):
    """Simple momentum strategy for educational purposes."""
    
    def __init__(self, strategy_id: str, symbols: List[str], lookback_days: int = 10):
        parameters = {'lookback_days': lookback_days}
        super().__init__(strategy_id, symbols, parameters)
        self.lookback_days = lookback_days
    
    def generate_signals(self, market_data, portfolio):
        """Generate signals based on simple momentum."""
        signals = []
        
        # Update indicators
        self.update_indicators(market_data)
        
        for symbol in self.symbols:
            if symbol not in market_data.data:
                continue
                
            # Check if we have enough price history
            if len(self.price_history[symbol]) < self.lookback_days:
                continue
            
            current_price = self.price_history[symbol][-1]
            past_price = self.price_history[symbol][-self.lookback_days]
            
            # Calculate return
            momentum = (current_price - past_price) / past_price
            
            current_position = self.get_position_size(symbol, portfolio)
            
            # Generate signals
            if momentum > 0.05 and current_position == 0:  # Strong positive momentum
                signal = SignalEvent(
                    timestamp=market_data.timestamp,
                    strategy_id=self.strategy_id,
                    symbol=symbol,
                    signal_type=OrderSide.BUY,
                    strength=min(1.0, momentum * 10),  # Cap at 100%
                    target_percent=0.03,  # 3% position
                    metadata={'momentum': momentum}
                )
                signals.append(signal)
                
            elif momentum < -0.03 and current_position > 0:  # Exit on negative momentum
                signal = SignalEvent(
                    timestamp=market_data.timestamp,
                    strategy_id=self.strategy_id,
                    symbol=symbol,
                    signal_type=OrderSide.SELL,
                    strength=1.0,
                    target_percent=0.0,  # Close position
                    metadata={'momentum': momentum, 'exit_reason': 'negative_momentum'}
                )
                signals.append(signal)
        
        return signals

# Create custom strategy instance
momentum_strategy = SimpleMomentumStrategy(
    strategy_id="simple_momentum",
    symbols=['AAPL'],
    lookback_days=15
)

print(f"Custom strategy created: {momentum_strategy}")

## 5. Execution Simulation {#execution}

The execution system simulates realistic trading conditions with slippage, commissions, and partial fills.

In [None]:
from backtesting_engine.execution.broker import SimulatedBroker
from backtesting_engine.execution.slippage import LinearSlippageModel, SquareRootSlippageModel
from backtesting_engine.execution.commissions import PercentageCommissionModel, InteractiveBrokersCommissionModel

# Create different execution models

# 1. Conservative execution (low costs)
conservative_slippage = LinearSlippageModel(
    base_rate=Decimal('0.0005'),  # 0.05% base slippage
    size_impact=Decimal('0.005'),
    volatility_impact=Decimal('0.002')
)

conservative_commission = PercentageCommissionModel(
    commission_rate=Decimal('0.0005'),  # 0.05% commission
    min_commission=Decimal('1.0')
)

conservative_broker = SimulatedBroker(
    slippage_model=conservative_slippage,
    commission_model=conservative_commission,
    partial_fill_probability=0.05
)

# 2. Realistic execution (moderate costs)
realistic_slippage = SquareRootSlippageModel(
    impact_coefficient=Decimal('0.1'),
    volatility_scaling=Decimal('1.0')
)

realistic_commission = InteractiveBrokersCommissionModel(account_type="pro")

realistic_broker = SimulatedBroker(
    slippage_model=realistic_slippage,
    commission_model=realistic_commission,
    partial_fill_probability=0.15,
    reject_probability=0.001
)

print("Execution models created:")
print(f"Conservative broker: {conservative_broker}")
print(f"Realistic broker: {realistic_broker}")

## 6. Running Backtests {#backtests}

Now let's put everything together and run a complete backtest.

In [None]:
# Create the backtesting engine
engine = BacktestEngine(
    start_date=datetime(2020, 1, 1),
    end_date=datetime(2023, 12, 31),
    initial_capital=Decimal('1000000'),  # $1M starting capital
    commission=Decimal('0.001'),
    margin_requirement=Decimal('0.5'),
    max_leverage=Decimal('2.0')
)

print(f"Backtesting engine created: {engine}")

# Add data handler
data_handler = CSVDataHandler(data_config)
engine.add_data_handler(data_handler)
print("Data handler added")

# Add broker
engine.set_broker(realistic_broker)
print("Broker configured")

# Add strategy
engine.add_strategy(strategy.strategy_id, strategy)
print("Strategy added")

print("\nBacktest configuration complete!")

In [None]:
# Run the backtest
print("Starting backtest...")
print("This may take a few moments...")

try:
    # Clear any previous state
    engine.portfolio.holdings.clear()
    engine.portfolio.positions.clear()
    
    # Run backtest
    results = engine.run()
    
    print("✅ Backtest completed successfully!")
    print(f"📈 Total Return: {results.metrics['total_return']:.2%}")
    print(f"📊 Sharpe Ratio: {results.metrics['sharpe_ratio']:.2f}")
    print(f"📉 Max Drawdown: {results.metrics['max_drawdown']:.2%}")
    
except Exception as e:
    print(f"❌ Backtest failed: {str(e)}")
    print("\n🔧 This is expected in demo mode - check data files exist and are properly formatted.")

## 7. Performance Analysis {#analysis}

Let's analyze the backtest results in detail.

In [None]:
# Get broker statistics
broker_stats = realistic_broker.get_statistics()

print("EXECUTION STATISTICS")
print("="*50)
print(f"Total Orders: {broker_stats['total_orders']}")
print(f"Filled Orders: {broker_stats['filled_orders']}")
print(f"Fill Rate: {broker_stats['fill_rate']:.2%}")
print(f"Rejection Rate: {broker_stats['rejection_rate']:.2%}")
print(f"Average Commission: ${broker_stats['avg_commission']:.2f}")
print(f"Average Slippage: ${broker_stats['avg_slippage']:.4f}")
print(f"Total Commission Paid: ${broker_stats['total_commission']:.2f}")
print(f"Total Slippage Cost: ${broker_stats['total_slippage']:.2f}")

# Portfolio analysis
portfolio = results.portfolio
print("\nPORTFOLIO ANALYSIS")
print("="*50)
print(f"Initial Capital: ${portfolio.initial_capital:,.2f}")
print(f"Final Equity: ${portfolio.calculate_total_equity():,.2f}")
print(f"Cash Position: ${portfolio.cash:,.2f}")
print(f"Total Return: {(portfolio.calculate_total_equity() / portfolio.initial_capital - 1):.2%}")
print(f"Realized P&L: ${portfolio.realized_pnl:,.2f}")
print(f"Unrealized P&L: ${portfolio.calculate_unrealized_pnl():,.2f}")

# Position analysis
if portfolio.positions:
    print("\nCURRENT POSITIONS")
    print("-"*50)
    for symbol, position in portfolio.positions.items():
        print(f"{symbol}: {position.quantity:+d} shares @ ${position.avg_price:.2f} ")
        print(f"  Market Value: ${position.market_value:,.2f}")
        print(f"  Unrealized P&L: ${position.unrealized_pnl:+,.2f}")
else:
    print("\nNo current positions")

In [None]:
# Analyze trades
if results.completed_trades:
    trades_data = []
    for trade in results.completed_trades:
        trades_data.append({
            'timestamp': trade.timestamp,
            'symbol': trade.symbol,
            'side': trade.side.value,
            'quantity': trade.quantity,
            'price': float(trade.fill_price),
            'commission': float(trade.commission),
            'slippage': float(trade.slippage)
        })
    
    trades_df = pd.DataFrame(trades_data)
    
    print("TRADE ANALYSIS")
    print("="*50)
    print(f"Total Trades: {len(trades_df)}")
    print(f"Buy Orders: {len(trades_df[trades_df['side'] == 'BUY'])}")
    print(f"Sell Orders: {len(trades_df[trades_df['side'] == 'SELL'])}")
    print(f"\nTrade Volume by Symbol:")
    print(trades_df.groupby('symbol')['quantity'].sum())
    print(f"\nAverage Trade Size: {trades_df['quantity'].mean():.0f} shares")
    print(f"Average Trade Value: ${(trades_df['price'] * trades_df['quantity']).mean():,.2f}")
    
    print("\nRecent Trades:")
    print(trades_df.tail(10).to_string(index=False))
else:
    print("No trades executed")

## 8. Advanced Features {#advanced}

Let's explore some advanced features like parameter optimization and comparison studies.

In [None]:
# Parameter sensitivity analysis
def run_parameter_study():
    """Run a simple parameter sensitivity study."""
    parameters_to_test = [
        {'lookback_period': 10, 'std_dev_multiplier': 1.5, 'position_size': 0.03},
        {'lookback_period': 20, 'std_dev_multiplier': 2.0, 'position_size': 0.05},
        {'lookback_period': 30, 'std_dev_multiplier': 2.5, 'position_size': 0.07},
    ]
    
    results_summary = []
    
    for i, params in enumerate(parameters_to_test):
        print(f"\nTesting parameter set {i+1}: {params}")
        
        try:
            # Create new strategy with different parameters
            test_strategy = MeanReversionStrategy(
                strategy_id=f"mean_reversion_test_{i+1}",
                symbols=symbols,
                **params
            )
            
            # Create new engine
            test_engine = BacktestEngine(
                start_date=datetime(2020, 1, 1),
                end_date=datetime(2023, 12, 31),
                initial_capital=Decimal('1000000')
            )
            
            # Set up components (simplified for speed)
            test_engine.add_data_handler(CSVDataHandler(data_config))
            test_engine.set_broker(conservative_broker)  # Use faster broker
            test_engine.add_strategy(test_strategy.strategy_id, test_strategy)
            
            # Run backtest
            test_results = test_engine.run()
            
            # Calculate metrics
            final_value = test_results.portfolio.calculate_total_equity()
            total_return = (final_value / test_results.portfolio.initial_capital) - 1
            
            results_summary.append({
                'parameters': params,
                'final_value': float(final_value),
                'total_return': float(total_return),
                'num_trades': len(test_results.completed_trades)
            })
            
            print(f"  Total Return: {total_return:.2%}")
            print(f"  Final Value: ${final_value:,.2f}")
            print(f"  Number of Trades: {len(test_results.completed_trades)}")
            
        except Exception as e:
            print(f"  ❌ Test failed: {e}")
            results_summary.append({
                'parameters': params,
                'final_value': None,
                'total_return': None,
                'num_trades': None,
                'error': str(e)
            })
    
    return results_summary

# Note: This might take a while to run
print("Running parameter sensitivity analysis...")
print("(This is a simplified version for demonstration)")

# For the tutorial, we'll just show the structure
print("\nParameter study structure demonstrated.")
print("To run the full study, uncomment the line below:")
print("# param_results = run_parameter_study()")

### Strategy Comparison

Let's compare different strategies side by side.

In [None]:
def compare_strategies():
    """Compare different trading strategies."""
    strategies_to_test = [
        {
            'name': 'Conservative Mean Reversion',
            'strategy': MeanReversionStrategy(
                strategy_id="conservative_mr",
                symbols=['AAPL'],
                lookback_period=30,
                std_dev_multiplier=2.5,
                position_size=0.03
            )
        },
        {
            'name': 'Aggressive Mean Reversion',
            'strategy': MeanReversionStrategy(
                strategy_id="aggressive_mr",
                symbols=['AAPL'],
                lookback_period=10,
                std_dev_multiplier=1.5,
                position_size=0.08
            )
        },
        {
            'name': 'Simple Momentum',
            'strategy': SimpleMomentumStrategy(
                strategy_id="simple_momentum",
                symbols=['AAPL'],
                lookback_days=15
            )
        }
    ]
    
    comparison_results = []
    
    for strategy_config in strategies_to_test:
        print(f"\nTesting {strategy_config['name']}...")
        
        # This would run each strategy and collect results
        # For the tutorial, we'll simulate results
        simulated_result = {
            'name': strategy_config['name'],
            'total_return': np.random.normal(0.15, 0.1),  # Simulated return
            'volatility': np.random.normal(0.2, 0.05),    # Simulated volatility
            'max_drawdown': np.random.normal(-0.08, 0.03), # Simulated drawdown
            'num_trades': np.random.randint(50, 200)       # Simulated trade count
        }
        
        comparison_results.append(simulated_result)
        print(f"  Total Return: {simulated_result['total_return']:.2%}")
        print(f"  Volatility: {simulated_result['volatility']:.2%}")
        print(f"  Max Drawdown: {simulated_result['max_drawdown']:.2%}")
    
    return comparison_results

print("Strategy Comparison Framework:")
comparison_results = compare_strategies()

# Create comparison table
comparison_df = pd.DataFrame(comparison_results)
print("\nStrategy Comparison Results (Simulated):")
print(comparison_df.round(4))

## 9. Best Practices {#best-practices}

Here are some best practices for using the backtesting engine effectively.

### 1. Data Quality

- Always validate your data before backtesting
- Handle missing data appropriately
- Be aware of survivorship bias
- Include transaction costs and slippage

### 2. Strategy Development

- Start simple and add complexity gradually
- Use out-of-sample testing
- Avoid over-optimization
- Consider regime changes

### 3. Risk Management

- Always include position sizing
- Use stop-losses appropriately
- Monitor correlation between positions
- Stress test your strategies

### 4. Performance Analysis

- Look beyond just returns
- Analyze drawdowns carefully
- Consider risk-adjusted metrics
- Understand your strategy's behavior in different market conditions

In [None]:
# Example of comprehensive strategy validation
def validate_strategy(strategy, data_config, validation_periods):
    """Comprehensive strategy validation framework."""
    
    validation_results = {}
    
    for period_name, (start_date, end_date) in validation_periods.items():
        print(f"\nValidating strategy for {period_name}: {start_date} to {end_date}")
        
        # Create period-specific configuration
        period_config = DataConfig(
            source_type=data_config.source_type,
            path_or_connection=data_config.path_or_connection,
            symbols=data_config.symbols,
            start_date=start_date,
            end_date=end_date,
            frequency=data_config.frequency
        )
        
        # This would run the strategy for each period
        # and collect performance metrics
        
        # Simulated results for demonstration
        validation_results[period_name] = {
            'total_return': np.random.normal(0.1, 0.05),
            'sharpe_ratio': np.random.normal(1.2, 0.3),
            'max_drawdown': np.random.normal(-0.1, 0.03),
            'win_rate': np.random.normal(0.6, 0.1)
        }
        
        print(f"  Period performance (simulated): {validation_results[period_name]}")
    
    return validation_results

# Define validation periods
validation_periods = {
    'Bull Market': (datetime(2020, 4, 1), datetime(2021, 3, 31)),
    'Bear Market': (datetime(2022, 1, 1), datetime(2022, 12, 31)),
    'Recovery': (datetime(2023, 1, 1), datetime(2023, 12, 31))
}

print("Strategy Validation Framework:")
validation_results = validate_strategy(strategy, data_config, validation_periods)

# Analyze consistency across periods
print("\nConsistency Analysis:")
metrics_df = pd.DataFrame(validation_results).T
print(metrics_df.round(3))

print("\nMetric Stability (lower is better):")
for metric in metrics_df.columns:
    std_dev = metrics_df[metric].std()
    mean_val = metrics_df[metric].mean()
    cv = std_dev / abs(mean_val) if mean_val != 0 else float('inf')
    print(f"  {metric}: CV = {cv:.3f}")

## Conclusion

This tutorial covers the essential features of the Backtesting Engine:


### Next Steps

- Experiment with different strategy parameters
- Try implementing your own custom strategies
- Explore different execution models
- Use real market data with the API data handler
- Implement walk-forward analysis for robust validation

### Resources

- Check the `examples/` directory for more detailed examples
- Read the documentation for advanced configuration options
- Use the CLI tool for automated backtesting workflows
- Explore the test suite for implementation details