# Algorithmic Trading Strategy Development

This notebook demonstrates how to develop, backtest, and analyze algorithmic trading strategies using our custom framework.

## Table of Contents
1. [Setup and Data Loading](#setup)
2. [Strategy Implementation](#strategy)
3. [Backtesting](#backtesting)
4. [Performance Analysis](#analysis)
5. [Strategy Comparison](#comparison)

## 1. Setup and Data Loading {#setup}

In [None]:
# Import required libraries
import sys
import os
sys.path.append('../src')

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

# Import our custom modules
from data.data_loader import DataLoader, load_single_stock
from backtesting.engine import BacktestEngine
from backtesting.strategy_base import StrategyBase
from strategies.moving_average import MovingAverageCrossover, ExponentialMovingAverageCrossover, TripleMovingAverageCrossover
from utils.metrics import PerformanceAnalyzer, quick_performance_summary
from config import config

# Set up plotting
plt.style.use('seaborn-v0_8')
plt.rcParams['figure.figsize'] = (12, 8)

print("✓ All modules imported successfully")
print(f"✓ Initial capital: ${config.initial_capital:,.2f}")
print(f"✓ Commission: {config.commission:.3%}")
print(f"✓ Slippage: {config.slippage:.3%}")

### Load Market Data

In [None]:
# Define parameters
SYMBOLS = ['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'TSLA']
START_DATE = '2020-01-01'
END_DATE = '2023-12-31'

print(f"Loading data for {len(SYMBOLS)} symbols from {START_DATE} to {END_DATE}...")

# Load data
loader = DataLoader()
data = loader.fetch_stock_data(SYMBOLS, START_DATE, END_DATE)

# Display data summary
print(f"\n📊 Data Summary:")
for symbol, df in data.items():
    print(f"  {symbol}: {len(df)} records, {df.index[0].strftime('%Y-%m-%d')} to {df.index[-1].strftime('%Y-%m-%d')}")

# Create combined dataset for multi-asset strategies
combined_data = pd.concat(data.values(), ignore_index=False)
combined_data = combined_data.sort_values(['date', 'symbol']) if 'date' in combined_data.columns else combined_data.sort_index()

print(f"\n✓ Combined dataset: {len(combined_data)} total records")

### Visualize Sample Data

In [None]:
# Plot price data for selected symbols
fig, axes = plt.subplots(2, 3, figsize=(18, 12))
axes = axes.flatten()

for i, (symbol, df) in enumerate(data.items()):
    if i < len(axes):
        ax = axes[i]
        
        # Plot price and moving averages
        ax.plot(df.index, df['close'], label='Close Price', linewidth=1)
        ax.plot(df.index, df['sma_20'], label='SMA 20', alpha=0.7)
        ax.plot(df.index, df['sma_50'], label='SMA 50', alpha=0.7)
        
        ax.set_title(f'{symbol} - Price and Moving Averages')
        ax.set_ylabel('Price ($)')
        ax.legend()
        ax.grid(True, alpha=0.3)

# Remove empty subplot
if len(data) < len(axes):
    fig.delaxes(axes[-1])

plt.tight_layout()
plt.show()

# Display basic statistics
print("\n📈 Price Statistics:")
stats_data = []
for symbol, df in data.items():
    total_return = (df['close'].iloc[-1] / df['close'].iloc[0] - 1) * 100
    volatility = df['returns'].std() * np.sqrt(252) * 100
    max_price = df['close'].max()
    min_price = df['close'].min()
    
    stats_data.append({
        'Symbol': symbol,
        'Total Return (%)': f"{total_return:.1f}%",
        'Volatility (%)': f"{volatility:.1f}%",
        'Max Price': f"${max_price:.2f}",
        'Min Price': f"${min_price:.2f}",
        'Current Price': f"${df['close'].iloc[-1]:.2f}"
    })

stats_df = pd.DataFrame(stats_data)
print(stats_df.to_string(index=False))

## 2. Strategy Implementation {#strategy}

Let's implement and test different moving average strategies.

In [None]:
# Create strategy instances
strategies = [
    MovingAverageCrossover(
        short_window=20, 
        long_window=50, 
        position_size_pct=0.2
    ),
    ExponentialMovingAverageCrossover(
        short_span=12, 
        long_span=26, 
        position_size_pct=0.2
    ),
    TripleMovingAverageCrossover(
        fast_window=10,
        medium_window=20,
        slow_window=50,
        position_size_pct=0.25
    )
]

print("📋 Strategy Configuration:")
for strategy in strategies:
    info = strategy.get_strategy_info()
    print(f"\n{info['name']}:")
    for param, value in info['parameters'].items():
        print(f"  {param}: {value}")

## 3. Backtesting {#backtesting}

Run backtests for all strategies and compare performance.

In [None]:
# Initialize backtest engine
engine = BacktestEngine(
    initial_capital=config.initial_capital,
    commission=config.commission,
    slippage=config.slippage
)

print("🚀 Starting backtests...\n")

# Run backtests for all strategies
results = engine.run_multiple_backtests(
    strategies=strategies,
    data=combined_data,
    start_date=START_DATE,
    end_date=END_DATE,
    benchmark_symbol='SPY'
)

print("\n✅ All backtests completed!")

### Results Summary

In [None]:
# Display results summary
summary_df = engine.get_results_summary()
print("📊 Backtest Results Summary:")
print(summary_df.to_string(index=False))

# Find best performing strategy
if not summary_df.empty:
    # Convert percentage strings back to float for comparison
    summary_df['Total Return Numeric'] = summary_df['Total Return'].str.rstrip('%').astype(float)
    best_strategy = summary_df.loc[summary_df['Total Return Numeric'].idxmax(), 'Strategy']
    print(f"\n🏆 Best performing strategy: {best_strategy}")

## 4. Performance Analysis {#analysis}

Detailed analysis of strategy performance.

In [None]:
# Plot portfolio value evolution
fig, axes = plt.subplots(2, 2, figsize=(16, 12))

# Portfolio value over time
ax1 = axes[0, 0]
for strategy_name, result in results.items():
    if result and 'portfolio_history' in result:
        portfolio_df = result['portfolio_history']
        ax1.plot(portfolio_df['timestamp'], portfolio_df['total_value'], 
                label=strategy_name, linewidth=2)

ax1.set_title('Portfolio Value Over Time')
ax1.set_ylabel('Portfolio Value ($)')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Drawdown analysis
ax2 = axes[0, 1]
analyzer = PerformanceAnalyzer()

for strategy_name, result in results.items():
    if result and 'portfolio_history' in result:
        portfolio_values = result['portfolio_history']['total_value']
        peak = portfolio_values.expanding().max()
        drawdown = (portfolio_values - peak) / peak * 100
        
        ax2.fill_between(range(len(drawdown)), drawdown, 0, 
                        alpha=0.3, label=f'{strategy_name} DD')

ax2.set_title('Drawdown Analysis')
ax2.set_ylabel('Drawdown (%)')
ax2.legend()
ax2.grid(True, alpha=0.3)

# Returns distribution
ax3 = axes[1, 0]
for strategy_name, result in results.items():
    if result and 'portfolio_history' in result:
        portfolio_df = result['portfolio_history']
        returns = portfolio_df['total_value'].pct_change().dropna() * 100
        ax3.hist(returns, bins=50, alpha=0.6, label=strategy_name, density=True)

ax3.set_title('Daily Returns Distribution')
ax3.set_xlabel('Daily Return (%)')
ax3.set_ylabel('Density')
ax3.legend()
ax3.grid(True, alpha=0.3)

# Risk-Return scatter
ax4 = axes[1, 1]
risk_return_data = []

for strategy_name, result in results.items():
    if result and 'annualized_return' in result:
        risk_return_data.append({
            'Strategy': strategy_name,
            'Return': result['annualized_return'] * 100,
            'Risk': result['volatility'] * 100,
            'Sharpe': result['sharpe_ratio']
        })

if risk_return_data:
    rr_df = pd.DataFrame(risk_return_data)
    scatter = ax4.scatter(rr_df['Risk'], rr_df['Return'], 
                         c=rr_df['Sharpe'], s=100, cmap='viridis')
    
    for i, row in rr_df.iterrows():
        ax4.annotate(row['Strategy'], (row['Risk'], row['Return']), 
                    xytext=(5, 5), textcoords='offset points', fontsize=8)
    
    plt.colorbar(scatter, ax=ax4, label='Sharpe Ratio')
    ax4.set_title('Risk-Return Profile')
    ax4.set_xlabel('Volatility (%)')
    ax4.set_ylabel('Annualized Return (%)')
    ax4.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

### Detailed Performance Metrics

In [None]:
# Generate detailed performance reports
print("📈 Detailed Performance Analysis:\n")

for strategy_name, result in results.items():
    if result and 'portfolio_history' in result:
        print(f"{'='*50}")
        print(f"Strategy: {strategy_name}")
        print(f"{'='*50}")
        
        # Generate comprehensive report
        portfolio_values = result['portfolio_history']['total_value']
        trades_df = result.get('trades', pd.DataFrame())
        
        analyzer = PerformanceAnalyzer()
        report = analyzer.generate_performance_report(portfolio_values, trades_df)
        
        # Display key metrics
        print(f"\n📊 Return Metrics:")
        returns = report.get('returns', {})
        print(f"  Total Return: {returns.get('total_return', 0):.2%}")
        print(f"  Annualized Return: {returns.get('annualized_return', 0):.2%}")
        print(f"  Volatility: {returns.get('volatility', 0):.2%}")
        
        print(f"\n⚠️  Risk Metrics:")
        risk = report.get('risk', {})
        print(f"  Sharpe Ratio: {risk.get('sharpe_ratio', 0):.3f}")
        print(f"  Sortino Ratio: {risk.get('sortino_ratio', 0):.3f}")
        print(f"  Max Drawdown: {risk.get('max_drawdown', 0):.2%}")
        print(f"  VaR (95%): {risk.get('var_95', 0):.2%}")
        
        if 'trades' in report and report['trades']:
            print(f"\n💼 Trade Statistics:")
            trades = report['trades']
            print(f"  Total Trades: {trades.get('total_trades', 0)}")
            print(f"  Win Rate: {trades.get('win_rate', 0):.1%}")
            print(f"  Profit Factor: {trades.get('profit_factor', 0):.2f}")
            print(f"  Average Trade: ${trades.get('avg_trade', 0):.2f}")
        
        print("\n")

## 5. Strategy Comparison {#comparison}

Compare strategies across multiple dimensions.

In [None]:
# Create comparison metrics table
comparison_data = []

for strategy_name, result in results.items():
    if result and 'portfolio_history' in result:
        comparison_data.append({
            'Strategy': strategy_name,
            'Total Return': f"{result.get('total_return', 0):.2%}",
            'Annualized Return': f"{result.get('annualized_return', 0):.2%}",
            'Volatility': f"{result.get('volatility', 0):.2%}",
            'Sharpe Ratio': f"{result.get('sharpe_ratio', 0):.3f}",
            'Max Drawdown': f"{result.get('max_drawdown', 0):.2%}",
            'Total Trades': result.get('total_trades', 0),
            'Final Value': f"${result.get('final_value', 0):,.2f}"
        })

if comparison_data:
    comparison_df = pd.DataFrame(comparison_data)
    print("🏆 Strategy Comparison Table:")
    print(comparison_df.to_string(index=False))
    
    # Ranking analysis
    print("\n🥇 Strategy Rankings:")
    
    # Convert percentage strings to numeric for ranking
    numeric_df = comparison_df.copy()
    numeric_df['Total Return Numeric'] = numeric_df['Total Return'].str.rstrip('%').astype(float)
    numeric_df['Sharpe Numeric'] = numeric_df['Sharpe Ratio'].astype(float)
    numeric_df['Max DD Numeric'] = numeric_df['Max Drawdown'].str.rstrip('%').astype(float)
    
    print(f"  Best Total Return: {numeric_df.loc[numeric_df['Total Return Numeric'].idxmax(), 'Strategy']}")
    print(f"  Best Sharpe Ratio: {numeric_df.loc[numeric_df['Sharpe Numeric'].idxmax(), 'Strategy']}")
    print(f"  Lowest Drawdown: {numeric_df.loc[numeric_df['Max DD Numeric'].idxmin(), 'Strategy']}")
    
    # Overall score (simple weighted average)
    numeric_df['Score'] = (
        numeric_df['Total Return Numeric'] * 0.4 +
        numeric_df['Sharpe Numeric'] * 10 * 0.4 +  # Scale Sharpe to similar range
        (-numeric_df['Max DD Numeric']) * 0.2  # Negative because lower is better
    )
    
    best_overall = numeric_df.loc[numeric_df['Score'].idxmax(), 'Strategy']
    print(f"  Best Overall (Weighted Score): {best_overall}")

### Trade Analysis

In [None]:
# Analyze trades for the best performing strategy
if results and best_overall in results:
    best_result = results[best_overall]
    
    if 'trades' in best_result and not best_result['trades'].empty:
        trades_df = best_result['trades']
        
        print(f"📋 Trade Analysis for {best_overall}:")
        print(f"\nFirst 10 trades:")
        print(trades_df.head(10).to_string(index=False))
        
        # Plot trade timeline
        fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 10))
        
        # Trade timeline
        buy_trades = trades_df[trades_df['action'] == 'BUY']
        sell_trades = trades_df[trades_df['action'] == 'SELL']
        
        ax1.scatter(buy_trades['timestamp'], buy_trades['price'], 
                   color='green', marker='^', s=50, label='Buy', alpha=0.7)
        ax1.scatter(sell_trades['timestamp'], sell_trades['price'], 
                   color='red', marker='v', s=50, label='Sell', alpha=0.7)
        
        ax1.set_title(f'Trade Timeline - {best_overall}')
        ax1.set_ylabel('Price ($)')
        ax1.legend()
        ax1.grid(True, alpha=0.3)
        
        # Portfolio value with trade markers
        portfolio_df = best_result['portfolio_history']
        ax2.plot(portfolio_df['timestamp'], portfolio_df['total_value'], 
                label='Portfolio Value', linewidth=2)
        
        # Mark trade dates on portfolio chart
        for _, trade in trades_df.iterrows():
            # Find closest portfolio value date
            closest_idx = (portfolio_df['timestamp'] - trade['timestamp']).abs().idxmin()
            portfolio_value = portfolio_df.loc[closest_idx, 'total_value']
            
            color = 'green' if trade['action'] == 'BUY' else 'red'
            marker = '^' if trade['action'] == 'BUY' else 'v'
            ax2.scatter(trade['timestamp'], portfolio_value, 
                       color=color, marker=marker, s=30, alpha=0.6)
        
        ax2.set_title('Portfolio Value with Trade Markers')
        ax2.set_ylabel('Portfolio Value ($)')
        ax2.set_xlabel('Date')
        ax2.grid(True, alpha=0.3)
        
        plt.tight_layout()
        plt.show()
        
        # Trade statistics by symbol
        if 'symbol' in trades_df.columns:
            print(f"\n📊 Trades by Symbol:")
            symbol_stats = trades_df.groupby('symbol').agg({
                'action': 'count',
                'quantity': 'sum',
                'price': ['mean', 'min', 'max']
            }).round(2)
            
            symbol_stats.columns = ['Total Trades', 'Total Quantity', 'Avg Price', 'Min Price', 'Max Price']
            print(symbol_stats.to_string())

## Conclusion

This notebook demonstrated:

1. **Data Loading**: How to fetch and prepare financial data for backtesting
2. **Strategy Implementation**: Creating different moving average crossover strategies
3. **Backtesting**: Running systematic backtests with realistic transaction costs
4. **Performance Analysis**: Comprehensive evaluation of strategy performance
5. **Comparison**: Comparing multiple strategies across various metrics

### Key Takeaways:

- Different moving average strategies can produce significantly different results
- Risk-adjusted metrics (like Sharpe ratio) are crucial for strategy evaluation
- Transaction costs and slippage have material impact on performance
- Drawdown analysis helps understand the risk profile of strategies

### Next Steps:

1. **Parameter Optimization**: Use the optimization notebook to find optimal parameters
2. **Additional Strategies**: Implement momentum and mean reversion strategies
3. **Risk Management**: Add stop-loss and position sizing rules
4. **Walk-Forward Analysis**: Test strategy robustness over different time periods