# Strategy Optimization & Parameter Tuning

**Purpose**: Systematically optimize trading strategy parameters to maximize Sharpe ratio.

**Approach**:
1. Grid search over parameter space
2. Evaluate performance metrics for each combination
3. Identify optimal parameters
4. Validate on out-of-sample data (if sufficient)

**Goal**: Achieve Sharpe ratio > 0.5 (Phase 1C success criteria)

In [None]:
import sys
sys.path.append('..')

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from itertools import product
from datetime import datetime, timedelta

# Import utils
from utils import (
    get_engine,
    load_market_data,
    get_active_tickers,
    MeanReversionStrategy,
    Backtest,
    run_multi_ticker_backtest,
    PerformanceMetrics
)
from utils.visualization import plot_strategy_summary

sns.set_style('darkgrid')
plt.rcParams['figure.figsize'] = (14, 7)

%matplotlib inline

# Suppress warnings for cleaner output
import warnings
warnings.filterwarnings('ignore')

## Load Market Data

In [None]:
engine = get_engine()

# Load active tickers with sufficient data and price movement
active = get_active_tickers(engine, min_snapshots=50, min_price_range=0.01)
print(f"Found {len(active)} active tickers")
print(f"\nTop 5 by price range:")
print(active[['ticker', 'snapshot_count', 'price_range', 'std_yes_prob']].head())

# Load data for active tickers
df = load_market_data(engine, min_snapshots=50)
df = df[df['ticker'].isin(active['ticker'])]

print(f"\nLoaded {len(df):,} snapshots from {df['ticker'].nunique()} tickers")

## Parameter Grid Definition

Define ranges for each strategy parameter to test.

In [None]:
# Mean Reversion Strategy Parameters
param_grid = {
    'window': [10, 15, 20, 25, 30],          # Rolling window size
    'std_threshold': [1.0, 1.5, 2.0, 2.5],   # Standard deviation threshold
    'position_size': [0.05, 0.10, 0.15]      # Position size (fraction of capital)
}

# Generate all combinations
param_combinations = list(product(
    param_grid['window'],
    param_grid['std_threshold'],
    param_grid['position_size']
))

print(f"Testing {len(param_combinations)} parameter combinations")
print(f"Sample combination: window={param_combinations[0][0]}, std_threshold={param_combinations[0][1]}, position_size={param_combinations[0][2]}")

## Grid Search Execution

In [None]:
results = []

print("Running grid search...\n")
print(f"{'Progress':<12} | {'Sharpe':<8} | {'Return %':<10} | {'Win Rate %':<12} | {'Trades':<8}")
print("-" * 70)

for i, (window, std_threshold, position_size) in enumerate(param_combinations):
    # Create strategy with these parameters
    strategy = MeanReversionStrategy(
        window=window,
        std_threshold=std_threshold
    )
    
    # Run backtest
    result = run_multi_ticker_backtest(
        df=df,
        strategy=strategy,
        initial_capital=10000,
        position_size=position_size
    )
    
    # Calculate metrics
    if len(result.trades) > 0:
        metrics = PerformanceMetrics.calculate_all(result)
        
        results.append({
            'window': window,
            'std_threshold': std_threshold,
            'position_size': position_size,
            'sharpe_ratio': metrics['sharpe_ratio'],
            'total_return_pct': metrics['total_return_pct'],
            'win_rate_pct': metrics['win_rate_pct'],
            'total_trades': metrics['total_trades'],
            'max_drawdown_pct': metrics['max_drawdown_pct'],
            'profit_factor': metrics['profit_factor']
        })
        
        # Print progress every 10 iterations
        if (i + 1) % 10 == 0 or metrics['sharpe_ratio'] > 0.5:
            progress = f"{i+1}/{len(param_combinations)}"
            marker = "✅" if metrics['sharpe_ratio'] > 0.5 else "  "
            print(f"{progress:<12} | {metrics['sharpe_ratio']:<8.3f} | {metrics['total_return_pct']:<10.2f} | {metrics['win_rate_pct']:<12.1f} | {metrics['total_trades']:<8} {marker}")

results_df = pd.DataFrame(results)
print(f"\nGrid search complete. Tested {len(results_df)} configurations.")

## Best Parameters

In [None]:
# Sort by Sharpe ratio
results_df_sorted = results_df.sort_values('sharpe_ratio', ascending=False)

print("\n" + "="*70)
print("TOP 10 PARAMETER COMBINATIONS (by Sharpe Ratio)")
print("="*70)
print(results_df_sorted[[
    'window', 'std_threshold', 'position_size',
    'sharpe_ratio', 'total_return_pct', 'win_rate_pct', 'total_trades'
]].head(10).to_string(index=False))

# Best configuration
best = results_df_sorted.iloc[0]
print("\n" + "="*70)
print("BEST CONFIGURATION")
print("="*70)
print(f"Window:           {best['window']:.0f}")
print(f"Std Threshold:    {best['std_threshold']:.1f}")
print(f"Position Size:    {best['position_size']:.2f}")
print(f"Sharpe Ratio:     {best['sharpe_ratio']:.3f}")
print(f"Total Return:     {best['total_return_pct']:.2f}%")
print(f"Win Rate:         {best['win_rate_pct']:.1f}%")
print(f"Total Trades:     {best['total_trades']:.0f}")
print(f"Max Drawdown:     {best['max_drawdown_pct']:.2f}%")
print(f"Profit Factor:    {best['profit_factor']:.2f}")
print("="*70)

if best['sharpe_ratio'] > 0.5:
    print("\n✅ PHASE 1C SUCCESS CRITERIA MET!")
    print(f"Sharpe Ratio: {best['sharpe_ratio']:.3f} > 0.5")
    print("\nReady to proceed to Phase 2 (Real-time monitoring)")
else:
    print("\n❌ Phase 1C criteria not met yet.")
    print(f"Sharpe Ratio: {best['sharpe_ratio']:.3f} < 0.5")
    print("\nRecommendations:")
    print("1. Collect more data (let poller run longer)")
    print("2. Focus on specific market types")
    print("3. Try alternative strategies (momentum, arbitrage)")
    print("4. Implement more sophisticated risk management")

## Parameter Sensitivity Analysis

In [None]:
# Analyze impact of each parameter on Sharpe ratio
fig, axes = plt.subplots(1, 3, figsize=(18, 5))

# 1. Window size effect
window_effect = results_df.groupby('window')['sharpe_ratio'].agg(['mean', 'std', 'max'])
axes[0].errorbar(window_effect.index, window_effect['mean'], yerr=window_effect['std'], 
                 marker='o', capsize=5, linewidth=2)
axes[0].plot(window_effect.index, window_effect['max'], '--', alpha=0.5, label='Max')
axes[0].axhline(y=0.5, color='r', linestyle='--', alpha=0.5, label='Target (0.5)')
axes[0].set_xlabel('Window Size')
axes[0].set_ylabel('Sharpe Ratio')
axes[0].set_title('Effect of Window Size')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# 2. Std threshold effect
std_effect = results_df.groupby('std_threshold')['sharpe_ratio'].agg(['mean', 'std', 'max'])
axes[1].errorbar(std_effect.index, std_effect['mean'], yerr=std_effect['std'],
                 marker='o', capsize=5, linewidth=2)
axes[1].plot(std_effect.index, std_effect['max'], '--', alpha=0.5, label='Max')
axes[1].axhline(y=0.5, color='r', linestyle='--', alpha=0.5, label='Target (0.5)')
axes[1].set_xlabel('Std Deviation Threshold')
axes[1].set_ylabel('Sharpe Ratio')
axes[1].set_title('Effect of Std Threshold')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

# 3. Position size effect
pos_effect = results_df.groupby('position_size')['sharpe_ratio'].agg(['mean', 'std', 'max'])
axes[2].errorbar(pos_effect.index, pos_effect['mean'], yerr=pos_effect['std'],
                 marker='o', capsize=5, linewidth=2)
axes[2].plot(pos_effect.index, pos_effect['max'], '--', alpha=0.5, label='Max')
axes[2].axhline(y=0.5, color='r', linestyle='--', alpha=0.5, label='Target (0.5)')
axes[2].set_xlabel('Position Size')
axes[2].set_ylabel('Sharpe Ratio')
axes[2].set_title('Effect of Position Size')
axes[2].legend()
axes[2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## Heatmap: Window vs Std Threshold

In [None]:
# Create pivot table for heatmap
# Average Sharpe across all position sizes for each window/std combination
pivot = results_df.pivot_table(
    values='sharpe_ratio',
    index='std_threshold',
    columns='window',
    aggfunc='mean'
)

plt.figure(figsize=(10, 6))
sns.heatmap(pivot, annot=True, fmt='.3f', cmap='RdYlGn', center=0.5,
            cbar_kws={'label': 'Sharpe Ratio'})
plt.title('Parameter Heatmap: Window Size vs Std Threshold\n(averaged over position sizes)', 
          fontweight='bold')
plt.xlabel('Window Size')
plt.ylabel('Std Deviation Threshold')
plt.tight_layout()
plt.show()

## Validate Best Strategy

Run detailed analysis with optimal parameters.

In [None]:
# Create strategy with best parameters
best_strategy = MeanReversionStrategy(
    window=int(best['window']),
    std_threshold=best['std_threshold']
)

# Run backtest with best parameters
best_result = run_multi_ticker_backtest(
    df=df,
    strategy=best_strategy,
    initial_capital=10000,
    position_size=best['position_size']
)

# Calculate comprehensive metrics
best_metrics = PerformanceMetrics.calculate_all(best_result)
PerformanceMetrics.print_metrics(best_metrics, "OPTIMIZED STRATEGY PERFORMANCE")

In [None]:
# Visualize best strategy performance
if len(best_result.trades) > 0:
    fig = plot_strategy_summary(best_result, figsize=(18, 12))
    plt.suptitle('Optimized Strategy Performance', fontsize=16, fontweight='bold', y=1.00)
    plt.show()
else:
    print("No trades generated with best parameters.")

## Save Results

In [None]:
# Save optimization results
results_df_sorted.to_csv('../data/optimization_results.csv', index=False)
print("✅ Optimization results saved to data/optimization_results.csv")

# Save best parameters
best_params = {
    'window': int(best['window']),
    'std_threshold': float(best['std_threshold']),
    'position_size': float(best['position_size']),
    'sharpe_ratio': float(best['sharpe_ratio']),
    'total_return_pct': float(best['total_return_pct']),
    'optimized_date': datetime.now().isoformat()
}

import json
with open('../data/best_parameters.json', 'w') as f:
    json.dump(best_params, f, indent=2)

print("✅ Best parameters saved to data/best_parameters.json")

## Next Steps

### If Sharpe > 0.5 ✅
**Phase 1C COMPLETE!**

Proceed to:
1. Phase 2: Real-time WebSocket integration
2. Live strategy monitoring dashboard  
3. Validate performance on live data

See `docs/phase-2-architecture.md` for implementation plan.

### If Sharpe < 0.5 ❌
**More work needed:**

1. **Data Collection**
   - Let poller run for several more hours/days
   - Focus on liquid markets with price movement
   - Wait for more market resolutions

2. **Strategy Alternatives**
   - Momentum strategy (buy rising, sell falling)
   - Arbitrage between yes/no prices
   - Volume-weighted strategies
   - Event-driven approaches

3. **Risk Management**
   - Implement stop-loss levels
   - Volatility-based position sizing
   - Maximum loss per trade limits
   - Diversification across market types

4. **Market Selection**
   - Filter for specific categories
   - Focus on high-volume markets
   - Exclude low-probability edge cases
   - Time-based filtering (avoid resolution time)

### Remember
**Sharpe > 0.5 is the gate to Phase 3 automation.**  
Don't proceed without proven profitability!