# Strategy Optimizer User Guide

This notebook provides a comprehensive guide to using the `StrategyOptimizer` module from the `algoshort` package for parameter tuning and walk-forward analysis.

## Table of Contents

1. [Setup and Installation](#1-setup-and-installation)
2. [Understanding Optimization Concepts](#2-understanding-optimization-concepts)
3. [The get_equity() Function](#3-the-get_equity-function)
4. [StrategyOptimizer Class](#4-strategyoptimizer-class)
5. [Grid Search Optimization](#5-grid-search-optimization)
6. [Rolling Walk-Forward Analysis](#6-rolling-walk-forward-analysis)
7. [Sensitivity Analysis](#7-sensitivity-analysis)
8. [Comparing Signals](#8-comparing-signals)
9. [Complete Workflow Integration](#9-complete-workflow-integration)
10. [Best Practices and Tips](#10-best-practices-and-tips)

## 1. Setup and Installation

First, let's import the required modules and set up logging.

In [None]:
# Standard imports
import pandas as pd
import numpy as np
import logging
import tempfile
import json
import os

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

# Import optimizer modules
from algoshort.optimizer import (
    StrategyOptimizer,
    get_equity,
    _worker_evaluate,
    MIN_SEGMENT_SIZE,
    MIN_SEGMENT_ROWS,
    MAX_GRID_COMBINATIONS,
    MAX_PARAM_VALUES,
)

print("Imports successful!")
print(f"\nModule Constants:")
print(f"  MIN_SEGMENT_SIZE: {MIN_SEGMENT_SIZE} bars")
print(f"  MIN_SEGMENT_ROWS: {MIN_SEGMENT_ROWS} rows")
print(f"  MAX_GRID_COMBINATIONS: {MAX_GRID_COMBINATIONS:,}")
print(f"  MAX_PARAM_VALUES: {MAX_PARAM_VALUES:,}")

## 2. Understanding Optimization Concepts

### What is Strategy Optimization?

Strategy optimization is the process of finding the best parameter values for a trading strategy. The `StrategyOptimizer` provides:

| Method | Description | Use Case |
|--------|-------------|----------|
| **Grid Search** | Tests all combinations of parameter values | Finding optimal parameters on a single data segment |
| **Walk-Forward** | Rolling optimization with in-sample/out-of-sample splits | Avoiding overfitting, realistic backtesting |
| **Sensitivity Analysis** | Tests parameter neighborhood around best values | Validating parameter robustness |

### Walk-Forward Analysis Explained

Walk-forward analysis splits data into segments:

```
Data: [====================================================================================]

Segment 1: [IN-SAMPLE (optimize)] [OUT-OF-SAMPLE (validate)]
Segment 2:            [IN-SAMPLE (optimize)] [OUT-OF-SAMPLE (validate)]
Segment 3:                        [IN-SAMPLE (optimize)] [OUT-OF-SAMPLE (validate)]
Segment 4:                                    [IN-SAMPLE (optimize)] [OUT-OF-SAMPLE (validate)]
```

- **In-Sample (IS)**: Used to find best parameters via grid search
- **Out-of-Sample (OOS)**: Used to validate performance with those parameters
- This prevents overfitting by never testing on the same data used for optimization

### Key Metrics

The optimizer tracks several metrics from the equity function:

| Metric | Description |
|--------|-------------|
| `convex` | Equity using convex risk adjustment (aggressive) |
| `concave` | Equity using concave risk adjustment (conservative) |
| `constant` | Equity using constant risk percentage |
| `equal_weight` | Equity using equal weight allocation |

### Parameter Stability

The optimizer calculates **Coefficient of Variation (CV)** for each parameter:

```
CV = std(parameter_values) / mean(parameter_values)
```

- **Low CV** (< 0.2): Parameters are stable across segments (good)
- **High CV** (> 0.5): Parameters vary significantly (potential overfitting)

## 3. The get_equity() Function

The `get_equity()` function processes a data segment and returns equity metrics. It's the core function used by the optimizer.

### Function Signature

```python
get_equity(
    segment_df: pd.DataFrame,      # OHLC data
    segment_idx: int = 0,          # Segment index for logging
    config_path: str = '...',      # Path to config JSON
    price_col: str = 'close',      # Price column name
    stop_method: str = 'atr',      # Stop-loss method
    inplace: bool = False,         # Modify input DataFrame?
    save_output: bool = False,     # Save to Excel?
    **stop_kwargs                  # Stop-loss parameters
) -> Dict[str, Any]
```

### Available Stop-Loss Methods

| Method | Description | Key Parameters |
|--------|-------------|----------------|
| `atr` | ATR-based trailing stop | `window`, `multiplier` |
| `fixed_percentage` | Fixed % below/above entry | `percentage` |
| `breakout_channel` | Donchian channel stop | `window` |
| `moving_average` | MA-based stop | `window`, `ma_type` |
| `volatility_std` | Standard deviation stop | `window`, `multiplier` |
| `support_resistance` | S/R level stop | `window` |
| `classified_pivot` | Pivot point stop | `swing_window` |

In [None]:
# Create sample OHLC data for demonstrations
np.random.seed(42)
n_days = 500  # About 2 years of trading days

# Generate realistic price data
dates = pd.date_range('2022-01-01', periods=n_days, freq='B')
returns = np.random.normal(0.0003, 0.015, n_days)  # Daily returns
close = 100 * np.exp(np.cumsum(returns))

# Create OHLC DataFrame
df_sample = pd.DataFrame({
    'date': dates,
    'open': close * (1 + np.random.uniform(-0.005, 0.005, n_days)),
    'high': close * (1 + np.abs(np.random.normal(0, 0.01, n_days))),
    'low': close * (1 - np.abs(np.random.normal(0, 0.01, n_days))),
    'close': close,
    'volume': np.random.randint(1000000, 5000000, n_days),
})

# Add benchmark for relative calculations
benchmark_returns = np.random.normal(0.0002, 0.012, n_days)
df_sample['rclose'] = 100 * np.exp(np.cumsum(benchmark_returns))

print(f"Sample data shape: {df_sample.shape}")
print(f"Date range: {df_sample['date'].iloc[0]} to {df_sample['date'].iloc[-1]}")
print(f"\nPrice range: ${df_sample['close'].min():.2f} - ${df_sample['close'].max():.2f}")
df_sample.head()

In [None]:
# Create a temporary config file for demonstrations
config = {
    "regimes": {
        "floor_ceiling": {
            "lvl": 1,
            "vlty_n": 63,
            "threshold": 1.5,
            "dgt": 3,
            "d_vol": 1.0,
            "dist_pct": 0.05,
            "retrace_pct": 0.05,
            "r_vol": 1.0
        }
    }
}

# Save config to temp file
config_path = tempfile.mktemp(suffix='.json')
with open(config_path, 'w') as f:
    json.dump(config, f, indent=2)

print(f"Config file created: {config_path}")
print(f"\nConfig contents:")
print(json.dumps(config, indent=2))

### Creating a Custom Equity Function

For faster demonstrations, let's create a simplified equity function that doesn't require the full regime calculation pipeline:

In [None]:
def mock_equity_func(
    segment_df: pd.DataFrame,
    segment_idx: int = 0,
    config_path: str = '',
    price_col: str = 'close',
    stop_method: str = 'atr',
    **stop_kwargs
) -> dict:
    """
    Simplified equity function for demonstration.
    
    In production, use the actual get_equity() function.
    """
    # Extract parameters
    window = stop_kwargs.get('window', 14)
    multiplier = stop_kwargs.get('multiplier', 2.0)
    
    # Calculate a simple performance metric based on parameters
    # (In reality, this would run the full strategy simulation)
    np.random.seed(segment_idx + int(window * 100 + multiplier * 10))
    
    # Simulate that certain parameter combinations work better
    base_return = 0.15  # 15% base return
    
    # Optimal window around 14, optimal multiplier around 2.0
    window_penalty = abs(window - 14) * 0.01
    mult_penalty = abs(multiplier - 2.0) * 0.05
    
    noise = np.random.normal(0, 0.02)
    total_return = base_return - window_penalty - mult_penalty + noise
    
    # Calculate equity values
    initial_capital = 100000
    
    return {
        'convex': initial_capital * (1 + total_return * 1.2),
        'concave': initial_capital * (1 + total_return * 0.8),
        'constant': initial_capital * (1 + total_return),
        'equal_weight': initial_capital * (1 + total_return * 0.9),
        'segment_idx': segment_idx,
        'rows_processed': len(segment_df),
        'stop_method': stop_method,
        'window': window,
        'multiplier': multiplier,
    }

# Test the mock function
result = mock_equity_func(
    segment_df=df_sample.iloc[:100],
    segment_idx=0,
    config_path=config_path,
    stop_method='atr',
    window=14,
    multiplier=2.0
)

print("Mock equity function result:")
for key, value in result.items():
    if isinstance(value, float):
        print(f"  {key}: {value:,.2f}")
    else:
        print(f"  {key}: {value}")

## 4. StrategyOptimizer Class

### Initialization

```python
StrategyOptimizer(
    data: pd.DataFrame,                    # Full historical OHLC data
    equity_func: Callable[..., Dict],      # Function to compute metrics
    config_path: str                        # Path to config file
)
```

### Input Validation

The optimizer validates:
- `data` must be a non-empty DataFrame
- `equity_func` must be callable
- `config_path` must point to an existing file

In [None]:
# Create optimizer instance
optimizer = StrategyOptimizer(
    data=df_sample,
    equity_func=mock_equity_func,
    config_path=config_path
)

print(f"Optimizer initialized successfully!")
print(f"  Data rows: {len(optimizer.data)}")
print(f"  Config path: {optimizer.config_path}")

In [None]:
# Validation examples
print("Input validation demonstrations:\n")

# Example 1: Empty DataFrame
try:
    bad_optimizer = StrategyOptimizer(
        data=pd.DataFrame(),  # Empty!
        equity_func=mock_equity_func,
        config_path=config_path
    )
except ValueError as e:
    print(f"1. Empty DataFrame: {e}")

# Example 2: Non-callable equity_func
try:
    bad_optimizer = StrategyOptimizer(
        data=df_sample,
        equity_func="not_a_function",  # Not callable!
        config_path=config_path
    )
except TypeError as e:
    print(f"2. Non-callable: {e}")

# Example 3: Invalid config path
try:
    bad_optimizer = StrategyOptimizer(
        data=df_sample,
        equity_func=mock_equity_func,
        config_path="nonexistent_file.json"  # Doesn't exist!
    )
except ValueError as e:
    print(f"3. Invalid config: {e}")

## 5. Grid Search Optimization

Grid search tests all combinations of parameter values to find the best performing set.

### Method Signature

```python
optimizer.run_grid_search(
    segment_data: pd.DataFrame,           # Data segment to evaluate
    param_grid: Dict[str, Iterable],      # Parameter combinations
    segment_idx: int,                      # Segment index
    stop_method: str = 'atr',              # Stop-loss method
    price_col: str = 'close',              # Price column
    n_jobs: int = 1,                       # Parallel jobs (-1 for all CPUs)
    backend: str = 'loky',                 # Joblib backend
) -> pd.DataFrame
```

In [None]:
# Define parameter grid
param_grid = {
    'window': [10, 12, 14, 16, 18, 20],
    'multiplier': [1.5, 1.75, 2.0, 2.25, 2.5]
}

n_combinations = 1
for values in param_grid.values():
    n_combinations *= len(values)

print(f"Parameter Grid:")
for param, values in param_grid.items():
    print(f"  {param}: {values}")
print(f"\nTotal combinations: {n_combinations}")

In [None]:
# Run grid search
grid_results = optimizer.run_grid_search(
    segment_data=df_sample,
    param_grid=param_grid,
    segment_idx=0,
    stop_method='atr',
    price_col='close',
    n_jobs=1  # Use 1 for notebook stability; use -1 in production
)

print(f"Grid search results: {len(grid_results)} combinations tested")
print(f"\nColumns: {list(grid_results.columns)}")
grid_results.head(10)

In [None]:
# Find best parameters
best_idx = grid_results['convex'].idxmax()
best_row = grid_results.loc[best_idx]

print("Best Parameters (by convex equity):")
print(f"  Window: {best_row['window']}")
print(f"  Multiplier: {best_row['multiplier']}")
print(f"\nPerformance:")
print(f"  Convex Equity: ${best_row['convex']:,.2f}")
print(f"  Constant Equity: ${best_row['constant']:,.2f}")
print(f"  Equal Weight Equity: ${best_row['equal_weight']:,.2f}")

In [None]:
# Visualize grid search results as heatmap
import matplotlib.pyplot as plt

# Pivot results for heatmap
pivot = grid_results.pivot(index='window', columns='multiplier', values='convex')

fig, ax = plt.subplots(figsize=(10, 6))
im = ax.imshow(pivot.values, cmap='RdYlGn', aspect='auto')

# Set ticks
ax.set_xticks(range(len(pivot.columns)))
ax.set_xticklabels(pivot.columns)
ax.set_yticks(range(len(pivot.index)))
ax.set_yticklabels(pivot.index)

# Labels
ax.set_xlabel('Multiplier')
ax.set_ylabel('Window')
ax.set_title('Grid Search Results: Convex Equity by Parameters')

# Add colorbar
cbar = plt.colorbar(im, ax=ax)
cbar.set_label('Convex Equity ($)')

# Annotate values
for i in range(len(pivot.index)):
    for j in range(len(pivot.columns)):
        value = pivot.iloc[i, j]
        text = ax.text(j, i, f'{value/1000:.1f}k', ha='center', va='center', fontsize=8)

plt.tight_layout()
plt.show()

### Grid Search Safety Limits

The optimizer has built-in limits to prevent memory issues:

In [None]:
# Example: Exceeding combination limit
try:
    large_grid = {
        'param1': list(range(100)),
        'param2': list(range(100)),
        'param3': list(range(10))
    }  # 100 * 100 * 10 = 100,000 combinations
    
    optimizer.run_grid_search(
        segment_data=df_sample,
        param_grid=large_grid,
        segment_idx=0
    )
except RuntimeError as e:
    print(f"Combination limit exceeded: {e}")

# Example: Too many values per parameter
try:
    too_many_values = {
        'window': list(range(1, 1500))  # 1499 values > MAX_PARAM_VALUES
    }
    
    optimizer.run_grid_search(
        segment_data=df_sample,
        param_grid=too_many_values,
        segment_idx=0
    )
except RuntimeError as e:
    print(f"\nParameter values limit exceeded: {e}")

# Example: Empty parameter values
try:
    empty_values = {
        'window': []  # Empty list!
    }
    
    optimizer.run_grid_search(
        segment_data=df_sample,
        param_grid=empty_values,
        segment_idx=0
    )
except ValueError as e:
    print(f"\nEmpty values error: {e}")

## 6. Rolling Walk-Forward Analysis

Walk-forward analysis is the gold standard for strategy validation. It prevents overfitting by using separate data for optimization and validation.

### Method Signature

```python
optimizer.rolling_walk_forward(
    stop_method: str,                      # Stop-loss method
    param_grid: Dict[str, Iterable],       # Parameter grid
    close_col: str = 'close',              # Close price column
    n_segments: int = 4,                   # Number of walk-forward segments
    n_jobs: int = 1,                       # Parallel jobs
    verbose: bool = False,                 # Print detailed output
    opt_metric: str = 'convex'             # Metric to optimize
) -> Tuple[pd.DataFrame, Dict, List]
```

### Returns

1. **oos_df**: DataFrame with out-of-sample results for each segment
2. **stability**: Dict with parameter stability metrics (CV values)
3. **history**: List of dicts with per-segment best parameters

In [None]:
# Run walk-forward optimization
oos_df, stability, history = optimizer.rolling_walk_forward(
    stop_method='atr',
    param_grid={
        'window': [10, 12, 14, 16, 18, 20],
        'multiplier': [1.5, 2.0, 2.5]
    },
    close_col='close',
    n_segments=4,
    n_jobs=1,
    verbose=True,
    opt_metric='convex'
)

print(f"\nWalk-forward complete!")
print(f"  OOS results: {len(oos_df)} segments")
print(f"  Valid segments: {stability.get('n_segments_valid', 0)}")

In [None]:
# View out-of-sample results
print("Out-of-Sample Results:")
oos_df[['segment', 'convex', 'constant', 'equal_weight', 'window', 'multiplier']]

In [None]:
# View parameter stability
print("Parameter Stability (Coefficient of Variation):")
print("="*50)
for key, value in stability.items():
    if key == 'n_segments_valid':
        print(f"  Valid segments: {value}")
    elif '_cv' in key:
        param_name = key.replace('_cv', '')
        if pd.isna(value):
            print(f"  {param_name}: N/A")
        else:
            stability_rating = "Stable" if value < 0.2 else "Moderate" if value < 0.5 else "Unstable"
            print(f"  {param_name}: {value:.4f} ({stability_rating})")

In [None]:
# View optimization history
print("Optimization History (Best Parameters per Segment):")
print("="*60)
for entry in history:
    print(f"\nSegment {entry['segment']}:")
    print(f"  In-Sample Metric: ${entry['is_metric']:,.2f}")
    print(f"  Parameters: {entry['params']}")

In [None]:
# Visualize walk-forward results
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Plot 1: OOS Performance by Segment
ax1 = axes[0, 0]
segments = oos_df['segment']
width = 0.25
x = np.arange(len(segments))

ax1.bar(x - width, oos_df['convex']/1000, width, label='Convex', color='green')
ax1.bar(x, oos_df['constant']/1000, width, label='Constant', color='blue')
ax1.bar(x + width, oos_df['equal_weight']/1000, width, label='Equal Weight', color='orange')

ax1.axhline(y=100, color='gray', linestyle='--', alpha=0.5, label='Initial Capital')
ax1.set_xlabel('Segment')
ax1.set_ylabel('Equity ($K)')
ax1.set_title('Out-of-Sample Performance by Segment')
ax1.set_xticks(x)
ax1.set_xticklabels(segments)
ax1.legend()
ax1.grid(True, alpha=0.3)

# Plot 2: Parameter Values Across Segments
ax2 = axes[0, 1]
windows = [h['params']['window'] for h in history]
multipliers = [h['params']['multiplier'] for h in history]
segments_hist = [h['segment'] for h in history]

ax2_twin = ax2.twinx()
ax2.plot(segments_hist, windows, 'b-o', label='Window', linewidth=2, markersize=8)
ax2_twin.plot(segments_hist, multipliers, 'r-s', label='Multiplier', linewidth=2, markersize=8)

ax2.set_xlabel('Segment')
ax2.set_ylabel('Window', color='blue')
ax2_twin.set_ylabel('Multiplier', color='red')
ax2.set_title('Best Parameters per Segment')
ax2.legend(loc='upper left')
ax2_twin.legend(loc='upper right')
ax2.grid(True, alpha=0.3)

# Plot 3: IS vs OOS Performance
ax3 = axes[1, 0]
is_metrics = [h['is_metric'] for h in history]
oos_metrics = oos_df['convex'].values

ax3.bar(x - 0.2, np.array(is_metrics)/1000, 0.4, label='In-Sample', color='lightblue')
ax3.bar(x + 0.2, oos_metrics/1000, 0.4, label='Out-of-Sample', color='darkblue')
ax3.axhline(y=100, color='gray', linestyle='--', alpha=0.5)
ax3.set_xlabel('Segment')
ax3.set_ylabel('Convex Equity ($K)')
ax3.set_title('In-Sample vs Out-of-Sample Performance')
ax3.set_xticks(x)
ax3.set_xticklabels(segments)
ax3.legend()
ax3.grid(True, alpha=0.3)

# Plot 4: Stability Indicators
ax4 = axes[1, 1]
cv_values = {k.replace('_cv', ''): v for k, v in stability.items() if '_cv' in k}
params = list(cv_values.keys())
cvs = [cv_values[p] if not pd.isna(cv_values[p]) else 0 for p in params]

colors = ['green' if c < 0.2 else 'orange' if c < 0.5 else 'red' for c in cvs]
ax4.barh(params, cvs, color=colors)
ax4.axvline(x=0.2, color='green', linestyle='--', alpha=0.5, label='Stable (<0.2)')
ax4.axvline(x=0.5, color='red', linestyle='--', alpha=0.5, label='Unstable (>0.5)')
ax4.set_xlabel('Coefficient of Variation')
ax4.set_title('Parameter Stability (CV)')
ax4.legend()
ax4.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 7. Sensitivity Analysis

Sensitivity analysis tests how robust your optimal parameters are by evaluating performance in the neighborhood of the best values.

### Method Signature

```python
optimizer.sensitivity_analysis(
    stop_method: str,                       # Stop-loss method
    best_params: Dict[str, Any],            # Best parameter values
    close_col: str = 'close',               # Close price column
    variance: float = 0.20,                 # Fraction to vary (+/- 20%)
    opt_metric: str = 'convex',             # Metric to evaluate
    extra_grids: Optional[Dict] = None      # Additional grid values
) -> Tuple[float, pd.DataFrame]
```

### Returns

1. **plateau_ratio_pct**: (avg performance / peak performance) * 100
   - High ratio (> 80%): Parameters are robust
   - Low ratio (< 60%): Performance is sensitive to parameter changes
2. **results_df**: Full grid search results around the best parameters

In [None]:
# Get best parameters from walk-forward
if history:
    best_params = history[-1]['params']
else:
    best_params = {'window': 14, 'multiplier': 2.0}

print(f"Best parameters for sensitivity analysis: {best_params}")

In [None]:
# Run sensitivity analysis
plateau_ratio, sens_results = optimizer.sensitivity_analysis(
    stop_method='atr',
    best_params=best_params,
    close_col='close',
    variance=0.20,  # +/- 20%
    opt_metric='convex'
)

print(f"Sensitivity Analysis Results:")
print(f"  Plateau Ratio: {plateau_ratio:.2f}%")
print(f"  Combinations tested: {len(sens_results)}")

if plateau_ratio > 80:
    print(f"\n  Interpretation: ROBUST - Performance degrades gracefully")
elif plateau_ratio > 60:
    print(f"\n  Interpretation: MODERATE - Some sensitivity to parameters")
else:
    print(f"\n  Interpretation: SENSITIVE - Performance highly dependent on exact parameters")

In [None]:
# View sensitivity results
print("Sensitivity Grid Results:")
sens_results[['window', 'multiplier', 'convex', 'constant']]

In [None]:
# Visualize sensitivity
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Performance by parameter value
ax1 = axes[0]

# Group by window
for mult in sens_results['multiplier'].unique():
    mask = sens_results['multiplier'] == mult
    ax1.plot(sens_results[mask]['window'], sens_results[mask]['convex']/1000,
             'o-', label=f'mult={mult}', markersize=8, linewidth=2)

ax1.axhline(y=sens_results['convex'].max()/1000, color='green', linestyle='--', 
            alpha=0.5, label='Peak')
ax1.axhline(y=sens_results['convex'].mean()/1000, color='orange', linestyle='--',
            alpha=0.5, label='Average')

ax1.set_xlabel('Window')
ax1.set_ylabel('Convex Equity ($K)')
ax1.set_title('Performance Sensitivity by Window')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Plot 2: Distribution of performance
ax2 = axes[1]
ax2.hist(sens_results['convex']/1000, bins=10, edgecolor='black', alpha=0.7)
ax2.axvline(x=sens_results['convex'].max()/1000, color='green', linestyle='--',
            linewidth=2, label=f'Peak: ${sens_results["convex"].max()/1000:.1f}K')
ax2.axvline(x=sens_results['convex'].mean()/1000, color='orange', linestyle='--',
            linewidth=2, label=f'Mean: ${sens_results["convex"].mean()/1000:.1f}K')

ax2.set_xlabel('Convex Equity ($K)')
ax2.set_ylabel('Count')
ax2.set_title(f'Performance Distribution (Plateau: {plateau_ratio:.1f}%)')
ax2.legend()
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 8. Comparing Signals

The `compare_signals()` method runs walk-forward optimization and summarizes results.

**Note**: Currently compares single signal with summary statistics. Multi-signal comparison is planned for future versions.

In [None]:
# Compare signals
comparison = optimizer.compare_signals(
    signals='rrg',  # Signal name (used for labeling)
    stop_method='atr',
    param_grid={'window': [10, 14, 20], 'multiplier': [1.5, 2.0, 2.5]},
    n_segments=3,
    n_jobs=1
)

print("Signal Comparison Results:")
comparison

## 9. Complete Workflow Integration

Here's how to integrate the optimizer with the full `algoshort` pipeline.

In [None]:
# Import all required modules
try:
    from algoshort.yfinance_handler import YFinanceDataHandler
    from algoshort.stop_loss import StopLossCalculator
    from algoshort.position_sizing import PositionSizing
    print("All modules imported successfully!")
    FULL_INTEGRATION = True
except ImportError as e:
    print(f"Some modules not available: {e}")
    print("Showing example workflow structure instead.")
    FULL_INTEGRATION = False

In [None]:
# Complete optimization workflow
print("="*60)
print("COMPLETE OPTIMIZATION WORKFLOW")
print("="*60)

# Step 1: Define parameter space
print("\n1. DEFINE PARAMETER SPACE")
param_grid = {
    'window': [10, 12, 14, 16, 18, 20],
    'multiplier': [1.5, 1.75, 2.0, 2.25, 2.5]
}
print(f"   Parameters: {list(param_grid.keys())}")
print(f"   Total combinations: {len(param_grid['window']) * len(param_grid['multiplier'])}")

# Step 2: Run walk-forward optimization
print("\n2. RUN WALK-FORWARD OPTIMIZATION")
oos_df, stability, history = optimizer.rolling_walk_forward(
    stop_method='atr',
    param_grid=param_grid,
    n_segments=4,
    opt_metric='convex',
    n_jobs=1
)
print(f"   Valid segments: {stability.get('n_segments_valid', 0)}")
print(f"   Mean OOS Convex: ${oos_df['convex'].mean():,.2f}")

# Step 3: Analyze parameter stability
print("\n3. ANALYZE PARAMETER STABILITY")
for key, value in stability.items():
    if '_cv' in key:
        param = key.replace('_cv', '')
        if pd.isna(value):
            print(f"   {param} CV: N/A")
        else:
            status = "STABLE" if value < 0.2 else "MODERATE" if value < 0.5 else "UNSTABLE"
            print(f"   {param} CV: {value:.4f} ({status})")

# Step 4: Get consensus best parameters
print("\n4. DETERMINE BEST PARAMETERS")
# Use most frequent or last segment's parameters
if history:
    best_params = history[-1]['params']
    print(f"   Best parameters: {best_params}")

# Step 5: Run sensitivity analysis
print("\n5. RUN SENSITIVITY ANALYSIS")
plateau_ratio, sens_results = optimizer.sensitivity_analysis(
    stop_method='atr',
    best_params=best_params,
    variance=0.20
)
print(f"   Plateau ratio: {plateau_ratio:.2f}%")
robustness = "HIGH" if plateau_ratio > 80 else "MODERATE" if plateau_ratio > 60 else "LOW"
print(f"   Robustness: {robustness}")

# Step 6: Final recommendation
print("\n" + "="*60)
print("OPTIMIZATION SUMMARY")
print("="*60)
print(f"\nRecommended Parameters:")
for param, value in best_params.items():
    print(f"  {param}: {value}")

print(f"\nExpected Performance (OOS):")
print(f"  Mean Convex Equity: ${oos_df['convex'].mean():,.2f}")
print(f"  Mean Return: {(oos_df['convex'].mean() / 100000 - 1) * 100:.2f}%")

print(f"\nConfidence Assessment:")
print(f"  Parameter Stability: {list(stability.values())[0] if stability else 'N/A'}")
print(f"  Robustness (Plateau): {plateau_ratio:.1f}%")

## 10. Best Practices and Tips

### Data Requirements

| Requirement | Minimum | Recommended |
|-------------|---------|-------------|
| Total rows | 100 | 500+ |
| Rows per segment | 30 | 50+ |
| Segments | 2 | 4-6 |

### Parameter Grid Guidelines

```python
# GOOD: Reasonable grid
param_grid = {
    'window': [10, 14, 20, 30],        # 4 values
    'multiplier': [1.5, 2.0, 2.5, 3.0]  # 4 values
}  # 16 combinations

# BAD: Too large
param_grid = {
    'window': list(range(5, 100)),     # 95 values!
    'multiplier': np.arange(0.5, 5, 0.1)  # 45 values!
}  # 4,275 combinations - too many!
```

### Interpreting Results

| Metric | Good | Concerning |
|--------|------|------------|
| Parameter CV | < 0.2 | > 0.5 |
| Plateau Ratio | > 80% | < 60% |
| IS vs OOS Gap | < 20% | > 50% |

### Common Pitfalls

1. **Overfitting**: Using too many parameters or segments
2. **Data Snooping**: Testing too many parameter combinations
3. **Survivorship Bias**: Not including delisted securities
4. **Look-Ahead Bias**: Using future data in calculations

In [None]:
# Quick reference: Full optimization workflow
print("""
QUICK REFERENCE: Strategy Optimization Workflow
================================================

# 1. Setup
from algoshort.optimizer import StrategyOptimizer, get_equity

optimizer = StrategyOptimizer(
    data=df,
    equity_func=get_equity,  # or custom function
    config_path='config.json'
)

# 2. Walk-Forward Optimization
oos_df, stability, history = optimizer.rolling_walk_forward(
    stop_method='atr',
    param_grid={'window': [10, 14, 20], 'multiplier': [1.5, 2.0, 2.5]},
    n_segments=4,
    opt_metric='convex'
)

# 3. Check Stability
print(f"Parameter stability: {stability}")

# 4. Sensitivity Analysis
best_params = history[-1]['params']
plateau, sens_df = optimizer.sensitivity_analysis(
    stop_method='atr',
    best_params=best_params,
    variance=0.20
)
print(f"Plateau ratio: {plateau:.1f}%")

# 5. Use optimized parameters in production
print(f"Recommended: {best_params}")
""")

In [None]:
# Cleanup: Remove temporary config file
if os.path.exists(config_path):
    os.unlink(config_path)
    print(f"Temporary config file removed: {config_path}")

---

## Summary

This guide covered:

1. **Setup**: Import and understand the optimizer module
2. **Concepts**: Walk-forward analysis, parameter stability, sensitivity
3. **get_equity()**: The core equity calculation function
4. **StrategyOptimizer**: Class initialization and validation
5. **Grid Search**: Finding optimal parameters on a single segment
6. **Walk-Forward**: Robust optimization with IS/OOS splits
7. **Sensitivity**: Testing parameter robustness
8. **Signal Comparison**: Comparing different trading signals
9. **Integration**: Complete workflow with other modules
10. **Best Practices**: Guidelines for effective optimization

### Key Takeaways

- Always use **walk-forward analysis** to avoid overfitting
- Check **parameter stability** (CV) across segments
- Validate with **sensitivity analysis** before production
- Keep parameter grids **reasonable** (< 1000 combinations)
- Compare **IS vs OOS performance** to detect overfitting

For questions or issues, refer to the test suite at `tests/test_optimizer.py`.