# Hyperparameter Tuning - Fraud Detection Model

This notebook demonstrates hyperparameter tuning strategies for the fraud detection
model using the project's `HyperparameterTuner` and `ExperimentTracker` classes.

We cover three approaches:
1. **Grid Search** — exhaustive search over a discrete parameter grid
2. **Random Search** — sampling from parameter distributions (including callable lambdas)
3. **SageMaker Automatic Model Tuning** — Bayesian optimization via managed tuning jobs

All trials are automatically logged to the ExperimentTracker for reproducibility.

**Requirements covered:** 3.1 (Grid search), 3.2 (Random search), 3.3 (SageMaker Automatic Model Tuning), 3.4 (Log all parameter combinations and metrics)

## 1. Setup and Imports

In [None]:
import sys
import io
import random

import boto3
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from xgboost import XGBClassifier

# Add project src to path
sys.path.insert(0, '../src')
from experiment_tracking import ExperimentTracker
from hyperparameter_tuning import HyperparameterTuner

sns.set_theme(style='whitegrid')
%matplotlib inline

## 2. Load Data from S3

Load the fraud detection dataset from the `fraud-detection-data` bucket and prepare
train/test splits for tuning experiments.

In [None]:
BUCKET_NAME = 'fraud-detection-data'
DATA_PREFIX = 'processed'

s3_client = boto3.client('s3')


def load_parquet_from_s3(bucket: str, key: str) -> pd.DataFrame:
    """Load a Parquet file from S3 into a pandas DataFrame."""
    response = s3_client.get_object(Bucket=bucket, Key=key)
    return pd.read_parquet(io.BytesIO(response['Body'].read()))


train_df = load_parquet_from_s3(BUCKET_NAME, f'{DATA_PREFIX}/train.parquet')
test_df = load_parquet_from_s3(BUCKET_NAME, f'{DATA_PREFIX}/test.parquet')

# Separate features and target
TARGET = 'Class'
FEATURES = [c for c in train_df.columns if c != TARGET]

X_train = train_df[FEATURES]
y_train = train_df[TARGET]
X_test = test_df[FEATURES]
y_test = test_df[TARGET]

print(f'Training set:  {X_train.shape[0]:,} rows, {X_train.shape[1]} features')
print(f'Test set:      {X_test.shape[0]:,} rows, {X_test.shape[1]} features')

## 3. Initialize ExperimentTracker and HyperparameterTuner

The `HyperparameterTuner` accepts an optional `ExperimentTracker` instance.
When provided, every trial is automatically logged with its parameters and metrics
(Requirement 3.4).

In [None]:
tracker = ExperimentTracker(region_name='us-east-1')
tuner = HyperparameterTuner(tracker=tracker)

print('ExperimentTracker and HyperparameterTuner initialized.')

## 4. Grid Search

Grid search evaluates every combination in the parameter grid. This is thorough but
can be expensive for large grids. Use it when the search space is small and you want
full coverage.

**Requirement 3.1**: Support grid search hyperparameter tuning with configurable parameter ranges

In [None]:
param_grid = {
    'max_depth': [3, 5, 7],
    'learning_rate': [0.01, 0.1, 0.2],
    'n_estimators': [50, 100, 150],
    'use_label_encoder': [False],
    'eval_metric': ['logloss'],
}

total_combos = 1
for v in param_grid.values():
    total_combos *= len(v)
print(f'Grid search will evaluate {total_combos} parameter combinations.')

In [None]:
grid_results = tuner.grid_search(
    model_class=XGBClassifier,
    param_grid=param_grid,
    X_train=X_train,
    y_train=y_train,
    X_test=X_test,
    y_test=y_test,
    experiment_name='xgboost-grid-search',
    scoring='accuracy',
)

print(f'Grid search complete — {len(grid_results["all_results"])} trials evaluated.')

### 4.1 Grid Search Results

In [None]:
# Build a results DataFrame for easy inspection
grid_rows = []
for trial in grid_results['all_results']:
    row = {**trial['params'], **trial['metrics'], 'score': trial['score']}
    grid_rows.append(row)

grid_df = pd.DataFrame(grid_rows).sort_values('score', ascending=False)
grid_df.head(10)

In [None]:
print('=== Best Hyperparameters (Grid Search) ===')
print(f'  Score (accuracy): {grid_results["best_score"]:.4f}')
for param, value in grid_results['best_params'].items():
    print(f'  {param}: {value}')

In [None]:
# Visualize grid search: accuracy by max_depth and learning_rate
fig, axes = plt.subplots(1, 3, figsize=(18, 5))

for ax, n_est in zip(axes, param_grid['n_estimators']):
    subset = grid_df[grid_df['n_estimators'] == n_est]
    pivot = subset.pivot_table(
        index='max_depth', columns='learning_rate', values='accuracy'
    )
    sns.heatmap(pivot, annot=True, fmt='.4f', cmap='YlGnBu', ax=ax)
    ax.set_title(f'n_estimators = {n_est}')

plt.suptitle('Grid Search — Accuracy by max_depth and learning_rate', y=1.02)
plt.tight_layout()
plt.show()

## 5. Random Search

Random search samples from parameter distributions. It is more efficient than grid
search for high-dimensional spaces because it explores a wider range of values.

Distributions can be:
- **Lists** — a random element is chosen uniformly
- **Callables** (e.g. `lambda: random.uniform(0.01, 0.3)`) — called each iteration

**Requirement 3.2**: Support random search hyperparameter tuning with configurable parameter distributions

In [None]:
param_distributions = {
    'max_depth': [3, 4, 5, 6, 7, 8, 9, 10],
    'learning_rate': lambda: random.uniform(0.01, 0.3),
    'n_estimators': lambda: random.randint(50, 200),
    'subsample': lambda: random.uniform(0.5, 1.0),
    'colsample_bytree': lambda: random.uniform(0.5, 1.0),
    'use_label_encoder': [False],
    'eval_metric': ['logloss'],
}

N_ITER = 15
print(f'Random search will sample {N_ITER} parameter combinations.')

In [None]:
random_results = tuner.random_search(
    model_class=XGBClassifier,
    param_distributions=param_distributions,
    n_iter=N_ITER,
    X_train=X_train,
    y_train=y_train,
    X_test=X_test,
    y_test=y_test,
    experiment_name='xgboost-random-search',
    scoring='accuracy',
)

print(f'Random search complete — {len(random_results["all_results"])} trials evaluated.')

### 5.1 Random Search Results

In [None]:
random_rows = []
for trial in random_results['all_results']:
    row = {**trial['params'], **trial['metrics'], 'score': trial['score']}
    random_rows.append(row)

random_df = pd.DataFrame(random_rows).sort_values('score', ascending=False)
random_df.head(10)

In [None]:
print('=== Best Hyperparameters (Random Search) ===')
print(f'  Score (accuracy): {random_results["best_score"]:.4f}')
for param, value in random_results['best_params'].items():
    if isinstance(value, float):
        print(f'  {param}: {value:.4f}')
    else:
        print(f'  {param}: {value}')

In [None]:
# Visualize random search: scatter of learning_rate vs accuracy
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

axes[0].scatter(random_df['learning_rate'], random_df['accuracy'], c='teal', alpha=0.7)
axes[0].set_xlabel('learning_rate')
axes[0].set_ylabel('Accuracy')
axes[0].set_title('Random Search — Learning Rate vs Accuracy')

axes[1].scatter(random_df['max_depth'], random_df['accuracy'], c='darkorange', alpha=0.7)
axes[1].set_xlabel('max_depth')
axes[1].set_ylabel('Accuracy')
axes[1].set_title('Random Search — Max Depth vs Accuracy')

plt.tight_layout()
plt.show()

## 6. SageMaker Automatic Model Tuning (Bayesian Optimization)

For large-scale tuning, SageMaker Automatic Model Tuning uses Bayesian optimization
to intelligently explore the hyperparameter space. It runs parallel training jobs on
dedicated instances, making it faster for expensive models.

**Requirement 3.3**: Support Bayesian optimization hyperparameter tuning using SageMaker Automatic Model Tuning

> **Note**: This section requires a SageMaker execution role and S3 training data.
> It will not run in a local-only environment.

In [None]:
import sagemaker
from sagemaker.estimator import Estimator
from sagemaker.tuner import (
    IntegerParameter,
    ContinuousParameter,
    HyperparameterTuner as SageMakerTuner,
)

sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()
region = sagemaker_session.boto_region_name

print(f'SageMaker session region: {region}')
print(f'Execution role: {role}')

In [None]:
# Configure the XGBoost estimator for SageMaker training
xgb_image_uri = sagemaker.image_uris.retrieve('xgboost', region, version='1.5-1')

xgb_estimator = Estimator(
    image_uri=xgb_image_uri,
    role=role,
    instance_count=1,
    instance_type='ml.m5.xlarge',
    output_path=f's3://{BUCKET_NAME}/models/tuning-output',
    sagemaker_session=sagemaker_session,
)

# Set static hyperparameters
xgb_estimator.set_hyperparameters(
    objective='binary:logistic',
    eval_metric='auc',
)

print('XGBoost estimator configured.')

In [None]:
# Define hyperparameter ranges for Bayesian optimization
hyperparameter_ranges = {
    'max_depth': IntegerParameter(3, 10),
    'eta': ContinuousParameter(0.01, 0.3),
    'subsample': ContinuousParameter(0.5, 1.0),
    'colsample_bytree': ContinuousParameter(0.5, 1.0),
    'num_round': IntegerParameter(50, 200),
}

print('Hyperparameter ranges:')
for name, param_range in hyperparameter_ranges.items():
    print(f'  {name}: {param_range}')

In [None]:
# Launch Bayesian optimization via the project's HyperparameterTuner
TRAIN_DATA_S3 = f's3://{BUCKET_NAME}/{DATA_PREFIX}/train.parquet'
VALIDATION_DATA_S3 = f's3://{BUCKET_NAME}/{DATA_PREFIX}/validation.parquet'

bayesian_results = tuner.bayesian_optimization(
    estimator=xgb_estimator,
    objective_metric_name='validation:auc',
    hyperparameter_ranges=hyperparameter_ranges,
    max_jobs=20,
    max_parallel_jobs=5,
    train_data_s3=TRAIN_DATA_S3,
    validation_data_s3=VALIDATION_DATA_S3,
)

print(f'Bayesian optimization complete.')
print(f'  Best training job: {bayesian_results["best_training_job"]}')
print(f'  Tuning job name:   {bayesian_results["tuning_job_name"]}')

### 6.1 Retrieve Best Hyperparameters from SageMaker Tuning

In [None]:
print('=== Best Hyperparameters (SageMaker Bayesian Optimization) ===')
for param, value in bayesian_results['best_params'].items():
    print(f'  {param}: {value}')

## 7. Compare Tuning Methods

Retrieve the best hyperparameters from each method side by side to decide which
configuration to promote to production.

In [None]:
comparison = {
    'Grid Search': {
        'best_score': grid_results['best_score'],
        'best_params': grid_results['best_params'],
        'trials': len(grid_results['all_results']),
    },
    'Random Search': {
        'best_score': random_results['best_score'],
        'best_params': random_results['best_params'],
        'trials': len(random_results['all_results']),
    },
}

print(f'{"Method":<20} {"Best Score":<15} {"Trials"}')
print('-' * 50)
for method, info in comparison.items():
    print(f'{method:<20} {info["best_score"]:<15.4f} {info["trials"]}')

# Determine overall winner between local methods
winner = max(comparison, key=lambda m: comparison[m]['best_score'])
print(f'\nBest local method: {winner} (score: {comparison[winner]["best_score"]:.4f})')

In [None]:
# Bar chart comparing best scores
methods = list(comparison.keys())
scores = [comparison[m]['best_score'] for m in methods]

plt.figure(figsize=(8, 5))
bars = plt.bar(methods, scores, color=['steelblue', 'darkorange'])
plt.ylabel('Best Accuracy')
plt.title('Tuning Method Comparison — Best Accuracy')
for bar, score in zip(bars, scores):
    plt.text(bar.get_x() + bar.get_width() / 2, bar.get_height(),
             f'{score:.4f}', ha='center', va='bottom')
plt.ylim(min(scores) - 0.01, max(scores) + 0.01)
plt.tight_layout()
plt.show()

## 8. Query Experiment History

Use the ExperimentTracker to retrieve logged experiments and inspect past tuning runs.

**Requirement 3.4**: Log all parameter combinations and their performance metrics

In [None]:
# Query all grid search experiments
grid_experiments = tracker.query_experiments(
    experiment_name='xgboost-grid-search'
)

print(f'Found {len(grid_experiments)} grid search experiments in tracker.')
if grid_experiments:
    print(f'Sample experiment keys: {list(grid_experiments[0].keys())}')

In [None]:
# Query random search experiments
random_experiments = tracker.query_experiments(
    experiment_name='xgboost-random-search'
)

print(f'Found {len(random_experiments)} random search experiments in tracker.')

## Summary

This notebook demonstrated three hyperparameter tuning strategies:

1. **Grid Search** — exhaustive evaluation of all parameter combinations, best for small search spaces.
2. **Random Search** — efficient sampling with callable distributions (lambdas), better for high-dimensional spaces.
3. **SageMaker Automatic Model Tuning** — Bayesian optimization with parallel training jobs for large-scale tuning.

All trials were logged to the ExperimentTracker for full reproducibility. The best
hyperparameters from any method can be promoted to production using the
`ProductionIntegrator` (see notebook 05).

Next steps: compare algorithms (notebook 03) or promote the winning configuration to production (notebook 05).