diff --git a/examples/optuna/README.md b/examples/optuna/README.md new file mode 100644 index 00000000..f045a969 --- /dev/null +++ b/examples/optuna/README.md @@ -0,0 +1,262 @@ +# Optuna Sampler Examples + +This directory contains comprehensive examples demonstrating each Optuna sampler available in Hyperactive. Each example shows the sampler's behavior, characteristics, and best use cases. + +## Quick Start + +Run any example directly: +```bash +python tpe_sampler_example.py +python random_sampler_example.py +# ... etc +``` + +## Sampler Overview + +| Sampler | Type | Best For | Characteristics | +|---------|------|----------|----------------| +| [TPESampler](tpe_sampler_example.py) | Bayesian | General use | Default choice, good balance | +| [RandomSampler](random_sampler_example.py) | Random | Baselines, noisy objectives | Simple, parallel-friendly | +| [CmaEsSampler](cmaes_sampler_example.py) | Evolution Strategy | Continuous optimization | Learns parameter correlations | +| [GPSampler](gp_sampler_example.py) | Bayesian | Expensive evaluations | Uncertainty quantification | +| [GridSampler](grid_sampler_example.py) | Exhaustive | Small spaces | Systematic coverage | +| [NSGAIISampler](nsga_ii_sampler_example.py) | Multi-objective | 2 objectives | Pareto optimization | +| [NSGAIIISampler](nsga_iii_sampler_example.py) | Multi-objective | 3+ objectives | Many-objective problems | +| [QMCSampler](qmc_sampler_example.py) | Quasi-random | Space exploration | Low-discrepancy sequences | + +## Detailed Examples + +### 1. TPESampler - Tree-structured Parzen Estimator +**File:** `tpe_sampler_example.py` + +The default and most popular choice. Uses Bayesian optimization to model good vs bad parameter regions. + +```python +from hyperactive.opt.optuna import TPESampler + +optimizer = TPESampler( + param_space=param_space, + n_trials=50, + random_state=42, + n_startup_trials=10, # Random trials before TPE + initialize={"warm_start": [good_params]} +) +``` + +**Best for:** General hyperparameter optimization, mixed parameter types + +### 2. RandomSampler - Pure Random Search +**File:** `random_sampler_example.py` + +Simple random sampling, surprisingly effective and good baseline. + +```python +from hyperactive.opt.optuna import RandomSampler + +optimizer = RandomSampler( + param_space=param_space, + n_trials=30, + random_state=42 +) +``` + +**Best for:** Baselines, noisy objectives, high-dimensional spaces + +### 3. CmaEsSampler - Covariance Matrix Adaptation +**File:** `cmaes_sampler_example.py` + +Evolution strategy that adapts search distribution shape. Requires `pip install cmaes`. + +```python +from hyperactive.opt.optuna import CmaEsSampler + +optimizer = CmaEsSampler( + param_space=continuous_params, # Only continuous! + n_trials=40, + sigma0=0.2, # Initial step size + random_state=42 +) +``` + +**Best for:** Continuous optimization, parameter correlations + +### 4. GPSampler - Gaussian Process Optimization +**File:** `gp_sampler_example.py` + +Bayesian optimization with uncertainty quantification. + +```python +from hyperactive.opt.optuna import GPSampler + +optimizer = GPSampler( + param_space=param_space, + n_trials=25, + n_startup_trials=8, + deterministic_objective=False +) +``` + +**Best for:** Expensive evaluations, uncertainty-aware optimization + +### 5. GridSampler - Exhaustive Grid Search +**File:** `grid_sampler_example.py` + +Systematic evaluation of discrete parameter grids. + +```python +from hyperactive.opt.optuna import GridSampler + +param_space = { + "n_neighbors": [1, 3, 5, 7, 11], # Discrete values only + "weights": ["uniform", "distance"], + "metric": ["euclidean", "manhattan"] +} + +optimizer = GridSampler( + param_space=param_space, + n_trials=total_combinations +) +``` + +**Best for:** Small discrete spaces, exhaustive analysis + +### 6. NSGAIISampler - Multi-objective (2 objectives) +**File:** `nsga_ii_sampler_example.py` + +Multi-objective optimization for two conflicting objectives. + +```python +from hyperactive.opt.optuna import NSGAIISampler + +# Multi-objective experiment returning [obj1, obj2] +optimizer = NSGAIISampler( + param_space=param_space, + n_trials=50, + population_size=20, + mutation_prob=0.1, + crossover_prob=0.9 +) +``` + +**Best for:** Trade-off analysis, accuracy vs complexity + +### 7. NSGAIIISampler - Many-objective (3+ objectives) +**File:** `nsga_iii_sampler_example.py` + +Many-objective optimization using reference points. + +```python +from hyperactive.opt.optuna import NSGAIIISampler + +# Many-objective experiment returning [obj1, obj2, obj3, obj4] +optimizer = NSGAIIISampler( + param_space=param_space, + n_trials=60, + population_size=24 +) +``` + +**Best for:** 3+ objectives, complex trade-offs + +### 8. QMCSampler - Quasi-Monte Carlo +**File:** `qmc_sampler_example.py` + +Low-discrepancy sequences for uniform space filling. + +```python +from hyperactive.opt.optuna import QMCSampler + +optimizer = QMCSampler( + param_space=param_space, + n_trials=32, # Power of 2 recommended + qmc_type="sobol", + scramble=True +) +``` + +**Best for:** Space exploration, design of experiments + +## Common Features + +All samplers support: +- **Random state:** `random_state=42` for reproducibility +- **Early stopping:** `early_stopping=10` stop after N trials without improvement +- **Max score:** `max_score=0.99` stop when target reached +- **Warm start:** `initialize={"warm_start": [points]}` initial good points + +## Choosing the Right Sampler + +### Quick Decision Tree + +1. **Multiple objectives?** + - 2 objectives → NSGAIISampler + - 3+ objectives → NSGAIIISampler + +2. **Single objective:** + - Need baseline/comparison → RandomSampler + - Small discrete space → GridSampler + - Expensive evaluations → GPSampler + - Only continuous params → CmaEsSampler + - Space exploration → QMCSampler + - General case → **TPESampler** (recommended) + +### Computational Budget + +- **Low budget (< 50 trials):** RandomSampler, QMCSampler +- **Medium budget (50-200 trials):** TPESampler, GPSampler +- **High budget (200+ trials):** CmaEsSampler, GridSampler + +### Parameter Types + +- **Mixed types:** TPESampler, GPSampler, RandomSampler +- **Continuous only:** CmaEsSampler +- **Discrete only:** GridSampler + +## Advanced Usage + +### Combining Samplers + +```python +# Phase 1: Initial exploration +qmc_optimizer = QMCSampler(n_trials=20, ...) +initial_results = qmc_optimizer.run() + +# Phase 2: Focused optimization +tpe_optimizer = TPESampler( + n_trials=30, + initialize={"warm_start": [initial_results]} +) +final_results = tpe_optimizer.run() +``` + +### Multi-objective Analysis + +```python +# For multi-objective problems, you'll typically get multiple solutions +# along the Pareto front. Choose based on your preferences: + +solutions = nsga_ii_optimizer.run() +# In practice, you'd analyze the trade-off curve +``` + +## Dependencies + +Most samplers work out of the box. Additional dependencies: +- **CmaEsSampler:** `pip install cmaes` +- All others: Only require `optuna` (included with Hyperactive) + +## Performance Tips + +1. **Start with TPESampler** for general problems +2. **Use RandomSampler** as baseline comparison +3. **Powers of 2** for QMCSampler trials (32, 64, 128, etc.) +4. **Warm start** with good initial points when available +5. **Early stopping** to avoid wasted evaluations +6. **Random state** for reproducible experiments + +## Further Reading + +- [Optuna Documentation](https://optuna.readthedocs.io/) +- [Hyperactive Documentation](https://hyperactive.readthedocs.io/) +- [Bayesian Optimization Review](https://arxiv.org/abs/1807.02811) +- [Multi-objective Optimization Survey](https://arxiv.org/abs/1909.04109) diff --git a/examples/optuna/cmaes_sampler_example.py b/examples/optuna/cmaes_sampler_example.py new file mode 100644 index 00000000..7e83019f --- /dev/null +++ b/examples/optuna/cmaes_sampler_example.py @@ -0,0 +1,165 @@ +""" +CmaEsSampler Example - Covariance Matrix Adaptation Evolution Strategy + +CMA-ES is a powerful evolution strategy particularly effective for continuous +optimization problems. It adapts both the mean and covariance matrix of a +multivariate normal distribution to efficiently explore the parameter space. + +Characteristics: +- Excellent for continuous parameter optimization +- Adapts search distribution shape and orientation +- Self-adaptive step size control +- Handles ill-conditioned problems well +- Does not work with categorical parameters +- Requires 'cmaes' package: pip install cmaes + +Note: This example includes a fallback if 'cmaes' package is not installed. +""" + +import numpy as np +from sklearn.datasets import make_regression +from sklearn.neural_network import MLPRegressor +from sklearn.model_selection import cross_val_score +from sklearn.metrics import mean_squared_error + +from hyperactive.experiment.integrations import SklearnCvExperiment +from hyperactive.opt.optuna import CmaEsSampler + + +def cmaes_theory(): + """Explain CMA-ES algorithm theory.""" + # CMA-ES Algorithm Theory: + # 1. Maintains a multivariate normal distribution N(μ, σ²C) + # - μ: mean vector (center of search) + # - σ: step size (global scaling) + # - C: covariance matrix (shape and orientation) + # + # 2. In each generation: + # - Sample λ offspring from N(μ, σ²C) + # - Evaluate all offspring + # - Select μ best solutions + # - Update μ, σ, and C based on selected solutions + # + # 3. Adaptive features: + # - Covariance matrix learns correlations between parameters + # - Step size adapts to local landscape + # - Handles rotated/scaled problems efficiently + + +def main(): + # === CmaEsSampler Example === + # Covariance Matrix Adaptation Evolution Strategy + + # Check if cmaes is available + try: + import cmaes + + cmaes_available = True + print(" CMA-ES package is available") + except ImportError: + cmaes_available = False + print("⚠ CMA-ES package not available. Install with: pip install cmaes") + print(" This example will demonstrate the interface but may fail at runtime.") + print() + + cmaes_theory() + + # Create a continuous optimization problem + X, y = make_regression(n_samples=200, n_features=10, noise=0.1, random_state=42) + print( + f"Dataset: Synthetic regression ({X.shape[0]} samples, {X.shape[1]} features)" + ) + + # Create experiment - neural network with continuous parameters + estimator = MLPRegressor(random_state=42, max_iter=1000) + experiment = SklearnCvExperiment( + estimator=estimator, X=X, y=y, cv=3, scoring="neg_mean_squared_error" + ) + + # Define search space - ONLY continuous parameters (CMA-ES limitation) + param_space = { + "alpha": (1e-6, 1e-1), # L2 regularization + "learning_rate_init": (1e-4, 1e-1), # Initial learning rate + "beta_1": (0.8, 0.99), # Adam beta1 parameter + "beta_2": (0.9, 0.999), # Adam beta2 parameter + "epsilon": (1e-9, 1e-6), # Adam epsilon parameter + # Note: No categorical parameters - CMA-ES doesn't support them + } + + # Search Space (Continuous parameters only): + # for param, space in param_space.items(): + # print(f" {param}: {space}") + # Note: CMA-ES only works with continuous parameters + # For mixed parameter types, consider TPESampler or GPSampler + + # Configure CmaEsSampler + optimizer = CmaEsSampler( + param_space=param_space, + n_trials=40, + random_state=42, + experiment=experiment, + sigma0=0.2, # Initial step size (exploration vs exploitation) + n_startup_trials=5, # Random trials before CMA-ES starts + ) + + # CmaEsSampler Configuration: + # n_trials: configured above + # sigma0: initial step size + # n_startup_trials: random trials before CMA-ES starts + # Adaptive covariance matrix will be learned during optimization + + if not cmaes_available: + print("⚠ Skipping optimization due to missing 'cmaes' package") + print("Install with: pip install cmaes") + return None, None + + # Run optimization + # Running CMA-ES optimization... + try: + best_params = optimizer.run() + + # Results + print("\n=== Results ===") + print(f"Best parameters: {best_params}") + print(f"Best score: {optimizer.best_score_:.4f}") + print() + + except ImportError as e: + print(f"CMA-ES failed: {e}") + print("Install the required package: pip install cmaes") + return None, None + + # CMA-ES Behavior Analysis: + # Evolution of search distribution: + # Initial: Spherical distribution (σ₀ * I) + # Early trials: Random exploration to gather information + # Mid-trials: Covariance matrix learns parameter correlations + # Later trials: Focused search along principal component directions + + # Adaptive Properties: + # Step size (σ) adapts to local topology + # Covariance matrix (C) learns parameter interactions + # Mean vector (μ) tracks promising regions + # Handles ill-conditioned and rotated problems + + # Best Use Cases: + # Continuous optimization problems + # Parameters with potential correlations + # Non-convex, multimodal functions + # When gradient information is unavailable + # Medium-dimensional problems (2-40 parameters) + + # Limitations: + # Only continuous parameters (no categorical/discrete) + # Requires additional 'cmaes' package + # Can be slower than TPE for simple problems + # Memory usage grows with parameter dimension + + if cmaes_available: + return best_params, optimizer.best_score_ + else: + return None, None + + +if __name__ == "__main__": + best_params, best_score = main() diff --git a/examples/optuna/gp_sampler_example.py b/examples/optuna/gp_sampler_example.py new file mode 100644 index 00000000..f7c2df1f --- /dev/null +++ b/examples/optuna/gp_sampler_example.py @@ -0,0 +1,162 @@ +""" +GPSampler Example - Gaussian Process Bayesian Optimization + +The GPSampler uses Gaussian Processes to model the objective function and +select promising parameter configurations. It's particularly effective for +expensive function evaluations and provides uncertainty estimates. + +Characteristics: +- Bayesian optimization with Gaussian Process surrogate model +- Balances exploration (high uncertainty) and exploitation (high mean) +- Works well with mixed parameter types +- Provides uncertainty quantification +- Efficient for expensive objective functions +- Can handle constraints and noisy observations +""" + +import numpy as np +from sklearn.datasets import load_breast_cancer +from sklearn.svm import SVC +from sklearn.model_selection import cross_val_score + +from hyperactive.experiment.integrations import SklearnCvExperiment +from hyperactive.opt.optuna import GPSampler + + +def gaussian_process_theory(): + """Explain Gaussian Process theory for optimization.""" + # Gaussian Process Bayesian Optimization: + # + # 1. Surrogate Model: + # - GP models f(x) ~ N(μ(x), σ²(x)) + # - μ(x): predicted mean (expected objective value) + # - σ²(x): predicted variance (uncertainty estimate) + # + # 2. Acquisition Function: + # - Balances exploration vs exploitation + # - Common choices: Expected Improvement (EI), Upper Confidence Bound (UCB) + # - Selects next point to evaluate: x_next = argmax acquisition(x) + # + # 3. Iterative Process: + # - Fit GP to observed data (x_i, f(x_i)) + # - Optimize acquisition function to find x_next + # - Evaluate f(x_next) + # - Update dataset and repeat + # + # 4. Key Advantages: + # - Uncertainty-aware: explores uncertain regions + # - Sample efficient: good for expensive evaluations + # - Principled: grounded in Bayesian inference + + +def main(): + # === GPSampler Example === + # Gaussian Process Bayesian Optimization + + gaussian_process_theory() + + # Load dataset - classification problem + X, y = load_breast_cancer(return_X_y=True) + print( + f"Dataset: Breast cancer classification ({X.shape[0]} samples, {X.shape[1]} features)" + ) + + # Create experiment + estimator = SVC(random_state=42) + experiment = SklearnCvExperiment(estimator=estimator, X=X, y=y, cv=5) + + # Define search space - mixed parameter types + param_space = { + "C": (0.01, 100), # Continuous - regularization + "gamma": (1e-6, 1e2), # Continuous - RBF parameter + "kernel": ["rbf", "poly", "sigmoid"], # Categorical + "degree": (2, 5), # Integer - polynomial degree + "coef0": (0.0, 1.0), # Continuous - kernel coefficient + } + + # Search Space (Mixed parameter types): + # for param, space in param_space.items(): + # print(f" {param}: {space}") + + # Configure GPSampler + optimizer = GPSampler( + param_space=param_space, + n_trials=25, # Fewer trials - GP is sample efficient + random_state=42, + experiment=experiment, + n_startup_trials=8, # Random initialization before GP modeling + deterministic_objective=False, # Set True if objective is noise-free + ) + + # GPSampler Configuration: + # n_trials: configured above + # n_startup_trials: random initialization + # deterministic_objective: configures noise handling + # Acquisition function: Expected Improvement (default) + + # Run optimization + # Running GP-based optimization... + best_params = optimizer.run() + + # Results + print("\n=== Results ===") + print(f"Best parameters: {best_params}") + print(f"Best score: {optimizer.best_score_:.4f}") + print() + + # GP Optimization Phases: + # + # Phase 1 (Trials 1-8): Random Exploration + # Random sampling for initial GP training data + # Builds diverse set of observations + # No model assumptions yet + + # Phase 2 (Trials 9-25): GP-guided Search + # GP model learns from observed data + # Acquisition function balances: + # - Exploitation: areas with high predicted performance + # - Exploration: areas with high uncertainty + # Sequential decision making with uncertainty + + # GP Model Characteristics: + # Handles mixed parameter types (continuous, discrete, categorical) + # Provides uncertainty estimates for all predictions + # Automatically balances exploration vs exploitation + # Sample efficient - good for expensive evaluations + # Can incorporate prior knowledge through mean/kernel functions + + # Acquisition Function Behavior: + # High mean + low variance → exploitation + # Low mean + high variance → exploration + # Balanced trade-off prevents premature convergence + # Adapts exploration strategy based on observed data + + # Best Use Cases: + # Expensive objective function evaluations + # Small to medium parameter spaces (< 20 dimensions) + # When uncertainty quantification is valuable + # Mixed parameter types (continuous + categorical) + # Noisy objective functions (with appropriate kernel) + + # Limitations: + # Computational cost grows with number of observations + # Hyperparameter tuning for GP kernel + # May struggle in very high dimensions + # Assumes some smoothness in objective function + + # Comparison with TPESampler: + # GPSampler advantages: + # + Principled uncertainty quantification + # + Better for expensive evaluations + # + Can handle constraints naturally + # + # TPESampler advantages: + # + Faster computation + # + Better scalability to high dimensions + # + More robust hyperparameter defaults + + return best_params, optimizer.best_score_ + + +if __name__ == "__main__": + best_params, best_score = main() diff --git a/examples/optuna/grid_sampler_example.py b/examples/optuna/grid_sampler_example.py new file mode 100644 index 00000000..b604f231 --- /dev/null +++ b/examples/optuna/grid_sampler_example.py @@ -0,0 +1,189 @@ +""" +GridSampler Example - Exhaustive Grid Search + +The GridSampler performs exhaustive search over a discretized parameter grid. +It systematically evaluates every combination of specified parameter values, +ensuring complete coverage but potentially requiring many evaluations. + +Characteristics: +- Exhaustive search over predefined parameter grids +- Systematic and reproducible exploration +- Guarantees finding the best combination within the grid +- No learning or adaptation +- Best for small, discrete parameter spaces +- Interpretable and deterministic results +""" + +import numpy as np +from sklearn.datasets import load_iris +from sklearn.neighbors import KNeighborsClassifier +from sklearn.model_selection import cross_val_score + +from hyperactive.experiment.integrations import SklearnCvExperiment +from hyperactive.opt.optuna import GridSampler + + +def grid_search_theory(): + """Explain grid search methodology.""" + # Grid Search Methodology: + # + # 1. Parameter Discretization: + # - Each continuous parameter divided into discrete levels + # - Categorical parameters use all specified values + # - Creates n₁ × n₂ × ... × nₖ total combinations + # + # 2. Systematic Evaluation: + # - Every combination evaluated exactly once + # - No randomness or learning involved + # - Order of evaluation is deterministic + # + # 3. Optimality Guarantees: + # - Finds global optimum within the discrete grid + # - Quality depends on grid resolution + # - May miss optimal values between grid points + # + # 4. Computational Complexity: + # - Exponential growth with number of parameters + # - Curse of dimensionality for many parameters + # - Embarrassingly parallel + + +def demonstrate_curse_of_dimensionality(): + """Show how grid search scales with dimensions.""" + # Grid Search Scaling (Curse of Dimensionality): + # + # scenarios = [ + # (2, 5, "2 parameters × 5 values each"), + # (3, 5, "3 parameters × 5 values each"), + # (4, 5, "4 parameters × 5 values each"), + # (5, 10, "5 parameters × 10 values each"), + # (10, 3, "10 parameters × 3 values each"), + # ] + # + # for n_params, n_values, description in scenarios: + # total_combinations = n_values ** n_params + # print(f" {description}: {total_combinations:,} combinations") + # + # → Grid search works best with small parameter spaces! + + +def main(): + # === GridSampler Example === + # Exhaustive Grid Search + + grid_search_theory() + demonstrate_curse_of_dimensionality() + + # Load dataset - simple classification + X, y = load_iris(return_X_y=True) + print(f"Dataset: Iris classification ({X.shape[0]} samples, {X.shape[1]} features)") + + # Create experiment + estimator = KNeighborsClassifier() + experiment = SklearnCvExperiment(estimator=estimator, X=X, y=y, cv=5) + + # Define search space - DISCRETE values only for grid search + param_space = { + "n_neighbors": [1, 3, 5, 7, 11, 15, 21], # 7 values + "weights": ["uniform", "distance"], # 2 values + "metric": ["euclidean", "manhattan", "minkowski"], # 3 values + "p": [1, 2], # Only relevant for minkowski metric # 2 values + } + + # Total combinations: 7 × 2 × 3 × 2 = 84 combinations + total_combinations = 1 + for param, values in param_space.items(): + total_combinations *= len(values) + + # Search Space (Discrete grids only): + # for param, values in param_space.items(): + # print(f" {param}: {values} ({len(values)} values)") + # Total combinations: calculated above + + # Configure GridSampler + optimizer = GridSampler( + param_space=param_space, + n_trials=total_combinations, # Will evaluate all combinations + random_state=42, # For deterministic ordering + experiment=experiment, + ) + + # GridSampler Configuration: + # n_trials: matches total combinations + # search_space: automatically derived from param_space + # Systematic evaluation of every combination + + # Run optimization + # Running exhaustive grid search... + best_params = optimizer.run() + + # Results + print("\n=== Results ===") + print(f"Best parameters: {best_params}") + print(f"Best score: {optimizer.best_score_:.4f}") + print() + + # Grid Search Characteristics: + # + # Exhaustive Coverage: + # - Evaluated all parameter combinations + # - Guaranteed to find best configuration within grid + # - No risk of missing good regions + + # Reproducibility: + # - Same grid → same results every time + # - Deterministic evaluation order + # - No randomness or hyperparameters + + # Interpretability: + # - Easy to understand methodology + # - Clear relationship between grid density and accuracy + # - Results easily visualized and analyzed + + # Grid Design Considerations: + # + # Parameter Value Selection: + # Include reasonable ranges for each parameter + # Use domain knowledge to choose meaningful values + # Consider logarithmic spacing for scale-sensitive parameters + # Start coarse, then refine around promising regions + # + # Computational Budget: + # Balance grid density with available compute + # Consider parallel evaluation to speed up + # Use coarse grids for initial exploration + # + # Best Use Cases: + # Small parameter spaces (< 6 parameters) + # Discrete/categorical parameters + # When exhaustive evaluation is feasible + # Baseline comparison for other methods + # When interpretability is crucial + # Parallel computing environments + # + # Limitations: + # Exponential scaling with parameter count + # May miss optimal values between grid points + # Inefficient for continuous parameters + # No adaptive learning or focusing + # Can waste evaluations in clearly bad regions + # + # Grid Search vs Other Methods: + # + # vs Random Search: + # + Systematic coverage guarantee + # + Reproducible results + # - Exponential scaling + # - Less efficient in high dimensions + # + # vs Bayesian Optimization: + # + No assumptions about objective function + # + Guaranteed to find grid optimum + # - Much less sample efficient + # - No learning from previous evaluations + + return best_params, optimizer.best_score_ + + +if __name__ == "__main__": + best_params, best_score = main() diff --git a/examples/optuna/nsga_ii_sampler_example.py b/examples/optuna/nsga_ii_sampler_example.py new file mode 100644 index 00000000..f866a193 --- /dev/null +++ b/examples/optuna/nsga_ii_sampler_example.py @@ -0,0 +1,212 @@ +""" +NSGAIISampler Example - Multi-objective Optimization with NSGA-II + +NSGA-II (Non-dominated Sorting Genetic Algorithm II) is designed for +multi-objective optimization problems where you want to optimize multiple +conflicting objectives simultaneously. It finds a Pareto front of solutions. + +Characteristics: +- Multi-objective evolutionary algorithm +- Finds Pareto-optimal solutions (non-dominated set) +- Balances multiple conflicting objectives +- Population-based search with selection pressure +- Elitist approach preserving best solutions +- Crowding distance for diversity preservation + +Note: For demonstration, we'll create a multi-objective problem from +a single-objective one by optimizing both performance and model complexity. +""" + +import numpy as np +from sklearn.datasets import load_digits +from sklearn.ensemble import RandomForestClassifier +from sklearn.model_selection import cross_val_score + +from hyperactive.experiment.integrations import SklearnCvExperiment +from hyperactive.opt.optuna import NSGAIISampler + + +class MultiObjectiveExperiment: + """Multi-objective experiment: maximize accuracy, minimize complexity.""" + + def __init__(self, X, y): + self.X = X + self.y = y + + def __call__(self, **params): + # Create model with parameters + model = RandomForestClassifier(random_state=42, **params) + + # Objective 1: Maximize accuracy (we'll return negative for minimization) + scores = cross_val_score(model, self.X, self.y, cv=3) + accuracy = np.mean(scores) + + # Objective 2: Minimize model complexity (number of parameters) + # For Random Forest: roughly n_estimators × max_depth × n_features + complexity = ( + params["n_estimators"] * params.get("max_depth", 10) * self.X.shape[1] + ) + + # NSGA-II minimizes objectives, so we return both as minimization + # Note: This is a simplified multi-objective setup for demonstration + return [-accuracy, complexity / 10000] # Scale complexity for better balance + + +def nsga_ii_theory(): + """Explain NSGA-II algorithm theory.""" + # NSGA-II Algorithm (Multi-objective Optimization): + # + # 1. Core Concepts: + # - Pareto Dominance: Solution A dominates B if A is better in all objectives + # - Pareto Front: Set of non-dominated solutions + # - Trade-offs: Improving one objective may worsen another + # + # 2. NSGA-II Process: + # - Initialize population randomly + # - For each generation: + # a) Fast non-dominated sorting (rank solutions by dominance) + # b) Crowding distance calculation (preserve diversity) + # c) Selection based on rank and crowding distance + # d) Crossover and mutation to create offspring + # + # 3. Selection Criteria: + # - Primary: Non-domination rank (prefer better fronts) + # - Secondary: Crowding distance (prefer diverse solutions) + # - Elitist: Best solutions always survive + # + # 4. Output: + # - Set of Pareto-optimal solutions + # - User chooses final solution based on preferences + + +def main(): + # === NSGAIISampler Example === + # Multi-objective Optimization with NSGA-II + + nsga_ii_theory() + + # Load dataset + X, y = load_digits(return_X_y=True) + print(f"Dataset: Handwritten digits ({X.shape[0]} samples, {X.shape[1]} features)") + + # Create multi-objective experiment + experiment = MultiObjectiveExperiment(X, y) + + # Multi-objective Problem: + # Objective 1: Maximize classification accuracy + # Objective 2: Minimize model complexity + # → Trade-off between performance and simplicity + + # Define search space + param_space = { + "n_estimators": (10, 200), # Number of trees + "max_depth": (1, 20), # Tree depth (complexity) + "min_samples_split": (2, 20), # Minimum samples to split + "min_samples_leaf": (1, 10), # Minimum samples per leaf + "max_features": ["sqrt", "log2", None], # Feature sampling + } + + # Search Space: + # for param, space in param_space.items(): + # print(f" {param}: {space}") + + # Configure NSGAIISampler + optimizer = NSGAIISampler( + param_space=param_space, + n_trials=50, # Population evolves over multiple generations + random_state=42, + experiment=experiment, + population_size=20, # Population size for genetic algorithm + mutation_prob=0.1, # Mutation probability + crossover_prob=0.9, # Crossover probability + ) + + # NSGAIISampler Configuration: + # n_trials: configured above + # population_size: for genetic algorithm + # mutation_prob: mutation probability + # crossover_prob: crossover probability + # Selection: Non-dominated sorting + crowding distance + + # Note: This example demonstrates the interface. + # In practice, NSGA-II returns multiple Pareto-optimal solutions. + # For single-objective problems, consider TPE or GP samplers instead. + + # Run optimization + # Running NSGA-II multi-objective optimization... + + try: + best_params = optimizer.run() + + # Results + print("\n=== Results ===") + print(f"Best parameters: {best_params}") + print(f"Best score: {optimizer.best_score_:.4f}") + print() + + # NSGA-II typically returns multiple solutions along Pareto front: + # High accuracy, high complexity models + # Medium accuracy, medium complexity models + # Lower accuracy, low complexity models + # User selects based on preferences/constraints + + except Exception as e: + print(f"Multi-objective optimization example: {e}") + print("Note: This demonstrates the interface for multi-objective problems.") + return None, None + + # NSGA-II Evolution Process: + # + # Generation 1: Random initialization + # Diverse population across parameter space + # Wide range of accuracy/complexity trade-offs + + # Generations 2-N: Evolutionary improvement + # Non-dominated sorting identifies best fronts + # Crowding distance maintains solution diversity + # Crossover combines good solutions + # Mutation explores new parameter regions + + # Final Population: Pareto front approximation + # Multiple non-dominated solutions + # Represents optimal trade-offs + # User chooses based on domain requirements + + # Key Advantages: + # Handles multiple conflicting objectives naturally + # Finds diverse set of optimal trade-offs + # No need to specify objective weights a priori + # Provides insight into objective relationships + # Robust to objective scaling differences + + # Best Use Cases: + # True multi-objective problems (accuracy vs speed, cost vs quality) + # When trade-offs between objectives are important + # Robustness analysis with multiple criteria + # When single objective formulation is unclear + + # Limitations: + # More complex than single-objective methods + # Requires more evaluations (population-based) + # May be overkill for single-objective problems + # Final solution selection still required + + # When to Use NSGA-II vs Single-objective Methods: + # Use NSGA-II when: + # Multiple objectives genuinely conflict + # Trade-off analysis is valuable + # Objective weights are unknown + # + # Use TPE/GP when: + # Single clear objective + # Computational budget is limited + # Faster convergence needed + + if "best_params" in locals(): + return best_params, optimizer.best_score_ + else: + return None, None + + +if __name__ == "__main__": + best_params, best_score = main() diff --git a/examples/optuna/nsga_iii_sampler_example.py b/examples/optuna/nsga_iii_sampler_example.py new file mode 100644 index 00000000..d8c0387f --- /dev/null +++ b/examples/optuna/nsga_iii_sampler_example.py @@ -0,0 +1,238 @@ +""" +NSGAIIISampler Example - Many-objective Optimization with NSGA-III + +NSGA-III is an extension of NSGA-II specifically designed for many-objective +optimization problems (typically 3+ objectives). It uses reference points +to maintain diversity and selection pressure in high-dimensional objective spaces. + +Characteristics: +- Many-objective evolutionary algorithm (3+ objectives) +- Reference point-based selection mechanism +- Better performance than NSGA-II for many objectives +- Maintains diversity through structured reference points +- Elitist approach with improved selection pressure +- Population-based search with normalization + +Note: For demonstration, we'll create a many-objective problem optimizing +accuracy, complexity, training time, and model interpretability. +""" + +import numpy as np +from sklearn.datasets import load_breast_cancer +from sklearn.tree import DecisionTreeClassifier +from sklearn.model_selection import cross_val_score +import time + +from hyperactive.experiment.integrations import SklearnCvExperiment +from hyperactive.opt.optuna import NSGAIIISampler + + +class ManyObjectiveExperiment: + """Many-objective experiment: optimize multiple conflicting goals.""" + + def __init__(self, X, y): + self.X = X + self.y = y + + def __call__(self, **params): + # Create model with parameters + model = DecisionTreeClassifier(random_state=42, **params) + + # Objective 1: Maximize accuracy (return negative for minimization) + start_time = time.time() + scores = cross_val_score(model, self.X, self.y, cv=3) + training_time = time.time() - start_time + accuracy = np.mean(scores) + + # Objective 2: Minimize model complexity (tree depth) + complexity = params.get("max_depth", 20) + + # Objective 3: Minimize training time + time_objective = training_time + + # Objective 4: Maximize interpretability (minimize tree size) + # Approximate tree size based on parameters + max_leaf_nodes = params.get("max_leaf_nodes", 100) + interpretability = max_leaf_nodes / 100.0 # Normalized + + # Return all objectives for minimization (negative accuracy for maximization) + return [ + -accuracy, # Minimize negative accuracy (maximize accuracy) + complexity / 20.0, # Minimize complexity (normalized) + time_objective, # Minimize training time + interpretability, # Minimize tree size (maximize interpretability) + ] + + +def nsga_iii_theory(): + """Explain NSGA-III algorithm theory.""" + # NSGA-III Algorithm (Many-objective Optimization): + # + # 1. Many-objective Challenge: + # - With 3+ objectives, most solutions become non-dominated + # - Traditional Pareto ranking loses selection pressure + # - Crowding distance becomes less effective + # - Need structured diversity preservation + # + # 2. NSGA-III Innovations: + # - Reference points on normalized hyperplane + # - Associate solutions with reference points + # - Select solutions to maintain balanced distribution + # - Adaptive normalization for different objective scales + # + # 3. Reference Point Strategy: + # - Systematic placement on unit simplex + # - Each reference point guides search direction + # - Solutions clustered around reference points + # - Maintains diversity across objective space + # + # 4. Selection Mechanism: + # - Non-dominated sorting (like NSGA-II) + # - Reference point association + # - Niche count balancing + # - Preserve solutions near each reference point + + +def main(): + # === NSGAIIISampler Example === + # Many-objective Optimization with NSGA-III + + nsga_iii_theory() + + # Load dataset + X, y = load_breast_cancer(return_X_y=True) + print( + f"Dataset: Breast cancer classification ({X.shape[0]} samples, {X.shape[1]} features)" + ) + + # Create many-objective experiment + experiment = ManyObjectiveExperiment(X, y) + + # Many-objective Problem (4 objectives): + # Objective 1: Maximize classification accuracy + # Objective 2: Minimize model complexity (tree depth) + # Objective 3: Minimize training time + # Objective 4: Maximize interpretability (smaller trees) + # → Complex trade-offs between multiple conflicting goals + + # Define search space + param_space = { + "max_depth": (1, 20), # Tree depth + "min_samples_split": (2, 50), # Minimum samples to split + "min_samples_leaf": (1, 20), # Minimum samples per leaf + "max_leaf_nodes": (10, 200), # Maximum leaf nodes + "criterion": ["gini", "entropy"], # Split criterion + } + + # Search Space: + # for param, space in param_space.items(): + # print(f" {param}: {space}") + + # Configure NSGAIIISampler + optimizer = NSGAIIISampler( + param_space=param_space, + n_trials=60, # More trials needed for many objectives + random_state=42, + experiment=experiment, + population_size=24, # Larger population for many objectives + mutation_prob=0.1, # Mutation probability + crossover_prob=0.9, # Crossover probability + ) + + # NSGAIIISampler Configuration: + # n_trials: configured above + # population_size: larger for many objectives + # mutation_prob: mutation probability + # crossover_prob: crossover probability + # Selection: Reference point-based diversity preservation + + # Note: NSGA-III is designed for 3+ objectives. + # For 2 objectives, NSGA-II is typically preferred. + # This example demonstrates the interface for many-objective problems. + + # Run optimization + # Running NSGA-III many-objective optimization... + + try: + best_params = optimizer.run() + + # Results + print("\n=== Results ===") + print(f"Best parameters: {best_params}") + print(f"Best score: {optimizer.best_score_:.4f}") + print() + + # NSGA-III produces a diverse set of solutions across 4D Pareto front: + # High accuracy, complex, slower models + # Balanced accuracy/complexity trade-offs + # Fast, simple, interpretable models + # Various combinations optimizing different objectives + + except Exception as e: + print(f"Many-objective optimization example: {e}") + print("Note: This demonstrates the interface for many-objective problems.") + return None, None + + # NSGA-III vs NSGA-II for Many Objectives: + # + # NSGA-II Limitations (3+ objectives): + # Most solutions become non-dominated + # Crowding distance loses effectiveness + # Selection pressure decreases + # Uneven distribution in objective space + + # NSGA-III Advantages: + # Reference points guide search directions + # Maintains diversity across all objectives + # Better selection pressure in many objectives + # Structured exploration of objective space + # Adaptive normalization handles different scales + + # Reference Point Mechanism: + # Systematic placement on normalized hyperplane + # Each point represents a different objective priority + # Solutions associated with nearest reference points + # Selection maintains balance across all points + # Prevents clustering in limited objective regions + + # Many-objective Problem Characteristics: + # + # Challenges: + # Exponential growth of non-dominated solutions + # Difficulty visualizing high-dimensional trade-offs + # User preference articulation becomes complex + # Increased computational requirements + + # Best Use Cases: + # Engineering design with multiple constraints + # Multi-criteria decision making (3+ criteria) + # Resource allocation problems + # System optimization with conflicting requirements + # When objective interactions are complex + # + # Algorithm Selection Guide: + # + # Use NSGA-III when: + # 3 or more objectives + # Objectives are truly conflicting + # Comprehensive trade-off analysis needed + # Reference point guidance is beneficial + # + # Use NSGA-II when: + # 2 objectives + # Simpler Pareto front structure + # Established performance for bi-objective problems + # + # Use single-objective methods when: + # Can formulate as weighted combination + # Clear primary objective with constraints + # Computational efficiency is critical + + if "best_params" in locals(): + return best_params, optimizer.best_score_ + else: + return None, None + + +if __name__ == "__main__": + best_params, best_score = main() diff --git a/examples/optuna/qmc_sampler_example.py b/examples/optuna/qmc_sampler_example.py new file mode 100644 index 00000000..ef4ef560 --- /dev/null +++ b/examples/optuna/qmc_sampler_example.py @@ -0,0 +1,214 @@ +""" +QMCSampler Example - Quasi-Monte Carlo Sampling + +The QMCSampler uses Quasi-Monte Carlo sequences (like Sobol or Halton) +to generate low-discrepancy samples. These sequences provide better +coverage of the parameter space compared to purely random sampling. + +Characteristics: +- Low-discrepancy sequences for uniform space filling +- Better convergence than random sampling +- Deterministic sequence generation +- Excellent space coverage properties +- No learning from previous evaluations +- Good baseline for comparison with adaptive methods + +QMC sequences are particularly effective for: +- Integration and sampling problems +- Initial design of experiments +- Baseline optimization comparisons +""" + +import numpy as np +from sklearn.datasets import load_wine +from sklearn.linear_model import LogisticRegression +from sklearn.model_selection import cross_val_score + +from hyperactive.experiment.integrations import SklearnCvExperiment +from hyperactive.opt.optuna import QMCSampler + + +def qmc_theory(): + """Explain Quasi-Monte Carlo theory.""" + # Quasi-Monte Carlo (QMC) Theory: + # + # 1. Low-Discrepancy Sequences: + # - Deterministic sequences that fill space uniformly + # - Better distribution than random sampling + # - Minimize gaps and clusters in parameter space + # - Convergence rate O(log^d(N)/N) vs O(1/√N) for random + # + # 2. Common QMC Sequences: + # - Sobol: Based on binary representations + # - Halton: Based on prime number bases + # - Faure: Generalization of Halton sequences + # - Each has different strengths for different dimensions + # + # 3. Space-Filling Properties: + # - Stratification: Even coverage of parameter regions + # - Low discrepancy: Uniform distribution approximation + # - Correlation breaking: Reduces clustering + # + # 4. Advantages over Random Sampling: + # - Better convergence for integration + # - More uniform exploration + # - Reproducible sequences + # - No unlucky clustering + + +def demonstrate_space_filling(): + """Demonstrate space-filling properties conceptually.""" + # Space-Filling Comparison: + # + # Random Sampling: + # Can have clusters and gaps + # Uneven coverage especially with few samples + # Variance in coverage quality + # Some regions may be under-explored + # + # Quasi-Monte Carlo (QMC): + # Systematic space filling + # Even coverage with fewer samples + # Consistent coverage quality + # All regions explored proportionally + # + # Grid Sampling: + # Perfect regular coverage + # Exponential scaling with dimensions + # May miss optimal points between grid lines + # + # → QMC provides balanced approach between random and grid + + +def main(): + # === QMCSampler Example === + # Quasi-Monte Carlo Low-Discrepancy Sampling + + qmc_theory() + demonstrate_space_filling() + + # Load dataset + X, y = load_wine(return_X_y=True) + print(f"Dataset: Wine classification ({X.shape[0]} samples, {X.shape[1]} features)") + + # Create experiment + estimator = LogisticRegression(random_state=42, max_iter=1000) + experiment = SklearnCvExperiment(estimator=estimator, X=X, y=y, cv=5) + + # Define search space + param_space = { + "C": (0.001, 100), # Regularization strength + "l1_ratio": (0.0, 1.0), # Elastic net mixing parameter + "solver": ["liblinear", "saga"], # Solver algorithm + "penalty": ["l1", "l2", "elasticnet"], # Regularization type + } + + # Search Space: + # C: (0.001, 100) - Regularization strength + # l1_ratio: (0.0, 1.0) - Elastic net mixing parameter + # solver: ['liblinear', 'saga'] - Solver algorithm + # penalty: ['l1', 'l2', 'elasticnet'] - Regularization type + + # Configure QMCSampler + optimizer = QMCSampler( + param_space=param_space, + n_trials=32, # Power of 2 often works well for QMC + random_state=42, + experiment=experiment, + qmc_type="sobol", # Sobol or Halton sequences + scramble=True, # Randomized QMC (Owen scrambling) + ) + + # QMCSampler Configuration: + # n_trials: 32 (power of 2 for better QMC properties) + # qmc_type: 'sobol' sequence + # scramble: True (randomized QMC) + # Deterministic low-discrepancy sampling + + # QMC Sequence Types: + # Sobol: Excellent for moderate dimensions, binary-based + # Halton: Good for low dimensions, prime-based + # Scrambling: Adds randomization while preserving uniformity + + # Run optimization + # Running QMC sampling optimization... + best_params = optimizer.run() + + # Results + print("\n=== Results ===") + print(f"Best parameters: {best_params}") + print(f"Best score: {optimizer.best_score_:.4f}") + print() + + # QMC behavior analysis: + # + # QMC Sampling Analysis: + # + # Sequence Properties: + # Deterministic generation (reproducible with same seed) + # Low-discrepancy (uniform distribution approximation) + # Space-filling (systematic coverage of parameter space) + # Stratification (even coverage of all regions) + # No clustering or large gaps + # + # Sobol Sequence Characteristics: + # Binary-based construction + # Good equidistribution properties + # Effective for dimensions up to ~40 + # Popular choice for QMC sampling + # + # Scrambling Benefits (when enabled): + # Breaks regularity patterns + # Provides Monte Carlo error estimates + # Maintains low-discrepancy properties + # Reduces correlation artifacts + # + # QMC vs Other Sampling Methods: + # + # vs Pure Random Sampling: + # + Better space coverage with fewer samples + # + More consistent performance + # + Faster convergence for integration-like problems + # - No true randomness (if needed for some applications) + # + # vs Grid Search: + # + Works well in higher dimensions + # + No exponential scaling + # + Covers continuous spaces naturally + # - No systematic guarantee of finding grid optimum + # + # vs Adaptive Methods (TPE, GP): + # + No assumptions about objective function + # + Embarrassingly parallel + # + Consistent performance regardless of function type + # - No learning from previous evaluations + # - May waste evaluations in clearly suboptimal regions + # + # Best Use Cases: + # Design of experiments (DoE) + # Initial exploration phase + # Baseline for comparing adaptive methods + # Integration and sampling problems + # When function evaluations are parallelizable + # Robustness testing across parameter space + # + # Implementation Considerations: + # Use powers of 2 for n_trials with Sobol sequences + # Consider scrambling for better statistical properties + # Choose sequence type based on dimensionality: + # - Sobol: Good general choice + # - Halton: Better for low dimensions (< 6) + # QMC works best with transformed uniform parameters + # + # Practical Recommendations: + # 1. Use QMC for initial exploration (first 20-50 evaluations) + # 2. Switch to adaptive methods (TPE/GP) for focused search + # 3. Use for sensitivity analysis across full parameter space + # 4. Good choice when unsure about objective function properties + # 5. Ideal for parallel evaluation scenarios + + return best_params, optimizer.best_score_ + + +if __name__ == "__main__": + best_params, best_score = main() diff --git a/examples/optuna/random_sampler_example.py b/examples/optuna/random_sampler_example.py new file mode 100644 index 00000000..1f1919af --- /dev/null +++ b/examples/optuna/random_sampler_example.py @@ -0,0 +1,119 @@ +""" +RandomSampler Example - Random Search + +The RandomSampler performs pure random sampling from the parameter space. +It serves as a baseline and is surprisingly effective for many problems, +especially when the parameter space is high-dimensional or when you have +limited computational budget. + +Characteristics: +- No learning from previous trials +- Uniform sampling from parameter distributions +- Excellent baseline for comparison +- Works well in high-dimensional spaces +- Embarrassingly parallel +- Good when objective function is noisy +""" + +import numpy as np +from sklearn.datasets import load_digits +from sklearn.svm import SVC +from sklearn.model_selection import cross_val_score + +from hyperactive.experiment.integrations import SklearnCvExperiment +from hyperactive.opt.optuna import RandomSampler + + +def objective_function_analysis(): + """Demonstrate when random sampling is effective.""" + # When Random Sampling Works Well: + # 1. High-dimensional parameter spaces (curse of dimensionality) + # 2. Noisy objective functions + # 3. Limited computational budget + # 4. As a baseline for comparison + # 5. When parallel evaluation is important + # 6. Uniform exploration is desired + + +def main(): + # === RandomSampler Example === + # Pure Random Search - Uniform Parameter Space Exploration + + objective_function_analysis() + + # Load dataset - using digits for a more challenging problem + X, y = load_digits(return_X_y=True) + print(f"Dataset: Handwritten digits ({X.shape[0]} samples, {X.shape[1]} features)") + + # Create experiment + estimator = SVC(random_state=42) + experiment = SklearnCvExperiment(estimator=estimator, X=X, y=y, cv=3) + + # Define search space - SVM hyperparameters + param_space = { + "C": (0.001, 1000), # Regularization - log scale would be better + "gamma": (1e-6, 1e2), # RBF kernel parameter + "kernel": ["rbf", "poly", "sigmoid"], # Kernel type + "degree": (2, 5), # Polynomial degree (only for poly kernel) + "coef0": (0.0, 10.0), # Independent term (poly/sigmoid) + } + + # Search Space: + # for param, space in param_space.items(): + # print(f" {param}: {space}") + + # Configure RandomSampler + optimizer = RandomSampler( + param_space=param_space, + n_trials=30, # More trials to show random behavior + random_state=42, # For reproducible random sampling + experiment=experiment, + ) + + # RandomSampler Configuration: + # n_trials: configured above + # random_state: set for reproducibility + # No learning parameters - pure random sampling + + # Run optimization + # Running random search... + best_params = optimizer.run() + + # Results + print("\n=== Results ===") + print(f"Best parameters: {best_params}") + print(f"Best score: {optimizer.best_score_:.4f}") + print() + + # Analysis of Random Sampling behavior: + # Each trial is independent - no learning from history + # Uniform coverage of parameter space + # No convergence issues or local optima concerns + # Embarrassingly parallel - can run trials simultaneously + # Works equally well for continuous, discrete, and categorical parameters + + # Comparison with Other Methods: + # vs Grid Search: + # + Better coverage in high dimensions + # + More efficient for continuous parameters + # - No systematic coverage guarantee + # + # vs Bayesian Optimization (TPE, GP): + # + No assumptions about objective function smoothness + # + Works well with noisy objectives + # + No risk of model misspecification + # - No exploitation of promising regions + # - May waste trials on clearly bad regions + + # Practical Usage: + # Use as baseline to validate more sophisticated methods + # Good first choice when objective is very noisy + # Ideal for parallel optimization setups + # Consider for high-dimensional problems (>10 parameters) + # Use with log-uniform distributions for scale-sensitive parameters + + return best_params, optimizer.best_score_ + + +if __name__ == "__main__": + best_params, best_score = main() diff --git a/examples/optuna/tpe_sampler_example.py b/examples/optuna/tpe_sampler_example.py new file mode 100644 index 00000000..8232a441 --- /dev/null +++ b/examples/optuna/tpe_sampler_example.py @@ -0,0 +1,100 @@ +""" +TPESampler Example - Tree-structured Parzen Estimator + +The TPESampler is Optuna's default and most popular Bayesian optimization algorithm. +It uses a Tree-structured Parzen Estimator to model the relationship between +hyperparameters and objective values, making it efficient at finding optimal regions. + +Characteristics: +- Bayesian optimization approach +- Good balance of exploration vs exploitation +- Works well with mixed parameter types (continuous, discrete, categorical) +- Efficient for moderate-dimensional problems +- Default choice for most hyperparameter optimization tasks +""" + +import numpy as np +from sklearn.datasets import load_wine +from sklearn.ensemble import RandomForestClassifier +from sklearn.model_selection import cross_val_score + +from hyperactive.experiment.integrations import SklearnCvExperiment +from hyperactive.opt.optuna import TPESampler + + +def main(): + # === TPESampler Example === + # Tree-structured Parzen Estimator - Bayesian Optimization + + # Load dataset + X, y = load_wine(return_X_y=True) + print(f"Dataset: Wine classification ({X.shape[0]} samples, {X.shape[1]} features)") + + # Create experiment + estimator = RandomForestClassifier(random_state=42) + experiment = SklearnCvExperiment(estimator=estimator, X=X, y=y, cv=3) + + # Define search space + param_space = { + "n_estimators": (10, 200), # Continuous integer + "max_depth": (1, 20), # Continuous integer + "min_samples_split": (2, 20), # Continuous integer + "min_samples_leaf": (1, 10), # Continuous integer + "max_features": ["sqrt", "log2", None], # Categorical + "bootstrap": [True, False], # Categorical boolean + } + + # Search Space: + # for param, space in param_space.items(): + # print(f" {param}: {space}") + + # Configure TPESampler with warm start + warm_start_points = [ + {"n_estimators": 100, "max_depth": 10, "min_samples_split": 2, + "min_samples_leaf": 1, "max_features": "sqrt", "bootstrap": True} + ] + + optimizer = TPESampler( + param_space=param_space, + n_trials=50, + random_state=42, + initialize={"warm_start": warm_start_points}, + experiment=experiment, + n_startup_trials=10, # Random trials before TPE kicks in + n_ei_candidates=24 # Number of candidates for expected improvement + ) + + # TPESampler Configuration: + # n_trials: configured above + # n_startup_trials: random exploration phase + # n_ei_candidates: number of expected improvement candidates + # warm_start: initial point(s) provided + + # Run optimization + # Running optimization... + best_params = optimizer.run() + + # Results + print("\n=== Results ===") + print(f"Best parameters: {best_params}") + print(f"Best score: {optimizer.best_score_:.4f}") + print() + + # TPE Behavior Analysis: + # - First 10 trials: Random exploration (n_startup_trials) + # - Trials 11-50: TPE-guided exploration based on past results + # - TPE builds probabilistic models of good vs bad parameter regions + # - Balances exploration of uncertain areas with exploitation of promising regions + + # Parameter Space Exploration: + # TPESampler effectively explores the joint parameter space by: + # 1. Modeling P(x|y) - probability of parameters given objective values + # 2. Using separate models for 'good' and 'bad' performing regions + # 3. Selecting next points to maximize expected improvement + # 4. Handling mixed parameter types (continuous, discrete, categorical) + + return best_params, optimizer.best_score_ + + +if __name__ == "__main__": + best_params, best_score = main() diff --git a/pyproject.toml b/pyproject.toml index e1661c10..4da54201 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -62,9 +62,12 @@ test = [ "flake8", "pytest-cov", "pathos", + "torch", + "tf_keras", ] all_extras = [ "hyperactive[integrations]", + "optuna<5", ] diff --git a/src/hyperactive/opt/_adapters/__init__.py b/src/hyperactive/opt/_adapters/__init__.py index d279503a..6e40d407 100644 --- a/src/hyperactive/opt/_adapters/__init__.py +++ b/src/hyperactive/opt/_adapters/__init__.py @@ -1,2 +1,7 @@ """Adapters for individual packages.""" # copyright: hyperactive developers, MIT License (see LICENSE file) + +from ._base_optuna_adapter import _BaseOptunaAdapter +from ._gfo import _BaseGFOadapter + +__all__ = ["_BaseOptunaAdapter", "_BaseGFOadapter"] diff --git a/src/hyperactive/opt/_adapters/_base_optuna_adapter.py b/src/hyperactive/opt/_adapters/_base_optuna_adapter.py new file mode 100644 index 00000000..fc8becd2 --- /dev/null +++ b/src/hyperactive/opt/_adapters/_base_optuna_adapter.py @@ -0,0 +1,213 @@ +"""Base adapter for Optuna optimizers.""" +# copyright: hyperactive developers, MIT License (see LICENSE file) + +from hyperactive.base import BaseOptimizer + + +class _BaseOptunaAdapter(BaseOptimizer): + """Base adapter for Optuna optimizers.""" + + _tags = { + "python_dependencies": ["optuna"], + "info:name": "Optuna-based optimizer", + } + + def __init__( + self, + param_space=None, + n_trials=100, + initialize=None, + random_state=None, + early_stopping=None, + max_score=None, + experiment=None, + **optimizer_kwargs, + ): + self.param_space = param_space + self.n_trials = n_trials + self.initialize = initialize + self.random_state = random_state + self.early_stopping = early_stopping + self.max_score = max_score + self.experiment = experiment + self.optimizer_kwargs = optimizer_kwargs + super().__init__() + + def _get_optimizer(self): + """Get the Optuna optimizer to use. + + This method should be implemented by subclasses to return + the specific optimizer class and its initialization parameters. + + Returns + ------- + optimizer + The Optuna optimizer instance + """ + raise NotImplementedError("Subclasses must implement _get_optimizer") + + def _convert_param_space(self, param_space): + """Convert parameter space to Optuna format. + + Parameters + ---------- + param_space : dict + The parameter space to convert + + Returns + ------- + dict + The converted parameter space + """ + return param_space + + def _suggest_params(self, trial, param_space): + """Suggest parameters using Optuna trial. + + Parameters + ---------- + trial : optuna.Trial + The Optuna trial object + param_space : dict + The parameter space + + Returns + ------- + dict + The suggested parameters + """ + params = {} + for key, space in param_space.items(): + if hasattr(space, "suggest"): # optuna distribution object + params[key] = trial._suggest(space, key) + elif isinstance(space, tuple) and len(space) == 2: + # Tuples are treated as ranges (low, high) + low, high = space + if isinstance(low, int) and isinstance(high, int): + params[key] = trial.suggest_int(key, low, high) + else: + params[key] = trial.suggest_float(key, low, high, log=False) + elif isinstance(space, list): + # Lists are treated as categorical choices + params[key] = trial.suggest_categorical(key, space) + else: + raise ValueError(f"Invalid parameter space for key '{key}': {space}") + return params + + def _objective(self, trial): + """Objective function for Optuna optimization. + + Parameters + ---------- + trial : optuna.Trial + The Optuna trial object + + Returns + ------- + float + The objective value + """ + params = self._suggest_params(trial, self.param_space) + score = self.experiment(**params) + + # Handle early stopping based on max_score + if self.max_score is not None and score >= self.max_score: + trial.study.stop() + + return score + + def _setup_initial_positions(self, study): + """Set up initial starting positions if provided. + + Parameters + ---------- + study : optuna.Study + The Optuna study object + """ + if self.initialize is not None: + if isinstance(self.initialize, dict) and "warm_start" in self.initialize: + warm_start_points = self.initialize["warm_start"] + if isinstance(warm_start_points, list): + # For warm start, we manually add trials to the study history + # instead of using suggest methods to avoid distribution conflicts + for point in warm_start_points: + self.experiment(**point) + study.enqueue_trial(point) + + def _run(self, experiment, param_space, n_trials, **kwargs): + """Run the Optuna optimization. + + Parameters + ---------- + experiment : callable + The experiment to optimize + param_space : dict + The parameter space + n_trials : int + Number of trials + **kwargs + Additional parameters + + Returns + ------- + dict + The best parameters found + """ + import optuna + + # Create optimizer with random state if provided + optimizer = self._get_optimizer() + + # Create study + study = optuna.create_study( + direction="maximize", # Assuming we want to maximize scores + sampler=optimizer, + ) + + # Setup initial positions + self._setup_initial_positions(study) + + # Setup early stopping callback + callbacks = [] + if self.early_stopping is not None: + + def early_stopping_callback(study, trial): + if len(study.trials) >= self.early_stopping: + study.stop() + + callbacks.append(early_stopping_callback) + + # Run optimization + study.optimize( + self._objective, + n_trials=n_trials, + callbacks=callbacks if callbacks else None, + ) + + self.best_score_ = study.best_value + self.best_params_ = study.best_params + return study.best_params + + @classmethod + def get_test_params(cls, parameter_set="default"): + """Return testing parameter settings for the optimizer.""" + from sklearn.datasets import load_iris + from sklearn.svm import SVC + + from hyperactive.experiment.integrations import SklearnCvExperiment + + X, y = load_iris(return_X_y=True) + sklearn_exp = SklearnCvExperiment(estimator=SVC(), X=X, y=y) + + param_space = { + "C": (0.01, 10), + "gamma": (0.0001, 10), + } + + return [ + { + "param_space": param_space, + "n_trials": 10, + "experiment": sklearn_exp, + } + ] diff --git a/src/hyperactive/opt/optuna/__init__.py b/src/hyperactive/opt/optuna/__init__.py new file mode 100644 index 00000000..346c1eb4 --- /dev/null +++ b/src/hyperactive/opt/optuna/__init__.py @@ -0,0 +1,22 @@ +"""Individual Optuna optimization algorithms.""" +# copyright: hyperactive developers, MIT License (see LICENSE file) + +from ._cmaes_optimizer import CmaEsOptimizer +from ._gp_optimizer import GPOptimizer +from ._grid_optimizer import GridOptimizer +from ._nsga_ii_optimizer import NSGAIIOptimizer +from ._nsga_iii_optimizer import NSGAIIIOptimizer +from ._qmc_optimizer import QMCOptimizer +from ._random_optimizer import RandomOptimizer +from ._tpe_optimizer import TPEOptimizer + +__all__ = [ + "TPEOptimizer", + "RandomOptimizer", + "CmaEsOptimizer", + "GPOptimizer", + "GridOptimizer", + "NSGAIIOptimizer", + "NSGAIIIOptimizer", + "QMCOptimizer", +] diff --git a/src/hyperactive/opt/optuna/_cmaes_optimizer.py b/src/hyperactive/opt/optuna/_cmaes_optimizer.py new file mode 100644 index 00000000..bc03c0e2 --- /dev/null +++ b/src/hyperactive/opt/optuna/_cmaes_optimizer.py @@ -0,0 +1,183 @@ +"""CMA-ES (Covariance Matrix Adaptation Evolution Strategy) optimizer.""" +# copyright: hyperactive developers, MIT License (see LICENSE file) + +from .._adapters._base_optuna_adapter import _BaseOptunaAdapter + + +class CmaEsOptimizer(_BaseOptunaAdapter): + """CMA-ES (Covariance Matrix Adaptation Evolution Strategy) optimizer. + + Parameters + ---------- + param_space : dict[str, tuple or list or optuna distributions] + The search space to explore. Dictionary with parameter names + as keys and either tuples/lists of (low, high) or + optuna distribution objects as values. + n_trials : int, default=100 + Number of optimization trials. + initialize : dict[str, int], default=None + The method to generate initial positions. A dictionary with + the following key literals and the corresponding value type: + {"grid": int, "vertices": int, "random": int, "warm_start": list[dict]} + random_state : None, int, default=None + If None, create a new random state. If int, create a new random state + seeded with the value. + early_stopping : int, default=None + Number of trials after which to stop if no improvement. + max_score : float, default=None + Maximum score threshold. Stop optimization when reached. + x0 : dict, default=None + Initial parameter values for CMA-ES. + sigma0 : float, default=1.0 + Initial standard deviation for CMA-ES. + n_startup_trials : int, default=1 + Number of startup trials for CMA-ES. + experiment : BaseExperiment, optional + The experiment to optimize parameters for. + Optional, can be passed later via ``set_params``. + + Examples + -------- + Basic usage of CmaEsOptimizer with a scikit-learn experiment: + + >>> from hyperactive.experiment.integrations import SklearnCvExperiment + >>> from hyperactive.opt.optuna import CmaEsOptimizer + >>> from sklearn.datasets import load_iris + >>> from sklearn.svm import SVC + >>> X, y = load_iris(return_X_y=True) + >>> sklearn_exp = SklearnCvExperiment(estimator=SVC(), X=X, y=y) + >>> param_space = { + ... "C": (0.01, 10), + ... "gamma": (0.0001, 10), + ... } + >>> optimizer = CmaEsOptimizer( + ... param_space=param_space, n_trials=50, experiment=sklearn_exp + ... ) + >>> best_params = optimizer.run() + """ + + _tags = { + "info:name": "CMA-ES Optimizer", + "info:local_vs_global": "global", + "info:explore_vs_exploit": "mixed", + "info:compute": "high", + "python_dependencies": ["optuna", "cmaes"], + } + + def __init__( + self, + param_space=None, + n_trials=100, + initialize=None, + random_state=None, + early_stopping=None, + max_score=None, + x0=None, + sigma0=1.0, + n_startup_trials=1, + experiment=None, + ): + self.x0 = x0 + self.sigma0 = sigma0 + self.n_startup_trials = n_startup_trials + + super().__init__( + param_space=param_space, + n_trials=n_trials, + initialize=initialize, + random_state=random_state, + early_stopping=early_stopping, + max_score=max_score, + experiment=experiment, + ) + + def _get_optimizer(self): + """Get the CMA-ES optimizer. + + Returns + ------- + optimizer + The Optuna CmaEsOptimizer instance + """ + import optuna + + try: + import cmaes # noqa: F401 + except ImportError: + raise ImportError( + "CmaEsOptimizer requires the 'cmaes' package. " + "Install it with: pip install cmaes" + ) + + optimizer_kwargs = { + "sigma0": self.sigma0, + "n_startup_trials": self.n_startup_trials, + } + + if self.x0 is not None: + optimizer_kwargs["x0"] = self.x0 + + if self.random_state is not None: + optimizer_kwargs["seed"] = self.random_state + + return optuna.samplers.CmaEsSampler(**optimizer_kwargs) + + @classmethod + def get_test_params(cls, parameter_set="default"): + """Return testing parameter settings for the optimizer.""" + from sklearn.datasets import make_regression + from sklearn.neural_network import MLPRegressor + + from hyperactive.experiment.integrations import SklearnCvExperiment + + # Test case 1: Basic continuous parameters (from base) + params = super().get_test_params(parameter_set) + params[0].update( + { + "sigma0": 0.5, + "n_startup_trials": 1, + } + ) + + # Test case 2: Neural network with continuous parameters only + # (CMA-ES specific - only continuous parameters allowed) + X, y = make_regression(n_samples=50, n_features=5, noise=0.1, random_state=42) + mlp_exp = SklearnCvExperiment( + estimator=MLPRegressor(random_state=42, max_iter=100), X=X, y=y, cv=3 + ) + + continuous_param_space = { + "alpha": (1e-5, 1e-1), # L2 regularization (continuous) + "learning_rate_init": (1e-4, 1e-1), # Learning rate (continuous) + "beta_1": (0.8, 0.99), # Adam beta1 (continuous) + "beta_2": (0.9, 0.999), # Adam beta2 (continuous) + # Note: No categorical parameters - CMA-ES doesn't support them + } + + params.append( + { + "param_space": continuous_param_space, + "n_trials": 8, # Smaller for faster testing + "experiment": mlp_exp, + "sigma0": 0.3, # Different sigma for diversity + "n_startup_trials": 2, # More startup trials + } + ) + + # Test case 3: High-dimensional continuous space (CMA-ES strength) + high_dim_continuous = { + f"x{i}": (-1.0, 1.0) + for i in range(6) # 6D continuous optimization + } + + params.append( + { + "param_space": high_dim_continuous, + "n_trials": 12, + "experiment": mlp_exp, + "sigma0": 0.7, # Larger initial spread + "n_startup_trials": 3, + } + ) + + return params diff --git a/src/hyperactive/opt/optuna/_gp_optimizer.py b/src/hyperactive/opt/optuna/_gp_optimizer.py new file mode 100644 index 00000000..c38df65f --- /dev/null +++ b/src/hyperactive/opt/optuna/_gp_optimizer.py @@ -0,0 +1,120 @@ +"""Gaussian Process optimizer.""" +# copyright: hyperactive developers, MIT License (see LICENSE file) + +from .._adapters._base_optuna_adapter import _BaseOptunaAdapter + + +class GPOptimizer(_BaseOptunaAdapter): + """Gaussian Process-based Bayesian optimizer. + + Parameters + ---------- + param_space : dict[str, tuple or list or optuna distributions] + The search space to explore. Dictionary with parameter names + as keys and either tuples/lists of (low, high) or + optuna distribution objects as values. + n_trials : int, default=100 + Number of optimization trials. + initialize : dict[str, int], default=None + The method to generate initial positions. A dictionary with + the following key literals and the corresponding value type: + {"grid": int, "vertices": int, "random": int, "warm_start": list[dict]} + random_state : None, int, default=None + If None, create a new random state. If int, create a new random state + seeded with the value. + early_stopping : int, default=None + Number of trials after which to stop if no improvement. + max_score : float, default=None + Maximum score threshold. Stop optimization when reached. + n_startup_trials : int, default=10 + Number of startup trials for GP. + deterministic_objective : bool, default=False + Whether the objective function is deterministic. + experiment : BaseExperiment, optional + The experiment to optimize parameters for. + Optional, can be passed later via ``set_params``. + + Examples + -------- + Basic usage of GPOptimizer with a scikit-learn experiment: + + >>> from hyperactive.experiment.integrations import SklearnCvExperiment + >>> from hyperactive.opt.optuna import GPOptimizer + >>> from sklearn.datasets import load_iris + >>> from sklearn.svm import SVC + >>> X, y = load_iris(return_X_y=True) + >>> sklearn_exp = SklearnCvExperiment(estimator=SVC(), X=X, y=y) + >>> param_space = { + ... "C": (0.01, 10), + ... "gamma": (0.0001, 10), + ... } + >>> optimizer = GPOptimizer( + ... param_space=param_space, n_trials=50, experiment=sklearn_exp + ... ) + >>> best_params = optimizer.run() + """ + + _tags = { + "info:name": "Gaussian Process Optimizer", + "info:local_vs_global": "global", + "info:explore_vs_exploit": "exploit", + "info:compute": "high", + "python_dependencies": ["optuna"], + } + + def __init__( + self, + param_space=None, + n_trials=100, + initialize=None, + random_state=None, + early_stopping=None, + max_score=None, + n_startup_trials=10, + deterministic_objective=False, + experiment=None, + ): + self.n_startup_trials = n_startup_trials + self.deterministic_objective = deterministic_objective + + super().__init__( + param_space=param_space, + n_trials=n_trials, + initialize=initialize, + random_state=random_state, + early_stopping=early_stopping, + max_score=max_score, + experiment=experiment, + ) + + def _get_optimizer(self): + """Get the GP optimizer. + + Returns + ------- + optimizer + The Optuna GPOptimizer instance + """ + import optuna + + optimizer_kwargs = { + "n_startup_trials": self.n_startup_trials, + "deterministic_objective": self.deterministic_objective, + } + + if self.random_state is not None: + optimizer_kwargs["seed"] = self.random_state + + return optuna.samplers.GPSampler(**optimizer_kwargs) + + @classmethod + def get_test_params(cls, parameter_set="default"): + """Return testing parameter settings for the optimizer.""" + params = super().get_test_params(parameter_set) + params[0].update( + { + "n_startup_trials": 5, + "deterministic_objective": True, + } + ) + return params diff --git a/src/hyperactive/opt/optuna/_grid_optimizer.py b/src/hyperactive/opt/optuna/_grid_optimizer.py new file mode 100644 index 00000000..57495a40 --- /dev/null +++ b/src/hyperactive/opt/optuna/_grid_optimizer.py @@ -0,0 +1,166 @@ +"""Grid optimizer.""" +# copyright: hyperactive developers, MIT License (see LICENSE file) + +from .._adapters._base_optuna_adapter import _BaseOptunaAdapter + + +class GridOptimizer(_BaseOptunaAdapter): + """Grid search optimizer. + + Parameters + ---------- + param_space : dict[str, tuple or list or optuna distributions] + The search space to explore. Dictionary with parameter names + as keys and either tuples/lists of (low, high) or + optuna distribution objects as values. + n_trials : int, default=100 + Number of optimization trials. + initialize : dict[str, int], default=None + The method to generate initial positions. A dictionary with + the following key literals and the corresponding value type: + {"grid": int, "vertices": int, "random": int, "warm_start": list[dict]} + random_state : None, int, default=None + If None, create a new random state. If int, create a new random state + seeded with the value. + early_stopping : int, default=None + Number of trials after which to stop if no improvement. + max_score : float, default=None + Maximum score threshold. Stop optimization when reached. + search_space : dict, default=None + Explicit search space for grid search. + experiment : BaseExperiment, optional + The experiment to optimize parameters for. + Optional, can be passed later via ``set_params``. + + Examples + -------- + Basic usage of GridOptimizer with a scikit-learn experiment: + + >>> from hyperactive.experiment.integrations import SklearnCvExperiment + >>> from hyperactive.opt.optuna import GridOptimizer + >>> from sklearn.datasets import load_iris + >>> from sklearn.svm import SVC + >>> X, y = load_iris(return_X_y=True) + >>> sklearn_exp = SklearnCvExperiment(estimator=SVC(), X=X, y=y) + >>> param_space = { + ... "C": [0.01, 0.1, 1, 10], + ... "gamma": [0.0001, 0.01, 0.1, 1], + ... } + >>> optimizer = GridOptimizer( + ... param_space=param_space, n_trials=50, experiment=sklearn_exp + ... ) + >>> best_params = optimizer.run() + """ + + _tags = { + "info:name": "Grid Optimizer", + "info:local_vs_global": "global", + "info:explore_vs_exploit": "explore", + "info:compute": "low", + "python_dependencies": ["optuna"], + } + + def __init__( + self, + param_space=None, + n_trials=100, + initialize=None, + random_state=None, + early_stopping=None, + max_score=None, + search_space=None, + experiment=None, + ): + self.search_space = search_space + + super().__init__( + param_space=param_space, + n_trials=n_trials, + initialize=initialize, + random_state=random_state, + early_stopping=early_stopping, + max_score=max_score, + experiment=experiment, + ) + + def _get_optimizer(self): + """Get the Grid optimizer. + + Returns + ------- + optimizer + The Optuna GridOptimizer instance + """ + import optuna + + # Convert param_space to Optuna search space format if needed + search_space = self.search_space + if search_space is None and self.param_space is not None: + search_space = {} + for key, space in self.param_space.items(): + if isinstance(space, list): + search_space[key] = space + elif isinstance(space, (tuple,)) and len(space) == 2: + # Convert range to discrete list for grid search + low, high = space + if isinstance(low, int) and isinstance(high, int): + search_space[key] = list(range(low, high + 1)) + else: + # Create a reasonable grid for continuous spaces + import numpy as np + + search_space[key] = np.linspace(low, high, 10).tolist() + + return optuna.samplers.GridSampler(search_space) + + @classmethod + def get_test_params(cls, parameter_set="default"): + """Return testing parameter settings for the optimizer.""" + from sklearn.datasets import load_iris + from sklearn.neighbors import KNeighborsClassifier + from sklearn.svm import SVC + + from hyperactive.experiment.integrations import SklearnCvExperiment + + X, y = load_iris(return_X_y=True) + + # Test case 1: Basic continuous parameters (converted to discrete) + svm_exp = SklearnCvExperiment(estimator=SVC(), X=X, y=y) + param_space_1 = { + "C": [0.01, 0.1, 1, 10], + "gamma": [0.0001, 0.01, 0.1, 1], + } + + # Test case 2: Mixed categorical and discrete parameters + knn_exp = SklearnCvExperiment(estimator=KNeighborsClassifier(), X=X, y=y) + param_space_2 = { + "n_neighbors": [1, 3, 5, 7], # Discrete integers + "weights": ["uniform", "distance"], # Categorical + "metric": ["euclidean", "manhattan"], # Categorical + "p": [1, 2], # Discrete for minkowski + } + + # Test case 3: Small exhaustive grid (tests complete enumeration) + param_space_3 = { + "C": [0.1, 1], # 2 values + "kernel": ["rbf", "linear"], # 2 values + } + # Total: 2 x 2 = 4 combinations, n_trials should cover all + + return [ + { + "param_space": param_space_1, + "n_trials": 10, + "experiment": svm_exp, + }, + { + "param_space": param_space_2, + "n_trials": 15, + "experiment": knn_exp, + }, + { + "param_space": param_space_3, + "n_trials": 4, # Exact number for exhaustive search + "experiment": svm_exp, + }, + ] diff --git a/src/hyperactive/opt/optuna/_nsga_ii_optimizer.py b/src/hyperactive/opt/optuna/_nsga_ii_optimizer.py new file mode 100644 index 00000000..a6ff9857 --- /dev/null +++ b/src/hyperactive/opt/optuna/_nsga_ii_optimizer.py @@ -0,0 +1,158 @@ +"""NSGA-II multi-objective optimizer.""" +# copyright: hyperactive developers, MIT License (see LICENSE file) + +from .._adapters._base_optuna_adapter import _BaseOptunaAdapter + + +class NSGAIIOptimizer(_BaseOptunaAdapter): + """NSGA-II multi-objective optimizer. + + Parameters + ---------- + param_space : dict[str, tuple or list or optuna distributions] + The search space to explore. Dictionary with parameter names + as keys and either tuples/lists of (low, high) or + optuna distribution objects as values. + n_trials : int, default=100 + Number of optimization trials. + initialize : dict[str, int], default=None + The method to generate initial positions. A dictionary with + the following key literals and the corresponding value type: + {"grid": int, "vertices": int, "random": int, "warm_start": list[dict]} + random_state : None, int, default=None + If None, create a new random state. If int, create a new random state + seeded with the value. + early_stopping : int, default=None + Number of trials after which to stop if no improvement. + max_score : float, default=None + Maximum score threshold. Stop optimization when reached. + population_size : int, default=50 + Population size for NSGA-II. + mutation_prob : float, default=0.1 + Mutation probability for NSGA-II. + crossover_prob : float, default=0.9 + Crossover probability for NSGA-II. + experiment : BaseExperiment, optional + The experiment to optimize parameters for. + Optional, can be passed later via ``set_params``. + + Examples + -------- + Basic usage of NSGAIIOptimizer with a scikit-learn experiment: + + >>> from hyperactive.experiment.integrations import SklearnCvExperiment + >>> from hyperactive.opt.optuna import NSGAIIOptimizer + >>> from sklearn.datasets import load_iris + >>> from sklearn.svm import SVC + >>> X, y = load_iris(return_X_y=True) + >>> sklearn_exp = SklearnCvExperiment(estimator=SVC(), X=X, y=y) + >>> param_space = { + ... "C": (0.01, 10), + ... "gamma": (0.0001, 10), + ... } + >>> optimizer = NSGAIIOptimizer( + ... param_space=param_space, n_trials=50, experiment=sklearn_exp + ... ) + >>> best_params = optimizer.run() + """ + + _tags = { + "info:name": "NSGA-II Optimizer", + "info:local_vs_global": "global", + "info:explore_vs_exploit": "mixed", + "info:compute": "high", + "python_dependencies": ["optuna"], + } + + def __init__( + self, + param_space=None, + n_trials=100, + initialize=None, + random_state=None, + early_stopping=None, + max_score=None, + population_size=50, + mutation_prob=0.1, + crossover_prob=0.9, + experiment=None, + ): + self.population_size = population_size + self.mutation_prob = mutation_prob + self.crossover_prob = crossover_prob + + super().__init__( + param_space=param_space, + n_trials=n_trials, + initialize=initialize, + random_state=random_state, + early_stopping=early_stopping, + max_score=max_score, + experiment=experiment, + ) + + def _get_optimizer(self): + """Get the NSGA-II optimizer. + + Returns + ------- + optimizer + The Optuna NSGAIIOptimizer instance + """ + import optuna + + optimizer_kwargs = { + "population_size": self.population_size, + "mutation_prob": self.mutation_prob, + "crossover_prob": self.crossover_prob, + } + + if self.random_state is not None: + optimizer_kwargs["seed"] = self.random_state + + return optuna.samplers.NSGAIISampler(**optimizer_kwargs) + + @classmethod + def get_test_params(cls, parameter_set="default"): + """Return testing parameter settings for the optimizer.""" + from sklearn.datasets import load_iris + from sklearn.ensemble import RandomForestClassifier + + from hyperactive.experiment.integrations import SklearnCvExperiment + + # Test case 1: Basic single-objective (inherits from base) + params = super().get_test_params(parameter_set) + params[0].update( + { + "population_size": 20, + "mutation_prob": 0.2, + "crossover_prob": 0.8, + } + ) + + # Test case 2: Multi-objective with mixed parameter types + X, y = load_iris(return_X_y=True) + rf_exp = SklearnCvExperiment( + estimator=RandomForestClassifier(random_state=42), X=X, y=y + ) + + mixed_param_space = { + "n_estimators": (10, 50), # Continuous integer + "max_depth": [3, 5, 7, None], # Mixed discrete/None + "criterion": ["gini", "entropy"], # Categorical + "min_samples_split": (2, 10), # Continuous integer + "bootstrap": [True, False], # Boolean categorical + } + + params.append( + { + "param_space": mixed_param_space, + "n_trials": 15, # Smaller for faster testing + "experiment": rf_exp, + "population_size": 8, # Smaller population for testing + "mutation_prob": 0.1, + "crossover_prob": 0.9, + } + ) + + return params diff --git a/src/hyperactive/opt/optuna/_nsga_iii_optimizer.py b/src/hyperactive/opt/optuna/_nsga_iii_optimizer.py new file mode 100644 index 00000000..25df6e73 --- /dev/null +++ b/src/hyperactive/opt/optuna/_nsga_iii_optimizer.py @@ -0,0 +1,126 @@ +"""NSGA-III multi-objective optimizer.""" +# copyright: hyperactive developers, MIT License (see LICENSE file) + +from .._adapters._base_optuna_adapter import _BaseOptunaAdapter + + +class NSGAIIIOptimizer(_BaseOptunaAdapter): + """NSGA-III multi-objective optimizer. + + Parameters + ---------- + param_space : dict[str, tuple or list or optuna distributions] + The search space to explore. Dictionary with parameter names + as keys and either tuples/lists of (low, high) or + optuna distribution objects as values. + n_trials : int, default=100 + Number of optimization trials. + initialize : dict[str, int], default=None + The method to generate initial positions. A dictionary with + the following key literals and the corresponding value type: + {"grid": int, "vertices": int, "random": int, "warm_start": list[dict]} + random_state : None, int, default=None + If None, create a new random state. If int, create a new random state + seeded with the value. + early_stopping : int, default=None + Number of trials after which to stop if no improvement. + max_score : float, default=None + Maximum score threshold. Stop optimization when reached. + population_size : int, default=50 + Population size for NSGA-III. + mutation_prob : float, default=0.1 + Mutation probability for NSGA-III. + crossover_prob : float, default=0.9 + Crossover probability for NSGA-III. + experiment : BaseExperiment, optional + The experiment to optimize parameters for. + Optional, can be passed later via ``set_params``. + + Examples + -------- + Basic usage of NSGAIIIOptimizer with a scikit-learn experiment: + + >>> from hyperactive.experiment.integrations import SklearnCvExperiment + >>> from hyperactive.opt.optuna import NSGAIIIOptimizer + >>> from sklearn.datasets import load_iris + >>> from sklearn.svm import SVC + >>> X, y = load_iris(return_X_y=True) + >>> sklearn_exp = SklearnCvExperiment(estimator=SVC(), X=X, y=y) + >>> param_space = { + ... "C": (0.01, 10), + ... "gamma": (0.0001, 10), + ... } + >>> optimizer = NSGAIIIOptimizer( + ... param_space=param_space, n_trials=50, experiment=sklearn_exp + ... ) + >>> best_params = optimizer.run() + """ + + _tags = { + "info:name": "NSGA-III Optimizer", + "info:local_vs_global": "global", + "info:explore_vs_exploit": "mixed", + "info:compute": "high", + "python_dependencies": ["optuna"], + } + + def __init__( + self, + param_space=None, + n_trials=100, + initialize=None, + random_state=None, + early_stopping=None, + max_score=None, + population_size=50, + mutation_prob=0.1, + crossover_prob=0.9, + experiment=None, + ): + self.population_size = population_size + self.mutation_prob = mutation_prob + self.crossover_prob = crossover_prob + + super().__init__( + param_space=param_space, + n_trials=n_trials, + initialize=initialize, + random_state=random_state, + early_stopping=early_stopping, + max_score=max_score, + experiment=experiment, + ) + + def _get_optimizer(self): + """Get the NSGA-III optimizer. + + Returns + ------- + optimizer + The Optuna NSGAIIIOptimizer instance + """ + import optuna + + optimizer_kwargs = { + "population_size": self.population_size, + "mutation_prob": self.mutation_prob, + "crossover_prob": self.crossover_prob, + } + + if self.random_state is not None: + optimizer_kwargs["seed"] = self.random_state + + return optuna.samplers.NSGAIIISampler(**optimizer_kwargs) + + @classmethod + def get_test_params(cls, parameter_set="default"): + """Return testing parameter settings for the optimizer.""" + params = super().get_test_params(parameter_set) + params[0].update( + { + "population_size": 20, + "mutation_prob": 0.2, + "crossover_prob": 0.8, + } + ) + return params diff --git a/src/hyperactive/opt/optuna/_qmc_optimizer.py b/src/hyperactive/opt/optuna/_qmc_optimizer.py new file mode 100644 index 00000000..b025dff8 --- /dev/null +++ b/src/hyperactive/opt/optuna/_qmc_optimizer.py @@ -0,0 +1,163 @@ +"""Quasi-Monte Carlo optimizer.""" +# copyright: hyperactive developers, MIT License (see LICENSE file) + +from .._adapters._base_optuna_adapter import _BaseOptunaAdapter + + +class QMCOptimizer(_BaseOptunaAdapter): + """Quasi-Monte Carlo optimizer. + + Parameters + ---------- + param_space : dict[str, tuple or list or optuna distributions] + The search space to explore. Dictionary with parameter names + as keys and either tuples/lists of (low, high) or + optuna distribution objects as values. + n_trials : int, default=100 + Number of optimization trials. + initialize : dict[str, int], default=None + The method to generate initial positions. A dictionary with + the following key literals and the corresponding value type: + {"grid": int, "vertices": int, "random": int, "warm_start": list[dict]} + random_state : None, int, default=None + If None, create a new random state. If int, create a new random state + seeded with the value. + early_stopping : int, default=None + Number of trials after which to stop if no improvement. + max_score : float, default=None + Maximum score threshold. Stop optimization when reached. + qmc_type : str, default="sobol" + Type of QMC sequence. Options: "sobol", "halton". + scramble : bool, default=True + Whether to scramble the QMC sequence. + experiment : BaseExperiment, optional + The experiment to optimize parameters for. + Optional, can be passed later via ``set_params``. + + Examples + -------- + Basic usage of QMCOptimizer with a scikit-learn experiment: + + >>> from hyperactive.experiment.integrations import SklearnCvExperiment + >>> from hyperactive.opt.optuna import QMCOptimizer + >>> from sklearn.datasets import load_iris + >>> from sklearn.svm import SVC + >>> X, y = load_iris(return_X_y=True) + >>> sklearn_exp = SklearnCvExperiment(estimator=SVC(), X=X, y=y) + >>> param_space = { + ... "C": (0.01, 10), + ... "gamma": (0.0001, 10), + ... } + >>> optimizer = QMCOptimizer( + ... param_space=param_space, n_trials=50, experiment=sklearn_exp + ... ) + >>> best_params = optimizer.run() + """ + + _tags = { + "info:name": "Quasi-Monte Carlo Optimizer", + "info:local_vs_global": "global", + "info:explore_vs_exploit": "explore", + "info:compute": "low", + "python_dependencies": ["optuna"], + } + + def __init__( + self, + param_space=None, + n_trials=100, + initialize=None, + random_state=None, + early_stopping=None, + max_score=None, + qmc_type="sobol", + scramble=True, + experiment=None, + ): + self.qmc_type = qmc_type + self.scramble = scramble + + super().__init__( + param_space=param_space, + n_trials=n_trials, + initialize=initialize, + random_state=random_state, + early_stopping=early_stopping, + max_score=max_score, + experiment=experiment, + ) + + def _get_optimizer(self): + """Get the QMC optimizer. + + Returns + ------- + optimizer + The Optuna QMCOptimizer instance + """ + import optuna + + optimizer_kwargs = { + "qmc_type": self.qmc_type, + "scramble": self.scramble, + } + + if self.random_state is not None: + optimizer_kwargs["seed"] = self.random_state + + return optuna.samplers.QMCSampler(**optimizer_kwargs) + + @classmethod + def get_test_params(cls, parameter_set="default"): + """Return testing parameter settings for the optimizer.""" + from sklearn.datasets import load_iris + from sklearn.linear_model import LogisticRegression + + from hyperactive.experiment.integrations import SklearnCvExperiment + + # Test case 1: Halton sequence without scrambling + params = super().get_test_params(parameter_set) + params[0].update( + { + "qmc_type": "halton", + "scramble": False, + } + ) + + # Test case 2: Sobol sequence with scrambling + X, y = load_iris(return_X_y=True) + lr_exp = SklearnCvExperiment( + estimator=LogisticRegression(random_state=42, max_iter=1000), X=X, y=y + ) + + mixed_param_space = { + "C": (0.01, 100), # Continuous + "penalty": [ + "l1", + "l2", + ], # Categorical - removed elasticnet to avoid solver conflicts + "solver": ["liblinear", "saga"], # Categorical + } + + params.append( + { + "param_space": mixed_param_space, + "n_trials": 16, # Power of 2 for better QMC properties + "experiment": lr_exp, + "qmc_type": "sobol", # Different sequence type + "scramble": True, # With scrambling for randomization + } + ) + + # Test case 3: Different sampler configuration with same experiment + params.append( + { + "param_space": mixed_param_space, + "n_trials": 8, # Power of 2, good for QMC + "experiment": lr_exp, + "qmc_type": "halton", # Different QMC type + "scramble": False, + } + ) + + return params diff --git a/src/hyperactive/opt/optuna/_random_optimizer.py b/src/hyperactive/opt/optuna/_random_optimizer.py new file mode 100644 index 00000000..9502b5b2 --- /dev/null +++ b/src/hyperactive/opt/optuna/_random_optimizer.py @@ -0,0 +1,95 @@ +"""Random optimizer.""" +# copyright: hyperactive developers, MIT License (see LICENSE file) + +from .._adapters._base_optuna_adapter import _BaseOptunaAdapter + + +class RandomOptimizer(_BaseOptunaAdapter): + """Random optimizer. + + Parameters + ---------- + param_space : dict[str, tuple or list or optuna distributions] + The search space to explore. Dictionary with parameter names + as keys and either tuples/lists of (low, high) or + optuna distribution objects as values. + n_trials : int, default=100 + Number of optimization trials. + initialize : dict[str, int], default=None + The method to generate initial positions. A dictionary with + the following key literals and the corresponding value type: + {"grid": int, "vertices": int, "random": int, "warm_start": list[dict]} + random_state : None, int, default=None + If None, create a new random state. If int, create a new random state + seeded with the value. + early_stopping : int, default=None + Number of trials after which to stop if no improvement. + max_score : float, default=None + Maximum score threshold. Stop optimization when reached. + experiment : BaseExperiment, optional + The experiment to optimize parameters for. + Optional, can be passed later via ``set_params``. + + Examples + -------- + Basic usage of RandomOptimizer with a scikit-learn experiment: + + >>> from hyperactive.experiment.integrations import SklearnCvExperiment + >>> from hyperactive.opt.optuna import RandomOptimizer + >>> from sklearn.datasets import load_iris + >>> from sklearn.svm import SVC + >>> X, y = load_iris(return_X_y=True) + >>> sklearn_exp = SklearnCvExperiment(estimator=SVC(), X=X, y=y) + >>> param_space = { + ... "C": (0.01, 10), + ... "gamma": (0.0001, 10), + ... } + >>> optimizer = RandomOptimizer( + ... param_space=param_space, n_trials=50, experiment=sklearn_exp + ... ) + >>> best_params = optimizer.run() + """ + + _tags = { + "info:name": "Random Optimizer", + "info:local_vs_global": "global", + "info:explore_vs_exploit": "explore", + "info:compute": "low", + "python_dependencies": ["optuna"], + } + + def __init__( + self, + param_space=None, + n_trials=100, + initialize=None, + random_state=None, + early_stopping=None, + max_score=None, + experiment=None, + ): + super().__init__( + param_space=param_space, + n_trials=n_trials, + initialize=initialize, + random_state=random_state, + early_stopping=early_stopping, + max_score=max_score, + experiment=experiment, + ) + + def _get_optimizer(self): + """Get the Random optimizer. + + Returns + ------- + optimizer + The Optuna RandomOptimizer instance + """ + import optuna + + optimizer_kwargs = {} + if self.random_state is not None: + optimizer_kwargs["seed"] = self.random_state + + return optuna.samplers.RandomSampler(**optimizer_kwargs) diff --git a/src/hyperactive/opt/optuna/_tpe_optimizer.py b/src/hyperactive/opt/optuna/_tpe_optimizer.py new file mode 100644 index 00000000..7d9a6f27 --- /dev/null +++ b/src/hyperactive/opt/optuna/_tpe_optimizer.py @@ -0,0 +1,191 @@ +"""TPE (Tree-structured Parzen Estimator) optimizer.""" +# copyright: hyperactive developers, MIT License (see LICENSE file) + +from .._adapters._base_optuna_adapter import _BaseOptunaAdapter + + +class TPEOptimizer(_BaseOptunaAdapter): + """Tree-structured Parzen Estimator optimizer. + + Parameters + ---------- + param_space : dict[str, tuple or list or optuna distributions] + The search space to explore. Dictionary with parameter names + as keys and either tuples/lists of (low, high) or + optuna distribution objects as values. + n_trials : int, default=100 + Number of optimization trials. + initialize : dict[str, int], default=None + The method to generate initial positions. A dictionary with + the following key literals and the corresponding value type: + {"grid": int, "vertices": int, "random": int, "warm_start": list[dict]} + random_state : None, int, default=None + If None, create a new random state. If int, create a new random state + seeded with the value. + early_stopping : int, default=None + Number of trials after which to stop if no improvement. + max_score : float, default=None + Maximum score threshold. Stop optimization when reached. + n_startup_trials : int, default=10 + Number of startup trials for TPE. + n_ei_candidates : int, default=24 + Number of candidates for expected improvement. + weights : callable, default=None + Weight function for TPE. + experiment : BaseExperiment, optional + The experiment to optimize parameters for. + Optional, can be passed later via ``set_params``. + + Examples + -------- + Basic usage of TPEOptimizer with a scikit-learn experiment: + + >>> from hyperactive.experiment.integrations import SklearnCvExperiment + >>> from hyperactive.opt.optuna import TPEOptimizer + >>> from sklearn.datasets import load_iris + >>> from sklearn.svm import SVC + >>> X, y = load_iris(return_X_y=True) + >>> sklearn_exp = SklearnCvExperiment(estimator=SVC(), X=X, y=y) + >>> param_space = { + ... "C": (0.01, 10), + ... "gamma": (0.0001, 10), + ... } + >>> optimizer = TPEOptimizer( + ... param_space=param_space, n_trials=50, experiment=sklearn_exp + ... ) + >>> best_params = optimizer.run() + """ + + _tags = { + "info:name": "Tree-structured Parzen Estimator", + "info:local_vs_global": "global", + "info:explore_vs_exploit": "exploit", + "info:compute": "middle", + "python_dependencies": ["optuna"], + } + + def __init__( + self, + param_space=None, + n_trials=100, + initialize=None, + random_state=None, + early_stopping=None, + max_score=None, + n_startup_trials=10, + n_ei_candidates=24, + weights=None, + experiment=None, + ): + self.n_startup_trials = n_startup_trials + self.n_ei_candidates = n_ei_candidates + self.weights = weights + + super().__init__( + param_space=param_space, + n_trials=n_trials, + initialize=initialize, + random_state=random_state, + early_stopping=early_stopping, + max_score=max_score, + experiment=experiment, + ) + + def _get_optimizer(self): + """Get the TPE optimizer. + + Returns + ------- + optimizer + The Optuna TPEOptimizer instance + """ + import optuna + + optimizer_kwargs = { + "n_startup_trials": self.n_startup_trials, + "n_ei_candidates": self.n_ei_candidates, + } + + if self.weights is not None: + optimizer_kwargs["weights"] = self.weights + + if self.random_state is not None: + optimizer_kwargs["seed"] = self.random_state + + return optuna.samplers.TPESampler(**optimizer_kwargs) + + @classmethod + def get_test_params(cls, parameter_set="default"): + """Return testing parameter settings for the optimizer.""" + from sklearn.datasets import load_wine + from sklearn.ensemble import RandomForestClassifier + from sklearn.svm import SVC + + from hyperactive.experiment.integrations import SklearnCvExperiment + + # Test case 1: Basic TPE with standard parameters + params = super().get_test_params(parameter_set) + params[0].update( + { + "n_startup_trials": 5, + "n_ei_candidates": 12, + } + ) + + # Test case 2: Mixed parameter types with warm start + X, y = load_wine(return_X_y=True) + rf_exp = SklearnCvExperiment( + estimator=RandomForestClassifier(random_state=42), X=X, y=y + ) + + mixed_param_space = { + "n_estimators": (10, 100), # Continuous integer + "max_depth": [3, 5, 7, 10, None], # Mixed discrete/None + "criterion": ["gini", "entropy"], # Categorical + "min_samples_split": (2, 20), # Continuous integer + "bootstrap": [True, False], # Boolean + } + + # Warm start with known good configuration + warm_start_points = [ + { + "n_estimators": 50, + "max_depth": 5, + "criterion": "gini", + "min_samples_split": 2, + "bootstrap": True, + } + ] + + params.append( + { + "param_space": mixed_param_space, + "n_trials": 20, + "experiment": rf_exp, + "n_startup_trials": 3, # Fewer random trials before TPE + "n_ei_candidates": 24, # More EI candidates for better optimization + "initialize": {"warm_start": warm_start_points}, + } + ) + + # Test case 3: High-dimensional continuous space (TPE strength) + svm_exp = SklearnCvExperiment(estimator=SVC(), X=X, y=y) + high_dim_space = { + "C": (0.01, 100), + "gamma": (1e-6, 1e2), + "coef0": (0.0, 10.0), + "degree": (2, 5), + "tol": (1e-5, 1e-2), + } + + params.append( + { + "param_space": high_dim_space, + "n_trials": 25, + "experiment": svm_exp, + "n_startup_trials": 8, # More startup for exploration + "n_ei_candidates": 32, # More candidates for complex space + } + ) + + return params