# Coding with AI

## My Old Implementation

Goal create a sklearn compatible stepwise tuner from my book *Effective Boost*
using AI.

In [2]:
# predict income > 50k
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
adult = fetch_openml('adult', version=2, as_frame=True)
X = adult['data']
y = adult['target'] == '>50K'

X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y)


In [3]:
X

Unnamed: 0,age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country
0,25,Private,226802,11th,7,Never-married,Machine-op-inspct,Own-child,Black,Male,0,0,40,United-States
1,38,Private,89814,HS-grad,9,Married-civ-spouse,Farming-fishing,Husband,White,Male,0,0,50,United-States
2,28,Local-gov,336951,Assoc-acdm,12,Married-civ-spouse,Protective-serv,Husband,White,Male,0,0,40,United-States
3,44,Private,160323,Some-college,10,Married-civ-spouse,Machine-op-inspct,Husband,Black,Male,7688,0,40,United-States
4,18,,103497,Some-college,10,Never-married,,Own-child,White,Female,0,0,30,United-States
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
48837,27,Private,257302,Assoc-acdm,12,Married-civ-spouse,Tech-support,Wife,White,Female,0,0,38,United-States
48838,40,Private,154374,HS-grad,9,Married-civ-spouse,Machine-op-inspct,Husband,White,Male,0,0,40,United-States
48839,58,Private,151910,HS-grad,9,Widowed,Adm-clerical,Unmarried,White,Female,0,0,40,United-States
48840,22,Private,201490,HS-grad,9,Never-married,Adm-clerical,Own-child,White,Male,0,0,20,United-States


In [4]:
y

0        False
1        False
2         True
3         True
4        False
         ...  
48837    False
48838     True
48839    False
48840    False
48841     True
Name: class, Length: 48842, dtype: bool

In [5]:
from hyperopt import hp, fmin, Trials, STATUS_OK, tpe
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score

params = {'random_state': 42}

rounds = [{'max_depth': hp.quniform('max_depth', 1, 8, 1),
           'min_child_weight': hp.loguniform('min_child_weight', -2, 3)},
          {'subsample': hp.uniform('subsample', 0.5, 1),
           'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1)},
          {'reg_alpha': hp.uniform('reg_alpha', 0, 10),
           'reg_lambda': hp.uniform('reg_lambda', 1, 10)},
          {'learning_rate': hp.loguniform('learning_rate', -7, 0)}]

def hyperparam_tuning(space, X_train, y_train, X_test, y_test, early_stopping_rounds=100, metric=None):
    int_vals = ['max_depth', 'reg_alpha']
    space = {k: int(v) if k in int_vals else v for k, v in space.items()}
    model = XGBClassifier(**space, early_stopping_rounds=early_stopping_rounds, 
                          enable_categorical=True)
    evaluation = [(X_train, y_train), (X_test, y_test)]
    model.fit(X_train, y_train, eval_set=evaluation,
              verbose=False)
    y_pred = model.predict(X_test)
    score = metric(y_test, y_pred)
    return {'loss': -score, 'status': STATUS_OK, 'model': model}


all_trials = []
for round in rounds:
    params = {**params, **round}
    trials = Trials()
    best = fmin(lambda space: hyperparam_tuning(space, X_train, y_train, X_test, y_test,
                                                early_stopping_rounds=150, metric=accuracy_score),
                 space=params, algo=tpe.suggest, max_evals=20, trials=trials)
    params = {**params, **best}
    all_trials.append(trials)

100%|██████████████████████████████████████████| 20/20 [00:08<00:00,  2.39trial/s, best loss: -0.8758496437638195]
100%|██████████████████████████████████████████| 20/20 [00:06<00:00,  3.24trial/s, best loss: -0.8751126033903858]
100%|██████████████████████████████████████████| 20/20 [00:06<00:00,  3.20trial/s, best loss: -0.8747850298910819]
100%|███████████████████████████████████████████| 20/20 [00:06<00:00,  3.14trial/s, best loss: -0.873884202767996]


In [6]:
print(params)

{'random_state': 42, 'max_depth': np.float64(3.0), 'min_child_weight': np.float64(7.346203204899075), 'subsample': np.float64(0.9028448078107045), 'colsample_bytree': np.float64(0.8209367760801355), 'reg_alpha': np.float64(1.9720667812489268), 'reg_lambda': np.float64(5.315937314850144), 'learning_rate': np.float64(0.3857861955230564)}


Complaints

- Global variables
- Not reusable
- Odd interface
- Doesn't work with other models or regressors

## Prompt

Create a sklearn tool to optimize the hyperparameters of a model
using the hyperopt library in a stepwise manner
It should have a constructor that receives the model, the hyperparameter
space as a sequence of dictionaries and the number of iterations (per sequence)
It should have a fit method that receives the training data and the
validation data. This should loop through the sequences of hyperparameter 
spaces and optimize them in a stepwise manner. 
After training it should have an attribute best_params_ that contains
the best hyperparameters found.

### Expected Interface

```python
rounds = [{'max_depth': hp.quniform('max_depth', 1, 8, 1),
           'min_child_weight': hp.loguniform('min_child_weight', -2, 3)},
            {'subsample': hp.uniform('subsample', 0.5, 1),
                'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1)},
            {'reg_alpha': hp.uniform('reg_alpha', 0, 10),
                'reg_lambda': hp.uniform('reg_lambda', 1, 10)},
                {'learning_rate': hp.loguniform('learning_rate', -7, 0)}]

sw = StepwiseOptimizer(xgb.XGBClasifier(), rounds)
sw.fit(X_train, y_train, X_test, y_test)
print(sw.best_params_)
```



## Claude

Issues:

- Had to comment out the code to set the random seed to get it to work.

- Uses `hp.choice` but we prefer to use `hp.quniform` for integer hyperparameters so the search space optimization is more efficient.



In [7]:
import numpy as np
from sklearn.base import BaseEstimator, MetaEstimatorMixin
from sklearn.model_selection import cross_val_score
from hyperopt import fmin, tpe, space_eval, Trials

class StepwiseHyperoptOptimizer(BaseEstimator, MetaEstimatorMixin):
    def __init__(self, model, param_space_sequence, max_evals_per_step=100, cv=5, 
                 scoring='neg_mean_squared_error', random_state=42):
        self.model = model
        self.param_space_sequence = param_space_sequence
        self.max_evals_per_step = max_evals_per_step
        self.cv = cv
        self.scoring = scoring
        self.random_state = random_state
        self.best_params_ = {}
        self.best_score_ = None

    def clean_int_params(self, params):
        int_vals = ['max_depth', 'reg_alpha']
        return {k: int(v) if k in int_vals else v for k, v in params.items()}

    def objective(self, params):
        # I added this
        params = self.clean_int_params(params)
        # END 
        current_params = {**self.best_params_, **params}
        self.model.set_params(**current_params)
        score = cross_val_score(self.model, self.X, self.y, cv=self.cv, scoring=self.scoring, n_jobs=-1)
        return -np.mean(score)

    def fit(self, X, y):
        self.X = X
        self.y = y

        for step, param_space in enumerate(self.param_space_sequence):
            print(f"Optimizing step {step + 1}/{len(self.param_space_sequence)}")
            
            trials = Trials()
            best = fmin(
                fn=self.objective,
                space=param_space,
                algo=tpe.suggest,
                max_evals=self.max_evals_per_step,
                trials=trials,
                #rstate=np.random.RandomState(self.random_state)
            )

            step_best_params = space_eval(param_space, best)
            # I added this
            step_best_params = self.clean_int_params(step_best_params)
            # END 
            self.best_params_.update(step_best_params)
            self.best_score_ = -min(trials.losses())
            
            print(f"Best parameters after step {step + 1}: {self.best_params_}")
            print(f"Best score after step {step + 1}: {self.best_score_}")

        # Fit the model with the best parameters
        self.model.set_params(**self.best_params_)
        self.model.fit(X, y)

        return self

    def predict(self, X):
        return self.model.predict(X)

    def score(self, X, y):
        return self.model.score(X, y)

In [8]:
# My testcase
import xgboost as xgb

rounds = [{'max_depth': hp.quniform('max_depth', 1, 8, 1),
           'min_child_weight': hp.loguniform('min_child_weight', -2, 3)},
            {'subsample': hp.uniform('subsample', 0.5, 1),
                'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1)},
            {'reg_alpha': hp.uniform('reg_alpha', 0, 10),
                'reg_lambda': hp.uniform('reg_lambda', 1, 10)},
                {'learning_rate': hp.loguniform('learning_rate', -7, 0)}]

sw = StepwiseHyperoptOptimizer(xgb.XGBClassifier(), rounds)
sw.fit(X_train, y_train, X_test, y_test)
print(sw.best_params_)

TypeError: StepwiseHyperoptOptimizer.fit() takes 3 positional arguments but 5 were given

In [9]:
# Interface for .fit expects 2 positional arguments, but 4 were given
import xgboost as xgb

rounds = [{'max_depth': hp.quniform('max_depth', 1, 8, 1),
           'min_child_weight': hp.loguniform('min_child_weight', -2, 3)},
            {'subsample': hp.uniform('subsample', 0.5, 1),
                'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1)},
            {'reg_alpha': hp.uniform('reg_alpha', 0, 10),
                'reg_lambda': hp.uniform('reg_lambda', 1, 10)},
                {'learning_rate': hp.loguniform('learning_rate', -7, 0)}]

sw = StepwiseHyperoptOptimizer(xgb.XGBClassifier(enable_categorical=True), rounds,
                               max_evals_per_step=20, scoring='accuracy')
sw.fit(X_train, y_train)#, X_test, y_test)
print(sw.best_params_)

Optimizing step 1/4
100%|██████████████████████████████████████████| 20/20 [00:09<00:00,  2.18trial/s, best loss: -0.8734404342429455]
Best parameters after step 1: {'max_depth': 3, 'min_child_weight': 0.25573212985900134}
Best score after step 1: 0.8734404342429455
Optimizing step 2/4
100%|██████████████████████████████████████████| 20/20 [00:06<00:00,  2.88trial/s, best loss: -0.8736861382145946]
Best parameters after step 2: {'max_depth': 3, 'min_child_weight': 0.25573212985900134, 'colsample_bytree': 0.8321794604619939, 'subsample': 0.9877339928336194}
Best score after step 2: 0.8736861382145946
Optimizing step 3/4
100%|███████████████████████████████████████████| 20/20 [00:06<00:00,  3.03trial/s, best loss: -0.873167437695894]
Best parameters after step 3: {'max_depth': 3, 'min_child_weight': 0.25573212985900134, 'colsample_bytree': 0.8321794604619939, 'subsample': 0.9877339928336194, 'reg_alpha': 1, 'reg_lambda': 4.325234011477646}
Best score after step 3: 0.873167437695894
Optim

### Testcase that Claude Gave Me

I'm going to use mine instead.

In [None]:
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from hyperopt import hp

# Create a sample dataset
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the sequence of hyperparameter spaces
param_space_sequence = [
    {
        'n_estimators': hp.choice('n_estimators', [10, 50, 100, 200]),
        'max_depth': hp.choice('max_depth', [None, 10, 20, 30])
    },
    {
        'min_samples_split': hp.uniform('min_samples_split', 0.1, 1.0),
        'min_samples_leaf': hp.uniform('min_samples_leaf', 0.1, 0.5)
    },
    {
        'max_features': hp.choice('max_features', ['sqrt', 'log2', None])
    }
]

# Create and fit the optimizer
optimizer = StepwiseHyperoptOptimizer(RandomForestClassifier(random_state=42), 
                                      param_space_sequence, 
                                      max_evals_per_step=30)
optimizer.fit(X_train, y_train)

# Print the best parameters and score
print("Best parameters:", optimizer.best_params_)
print("Best cross-validation score:", optimizer.best_score_)

# Evaluate on the test set
test_score = optimizer.score(X_test, y_test)
print("Test score:", test_score)

## ChatGPT 4o

In [13]:
import numpy as np
from hyperopt import hp, fmin, tpe, Trials
from hyperopt.pyll.base import scope
from sklearn.model_selection import cross_val_score

class StepwiseHyperoptOptimizerGPT:
    def __init__(self, model, hyperparameter_spaces, iterations_per_space=50):
        """
        :param model: The model to optimize (e.g., a scikit-learn estimator)
        :param hyperparameter_spaces: A list of hyperparameter spaces to optimize over, step by step
        :param iterations_per_space: Number of trials per hyperparameter space
        """
        self.model = model
        self.hyperparameter_spaces = hyperparameter_spaces
        self.iterations_per_space = iterations_per_space
        self.best_params_ = None

    def clean_int_params(self, params):
        int_vals = ['max_depth', 'reg_alpha']
        return {k: int(v) if k in int_vals else v for k, v in params.items()}

    def fit(self, X_train, y_train, X_val=None, y_val=None):
        """
        Optimize the hyperparameters step by step using the provided hyperparameter spaces.
        
        :param X_train: Training data
        :param y_train: Training labels
        :param X_val: Validation data (optional)
        :param y_val: Validation labels (optional)
        """
        best_params = {}

        for space in self.hyperparameter_spaces:
            def objective(params):
                 # I added this
                params = self.clean_int_params(params)
                # END 
                model = self.model.set_params(**{**best_params, **params})
                score = -np.mean(cross_val_score(model, X_train, y_train, cv=3, scoring='neg_mean_squared_error'))
                return score

            trials = Trials()
            best_params_step = fmin(fn=objective, 
                                    space=space, 
                                    algo=tpe.suggest, 
                                    max_evals=self.iterations_per_space, 
                                    trials=trials)
            
            # Update the best parameters with the new step's best found parameters
            best_params_step = self.clean_int_params(best_params_step)
            best_params.update(best_params_step)

        self.best_params_ = best_params
        # Re-fit the model with the final best parameters on the full training data
        self.model.set_params(**self.best_params_)
        self.model.fit(X_train, y_train)

    def predict(self, X):
        """
        Make predictions using the model with the best-found hyperparameters.
        
        :param X: Data to predict
        :return: Model predictions
        """
        return self.model.predict(X)


In [14]:
# Try my testcase

# Interface for .fit expects 2 positional arguments, but 4 were given
import xgboost as xgb

rounds = [{'max_depth': hp.quniform('max_depth', 1, 8, 1),
           'min_child_weight': hp.loguniform('min_child_weight', -2, 3)},
            {'subsample': hp.uniform('subsample', 0.5, 1),
                'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1)},
            {'reg_alpha': hp.uniform('reg_alpha', 0, 10),
                'reg_lambda': hp.uniform('reg_lambda', 1, 10)},
                {'learning_rate': hp.loguniform('learning_rate', -7, 0)}]

sw = StepwiseHyperoptOptimizerGPT(xgb.XGBClassifier(enable_categorical=True), rounds,
                               iterations_per_space=20)#, scoring='accuracy')
sw.fit(X_train, y_train, X_test, y_test)
print(sw.best_params_)

100%|██████████████████████████████████████████| 20/20 [00:11<00:00,  1.73trial/s, best loss: 0.12653221934198103]
 30%|█████████████▏                              | 6/20 [00:02<00:06,  2.32trial/s, best loss: 0.1273785537236524]

Exception ignored on calling ctypes callback function
: 
<bound method DataIter._next_wrapper of <xgboost.data.SingleBatchInternalIter object at 0x126b8ca40>>
Traceback (most recent call last):

  File "/Users/matt/Dropbox/work/courses/ms-courses/professionalpython/.venv/lib/python3.12/site-packages/xgboost/core.py", line 582, in _next_wrapper

def _next_wrapper(self, this: None) -> int:  # pylint: disable=unused-argument
KeyboardInterrupt
: 


 35%|███████████████▍                            | 7/20 [00:03<00:05,  2.55trial/s, best loss: 0.1273785537236524]

1 fits failed out of a total of 3.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
1 fits failed with the following error:
Traceback (most recent call last):
  File "/Users/matt/Dropbox/work/courses/ms-courses/professionalpython/.venv/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 888, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/Users/matt/Dropbox/work/courses/ms-courses/professionalpython/.venv/lib/python3.12/site-packages/xgboost/core.py", line 726, in inner_f
    return func(**kwargs)
           ^^^^^^^^^^^^^^
  File "/Users/matt/Dropbox/work/courses/ms-courses/professionalpython/.venv/lib/python3.12/site-packages/xgboost/sklearn.py", line 1512, in fit
    train_dmatrix, eva

100%|██████████████████████████████████████████| 20/20 [00:09<00:00,  2.16trial/s, best loss: 0.12721472226017308]
 25%|███████████                                 | 5/20 [00:02<00:08,  1.87trial/s, best loss: 0.1283340032582519]


KeyboardInterrupt: 

## GPT Didn't Have Scoring



In [15]:
import numpy as np
from hyperopt import hp, fmin, tpe, Trials
from hyperopt.pyll.base import scope
from sklearn.model_selection import cross_val_score

class StepwiseHyperoptOptimizerGPT:
    def __init__(self, model, hyperparameter_spaces, iterations_per_space=50, scoring=None):
        """
        :param model: The model to optimize (e.g., a scikit-learn estimator)
        :param hyperparameter_spaces: A list of hyperparameter spaces to optimize over, step by step
        :param iterations_per_space: Number of trials per hyperparameter space
        """
        self.model = model
        self.hyperparameter_spaces = hyperparameter_spaces
        self.iterations_per_space = iterations_per_space
        self.scoring = scoring if scoring is not None else 'neg_mean_squared_error'  # Default scoring method

        self.best_params_ = None

    def clean_int_params(self, params):
        int_vals = ['max_depth', 'reg_alpha']
        return {k: int(v) if k in int_vals else v for k, v in params.items()}

    def fit(self, X_train, y_train, X_val=None, y_val=None):
        """
        Optimize the hyperparameters step by step using the provided hyperparameter spaces.
        
        :param X_train: Training data
        :param y_train: Training labels
        :param X_val: Validation data (optional)
        :param y_val: Validation labels (optional)
        """
        best_params = {}

        for space in self.hyperparameter_spaces:
            def objective(params):
                 # I added this
                params = self.clean_int_params(params)
                # END 
                model = self.model.set_params(**{**best_params, **params})
                score = -np.mean(cross_val_score(model, X_train, y_train, cv=3, scoring=self.scoring))
                return score

            trials = Trials()
            best_params_step = fmin(fn=objective, 
                                    space=space, 
                                    algo=tpe.suggest, 
                                    max_evals=self.iterations_per_space, 
                                    trials=trials)
            
            # Update the best parameters with the new step's best found parameters
            best_params_step = self.clean_int_params(best_params_step)
            best_params.update(best_params_step)

        self.best_params_ = best_params
        # Re-fit the model with the final best parameters on the full training data
        self.model.set_params(**self.best_params_)
        self.model.fit(X_train, y_train)

    def predict(self, X):
        """
        Make predictions using the model with the best-found hyperparameters.
        
        :param X: Data to predict
        :return: Model predictions
        """
        return self.model.predict(X)


In [16]:
# Try my testcase

# Interface for .fit expects 2 positional arguments, but 4 were given
import xgboost as xgb

rounds = [{'max_depth': hp.quniform('max_depth', 1, 8, 1),
           'min_child_weight': hp.loguniform('min_child_weight', -2, 3)},
            {'subsample': hp.uniform('subsample', 0.5, 1),
                'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1)},
            {'reg_alpha': hp.uniform('reg_alpha', 0, 10),
                'reg_lambda': hp.uniform('reg_lambda', 1, 10)},
                {'learning_rate': hp.loguniform('learning_rate', -7, 0)}]

sw = StepwiseHyperoptOptimizerGPT(xgb.XGBClassifier(enable_categorical=True), rounds,
                               iterations_per_space=20, scoring='accuracy')
sw.fit(X_train, y_train, X_test, y_test)
print(sw.best_params_)

100%|██████████████████████████████████████████| 20/20 [00:11<00:00,  1.69trial/s, best loss: -0.8737407541027228]
100%|██████████████████████████████████████████| 20/20 [00:11<00:00,  1.74trial/s, best loss: -0.8722938996053401]
100%|██████████████████████████████████████████| 20/20 [00:11<00:00,  1.75trial/s, best loss: -0.8730036399514738]
100%|██████████████████████████████████████████| 20/20 [00:10<00:00,  1.83trial/s, best loss: -0.8725122595812956]
{'max_depth': 4, 'min_child_weight': np.float64(0.3256402691705849), 'colsample_bytree': np.float64(0.6669097302598538), 'subsample': np.float64(0.9603003785973454), 'reg_alpha': 0, 'reg_lambda': np.float64(4.876612812920501), 'learning_rate': np.float64(0.26909073508143416)}


### GPT Testcase

In [None]:
from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import make_regression
from hyperopt import hp

# Generate sample data
X_train, y_train = make_regression(n_samples=100, n_features=20)
X_val, y_val = make_regression(n_samples=50, n_features=20)

# Define model
model = RandomForestRegressor()

# Define hyperparameter spaces to explore
hyperparameter_spaces = [
    {
        'n_estimators': scope.int(hp.quniform('n_estimators', 50, 200, 1)),
        'max_depth': scope.int(hp.quniform('max_depth', 5, 20, 1)),
    },
    {
        'min_samples_split': scope.int(hp.quniform('min_samples_split', 2, 10, 1)),
        'min_samples_leaf': scope.int(hp.quniform('min_samples_leaf', 1, 5, 1)),
    }
]

# Create the stepwise optimizer
optimizer = StepwiseHyperoptOptimizer(model, hyperparameter_spaces, iterations_per_space=20)

# Optimize the model
optimizer.fit(X_train, y_train)

# Access the best hyperparameters found
print("Best hyperparameters:", optimizer.best_params_)

# Make predictions on validation data
predictions = optimizer.predict(X_val)


## VSCode Copilot

The following was generated after pasting the prompt and the sample inteface into the `vsw.py` docstring.



In [17]:


from hyperopt import hp, fmin, tpe, Trials
from sklearn.model_selection import cross_val_score
import numpy as np

class StepwiseOptimizerCopilot:

    def __init__(self, model, rounds, n_iter=100):
        self.model = model
        self.rounds = rounds
        self.n_iter = n_iter
        self.best_params_ = None

    def fit(self, X_train, y_train, X_test, y_test):
        for round in self.rounds:
            space = round
            trials = Trials()
            best = fmin(self.objective, space, algo=tpe.suggest, max_evals=self.n_iter, trials=trials)
            self.best_params_ = best

    def objective(self, params):
        # I added this
        params = self.clean_int_params(params)
        # END
        self.model.set_params(**params)
        return -np.mean(cross_val_score(self.model, X_train, y_train, cv=3, n_jobs=-1, scoring='accuracy'))
    
    def clean_int_params(self, params):
        int_vals = ['max_depth', 'reg_alpha']
        return {k: int(v) if k in int_vals else v for k, v in params.items()}

In [18]:
# Try my testcase

# Interface for .fit expects 2 positional arguments, but 4 were given
import xgboost as xgb

rounds = [{'max_depth': hp.quniform('max_depth', 1, 8, 1),
           'min_child_weight': hp.loguniform('min_child_weight', -2, 3)},
            {'subsample': hp.uniform('subsample', 0.5, 1),
                'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1)},
            {'reg_alpha': hp.uniform('reg_alpha', 0, 10),
                'reg_lambda': hp.uniform('reg_lambda', 1, 10)},
                {'learning_rate': hp.loguniform('learning_rate', -7, 0)}]

sw = StepwiseOptimizerCopilot(xgb.XGBClassifier(enable_categorical=True), rounds,
                               n_iter=20)#, scoring='accuracy')
sw.fit(X_train, y_train, X_test, y_test)
print(sw.best_params_)

100%|██████████████████████████████████████████| 20/20 [00:09<00:00,  2.15trial/s, best loss: -0.8730036690154618]
100%|██████████████████████████████████████████| 20/20 [00:10<00:00,  1.99trial/s, best loss: -0.8682809476192492]
100%|███████████████████████████████████████████| 20/20 [00:09<00:00,  2.16trial/s, best loss: -0.871229258904306]
100%|██████████████████████████████████████████| 20/20 [00:09<00:00,  2.04trial/s, best loss: -0.8723757773303266]
{'learning_rate': np.float64(0.17141649601970488)}


### Commentary

Appears to work (after I added `.clean_int_params()`. However:

- `X_train` and `X_test` are ignored by `fit()`. I'm lucky that my global variables are named `X_train` and `X_test` so it works.


- The scoring function is hardcoded to `accuracy`. I would prefer to pass it as an argument to the constructor.

- `best_params_` has a logic error, it isn't updated only overwritten.

- Does not subclass `BaseEstimator` and `TransformerMixin` so it can't be used in a pipeline.

In [None]:


from hyperopt import hp, fmin, tpe, Trials
from sklearn.model_selection import cross_val_score
from sklearn.base import BaseEstimator, TransformerMixin
import numpy as np

class StepwiseOptimizerCopilot(BaseEstimator, TransformerMixin):
    def __init__(self, model, rounds, n_iter=100, scoring=None):
        self.model = model
        self.rounds = rounds
        self.n_iter = n_iter
        self.scoring = scoring if scoring is not None else 'accuracy'
        self.best_params_ = {}

    def fit(self, X_train, y_train, X_test, y_test):
        for round in self.rounds:
            space = round
            trials = Trials()
            best = fmin(self.objective, space, algo=tpe.suggest, max_evals=self.n_iter, trials=trials)
            # I added this
            print(best)
            # END
            self.best_params_.update(best)

    def objective(self, params):
        # I added this
        params = self.clean_int_params(params)
        # END
        self.model.set_params(**params)
        return -np.mean(cross_val_score(self.model, X_train, y_train, cv=3, n_jobs=-1, scoring=self.scoring))
    
    def clean_int_params(self, params):
        int_vals = ['max_depth', 'reg_alpha']
        return {k: int(v) if k in int_vals else v for k, v in params.items()}

In [None]:
# Try my testcase

# Interface for .fit expects 2 positional arguments, but 4 were given
import xgboost as xgb

rounds = [{'max_depth': hp.quniform('max_depth', 1, 8, 1),
           'min_child_weight': hp.loguniform('min_child_weight', -2, 3)},
            {'subsample': hp.uniform('subsample', 0.5, 1),
                'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1)},
            {'reg_alpha': hp.uniform('reg_alpha', 0, 10),
                'reg_lambda': hp.uniform('reg_lambda', 1, 10)},
                {'learning_rate': hp.loguniform('learning_rate', -7, 0)}]

sw = StepwiseOptimizerCopilot(xgb.XGBClassifier(enable_categorical=True), rounds,
                               n_iter=20)#, scoring='accuracy')
sw.fit(X_train, y_train, X_test, y_test)
print(sw.best_params_)

## How Long to Write?

(From ChatGPT4o)

The time it would take for the average programmer to write the `StepwiseHyperoptOptimizer` class can vary based on several factors, such as familiarity with the required libraries (`scikit-learn`, `hyperopt`), experience in Python, and proficiency in writing custom optimization tools.

Here’s a rough estimate based on different programmer skill levels:

### 1. **Experienced Data Scientist or Machine Learning Engineer** (Familiar with `hyperopt`, `scikit-learn`, and optimization techniques):
   - **Time estimate**: 1 to 2 hours
   - **Reasoning**: An experienced programmer in this field will likely have prior knowledge of both `hyperopt` and `scikit-learn`, allowing them to quickly write the class. They may spend some time thinking through the stepwise nature of the hyperparameter spaces and ensuring proper parameter passing and scoring.

### 2. **Intermediate Python Developer with Basic Machine Learning Experience**:
   - **Time estimate**: 2 to 4 hours
   - **Reasoning**: An intermediate developer may need some time to look up the `hyperopt` API, understand how to integrate it with `scikit-learn`, and write an effective stepwise optimization method. Debugging and testing could take longer if they are less familiar with hyperparameter tuning techniques.

### 3. **Junior Programmer or Developer New to Machine Learning**:
   - **Time estimate**: 4 to 8 hours
   - **Reasoning**: A junior developer or someone less familiar with the libraries and optimization techniques would need additional time to research, learn how `hyperopt` and `scikit-learn` work together, and figure out the best way to structure the stepwise optimization. They may spend significant time debugging and testing, ensuring that the optimizer works as intended.

### Key Time Considerations:
- **Understanding of `hyperopt`**: For someone new to the `hyperopt` library, there could be some learning overhead, as it’s not as commonly used as other hyperparameter optimization libraries like `GridSearchCV`.
- **Familiarity with `scikit-learn`**: The more comfortable a programmer is with scikit-learn’s models and cross-validation methods, the faster they’ll implement this class.
- **Error handling and edge cases**: Ensuring robust handling of edge cases (e.g., invalid hyperparameters, failed model fits) may extend development time, especially for less experienced programmers.

In summary, the time required could range from as little as 1 hour for someone experienced to as much as a full day for someone new to the domain.

(From Claude 3.5 Sonnet)

To estimate how long it would take the average programmer to write this class, we need to consider several factors:

1. Experience level: Let's assume we're talking about an average programmer with some experience in Python and machine learning, but not necessarily an expert in hyperparameter optimization or the hyperopt library.

2. Familiarity with the concepts: The programmer would need to understand hyperparameter optimization, sklearn's API, and the basics of the hyperopt library.

3. Requirements analysis and design: Time needed to understand the requirements and design the class structure.

4. Implementation: Actual coding time.

5. Testing and debugging: Time to write test cases and fix any issues.

Given these considerations, here's a rough estimate:

1. Understanding the requirements and researching necessary concepts: 2-4 hours
2. Designing the class structure: 1-2 hours
3. Implementing the basic structure and methods: 3-5 hours
4. Implementing the stepwise optimization logic: 2-3 hours
5. Testing and debugging: 2-4 hours
6. Refining and optimizing the code: 1-2 hours

In total, this sums up to approximately 11-20 hours of work.

However, this estimate can vary significantly based on the individual programmer's:

- Prior experience with similar tasks
- Familiarity with the libraries used (sklearn, hyperopt)
- Understanding of machine learning concepts and hyperparameter optimization
- Debugging skills
- Attention to detail for edge cases and error handling

An experienced machine learning engineer familiar with hyperopt might complete this in 4-8 hours, while a beginner programmer new to these concepts might take 30-40 hours or more.

It's also worth noting that this estimate is for creating a working version. Additional time would be needed for more comprehensive testing, documentation, and potential refactoring for production-ready code.



In [21]:
30*60 /10

180.0

## Convert to Project

- Use `uv init --package --lib sk-stepwise` to create a new project. (`--lib` makes a library. `--package` makes it a distributable.)
- Run `uv add hyperopt scikit-learn numpy distutils` to add dependencies.
- Edit `sk-stepwise/src/sk_stepwise/__init__.py` to and add class. (I used the claude version.)
- Install as editable with `uv pip install -e .` (Make sure you are in the directory with `pyproject.toml`.)


In [22]:
# predict income > 50k

from hyperopt import hp, fmin, Trials, STATUS_OK, tpe
import sk_stepwise as sw
import xgboost as xgb
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split

adult = fetch_openml('adult', version=2, as_frame=True)
X = adult['data']
y = adult['target'] == '>50K'

X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y)

rounds = [{'max_depth': hp.quniform('max_depth', 1, 8, 1),
           'min_child_weight': hp.loguniform('min_child_weight', -2, 3)},
            {'subsample': hp.uniform('subsample', 0.5, 1),
                'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1)},
            {'reg_alpha': hp.uniform('reg_alpha', 0, 10),
                'reg_lambda': hp.uniform('reg_lambda', 1, 10)},
                {'learning_rate': hp.loguniform('learning_rate', -7, 0)}]

swopt = sw.StepwiseHyperoptOptimizer(xgb.XGBClassifier(enable_categorical=True), rounds)
swopt.fit(X_train, y_train)
print(swopt.best_params_)

Optimizing step 1/4
 67%|███████████████████████████▍             | 67/100 [00:24<00:12,  2.71trial/s, best loss: 0.12628665863277572]


KeyboardInterrupt: 

## Summary

- AI can make us much more productive.