

# Hyperparameter Optimization Project

## Fall 2021 - Team 2


Then you try to optimize hyperparameters ``C`` and ``solver`` of the classifier by using optuna.
When you introduce optuna naively, you define an ``objective`` function
such that it takes ``trial`` and calls ``suggest_*`` methods of ``trial`` to sample the hyperparameters:



In [134]:
import numpy as np
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import optuna

In [55]:
# building a Python class for project

class Tune:
    search_space_C = None
    search_space_solver = None
    study = None
    trial = None
    objective = None
    performance = None
    C = None
    solver = None
    
    def create_study(self, C, solver):
        self.search_space_C = C
        self.search_space_solver = solver
        
        # The function needs to be more generalizbale, later change required
        self.study = optuna.create_study(direction="maximize")
        self.trial = self.study.ask()
        self.C = self.trial.suggest_loguniform("C", self.search_space_C[0], self.search_space_C[1])
        self.solver = self.trial.suggest_categorical("solver", set(self.search_space_solver))
        print("Hyperparameter:", self.C)
        print("Solver:", self.solver)
    
    def update_study(self):
        self.get_performance()
        self.trial = self.study.ask()
        self.C = self.trial.suggest_loguniform("C", self.search_space_C[0], self.search_space_C[1])
        self.solver = self.trial.suggest_categorical("solver", set(self.search_space_solver))
        self.study.tell(self.trial, self.performance)
        # insert flask call to store current performance and values
        return self.C, self.solver
    
    def update_study_multiple(self, n):
        for _ in n:
            self.update_study()
        # flask call, retrive best study
    
    def set_objective(self, objective):
        self.objective = objective
        print("Objective has been set.")
    
    def get_performance(self):
        self.performance = self.objective(self.C, self.solver)
        print("Current Model Performance:", self.performance)

In [135]:
# user defines data and split

X, y = make_classification(n_features=10)
X_train, X_test, y_train, y_test = train_test_split(X, y)

In [136]:
# user defines objective function

def objective(C, solver):
    clf = LogisticRegression(C=C, solver=solver)
    clf.fit(X_train, y_train)
    val_accuracy = clf.score(X_test, y_test)
    return val_accuracy

In [137]:
tune = Tune()

In [138]:
tune.create_study([1e-7, 10.0], ["lbfgs", "saga"])

[32m[I 2021-12-02 18:16:06,535][0m A new study created in memory with name: no-name-82c6938a-743e-452d-8cc3-1b1b27850905[0m


Hyperparameter: 3.091858591608382e-06
Solver: lbfgs


In [139]:
tune.set_objective(objective)

Objective has been set.


In [182]:
tune.update_study()

Current Model Performance: 0.4


(0.00015550332306482872, 'lbfgs')