# OPTUNA IN MACHINE LEARNING

#### Terms in Optuna

1. Study: A Study in Optuna is an optimization session that encompasses multiple trails. Essential objective is to optimize the objective fnc; i.e. the training and  finding the best hyperparameters are called as study in term's of Optuna's language.
2. Trial: A trial is a single iteration of the optimization process where a specific set of hyperparameters is evaluated. Each trial runs the obj func once with a distinct set of hyperparameters.
3. Trial Parameters: Specific hyperparameters values chosen during a trial. Each trial will have a unique combination of hyperparameters that are evaluated to see how they impact the objective function. E.g. - Different different learning rates, batch_sizes during each trial.
4. Objective Function: The fnc to be optimized(minimized or maximized) during the hyperparameter search. It takes hyperparameters as input and return a value (eg accuracy, loss, etc) that Optuna tries to optimize. E.g. - In a classification tack, the obj fnc could be the cross-entropy loss which Optuna seeks to optimize.
5. Sampler: A sampler is the algorithm that suggests which hyperparameters should be evaluated next. Optuna uses the Tree-structured Parzen Estimator(TPE) by default(TPE uses Bayesian Optimization), but can also support other sampling like Random Sampling or even Custom Sampling,

In [11]:
# !uv pip install optuna

In [1]:
import optuna
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import pandas as pd

url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
columns = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI',
           'DiabetesPedigreeFunction', 'Age', 'Outcome']

df = pd.read_csv(url, names=columns)
df.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [2]:
import numpy as np

# Missing Value Imputation
cols_with_missing_vals = ['Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI']
df[cols_with_missing_vals] = df[cols_with_missing_vals].replace(0,np.nan)

df.fillna(df.mean(), inplace=True)
print(df.isnull().sum())

Pregnancies                 0
Glucose                     0
BloodPressure               0
SkinThickness               0
Insulin                     0
BMI                         0
DiabetesPedigreeFunction    0
Age                         0
Outcome                     0
dtype: int64


In [3]:
X = df.drop('Outcome', axis=1)
y = df['Outcome']

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3, random_state=42)

# Optional Scaling
scalar = StandardScaler()
X_train = scalar.fit_transform(X_train)
X_test = scalar.transform(X_test)

print(f'Training set shape: {X_train.shape}')
print(f'Test set shape: {X_test.shape}')

Training set shape: (537, 8)
Test set shape: (231, 8)


In [5]:
# Now we will define an objective function for the workflow of optuna

from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
# Define the objective fnc

def objective(trial):
    # Suggest values of the hyperparameters
    n_estimators = trial.suggest_int('n_estimators', 50, 200)
    max_depth = trial.suggest_int('max_depth', 3, 20)
    # Here the search space will be in the Range [50,200] for n_estimators
    # and range of [3,20] for max_depth

    # Create the RandomForestClassifier with suggested hyperparameters
    model = RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        random_state=42
    )
    # Perform 3-fold cross-validation and calculate accuracy
    score = cross_val_score(model, X_train, y_train, cv=3, scoring='accuracy').mean()
    return score

In [6]:
# Create a study object and optimize the objective function
study = optuna.create_study(direction='maximize', sampler=optuna.samplers.TPESampler())
# As we are trying to 'Maximize' our accuracy so our direction of the objective function will be to maximize it
study.optimize(objective, n_trials=100) # Running 50 trials to find the optimal hyperparameters

# Print the best result
print(f'Best trial accuracy: {study.best_trial.value}')
print(f'Best trial Hyperparameters: {study.best_trial.params}')

from sklearn.metrics import accuracy_score

# Train a RandomForestClassifier using the best hyperparameters from Optuna
best_model = RandomForestClassifier(**study.best_trial.params, random_state=42)

best_model.fit(X_train, y_train)

y_pred = best_model.predict(X_test)

test_accuracy = accuracy_score(y_test, y_pred)

print(f'Test accuracy ith best hyperparameters: {test_accuracy:.2f}')

[I 2025-07-02 18:09:34,177] A new study created in memory with name: no-name-ff348ff6-4448-4c70-8572-50bcfebb1d4f
[I 2025-07-02 18:09:34,487] Trial 0 finished with value: 0.7783985102420856 and parameters: {'n_estimators': 106, 'max_depth': 18}. Best is trial 0 with value: 0.7783985102420856.
[I 2025-07-02 18:09:34,709] Trial 1 finished with value: 0.7690875232774674 and parameters: {'n_estimators': 59, 'max_depth': 13}. Best is trial 0 with value: 0.7783985102420856.
[I 2025-07-02 18:09:35,257] Trial 2 finished with value: 0.756052141527002 and parameters: {'n_estimators': 173, 'max_depth': 4}. Best is trial 0 with value: 0.7783985102420856.
[I 2025-07-02 18:09:35,666] Trial 3 finished with value: 0.7653631284916201 and parameters: {'n_estimators': 130, 'max_depth': 11}. Best is trial 0 with value: 0.7783985102420856.
[I 2025-07-02 18:09:36,264] Trial 4 finished with value: 0.7709497206703911 and parameters: {'n_estimators': 200, 'max_depth': 10}. Best is trial 0 with value: 0.7783985

Best trial accuracy: 0.7839851024208566
Best trial Hyperparameters: {'n_estimators': 73, 'max_depth': 18}
Test accuracy ith best hyperparameters: 0.76


In [14]:
# Now we will use RandomSampler

# Create a study object and optimize the objective function
study = optuna.create_study(direction='maximize', sampler=optuna.samplers.RandomSampler())
# As we are trying to 'Maximize' our accuracy so our direction of the objective function will be to maximize it
study.optimize(objective, n_trials=100) # Running 50 trials to find the optimal hyperparameters

# Print the best result
print(f'Best trial accuracy: {study.best_trial.value}')
print(f'Best trial Hyperparameters: {study.best_trial.params}')

from sklearn.metrics import accuracy_score

# Train a RandomForestClassifier using the best hyperparameters from Optuna
best_model = RandomForestClassifier(**study.best_trial.params, random_state=42)

best_model.fit(X_train, y_train)

y_pred = best_model.predict(X_test)

test_accuracy = accuracy_score(y_test, y_pred)

print(f'Test accuracy ith best hyperparameters: {test_accuracy:.2f}')

[I 2025-07-02 18:01:46,900] A new study created in memory with name: no-name-0a892328-555f-46ba-a887-a6036f217a80
[I 2025-07-02 18:01:47,366] Trial 0 finished with value: 0.7672253258845437 and parameters: {'n_estimators': 109, 'max_depth': 5}. Best is trial 0 with value: 0.7672253258845437.
[I 2025-07-02 18:01:47,665] Trial 1 finished with value: 0.7746741154562384 and parameters: {'n_estimators': 72, 'max_depth': 5}. Best is trial 1 with value: 0.7746741154562384.
[I 2025-07-02 18:01:48,205] Trial 2 finished with value: 0.7728119180633147 and parameters: {'n_estimators': 116, 'max_depth': 8}. Best is trial 1 with value: 0.7746741154562384.
[I 2025-07-02 18:01:48,474] Trial 3 finished with value: 0.7728119180633147 and parameters: {'n_estimators': 58, 'max_depth': 8}. Best is trial 1 with value: 0.7746741154562384.
[I 2025-07-02 18:01:48,933] Trial 4 finished with value: 0.7690875232774674 and parameters: {'n_estimators': 98, 'max_depth': 14}. Best is trial 1 with value: 0.77467411545

Best trial accuracy: 0.7858472998137803
Best trial Hyperparameters: {'n_estimators': 121, 'max_depth': 15}
Test accuracy ith best hyperparameters: 0.76


In [15]:
# Now we will use Grid Search

# We have to explicitly mention our search space first

search_space = {
    'n_estimators': [50, 100, 150, 200],
    'max_depth': [5, 10, 15, 20]
}
# Create a study object and optimize the objective function
study = optuna.create_study(direction='maximize', sampler=optuna.samplers.GridSampler(search_space))
# As we are trying to 'Maximize' our accuracy so our direction of the objective function will be to maximize it
study.optimize(objective, n_trials=100) # Running 50 trials to find the optimal hyperparameters

# Print the best result
print(f'Best trial accuracy: {study.best_trial.value}')
print(f'Best trial Hyperparameters: {study.best_trial.params}')

from sklearn.metrics import accuracy_score

# Train a RandomForestClassifier using the best hyperparameters from Optuna
best_model = RandomForestClassifier(**study.best_trial.params, random_state=42)

best_model.fit(X_train, y_train)

y_pred = best_model.predict(X_test)

test_accuracy = accuracy_score(y_test, y_pred)

print(f'Test accuracy ith best hyperparameters: {test_accuracy:.2f}')

[I 2025-07-02 18:04:47,419] A new study created in memory with name: no-name-f2b3e42d-d08b-4678-97e2-377437c3e93d
[I 2025-07-02 18:04:47,731] Trial 0 finished with value: 0.7690875232774674 and parameters: {'n_estimators': 100, 'max_depth': 5}. Best is trial 0 with value: 0.7690875232774674.
[I 2025-07-02 18:04:48,290] Trial 1 finished with value: 0.7672253258845437 and parameters: {'n_estimators': 150, 'max_depth': 10}. Best is trial 0 with value: 0.7690875232774674.
[I 2025-07-02 18:04:48,497] Trial 2 finished with value: 0.7728119180633147 and parameters: {'n_estimators': 50, 'max_depth': 15}. Best is trial 2 with value: 0.7728119180633147.
[I 2025-07-02 18:04:48,832] Trial 3 finished with value: 0.7653631284916201 and parameters: {'n_estimators': 100, 'max_depth': 15}. Best is trial 2 with value: 0.7728119180633147.
[I 2025-07-02 18:04:49,232] Trial 4 finished with value: 0.7690875232774674 and parameters: {'n_estimators': 100, 'max_depth': 20}. Best is trial 2 with value: 0.772811

Best trial accuracy: 0.7746741154562384
Best trial Hyperparameters: {'n_estimators': 50, 'max_depth': 5}
Test accuracy ith best hyperparameters: 0.74


#### Optuna Visualization

In [7]:
from optuna.visualization import plot_optimization_history, plot_parallel_coordinate, plot_slice, plot_contour, plot_param_importances

In [18]:
# !uv pip install matplotlib plotly

[2mUsing Python 3.10.12 environment at: PyTorch[0m
[2K[2mResolved [1m13 packages[0m [2min 183ms[0m[0m                                        [0m
[2K[2mPrepared [1m2 packages[0m [2min 2.16s[0m[0m                                             
[2K[2mInstalled [1m2 packages[0m [2min 25ms[0m[0m                                [0m
 [32m+[39m [1mnarwhals[0m[2m==1.45.0[0m
 [32m+[39m [1mplotly[0m[2m==6.2.0[0m


In [8]:
# 1. Optimization history - trial number vs accuracy graph

plot_optimization_history(study).show()

In [9]:
# 2. Parallel Coordinate Plot - SHows the hotspots

plot_parallel_coordinate(study).show()

In [10]:
# 3. Slice Plot - Individually plots objective function vs each hyperparameter

plot_slice(study).show()

In [11]:
# Contour plot - Dense colour shows the best possible location of the objective fnc
plot_contour(study).show()

In [12]:
# Importance plot - gives importance of each hyperparameter

plot_param_importances(study).show()

##### Define by run is used in Optuna to search from Dynamic Search Spaces. We can consider the models/ML algorithms as hyperparameter which gives the flexibility to know which ML model is the best for our problem and also gives the best hyperparameters for that particular algorithm. - This challenge is solved by Optuna!!! Every algorithm will have different/dynamic search space wrt the algorith it is using. Optuna also supports distributed computing to make the computing faster.

In [14]:
# Optimizing multiple ML models

from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.svm import  SVC

In [15]:
def objective(trial):
    classifier_name = trial.suggest_categorical('classifier', ['SVM', 'RandomForest', 'GradientBoosting'])
    if classifier_name=='SVM':
        # SVM hyperparams
        c = trial.suggest_float('C', 0.1, 100, log=True)
        kernel = trial.suggest_categorical('kernel', ['linear', 'rbf', 'poly', 'sigmoid'])
        gamma = trial.suggest_categorical('gamma', ['scale', 'auto'])
        model = SVC(C=c, kernel=kernel, gamma=gamma, random_state=42)
    elif classifier_name=='RandomForest':
        n_estimators = trial.suggest_int('n_estimators', 50, 300)
        max_depth = trial.suggest_int('max_depth', 3, 20)
        min_samples_split = trial.suggest_int('min_samples_split', 2, 10)
        min_samples_leaf = trial.suggest_int('min_samples_leaf', 1, 10)
        bootstrap = trial.suggest_categorical('bootstrap', [True, False])
        model = RandomForestClassifier(
            n_estimators=n_estimators,
            max_depth=max_depth,
            min_samples_split=min_samples_split,
            min_samples_leaf=min_samples_leaf,
            bootstrap=bootstrap,
            random_state=42
        )
    elif classifier_name == 'GradientBoosting':
        n_estimators = trial.suggest_int('n_estimators', 50, 300)
        learning_rate = trial.suggest_float('learning_rate', 0.01, 0.3, log=True)
        max_depth = trial.suggest_int('max_depth', 3, 20)
        min_samples_split = trial.suggest_int('min_samples_split', 2, 10)
        min_samples_leaf = trial.suggest_int('min_samples_leaf', 1, 10)
        model = GradientBoostingClassifier(
            n_estimators=n_estimators,
            learning_rate=learning_rate,
            max_depth=max_depth,
            min_samples_split=min_samples_split,
            min_samples_leaf=min_samples_leaf,
            random_state=42
        )

    # Perform cross-val and return mean of acu
    score = cross_val_score(model, X_train, y_train, cv=3,scoring='accuracy').mean()
    return score

In [17]:
# Create a study object and optimize the objective function
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)

# Print the best result
print(f'Best trial accuracy: {study.best_trial.value}')
print(f'Best trial Hyperparameters: {study.best_trial.params}')

[I 2025-07-02 18:57:23,974] A new study created in memory with name: no-name-b89aef68-5b6f-4a7e-85cd-e4d3999cc84f
[I 2025-07-02 18:57:24,689] Trial 0 finished with value: 0.7653631284916201 and parameters: {'classifier': 'GradientBoosting', 'n_estimators': 258, 'learning_rate': 0.011472826268146236, 'max_depth': 3, 'min_samples_split': 9, 'min_samples_leaf': 5}. Best is trial 0 with value: 0.7653631284916201.
[I 2025-07-02 18:57:25,267] Trial 1 finished with value: 0.7709497206703911 and parameters: {'classifier': 'RandomForest', 'n_estimators': 244, 'max_depth': 20, 'min_samples_split': 4, 'min_samples_leaf': 7, 'bootstrap': False}. Best is trial 1 with value: 0.7709497206703911.
[I 2025-07-02 18:57:25,317] Trial 2 finished with value: 0.7858472998137801 and parameters: {'classifier': 'SVM', 'C': 18.041504378844213, 'kernel': 'linear', 'gamma': 'auto'}. Best is trial 2 with value: 0.7858472998137801.
[I 2025-07-02 18:57:25,555] Trial 3 finished with value: 0.756052141527002 and parame

Best trial accuracy: 0.7895716945996275
Best trial Hyperparameters: {'classifier': 'SVM', 'C': 0.13163973212477242, 'kernel': 'linear', 'gamma': 'auto'}


In [19]:
from sklearn.metrics import accuracy_score

params = study.best_trial.params
model_class = params['classifier']  # e.g., 'SVM'

# Remove 'classifier' and pass the rest as kwargs
model_kwargs = {k: v for k, v in params.items() if k != 'classifier'}
model_kwargs['random_state'] = 42

# You would typically map string name to actual class
classifier_map = {
    'SVM': SVC,
    # Add other mappings if needed
}

# Train using the best hyperparameters from Optuna
best_model = classifier_map[model_class](**model_kwargs)

best_model.fit(X_train, y_train)

y_pred = best_model.predict(X_test)

test_accuracy = accuracy_score(y_test, y_pred)

print(f'Test accuracy ith best hyperparameters: {test_accuracy:.2f}')

Test accuracy ith best hyperparameters: 0.74


In [20]:
# We can also create dataframe of the study

study.trials_dataframe()

Unnamed: 0,number,value,datetime_start,datetime_complete,duration,params_C,params_bootstrap,params_classifier,params_gamma,params_kernel,params_learning_rate,params_max_depth,params_min_samples_leaf,params_min_samples_split,params_n_estimators,state
0,0,0.765363,2025-07-02 18:57:23.975780,2025-07-02 18:57:24.689564,0 days 00:00:00.713784,,,GradientBoosting,,,0.011473,3.0,5.0,9.0,258.0,COMPLETE
1,1,0.770950,2025-07-02 18:57:24.690158,2025-07-02 18:57:25.267910,0 days 00:00:00.577752,,False,RandomForest,,,,20.0,7.0,4.0,244.0,COMPLETE
2,2,0.785847,2025-07-02 18:57:25.268676,2025-07-02 18:57:25.317121,0 days 00:00:00.048445,18.041504,,SVM,auto,linear,,,,,,COMPLETE
3,3,0.756052,2025-07-02 18:57:25.317748,2025-07-02 18:57:25.555334,0 days 00:00:00.237586,,True,RandomForest,,,,5.0,5.0,4.0,87.0,COMPLETE
4,4,0.757914,2025-07-02 18:57:25.555879,2025-07-02 18:57:25.955644,0 days 00:00:00.399765,,True,RandomForest,,,,6.0,6.0,2.0,142.0,COMPLETE
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,95,0.746741,2025-07-02 18:57:45.613228,2025-07-02 18:57:45.639297,0 days 00:00:00.026069,0.147560,,SVM,auto,rbf,,,,,,COMPLETE
96,96,0.789572,2025-07-02 18:57:45.639946,2025-07-02 18:57:45.661181,0 days 00:00:00.021235,0.139860,,SVM,auto,linear,,,,,,COMPLETE
97,97,0.713222,2025-07-02 18:57:45.661782,2025-07-02 18:57:45.684368,0 days 00:00:00.022586,0.142585,,SVM,auto,poly,,,,,,COMPLETE
98,98,0.756052,2025-07-02 18:57:45.685024,2025-07-02 18:57:46.471904,0 days 00:00:00.786880,,,GradientBoosting,,,0.122128,5.0,4.0,8.0,147.0,COMPLETE


In [21]:
study.trials_dataframe()['params_classifier'].value_counts()

params_classifier
SVM                 78
RandomForest        12
GradientBoosting    10
Name: count, dtype: int64

In [22]:
study.trials_dataframe().groupby('params_classifier')['value'].mean()

params_classifier
GradientBoosting    0.744134
RandomForest        0.761794
SVM                 0.772120
Name: value, dtype: float64

In [1]:
import sys

print(sys.executable)

/home/ml02/Downloads/PyTorch/PyTorch/bin/python


# OPTUNA IN DEEP LEARNING

#### How to train a neural network model in optuna

Optuna creates a study object which contains trials. Each trial is sent to the objective function. Objective function is used to define - Search Space, Model Init, Param Init, Training loop, Evaluation Loop which returns the objective metric.

In [1]:
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
from torch.utils.data import Dataset, DataLoader
from sklearn.model_selection import train_test_split

In [3]:
torch.manual_seed(42)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device:{device}")

Using device:cuda


In [4]:
df = pd.read_csv("/home/ml02/Downloads/PyTorch/archive/fashion-mnist_train.csv")
df.shape

(60000, 785)

In [5]:
X = df.iloc[:,1:].values
y = df.iloc[:, 0].values

In [6]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [7]:
X_train = X_train/255.0
X_test = X_test/255.0

In [8]:
class CustomDataset(Dataset):
    def __init__(self, features, labels):
        self.features = torch.tensor(features, dtype=torch.float32)
        self.labels = torch.tensor(labels, dtype=torch.long)
    def __len__(self):
        return len(self.features)
    def __getitem__(self, idx):
        return self.features[idx], self.labels[idx]

In [9]:
train_dataset = CustomDataset(X_train, y_train)
test_dataset = CustomDataset(X_test, y_test)

In [10]:
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True, pin_memory=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False, pin_memory=True)

In [11]:
len(train_loader)

1500

In [12]:
#NB - The below architecture has as drawback - We are using same neurons for every layer
class MyNN(nn.Module):
    def __init__(self, input_dim, output_dim, num_hidden_layers, neurons_per_layer):
        super().__init__()
        layers = [] # This will store the layers dynamically in every trial
        for i in range(num_hidden_layers):
            layers.append(nn.Linear(input_dim, neurons_per_layer))
            layers.append(nn.BatchNorm1d(neurons_per_layer))
            layers.append(nn.ReLU())
            layers.append(nn.Dropout(0.3))
            input_dim = neurons_per_layer
        layers.append(nn.Linear(neurons_per_layer, output_dim))
        self.model = nn.Sequential(*layers) # Unpacking the array here as sequential takes input one by one

    def forward(self, x):
        return self.model(x)

In [13]:
# Now we will make our example objective function
# and will only focus on two things -
## 1. Number of hidden layers
## 2. Number of neurons per layer

def objective(trial):
    # next hyperparameter values from the search space
    num_hidden_layers = trial.suggest_int("num_hidden_layers", 1,5)
    neurons_per_layer = trial.suggest_int("neurons_per_layer", 8, 128, step=8)

    # model init
    input_dim = 784
    output_dim = 10

    model = MyNN(input_dim, output_dim, num_hidden_layers, neurons_per_layer)
    model.to(device)

    # param init
    lr = 0.01
    epochs = 100

    # optimizer selection
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.SGD(model.parameters(), lr=0.1, weight_decay=1e-4)

    # training loop - We will not print values of loss here
    model.train()
    for epoch in range(epochs):
        for batch_features, batch_labels in train_loader:
            batch_features, batch_labels = batch_features.to(device), batch_labels.to(device)
            outputs = model(batch_features) # Forward pass
            loss = criterion(outputs, batch_labels)
            # back pass
            optimizer.zero_grad()
            loss.backward()
            # update grads
            optimizer.step()

    # evaluation
    model.eval()
    total, correct = 0, 0
    with torch.no_grad():
        for batch_features, batch_labels in test_loader:
            batch_features, batch_labels = batch_features.to(device), batch_labels.to(device)
            outputs = model(batch_features)
            _, predicted = torch.max(outputs, 1)
            total += batch_labels.shape[0]
            correct = correct + (predicted == batch_labels).sum().item()
        accuracy = correct/total
    return accuracy

In [14]:
# Now we will create an Optuna study
import optuna

study = optuna.create_study(direction='maximize')

[I 2025-07-07 19:23:33,061] A new study created in memory with name: no-name-51296a41-b765-4c0e-86c1-31c67aedc18c


In [15]:
study.optimize(objective, n_trials=10)

[I 2025-07-07 19:26:56,750] Trial 0 finished with value: 0.7855833333333333 and parameters: {'num_hidden_layers': 2, 'neurons_per_layer': 8}. Best is trial 0 with value: 0.7855833333333333.
[I 2025-07-07 19:31:44,820] Trial 1 finished with value: 0.8934166666666666 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 112}. Best is trial 1 with value: 0.8934166666666666.
[I 2025-07-07 19:36:37,260] Trial 2 finished with value: 0.8946666666666667 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 112}. Best is trial 2 with value: 0.8946666666666667.
[I 2025-07-07 19:42:21,163] Trial 3 finished with value: 0.8860833333333333 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 120}. Best is trial 2 with value: 0.8946666666666667.
[I 2025-07-07 19:46:21,853] Trial 4 finished with value: 0.8656666666666667 and parameters: {'num_hidden_layers': 3, 'neurons_per_layer': 32}. Best is trial 2 with value: 0.8946666666666667.
[I 2025-07-07 19:49:22,589] Trial 5 finishe

In [16]:
study.best_value

0.8946666666666667

In [17]:
study.best_params

{'num_hidden_layers': 4, 'neurons_per_layer': 112}

#### Now we will continue to update the objective function and will add few more params in hyperparam searching

In [1]:
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
from torch.utils.data import Dataset, DataLoader
from sklearn.model_selection import train_test_split

In [2]:
torch.manual_seed(42)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device:{device}")

Using device:cuda


In [3]:
df = pd.read_csv("/home/ml02/Downloads/PyTorch/archive/fashion-mnist_train.csv")
df.shape

(60000, 785)

In [4]:
X = df.iloc[:,1:].values
y = df.iloc[:, 0].values

In [5]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [6]:
X_train = X_train/255.0
X_test = X_test/255.0

In [7]:
class CustomDataset(Dataset):
    def __init__(self, features, labels):
        self.features = torch.tensor(features, dtype=torch.float32)
        self.labels = torch.tensor(labels, dtype=torch.long)
    def __len__(self):
        return len(self.features)
    def __getitem__(self, idx):
        return self.features[idx], self.labels[idx]

In [8]:
train_dataset = CustomDataset(X_train, y_train)
test_dataset = CustomDataset(X_test, y_test)

In [9]:
#NB - The below architecture has as drawback - We are using same neurons for every layer
class MyNN(nn.Module):
    def __init__(self, input_dim, output_dim, num_hidden_layers, neurons_per_layer, dropout):
        super().__init__()
        layers = [] # This will store the layers dynamically in every trial
        for i in range(num_hidden_layers):
            layers.append(nn.Linear(input_dim, neurons_per_layer))
            layers.append(nn.BatchNorm1d(neurons_per_layer))
            layers.append(nn.ReLU())
            layers.append(nn.Dropout(dropout))
            input_dim = neurons_per_layer
        layers.append(nn.Linear(neurons_per_layer, output_dim))
        self.model = nn.Sequential(*layers) # Unpacking the array here as sequential takes input one by one

    def forward(self, x):
        return self.model(x)

In [10]:
# Now we will make our example objective function
# and will only focus on two things -
## 1. Number of hidden layers
## 2. Number of neurons per layer

def objective(trial):
    # next hyperparameter values from the search space
    num_hidden_layers = trial.suggest_int("num_hidden_layers", 1,5)
    neurons_per_layer = trial.suggest_int("neurons_per_layer", 8, 128, step=8)
    epochs = trial.suggest_int("epochs", 10, 50, step=10)
    lr = trial.suggest_float("lr", 1e-5, 1e-1, log=True) # The values will be in logarithmic scale
    dropout_rate = trial.suggest_float("dropout", 0.1, 0.5, step=0.1)
    batch_size = trial.suggest_categorical("batch_size", [16,32,64,128])
    optimizer_name = trial.suggest_categorical("optimizer", ['Adam', 'SGD', 'RMSprop'])
    weight_decay = trial.suggest_float("weight_decay", 1e-5, 1e-3, log=True)

    # model init
    input_dim = 784
    output_dim = 10
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, pin_memory=True)
    test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False, pin_memory=True)
    model = MyNN(input_dim, output_dim, num_hidden_layers, neurons_per_layer, dropout_rate)
    model.to(device)

    # optimizer selection
    criterion = nn.CrossEntropyLoss()
    if optimizer_name == 'Adam':
        optimizer = optim.Adam(model.parameters(), lr=lr, weight_decay=weight_decay)
    elif optimizer_name == 'SGD':
        optimizer = optim.SGD(model.parameters(), lr=lr, weight_decay=weight_decay)
    else:
        optimizer = optim.RMSprop(model.parameters(), lr=lr, weight_decay=weight_decay)

    # training loop - We will not print values of loss here
    model.train()
    for epoch in range(epochs):
        for batch_features, batch_labels in train_loader:
            batch_features, batch_labels = batch_features.to(device), batch_labels.to(device)
            outputs = model(batch_features) # Forward pass
            loss = criterion(outputs, batch_labels)
            # back pass
            optimizer.zero_grad()
            loss.backward()
            # update grads
            optimizer.step()

    # evaluation
    model.eval()
    total, correct = 0, 0
    with torch.no_grad():
        for batch_features, batch_labels in test_loader:
            batch_features, batch_labels = batch_features.to(device), batch_labels.to(device)
            outputs = model(batch_features)
            _, predicted = torch.max(outputs, 1)
            total += batch_labels.shape[0]
            correct = correct + (predicted == batch_labels).sum().item()
        accuracy = correct/total
    return accuracy

In [11]:
import optuna

study = optuna.create_study(direction='maximize')

[I 2025-07-07 20:42:02,607] A new study created in memory with name: no-name-01d1e774-3b13-426e-a4e6-6cef166a8263


In [12]:
study.optimize(objective, n_trials=10)

[I 2025-07-07 20:42:29,247] Trial 0 finished with value: 0.8781666666666667 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 112, 'epochs': 10, 'lr': 0.00023473466577846877, 'dropout': 0.30000000000000004, 'batch_size': 128, 'optimizer': 'RMSprop', 'weight_decay': 3.8709483389315404e-05}. Best is trial 0 with value: 0.8781666666666667.
[I 2025-07-07 20:43:19,660] Trial 1 finished with value: 0.8655833333333334 and parameters: {'num_hidden_layers': 3, 'neurons_per_layer': 32, 'epochs': 30, 'lr': 0.0006317252980147981, 'dropout': 0.30000000000000004, 'batch_size': 128, 'optimizer': 'RMSprop', 'weight_decay': 5.373048924186291e-05}. Best is trial 0 with value: 0.8781666666666667.
[I 2025-07-07 20:44:20,216] Trial 2 finished with value: 0.8813333333333333 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 112, 'epochs': 30, 'lr': 0.00013226849854707735, 'dropout': 0.2, 'batch_size': 128, 'optimizer': 'RMSprop', 'weight_decay': 0.0001327606301060182}. Best is trial 2

In [13]:
study.best_value

0.8813333333333333

In [14]:
study.best_params

{'num_hidden_layers': 4,
 'neurons_per_layer': 112,
 'epochs': 30,
 'lr': 0.00013226849854707735,
 'dropout': 0.2,
 'batch_size': 128,
 'optimizer': 'RMSprop',
 'weight_decay': 0.0001327606301060182}

# Integrating MlFlow With Optuna to track trials

In [None]:
# # !uv pip install mlflow

In [10]:
# !uv pip install optuna-integration

[2mUsing Python 3.10.12 environment at: PyTorch[0m
[2K[2mResolved [1m14 packages[0m [2min 1.55s[0m[0m                                        [0m
[2K[2mPrepared [1m1 package[0m [2min 180ms[0m[0m                                              
[2K[2mInstalled [1m1 package[0m [2min 3ms[0m[0m=4.4.0                            [0m
 [32m+[39m [1moptuna-integration[0m[2m==4.4.0[0m


In [1]:
import torch
import pandas as pd
import numpy as np
import torch.optim as optim
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from sklearn.model_selection import train_test_split

In [2]:
torch.manual_seed(42)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
df = pd.read_csv("/home/ml02/Downloads/PyTorch/archive/fashion-mnist_train.csv")
X = df.iloc[:,1:].values
y = df.iloc[:, 0].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train = X_train/255.0
X_test = X_test/255.0

In [3]:
class CustomDataset(Dataset):
    def __init__(self, features, labels):
        self.features = torch.tensor(features, dtype=torch.float32)
        self.labels = torch.tensor(labels, dtype=torch.long)
    def __len__(self):
        return len(self.features)
    def __getitem__(self, item):
        return self.features[item], self.labels[item]

train_dataset = CustomDataset(X_train, y_train)
test_dataset = CustomDataset(X_test, y_test)

In [4]:
class MyNN(nn.Module):
    def __init__(self, input_dim, output_dim, num_hiddden_layers, neurons_per_layer, dropout):
        super().__init__()
        layers = []
        for i in range(num_hiddden_layers):
            layers.append(nn.Linear(input_dim, neurons_per_layer))
            layers.append(nn.BatchNorm1d(neurons_per_layer))
            layers.append(nn.ReLU())
            layers.append(nn.Dropout(dropout))
            input_dim = neurons_per_layer
        layers.append(nn.Linear(neurons_per_layer, output_dim))
        self.model = nn.Sequential(*layers)

    def forward(self, x):
        return self.model(x)

 Don’t need to manually write with mlflow.start_run(): when using Optuna with MLflowCallback. It’s handled automatically. MLflowCallback from Optuna ,wraps each trial in an MLflow run behind the scenes. So,
 ```
 with mlflow.start_run():
    ...
```
is automatically done by the callback during each trial.

You just need to:

> Create your MLflowCallback;
> Decorate your objective function with @mlflc.track_in_mlflow();
> Run the study — and everything gets logged!

In [5]:
import optuna
import mlflow
from optuna.integration.mlflow import MLflowCallback

# mlflow.set_tracking_uri("http://localhost:5000")
# mlflow.set_experiment("Optuna Hyperparameter Search 1")

mlflc = MLflowCallback(
tracking_uri="http://localhost:5000",
metric_name="Accuracy"
)

@mlflc.track_in_mlflow()
def objective(trial):
    num_hidden_layers = trial.suggest_int("num_hidden_layers", 1,5)
    neurons_per_layer = trial.suggest_int("neurons_per_layer", 8, 128, step=8)
    epochs = trial.suggest_int("epochs", 10, 100, step=10)
    lr = trial.suggest_float("lr", 1e-5, 1e-1, log=True)
    dropout_rate = trial.suggest_float("dropout", 0.1, 0.5, step=0.1)
    batch_size = trial.suggest_categorical("batch_size", [16,32,64,128])
    # Optional: Add label smoothing as a hyperparameter
    label_smoothing = trial.suggest_float("label_smoothing", 0.0, 0.2, step=0.05)
    optimizer_name = trial.suggest_categorical("optimizer", ['Adam', 'SGD', 'RMSprop', 'Rprop', 'LBFGS'])
    weight_decay = trial.suggest_float("weight_decay", 1e-5, 1e-3, log=True)

    # model init
    input_dim = 784
    output_dim = 10
    train_loader = DataLoader(train_dataset, batch_size, shuffle=True, pin_memory=True)
    test_loader = DataLoader(test_dataset, batch_size, shuffle=False, pin_memory=True)
    model = MyNN(input_dim, output_dim, num_hidden_layers, neurons_per_layer, dropout_rate)
    model.to(device)

    # optimizer selection

    criterion = nn.CrossEntropyLoss(label_smoothing=label_smoothing)

    if optimizer_name == 'Adam':
        optimizer = optim.Adam(model.parameters(), lr=lr, weight_decay=weight_decay)
    elif optimizer_name == 'SGD':
        optimizer = optim.SGD(model.parameters(), lr=lr, weight_decay=weight_decay)
    elif optimizer_name == 'Rprop':
        optimizer = optim.Rprop(model.parameters(), lr=lr, etas=(0.5, 1.2), step_sizes=(1e-10, 100))
    elif optimizer_name == 'LBFGS':
        optimizer = optim.LBFGS(model.parameters(), lr=lr, max_iter=15, tolerance_grad=1e-10, tolerance_change=1e-12)
    else:
        optimizer = optim.RMSprop(model.parameters(), lr=lr, weight_decay=weight_decay)

    # training loop
    model.train()
    for epoch in range(epochs):
        for batch_features, batch_labels in train_loader:
            batch_features, batch_labels = batch_features.to(device), batch_labels.to(device)
            if optimizer_name == 'LBFGS':
                def closure():
                    optimizer.zero_grad()
                    outputs = model(batch_features)
                    loss = criterion(outputs, batch_labels)
                    loss.backward()
                    return loss
                optimizer.step(closure)
            else:
                outputs = model(batch_features)
                loss = criterion(outputs, batch_labels)
                # back pass
                optimizer.zero_grad()
                loss.backward()
                # update grads
                optimizer.step()

    # evaluatiom
    model.eval()
    total, correct = 0, 0
    with torch.no_grad():
        for batch_features, batch_labels in test_loader:
            batch_features, batch_labels = batch_features.to(device), batch_labels.to(device)
            outputs = model(batch_features)
            _, predicted = torch.max(outputs, 1)
            total += batch_features.shape[0]
            correct += (predicted == batch_labels).sum().item()
        accuracy = correct/total
    return accuracy

  mlflc = MLflowCallback(
  @mlflc.track_in_mlflow()


In [6]:
study = optuna.create_study(study_name="First_Optuna_Hyperparameter_Search", direction="maximize")

[I 2025-07-08 16:43:02,642] A new study created in memory with name: First_Optuna_Hyperparameter_Search


In [7]:
study.optimize(objective, n_trials=50, callbacks=[mlflc])

2025/07/08 16:43:04 INFO mlflow.tracking.fluent: Experiment with name 'First_Optuna_Hyperparameter_Search' does not exist. Creating a new experiment.
[I 2025-07-08 16:43:35,218] Trial 0 finished with value: 0.8689166666666667 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 32, 'epochs': 20, 'lr': 0.00029765816732581007, 'dropout': 0.2, 'batch_size': 128, 'label_smoothing': 0.0, 'optimizer': 'Adam', 'weight_decay': 0.00015518267720959756}. Best is trial 0 with value: 0.8689166666666667.


🏃 View run 0 at: http://localhost:5000/#/experiments/605049891641191207/runs/9fe98f09c0294489886a9688412e9abe
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 0 at: http://localhost:5000/#/experiments/605049891641191207/runs/9fe98f09c0294489886a9688412e9abe
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 16:46:54,570] Trial 1 finished with value: 0.54875 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 16, 'epochs': 60, 'lr': 2.2137897447501633e-05, 'dropout': 0.2, 'batch_size': 32, 'label_smoothing': 0.1, 'optimizer': 'SGD', 'weight_decay': 0.0003129368585252753}. Best is trial 0 with value: 0.8689166666666667.


🏃 View run 1 at: http://localhost:5000/#/experiments/605049891641191207/runs/9fe0ca6b018542cc9f7b1d2acfdd006b
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 1 at: http://localhost:5000/#/experiments/605049891641191207/runs/9fe0ca6b018542cc9f7b1d2acfdd006b
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 16:52:11,478] Trial 2 finished with value: 0.293 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 112, 'epochs': 60, 'lr': 0.00011238180034710855, 'dropout': 0.30000000000000004, 'batch_size': 64, 'label_smoothing': 0.15000000000000002, 'optimizer': 'Rprop', 'weight_decay': 0.00041725843676110476}. Best is trial 0 with value: 0.8689166666666667.


🏃 View run 2 at: http://localhost:5000/#/experiments/605049891641191207/runs/0be3d0170da749c085460d8a50fad7f2
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 2 at: http://localhost:5000/#/experiments/605049891641191207/runs/0be3d0170da749c085460d8a50fad7f2
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 16:57:07,554] Trial 3 finished with value: 0.6896666666666667 and parameters: {'num_hidden_layers': 3, 'neurons_per_layer': 112, 'epochs': 70, 'lr': 2.1844212736528202e-05, 'dropout': 0.2, 'batch_size': 64, 'label_smoothing': 0.2, 'optimizer': 'Rprop', 'weight_decay': 0.00022698801855214538}. Best is trial 0 with value: 0.8689166666666667.


🏃 View run 3 at: http://localhost:5000/#/experiments/605049891641191207/runs/0c76399f8a844e7a877abf5a05c3855d
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 3 at: http://localhost:5000/#/experiments/605049891641191207/runs/0c76399f8a844e7a877abf5a05c3855d
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 16:59:13,378] Trial 4 finished with value: 0.08291666666666667 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 72, 'epochs': 50, 'lr': 8.08874004743569e-05, 'dropout': 0.1, 'batch_size': 128, 'label_smoothing': 0.0, 'optimizer': 'LBFGS', 'weight_decay': 7.050602797711133e-05}. Best is trial 0 with value: 0.8689166666666667.


🏃 View run 4 at: http://localhost:5000/#/experiments/605049891641191207/runs/15109c18a5b247b289a35161cdf3adf6
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 4 at: http://localhost:5000/#/experiments/605049891641191207/runs/15109c18a5b247b289a35161cdf3adf6
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:01:32,987] Trial 5 finished with value: 0.7405 and parameters: {'num_hidden_layers': 1, 'neurons_per_layer': 128, 'epochs': 50, 'lr': 0.00909798546500459, 'dropout': 0.5, 'batch_size': 32, 'label_smoothing': 0.05, 'optimizer': 'Rprop', 'weight_decay': 3.074898863635081e-05}. Best is trial 0 with value: 0.8689166666666667.


🏃 View run 5 at: http://localhost:5000/#/experiments/605049891641191207/runs/74a04140f23548d88925557a34dfad19
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 5 at: http://localhost:5000/#/experiments/605049891641191207/runs/74a04140f23548d88925557a34dfad19
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:10:43,236] Trial 6 finished with value: 0.13533333333333333 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 96, 'epochs': 100, 'lr': 0.0002638130616943808, 'dropout': 0.5, 'batch_size': 64, 'label_smoothing': 0.1, 'optimizer': 'LBFGS', 'weight_decay': 3.644954514412465e-05}. Best is trial 0 with value: 0.8689166666666667.


🏃 View run 6 at: http://localhost:5000/#/experiments/605049891641191207/runs/46570a1a02ff4632bd12c5edaead7b19
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 6 at: http://localhost:5000/#/experiments/605049891641191207/runs/46570a1a02ff4632bd12c5edaead7b19
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:18:37,343] Trial 7 finished with value: 0.8521666666666666 and parameters: {'num_hidden_layers': 2, 'neurons_per_layer': 120, 'epochs': 100, 'lr': 0.00804114520064228, 'dropout': 0.1, 'batch_size': 16, 'label_smoothing': 0.1, 'optimizer': 'Adam', 'weight_decay': 3.9634779944739436e-05}. Best is trial 0 with value: 0.8689166666666667.


🏃 View run 7 at: http://localhost:5000/#/experiments/605049891641191207/runs/c01e2640c0e04e3585ffc3d7195d73b1
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 7 at: http://localhost:5000/#/experiments/605049891641191207/runs/c01e2640c0e04e3585ffc3d7195d73b1
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:20:51,705] Trial 8 finished with value: 0.7461666666666666 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 120, 'epochs': 40, 'lr': 0.00769937646396465, 'dropout': 0.5, 'batch_size': 32, 'label_smoothing': 0.05, 'optimizer': 'Adam', 'weight_decay': 0.0003267975039695507}. Best is trial 0 with value: 0.8689166666666667.


🏃 View run 8 at: http://localhost:5000/#/experiments/605049891641191207/runs/0e75c4c583894d81bb9caef1def736d3
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 8 at: http://localhost:5000/#/experiments/605049891641191207/runs/0e75c4c583894d81bb9caef1def736d3
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:27:24,268] Trial 9 finished with value: 0.18275 and parameters: {'num_hidden_layers': 3, 'neurons_per_layer': 48, 'epochs': 100, 'lr': 0.00019638649712595333, 'dropout': 0.30000000000000004, 'batch_size': 64, 'label_smoothing': 0.0, 'optimizer': 'LBFGS', 'weight_decay': 0.0003986844491414797}. Best is trial 0 with value: 0.8689166666666667.


🏃 View run 9 at: http://localhost:5000/#/experiments/605049891641191207/runs/546358e0e96d40d0875e87b3e6444ab8
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 9 at: http://localhost:5000/#/experiments/605049891641191207/runs/546358e0e96d40d0875e87b3e6444ab8
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:27:47,145] Trial 10 finished with value: 0.24416666666666667 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 8, 'epochs': 10, 'lr': 0.06571152942331564, 'dropout': 0.4, 'batch_size': 128, 'label_smoothing': 0.0, 'optimizer': 'RMSprop', 'weight_decay': 0.00012320470992863993}. Best is trial 0 with value: 0.8689166666666667.


🏃 View run 10 at: http://localhost:5000/#/experiments/605049891641191207/runs/f2801415d75f46e5a051d2202f44eb9c
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 10 at: http://localhost:5000/#/experiments/605049891641191207/runs/f2801415d75f46e5a051d2202f44eb9c
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:29:06,501] Trial 11 finished with value: 0.86975 and parameters: {'num_hidden_layers': 1, 'neurons_per_layer': 40, 'epochs': 20, 'lr': 0.0017170117999339696, 'dropout': 0.1, 'batch_size': 16, 'label_smoothing': 0.2, 'optimizer': 'Adam', 'weight_decay': 1.3761536117335797e-05}. Best is trial 11 with value: 0.86975.


🏃 View run 11 at: http://localhost:5000/#/experiments/605049891641191207/runs/ceb211b767b44aea957e89ee951eaf40
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 11 at: http://localhost:5000/#/experiments/605049891641191207/runs/ceb211b767b44aea957e89ee951eaf40
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:29:50,093] Trial 12 finished with value: 0.869 and parameters: {'num_hidden_layers': 1, 'neurons_per_layer': 40, 'epochs': 10, 'lr': 0.0011785169177637837, 'dropout': 0.2, 'batch_size': 16, 'label_smoothing': 0.2, 'optimizer': 'Adam', 'weight_decay': 1.0581158941699855e-05}. Best is trial 11 with value: 0.86975.


🏃 View run 12 at: http://localhost:5000/#/experiments/605049891641191207/runs/185450e5af2048d1a82a02bf696a537a
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 12 at: http://localhost:5000/#/experiments/605049891641191207/runs/185450e5af2048d1a82a02bf696a537a
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:31:50,486] Trial 13 finished with value: 0.8774166666666666 and parameters: {'num_hidden_layers': 1, 'neurons_per_layer': 56, 'epochs': 30, 'lr': 0.0013684121933584452, 'dropout': 0.1, 'batch_size': 16, 'label_smoothing': 0.2, 'optimizer': 'Adam', 'weight_decay': 1.2817062586687998e-05}. Best is trial 13 with value: 0.8774166666666666.


🏃 View run 13 at: http://localhost:5000/#/experiments/605049891641191207/runs/f514e2284b324805ae5c7180ebd1b82a
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 13 at: http://localhost:5000/#/experiments/605049891641191207/runs/f514e2284b324805ae5c7180ebd1b82a
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:34:15,957] Trial 14 finished with value: 0.8855 and parameters: {'num_hidden_layers': 2, 'neurons_per_layer': 64, 'epochs': 30, 'lr': 0.0015772473893209706, 'dropout': 0.1, 'batch_size': 16, 'label_smoothing': 0.15000000000000002, 'optimizer': 'Adam', 'weight_decay': 1.1390864828940765e-05}. Best is trial 14 with value: 0.8855.


🏃 View run 14 at: http://localhost:5000/#/experiments/605049891641191207/runs/0bd7ebe3d3784b9ea2d9632c7bae432d
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 14 at: http://localhost:5000/#/experiments/605049891641191207/runs/0bd7ebe3d3784b9ea2d9632c7bae432d
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:36:22,855] Trial 15 finished with value: 0.8844166666666666 and parameters: {'num_hidden_layers': 2, 'neurons_per_layer': 72, 'epochs': 30, 'lr': 0.00278282083135261, 'dropout': 0.1, 'batch_size': 16, 'label_smoothing': 0.15000000000000002, 'optimizer': 'SGD', 'weight_decay': 1.8804736545934407e-05}. Best is trial 14 with value: 0.8855.


🏃 View run 15 at: http://localhost:5000/#/experiments/605049891641191207/runs/49dcf08714174308a5dbbfcada01cab8
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 15 at: http://localhost:5000/#/experiments/605049891641191207/runs/49dcf08714174308a5dbbfcada01cab8
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:38:29,927] Trial 16 finished with value: 0.8618333333333333 and parameters: {'num_hidden_layers': 2, 'neurons_per_layer': 80, 'epochs': 30, 'lr': 0.03148645689563906, 'dropout': 0.1, 'batch_size': 16, 'label_smoothing': 0.15000000000000002, 'optimizer': 'SGD', 'weight_decay': 0.0009441176684661442}. Best is trial 14 with value: 0.8855.


🏃 View run 16 at: http://localhost:5000/#/experiments/605049891641191207/runs/42e39fe679964c7bb7a2c956d20beb43
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 16 at: http://localhost:5000/#/experiments/605049891641191207/runs/42e39fe679964c7bb7a2c956d20beb43
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:41:12,510] Trial 17 finished with value: 0.8751666666666666 and parameters: {'num_hidden_layers': 2, 'neurons_per_layer': 88, 'epochs': 40, 'lr': 0.003081882651463598, 'dropout': 0.4, 'batch_size': 16, 'label_smoothing': 0.15000000000000002, 'optimizer': 'SGD', 'weight_decay': 1.8589970084200084e-05}. Best is trial 14 with value: 0.8855.


🏃 View run 17 at: http://localhost:5000/#/experiments/605049891641191207/runs/2d987e01eda54e549ae04e0a93f778f5
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 17 at: http://localhost:5000/#/experiments/605049891641191207/runs/2d987e01eda54e549ae04e0a93f778f5
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:47:11,365] Trial 18 finished with value: 0.887 and parameters: {'num_hidden_layers': 2, 'neurons_per_layer': 64, 'epochs': 80, 'lr': 0.0005627287653936512, 'dropout': 0.2, 'batch_size': 16, 'label_smoothing': 0.15000000000000002, 'optimizer': 'RMSprop', 'weight_decay': 2.2164795599592697e-05}. Best is trial 18 with value: 0.887.


🏃 View run 18 at: http://localhost:5000/#/experiments/605049891641191207/runs/66499b559f8842129ca15311db71f3a5
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 18 at: http://localhost:5000/#/experiments/605049891641191207/runs/66499b559f8842129ca15311db71f3a5
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 17:53:20,585] Trial 19 finished with value: 0.8816666666666667 and parameters: {'num_hidden_layers': 3, 'neurons_per_layer': 56, 'epochs': 70, 'lr': 0.0005184481708970104, 'dropout': 0.2, 'batch_size': 16, 'label_smoothing': 0.15000000000000002, 'optimizer': 'RMSprop', 'weight_decay': 6.182359486670447e-05}. Best is trial 18 with value: 0.887.


🏃 View run 19 at: http://localhost:5000/#/experiments/605049891641191207/runs/70033196f063478ba7b040b3f4f56d18
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 19 at: http://localhost:5000/#/experiments/605049891641191207/runs/70033196f063478ba7b040b3f4f56d18
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 18:00:16,568] Trial 20 finished with value: 0.8915833333333333 and parameters: {'num_hidden_layers': 2, 'neurons_per_layer': 96, 'epochs': 80, 'lr': 5.176769310304068e-05, 'dropout': 0.30000000000000004, 'batch_size': 16, 'label_smoothing': 0.05, 'optimizer': 'RMSprop', 'weight_decay': 2.0280768981843973e-05}. Best is trial 20 with value: 0.8915833333333333.


🏃 View run 20 at: http://localhost:5000/#/experiments/605049891641191207/runs/b48c6f1e570a4145b2d8129fdbe93d08
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 20 at: http://localhost:5000/#/experiments/605049891641191207/runs/b48c6f1e570a4145b2d8129fdbe93d08
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 18:06:54,556] Trial 21 finished with value: 0.8901666666666667 and parameters: {'num_hidden_layers': 2, 'neurons_per_layer': 96, 'epochs': 80, 'lr': 5.961025496874527e-05, 'dropout': 0.30000000000000004, 'batch_size': 16, 'label_smoothing': 0.05, 'optimizer': 'RMSprop', 'weight_decay': 2.3003030653660054e-05}. Best is trial 20 with value: 0.8915833333333333.


🏃 View run 21 at: http://localhost:5000/#/experiments/605049891641191207/runs/8f82f3daaccf4582a612906c2cb8817a
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 21 at: http://localhost:5000/#/experiments/605049891641191207/runs/8f82f3daaccf4582a612906c2cb8817a
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 18:13:17,201] Trial 22 finished with value: 0.8901666666666667 and parameters: {'num_hidden_layers': 2, 'neurons_per_layer': 96, 'epochs': 80, 'lr': 4.8589147564376325e-05, 'dropout': 0.30000000000000004, 'batch_size': 16, 'label_smoothing': 0.05, 'optimizer': 'RMSprop', 'weight_decay': 2.452139772512496e-05}. Best is trial 20 with value: 0.8915833333333333.


🏃 View run 22 at: http://localhost:5000/#/experiments/605049891641191207/runs/6d2befe07b4d4237bf34de6f6fe88935
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 22 at: http://localhost:5000/#/experiments/605049891641191207/runs/6d2befe07b4d4237bf34de6f6fe88935
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 18:23:05,533] Trial 23 finished with value: 0.8705833333333334 and parameters: {'num_hidden_layers': 3, 'neurons_per_layer': 96, 'epochs': 80, 'lr': 1.0979404664296007e-05, 'dropout': 0.4, 'batch_size': 16, 'label_smoothing': 0.05, 'optimizer': 'RMSprop', 'weight_decay': 4.644267783132921e-05}. Best is trial 20 with value: 0.8915833333333333.


🏃 View run 23 at: http://localhost:5000/#/experiments/605049891641191207/runs/dc9794282940429fa16df42ae8d36443
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 23 at: http://localhost:5000/#/experiments/605049891641191207/runs/dc9794282940429fa16df42ae8d36443
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 18:33:09,105] Trial 24 finished with value: 0.8886666666666667 and parameters: {'num_hidden_layers': 2, 'neurons_per_layer': 96, 'epochs': 90, 'lr': 4.659387648519733e-05, 'dropout': 0.30000000000000004, 'batch_size': 16, 'label_smoothing': 0.05, 'optimizer': 'RMSprop', 'weight_decay': 2.580785962340548e-05}. Best is trial 20 with value: 0.8915833333333333.


🏃 View run 24 at: http://localhost:5000/#/experiments/605049891641191207/runs/8b5716ff2e4c4ffea77bc460d6371844
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 24 at: http://localhost:5000/#/experiments/605049891641191207/runs/8b5716ff2e4c4ffea77bc460d6371844
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 18:40:33,716] Trial 25 finished with value: 0.8916666666666667 and parameters: {'num_hidden_layers': 3, 'neurons_per_layer': 104, 'epochs': 80, 'lr': 6.464194037292499e-05, 'dropout': 0.30000000000000004, 'batch_size': 16, 'label_smoothing': 0.05, 'optimizer': 'RMSprop', 'weight_decay': 6.595346303851223e-05}. Best is trial 25 with value: 0.8916666666666667.


🏃 View run 25 at: http://localhost:5000/#/experiments/605049891641191207/runs/cf04425faa874d7b9df8aac1515d383f
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 25 at: http://localhost:5000/#/experiments/605049891641191207/runs/cf04425faa874d7b9df8aac1515d383f
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 18:48:50,672] Trial 26 finished with value: 0.8790833333333333 and parameters: {'num_hidden_layers': 3, 'neurons_per_layer': 104, 'epochs': 90, 'lr': 1.2941137299710918e-05, 'dropout': 0.4, 'batch_size': 16, 'label_smoothing': 0.05, 'optimizer': 'RMSprop', 'weight_decay': 7.344479260994156e-05}. Best is trial 25 with value: 0.8916666666666667.


🏃 View run 26 at: http://localhost:5000/#/experiments/605049891641191207/runs/eb77f701cfc644f89fc0ea78e3ad07c9
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 26 at: http://localhost:5000/#/experiments/605049891641191207/runs/eb77f701cfc644f89fc0ea78e3ad07c9
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 18:50:45,166] Trial 27 finished with value: 0.8859166666666667 and parameters: {'num_hidden_layers': 3, 'neurons_per_layer': 80, 'epochs': 70, 'lr': 9.974189045697116e-05, 'dropout': 0.30000000000000004, 'batch_size': 128, 'label_smoothing': 0.1, 'optimizer': 'RMSprop', 'weight_decay': 5.440261769495814e-05}. Best is trial 25 with value: 0.8916666666666667.


🏃 View run 27 at: http://localhost:5000/#/experiments/605049891641191207/runs/e66c2abe67fc4f98ab4be76e5a66f202
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 27 at: http://localhost:5000/#/experiments/605049891641191207/runs/e66c2abe67fc4f98ab4be76e5a66f202
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 18:55:33,364] Trial 28 finished with value: 0.887 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 112, 'epochs': 90, 'lr': 3.4663651344011756e-05, 'dropout': 0.4, 'batch_size': 32, 'label_smoothing': 0.05, 'optimizer': 'RMSprop', 'weight_decay': 0.00010505420803425447}. Best is trial 25 with value: 0.8916666666666667.


🏃 View run 28 at: http://localhost:5000/#/experiments/605049891641191207/runs/2b508ae13a09460bbf07a65af62b687b
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 28 at: http://localhost:5000/#/experiments/605049891641191207/runs/2b508ae13a09460bbf07a65af62b687b
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 18:57:30,769] Trial 29 finished with value: 0.88625 and parameters: {'num_hidden_layers': 3, 'neurons_per_layer': 88, 'epochs': 80, 'lr': 0.00013116147889143774, 'dropout': 0.30000000000000004, 'batch_size': 128, 'label_smoothing': 0.0, 'optimizer': 'RMSprop', 'weight_decay': 0.0001969917988468579}. Best is trial 25 with value: 0.8916666666666667.


🏃 View run 29 at: http://localhost:5000/#/experiments/605049891641191207/runs/2d8935a7a0584ef99012aa2c8f025df9
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 29 at: http://localhost:5000/#/experiments/605049891641191207/runs/2d8935a7a0584ef99012aa2c8f025df9
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 19:04:37,034] Trial 30 finished with value: 0.8931666666666667 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 104, 'epochs': 70, 'lr': 0.00048349713123405476, 'dropout': 0.30000000000000004, 'batch_size': 16, 'label_smoothing': 0.1, 'optimizer': 'RMSprop', 'weight_decay': 1.5893199256397625e-05}. Best is trial 30 with value: 0.8931666666666667.


🏃 View run 30 at: http://localhost:5000/#/experiments/605049891641191207/runs/5a61bbd601744656b78b9a1d5afab5cc
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 30 at: http://localhost:5000/#/experiments/605049891641191207/runs/5a61bbd601744656b78b9a1d5afab5cc
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 19:11:32,768] Trial 31 finished with value: 0.89275 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 104, 'epochs': 70, 'lr': 0.0005954989933975906, 'dropout': 0.30000000000000004, 'batch_size': 16, 'label_smoothing': 0.1, 'optimizer': 'RMSprop', 'weight_decay': 1.6612789881027457e-05}. Best is trial 30 with value: 0.8931666666666667.


🏃 View run 31 at: http://localhost:5000/#/experiments/605049891641191207/runs/edfb73109e0947ac9a0f18263cfbca72
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 31 at: http://localhost:5000/#/experiments/605049891641191207/runs/edfb73109e0947ac9a0f18263cfbca72
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 19:19:47,268] Trial 32 finished with value: 0.8924166666666666 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 128, 'epochs': 60, 'lr': 0.00047185955116869677, 'dropout': 0.30000000000000004, 'batch_size': 16, 'label_smoothing': 0.1, 'optimizer': 'RMSprop', 'weight_decay': 1.3919659343137671e-05}. Best is trial 30 with value: 0.8931666666666667.


🏃 View run 32 at: http://localhost:5000/#/experiments/605049891641191207/runs/417ec97d480e4bcab7d34f66dc78baf2
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 32 at: http://localhost:5000/#/experiments/605049891641191207/runs/417ec97d480e4bcab7d34f66dc78baf2
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 19:26:55,533] Trial 33 finished with value: 0.8909166666666667 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 128, 'epochs': 60, 'lr': 0.0005795216043981217, 'dropout': 0.4, 'batch_size': 16, 'label_smoothing': 0.1, 'optimizer': 'RMSprop', 'weight_decay': 1.4351035041219301e-05}. Best is trial 30 with value: 0.8931666666666667.


🏃 View run 33 at: http://localhost:5000/#/experiments/605049891641191207/runs/8730ab8f56244227a858638a8997cdb3
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 33 at: http://localhost:5000/#/experiments/605049891641191207/runs/8730ab8f56244227a858638a8997cdb3
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 19:32:51,589] Trial 34 finished with value: 0.8950833333333333 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 120, 'epochs': 60, 'lr': 0.00035248257627717983, 'dropout': 0.2, 'batch_size': 16, 'label_smoothing': 0.1, 'optimizer': 'RMSprop', 'weight_decay': 1.5814593102356403e-05}. Best is trial 34 with value: 0.8950833333333333.


🏃 View run 34 at: http://localhost:5000/#/experiments/605049891641191207/runs/b44c65a9a2024b6c9732ca792fb47df2
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 34 at: http://localhost:5000/#/experiments/605049891641191207/runs/b44c65a9a2024b6c9732ca792fb47df2
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 19:37:50,090] Trial 35 finished with value: 0.6998333333333333 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 120, 'epochs': 60, 'lr': 0.0003961720516819966, 'dropout': 0.2, 'batch_size': 64, 'label_smoothing': 0.1, 'optimizer': 'Rprop', 'weight_decay': 1.0027826000388541e-05}. Best is trial 34 with value: 0.8950833333333333.


🏃 View run 35 at: http://localhost:5000/#/experiments/605049891641191207/runs/9b4c6dd2f1af42fd8d4ca10b944b0857
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 35 at: http://localhost:5000/#/experiments/605049891641191207/runs/9b4c6dd2f1af42fd8d4ca10b944b0857
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 19:42:21,794] Trial 36 finished with value: 0.8941666666666667 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 128, 'epochs': 70, 'lr': 0.00016844514917520388, 'dropout': 0.2, 'batch_size': 32, 'label_smoothing': 0.1, 'optimizer': 'RMSprop', 'weight_decay': 1.5984129765075968e-05}. Best is trial 34 with value: 0.8950833333333333.


🏃 View run 36 at: http://localhost:5000/#/experiments/605049891641191207/runs/d9386e2d890d46ccb4aef97e32140aa0
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 36 at: http://localhost:5000/#/experiments/605049891641191207/runs/d9386e2d890d46ccb4aef97e32140aa0
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 19:50:50,899] Trial 37 finished with value: 0.0865 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 112, 'epochs': 70, 'lr': 0.00018444059455403626, 'dropout': 0.2, 'batch_size': 32, 'label_smoothing': 0.1, 'optimizer': 'LBFGS', 'weight_decay': 3.067813250869572e-05}. Best is trial 34 with value: 0.8950833333333333.


🏃 View run 37 at: http://localhost:5000/#/experiments/605049891641191207/runs/2a09dbef0e574761ac4c212d1e713d4b
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 37 at: http://localhost:5000/#/experiments/605049891641191207/runs/2a09dbef0e574761ac4c212d1e713d4b
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 19:54:17,959] Trial 38 finished with value: 0.8941666666666667 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 120, 'epochs': 50, 'lr': 0.0008131188074351917, 'dropout': 0.2, 'batch_size': 32, 'label_smoothing': 0.1, 'optimizer': 'RMSprop', 'weight_decay': 1.614542703364417e-05}. Best is trial 34 with value: 0.8950833333333333.


🏃 View run 38 at: http://localhost:5000/#/experiments/605049891641191207/runs/29272f0217e94347a449a3c64b4a63b7
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 38 at: http://localhost:5000/#/experiments/605049891641191207/runs/29272f0217e94347a449a3c64b4a63b7
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 19:58:17,198] Trial 39 finished with value: 0.893 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 120, 'epochs': 50, 'lr': 0.0002576570726809196, 'dropout': 0.2, 'batch_size': 32, 'label_smoothing': 0.1, 'optimizer': 'RMSprop', 'weight_decay': 3.687530934774476e-05}. Best is trial 34 with value: 0.8950833333333333.


🏃 View run 39 at: http://localhost:5000/#/experiments/605049891641191207/runs/29bd70c08cb447b6861e0f0cf643d6c0
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 39 at: http://localhost:5000/#/experiments/605049891641191207/runs/29bd70c08cb447b6861e0f0cf643d6c0
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 20:05:53,245] Trial 40 finished with value: 0.587 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 128, 'epochs': 50, 'lr': 0.0008655706311511185, 'dropout': 0.2, 'batch_size': 32, 'label_smoothing': 0.1, 'optimizer': 'Rprop', 'weight_decay': 3.097691031873351e-05}. Best is trial 34 with value: 0.8950833333333333.


🏃 View run 40 at: http://localhost:5000/#/experiments/605049891641191207/runs/012e3dd7a8484b4e9182b47710986a74
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 40 at: http://localhost:5000/#/experiments/605049891641191207/runs/012e3dd7a8484b4e9182b47710986a74
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 20:09:52,653] Trial 41 finished with value: 0.8930833333333333 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 120, 'epochs': 50, 'lr': 0.00024660555221178606, 'dropout': 0.2, 'batch_size': 32, 'label_smoothing': 0.1, 'optimizer': 'RMSprop', 'weight_decay': 1.7109059735635917e-05}. Best is trial 34 with value: 0.8950833333333333.


🏃 View run 41 at: http://localhost:5000/#/experiments/605049891641191207/runs/a3dbf7c214b34d1fb778a815ce0f1156
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 41 at: http://localhost:5000/#/experiments/605049891641191207/runs/a3dbf7c214b34d1fb778a815ce0f1156
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 20:13:29,405] Trial 42 finished with value: 0.8971666666666667 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 120, 'epochs': 40, 'lr': 0.00014945226423368488, 'dropout': 0.2, 'batch_size': 32, 'label_smoothing': 0.1, 'optimizer': 'RMSprop', 'weight_decay': 1.5509457304206577e-05}. Best is trial 42 with value: 0.8971666666666667.


🏃 View run 42 at: http://localhost:5000/#/experiments/605049891641191207/runs/beaff062381d4bfebcaaa8f770417336
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 42 at: http://localhost:5000/#/experiments/605049891641191207/runs/beaff062381d4bfebcaaa8f770417336
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 20:16:19,075] Trial 43 finished with value: 0.8923333333333333 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 112, 'epochs': 40, 'lr': 0.0001415159153234608, 'dropout': 0.2, 'batch_size': 32, 'label_smoothing': 0.1, 'optimizer': 'RMSprop', 'weight_decay': 1.559495434050298e-05}. Best is trial 42 with value: 0.8971666666666667.


🏃 View run 43 at: http://localhost:5000/#/experiments/605049891641191207/runs/fb84bc3c0fd2421a8205bb6cabc84d25
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 43 at: http://localhost:5000/#/experiments/605049891641191207/runs/fb84bc3c0fd2421a8205bb6cabc84d25
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 20:22:00,530] Trial 44 finished with value: 0.06958333333333333 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 128, 'epochs': 40, 'lr': 0.0003450981617636907, 'dropout': 0.2, 'batch_size': 32, 'label_smoothing': 0.1, 'optimizer': 'LBFGS', 'weight_decay': 2.9814413998873164e-05}. Best is trial 42 with value: 0.8971666666666667.


🏃 View run 44 at: http://localhost:5000/#/experiments/605049891641191207/runs/b21fff757a2a476db03d0765688f93db
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 44 at: http://localhost:5000/#/experiments/605049891641191207/runs/b21fff757a2a476db03d0765688f93db
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 20:28:42,762] Trial 45 finished with value: 0.6165 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 120, 'epochs': 60, 'lr': 0.0008174042926429062, 'dropout': 0.2, 'batch_size': 32, 'label_smoothing': 0.1, 'optimizer': 'Rprop', 'weight_decay': 1.1697761389796636e-05}. Best is trial 42 with value: 0.8971666666666667.


🏃 View run 45 at: http://localhost:5000/#/experiments/605049891641191207/runs/360a250438d547d8b9f524566d0d4856
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 45 at: http://localhost:5000/#/experiments/605049891641191207/runs/360a250438d547d8b9f524566d0d4856
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 20:32:14,963] Trial 46 finished with value: 0.87925 and parameters: {'num_hidden_layers': 4, 'neurons_per_layer': 112, 'epochs': 50, 'lr': 0.0036509124522917475, 'dropout': 0.1, 'batch_size': 32, 'label_smoothing': 0.15000000000000002, 'optimizer': 'RMSprop', 'weight_decay': 4.566406379928586e-05}. Best is trial 42 with value: 0.8971666666666667.


🏃 View run 46 at: http://localhost:5000/#/experiments/605049891641191207/runs/96f08250dcd143d3bcc5fe71434d4a33
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 46 at: http://localhost:5000/#/experiments/605049891641191207/runs/96f08250dcd143d3bcc5fe71434d4a33
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 20:37:36,360] Trial 47 finished with value: 0.8936666666666667 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 120, 'epochs': 70, 'lr': 0.00016189532097820434, 'dropout': 0.2, 'batch_size': 32, 'label_smoothing': 0.1, 'optimizer': 'RMSprop', 'weight_decay': 2.6443695882545585e-05}. Best is trial 42 with value: 0.8971666666666667.


🏃 View run 47 at: http://localhost:5000/#/experiments/605049891641191207/runs/54a80e5e9b7347bd8f672cac36ff7ee1
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 47 at: http://localhost:5000/#/experiments/605049891641191207/runs/54a80e5e9b7347bd8f672cac36ff7ee1
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 20:40:12,774] Trial 48 finished with value: 0.70375 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 120, 'epochs': 40, 'lr': 2.5281333333923215e-05, 'dropout': 0.2, 'batch_size': 32, 'label_smoothing': 0.15000000000000002, 'optimizer': 'SGD', 'weight_decay': 1.001254052520279e-05}. Best is trial 42 with value: 0.8971666666666667.


🏃 View run 48 at: http://localhost:5000/#/experiments/605049891641191207/runs/0b9be09607304427b2885ee598834e97
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 48 at: http://localhost:5000/#/experiments/605049891641191207/runs/0b9be09607304427b2885ee598834e97
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


[I 2025-07-08 20:49:22,340] Trial 49 finished with value: 0.08033333333333334 and parameters: {'num_hidden_layers': 5, 'neurons_per_layer': 24, 'epochs': 60, 'lr': 0.00016214771607804684, 'dropout': 0.1, 'batch_size': 32, 'label_smoothing': 0.1, 'optimizer': 'LBFGS', 'weight_decay': 1.2878101760084487e-05}. Best is trial 42 with value: 0.8971666666666667.


🏃 View run 49 at: http://localhost:5000/#/experiments/605049891641191207/runs/10b8b39b657f4a09ab683bce2add4c98
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207
🏃 View run 49 at: http://localhost:5000/#/experiments/605049891641191207/runs/10b8b39b657f4a09ab683bce2add4c98
🧪 View experiment at: http://localhost:5000/#/experiments/605049891641191207


In [8]:
study.best_value

0.8971666666666667

In [9]:
study.best_params

{'num_hidden_layers': 5,
 'neurons_per_layer': 120,
 'epochs': 40,
 'lr': 0.00014945226423368488,
 'dropout': 0.2,
 'batch_size': 32,
 'label_smoothing': 0.1,
 'optimizer': 'RMSprop',
 'weight_decay': 1.5509457304206577e-05}