# Hyper-Parameter Tuning

This notebook contains hyperparameter tuning for the model. Learner module, FastAI, provides a convenient way to create and fine-tune convolutional neural network (CNN) models. vision.learner is a function that helps us to construct a learner object, which has the model architecture, data, training configuration, and other elements. We can specify a pre-trained model architecture and fine-tune it on the dataset. vision.learner supports a wide range of CNN architectures.

This notebook includes these following implementations:

1. Random Search optimization algorithm - Run 1
2. Random Search optimization algorithm - Run 2
3. Hyperparameter Optimization with Optuna's Successive Halving Pruner

In [1]:
import warnings
warnings.filterwarnings("ignore")

In [2]:
!pip install optuna-integration

Collecting optuna-integration
  Downloading optuna_integration-3.6.0-py3-none-any.whl.metadata (10 kB)
Downloading optuna_integration-3.6.0-py3-none-any.whl (93 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m93.4/93.4 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: optuna-integration
Successfully installed optuna-integration-3.6.0


In [3]:
!pip install optuna lightgbm



In [4]:
!pip install --upgrade optuna



# Random Search optimization algorithm - Run 1

The optimization algorithm used below is Optuna's default algorithm, which is a Tree-structured Parzen Estimator (TPE) algorithm.

For n_trails = 10, the accuracy score and best hyper-parameter are as follows:

| Trail No. | Best Score | Architecture | Weight Decay | Epochs | Batch Size | Drop |
|-----------|------------|--------------|--------------|--------|------------|------|
| 0         | 0.9011     | ResNet34     | 0.00024      | 8      | 64         | 0.4  |
| 1         | 0.9413     | ResNet18     | 0.0090       | 15     | 64         | 0.2  |
| 2         | 0.9343     | ResNet18     | 0.0065       | 5      | 32         | 0.4  |
| 3         | 0.9080     | ResNet34     | 0.00062      | 5      | 32         | 0.4  |
| 4         | 0.9019     | ResNet34     | 0.00092      | 6      | 64         | 0.2  |
| 5         | 0.9527     | ResNet50     | 0.00005      | 7      | 64         | 0.4  |
| 6         | 0.9220     | ResNet34     | 0.00895      | 15     | 64         | 0.2  |
| 7         | 0.9404     | ResNet18     | 0.0035       | 11     | 64         | 0.4  |
| 8         | 0.9212     | ResNet18     | 0.0002       | 5      | 64         | 0.4  |
| 9         | 0.9203     | ResNet34     | 0.0007       | 13     | 32         | 0.4  |


**The best aacuracy score is 0.9527 with these hyperparameters:**
- Architecture: ResNet 50
- Weight Decay: 5.527e-5
- Epochs: 7
- Batch Size: 64
- Drop: 0.4

In [5]:
import optuna
from fastai.vision.all import *
from sklearn.model_selection import cross_val_score

# objective function for hyperparameter optimization
def objective(trial):
    path = Path('/kaggle/input/brain-tumor-mri-classification-dataset/Brain_Tumor_MRI_Image_Dataset/Training')
    
    # hyperparameters to optimize
    arch = trial.suggest_categorical('arch', ['resnet18', 'resnet34', 'resnet50'])
    wd = trial.suggest_loguniform('wd', 1e-6, 1e-2)
    epochs = trial.suggest_int('epochs', 5, 15)
    bs = trial.suggest_categorical('bs', [32, 64])
    drop = trial.suggest_categorical('drop', [0.2, 0.4])
    
    # DataBlock to prepare the data
    dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),
                       get_items=get_image_files,
                       get_y=parent_label,
                       splitter=RandomSplitter(valid_pct=0.2, seed=42))
    
    dls = dblock.dataloaders(path, bs=bs) # Create DataLoaders
    
    # learner with specified architecture, metrics, weight decay, and data
    learn = vision_learner(dls, arch, metrics=accuracy, wd=wd)
    
    # Fine-tune the model
    learn.fine_tune(epochs, base_lr=0.001, cbs=[MixedPrecision()])
    
    return float(learn.validate()[1]) # validation accuracy of the trained model

# Optuna study object for hyperparameter optimization
study = optuna.create_study(direction="maximize")

# Optimize the objective function by running multiple trials
study.optimize(objective, n_trials=10)

trial = study.best_trial # best trial from the study

print("Accuracy: {}".format(trial.value))
print("Best hyperparameters: {}".format(trial.params))

[I 2024-05-24 21:50:09,794] A new study created in memory with name: no-name-10ac8b0a-751b-4f7e-b013-29bab511d22f


model.safetensors:   0%|          | 0.00/87.3M [00:00<?, ?B/s]

epoch,train_loss,valid_loss,accuracy,time
0,1.096213,0.387679,0.865149,00:14


epoch,train_loss,valid_loss,accuracy,time
0,0.568723,0.322058,0.892294,00:16
1,0.490151,0.291415,0.892294,00:15
2,0.417919,0.286482,0.898424,00:16
3,0.361479,0.278917,0.884413,00:16
4,0.310722,0.246992,0.90718,00:16
5,0.267015,0.246769,0.904553,00:16
6,0.226637,0.251126,0.898424,00:16
7,0.210725,0.248565,0.901926,00:16


[I 2024-05-24 21:52:42,748] Trial 0 finished with value: 0.9010508060455322 and parameters: {'arch': 'resnet34', 'wd': 0.00023340994424944704, 'epochs': 8, 'bs': 64, 'drop': 0.4}. Best is trial 0 with value: 0.9010508060455322.


model.safetensors:   0%|          | 0.00/46.8M [00:00<?, ?B/s]

epoch,train_loss,valid_loss,accuracy,time
0,0.930997,0.300741,0.89317,00:09


epoch,train_loss,valid_loss,accuracy,time
0,0.438969,0.262434,0.903678,00:10
1,0.389904,0.233369,0.918564,00:11
2,0.346242,0.217925,0.91944,00:11
3,0.301022,0.217099,0.917688,00:10
4,0.255345,0.200015,0.920315,00:11
5,0.222467,0.182874,0.930823,00:11
6,0.190116,0.173722,0.930823,00:11
7,0.150652,0.166545,0.934326,00:11
8,0.122943,0.173149,0.934326,00:11
9,0.121059,0.168385,0.929947,00:11


[I 2024-05-24 21:55:42,252] Trial 1 finished with value: 0.9413309693336487 and parameters: {'arch': 'resnet18', 'wd': 0.009064248812822551, 'epochs': 15, 'bs': 64, 'drop': 0.2}. Best is trial 1 with value: 0.9413309693336487.


epoch,train_loss,valid_loss,accuracy,time
0,0.770679,0.276929,0.894921,00:10


epoch,train_loss,valid_loss,accuracy,time
0,0.436824,0.254958,0.914186,00:12
1,0.360165,0.212044,0.922067,00:12
2,0.310777,0.203068,0.911559,00:12
3,0.258737,0.172688,0.936077,00:12
4,0.235185,0.161799,0.934326,00:12


[I 2024-05-24 21:56:56,249] Trial 2 finished with value: 0.9343257546424866 and parameters: {'arch': 'resnet18', 'wd': 0.006544389818968023, 'epochs': 5, 'bs': 32, 'drop': 0.4}. Best is trial 1 with value: 0.9413309693336487.


epoch,train_loss,valid_loss,accuracy,time
0,0.947751,0.383075,0.863398,00:13


epoch,train_loss,valid_loss,accuracy,time
0,0.545927,0.322183,0.887916,00:17
1,0.442133,0.286373,0.894046,00:17
2,0.404942,0.258899,0.902802,00:17
3,0.322297,0.254092,0.904553,00:17
4,0.2968,0.253844,0.909807,00:17


[I 2024-05-24 21:58:42,859] Trial 3 finished with value: 0.9080560207366943 and parameters: {'arch': 'resnet34', 'wd': 0.0006280450180845069, 'epochs': 5, 'bs': 32, 'drop': 0.4}. Best is trial 1 with value: 0.9413309693336487.


epoch,train_loss,valid_loss,accuracy,time
0,1.133325,0.413726,0.849387,00:12


epoch,train_loss,valid_loss,accuracy,time
0,0.537261,0.325078,0.886165,00:16
1,0.463902,0.291431,0.887916,00:16
2,0.401973,0.273693,0.885289,00:16
3,0.321756,0.254494,0.901051,00:16
4,0.28691,0.253121,0.900175,00:16
5,0.270712,0.244473,0.903678,00:16


[I 2024-05-24 22:00:36,999] Trial 4 finished with value: 0.9019264578819275 and parameters: {'arch': 'resnet34', 'wd': 0.0009150842875866279, 'epochs': 6, 'bs': 64, 'drop': 0.2}. Best is trial 1 with value: 0.9413309693336487.


model.safetensors:   0%|          | 0.00/102M [00:00<?, ?B/s]

epoch,train_loss,valid_loss,accuracy,time
0,0.826989,0.357276,0.862522,00:21


epoch,train_loss,valid_loss,accuracy,time
0,0.320762,0.251823,0.92732,00:25
1,0.28476,0.241625,0.930823,00:26
2,0.22857,0.245083,0.934326,00:26
3,0.191758,0.222438,0.942207,00:25
4,0.144681,0.17278,0.950963,00:26
5,0.109777,0.19892,0.954466,00:25
6,0.090422,0.216,0.951839,00:26


[I 2024-05-24 22:04:07,216] Trial 5 finished with value: 0.9527145624160767 and parameters: {'arch': 'resnet50', 'wd': 5.527401639726816e-05, 'epochs': 7, 'bs': 64, 'drop': 0.4}. Best is trial 5 with value: 0.9527145624160767.


epoch,train_loss,valid_loss,accuracy,time
0,1.098644,0.400075,0.85289,00:13


epoch,train_loss,valid_loss,accuracy,time
0,0.549288,0.335254,0.881786,00:16
1,0.506567,0.30829,0.888792,00:16
2,0.456383,0.285498,0.891419,00:16
3,0.401892,0.273,0.900175,00:17
4,0.343611,0.281282,0.890543,00:16
5,0.282301,0.236686,0.908056,00:16
6,0.248601,0.246622,0.910683,00:15
7,0.221422,0.240875,0.911559,00:16
8,0.193505,0.214496,0.922942,00:15
9,0.180286,0.205274,0.928196,00:15


[I 2024-05-24 22:08:28,633] Trial 6 finished with value: 0.9220665693283081 and parameters: {'arch': 'resnet34', 'wd': 0.008951501361629699, 'epochs': 15, 'bs': 64, 'drop': 0.2}. Best is trial 5 with value: 0.9527145624160767.


epoch,train_loss,valid_loss,accuracy,time
0,0.905936,0.306733,0.894921,00:09


epoch,train_loss,valid_loss,accuracy,time
0,0.434151,0.277744,0.901051,00:10
1,0.390968,0.23619,0.915937,00:10
2,0.350471,0.223331,0.917688,00:10
3,0.292414,0.195645,0.918564,00:10
4,0.250865,0.187537,0.926445,00:10
5,0.208969,0.178643,0.92732,00:10
6,0.176541,0.175607,0.93345,00:10
7,0.151058,0.172329,0.93345,00:10
8,0.139511,0.16651,0.936953,00:10
9,0.124001,0.164608,0.935201,00:10


[I 2024-05-24 22:10:36,906] Trial 7 finished with value: 0.9404553174972534 and parameters: {'arch': 'resnet18', 'wd': 0.003537091430333875, 'epochs': 11, 'bs': 64, 'drop': 0.4}. Best is trial 5 with value: 0.9527145624160767.


epoch,train_loss,valid_loss,accuracy,time
0,0.941354,0.277954,0.89317,00:09


epoch,train_loss,valid_loss,accuracy,time
0,0.441243,0.266596,0.901051,00:10
1,0.376036,0.217937,0.908932,00:10
2,0.306597,0.209907,0.914186,00:10
3,0.250445,0.195011,0.926445,00:10
4,0.224316,0.202344,0.922942,00:10


[I 2024-05-24 22:11:41,417] Trial 8 finished with value: 0.9211909174919128 and parameters: {'arch': 'resnet18', 'wd': 0.00020852879448736512, 'epochs': 5, 'bs': 64, 'drop': 0.4}. Best is trial 5 with value: 0.9527145624160767.


epoch,train_loss,valid_loss,accuracy,time
0,0.91082,0.359393,0.868652,00:13


epoch,train_loss,valid_loss,accuracy,time
0,0.537367,0.309434,0.897548,00:17
1,0.480948,0.295927,0.88704,00:17
2,0.435401,0.249373,0.912434,00:17
3,0.376859,0.250839,0.903678,00:17
4,0.316053,0.25313,0.906305,00:17
5,0.278214,0.218644,0.918564,00:17
6,0.260835,0.218007,0.908056,00:17
7,0.231998,0.208978,0.921191,00:17
8,0.186753,0.204182,0.922067,00:17
9,0.184746,0.203614,0.917688,00:17


[I 2024-05-24 22:15:43,586] Trial 9 finished with value: 0.9203152656555176 and parameters: {'arch': 'resnet34', 'wd': 0.0007385132863360378, 'epochs': 13, 'bs': 32, 'drop': 0.4}. Best is trial 5 with value: 0.9527145624160767.


Accuracy: 0.9527145624160767
Best hyperparameters: {'arch': 'resnet50', 'wd': 5.527401639726816e-05, 'epochs': 7, 'bs': 64, 'drop': 0.4}


# Randomized search for hyperparameter optimization - Run 2


For n_trails = 10, the accuracy score and best hyper-parameter are as follows:

| Trial | Best Score | Architecture | Weight Decay | Epochs | Batch Size | Drop |
|-------|------------|--------------|--------------|--------|------------|------|
| 0     | 0.9177     | ResNet34     | 0.000248     | 6      | 32         | 0.2  |
| 1     | 0.9492     | ResNet50     | 0.002137     | 7      | 64         | 0.4  |
| 2     | 0.9518     | ResNet50     | 0.000004     | 8      | 64         | 0.2  |
| 3     | 0.9046     | ResNet34     | 0.000269     | 6      | 64         | 0.2  |
| 4     | 0.9378     | ResNet50     | 0.000058     | 6      | 32         | 0.2  |
| 5     | 0.9352     | ResNet18     | 0.000154     | 7      | 32         | 0.2  |
| 6     | 0.9063     | ResNet34     | 0.000006     | 5      | 64         | 0.4  |
| 7     | 0.9553     | ResNet50     | 0.000004     | 13     | 64         | 0.2  |
| 8     | 0.9238     | ResNet34     | 0.000824     | 13     | 64         | 0.4  |
| 9     | 0.9361     | ResNet18     | 0.000018     | 8      | 32         | 0.2  |


**The best aacuracy score is 0.9553 with these hyperparameters:**
- Architecture: ResNet 50
- Weight Decay: 4e-6
- Epochs: 5
- Batch Size: 64
- Drop: 0.2

In [6]:
import optuna
from fastai.vision.all import *
from sklearn.model_selection import cross_val_score
from optuna.samplers import TPESampler

def objective(trial):
    path = Path('/kaggle/input/brain-tumor-mri-classification-dataset/Brain_Tumor_MRI_Image_Dataset/Training')
    
    arch = trial.suggest_categorical('arch', ['resnet18', 'resnet34', 'resnet50'])
    wd = trial.suggest_loguniform('wd', 1e-6, 1e-2)
    epochs = trial.suggest_int('epochs', 5, 15)
    bs = trial.suggest_categorical('bs', [32, 64])
    drop = trial.suggest_categorical('drop', [0.2, 0.4])

    dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),
                       get_items=get_image_files,
                       get_y=parent_label,
                       splitter=RandomSplitter(valid_pct=0.2, seed=42))
    
    dls = dblock.dataloaders(path, bs=bs)

    learn = vision_learner(dls, arch, metrics=accuracy, wd=wd)
    learn.fine_tune(epochs, base_lr=0.001, cbs=[MixedPrecision()])

    return float(learn.validate()[1])

sampler = TPESampler(seed=42)  # Initialize TPESampler for random search
study = optuna.create_study(direction="maximize", sampler=sampler)  # Use TPESampler for random search
study.optimize(objective, n_trials=10)

trial = study.best_trial

print("Accuracy: {}".format(trial.value))
print("Best hyperparameters: {}".format(trial.params))

[I 2024-05-24 22:15:43,603] A new study created in memory with name: no-name-b9e79e9e-2be7-4f61-82ff-92e3b89629f2


epoch,train_loss,valid_loss,accuracy,time
0,0.861309,0.381287,0.870403,00:13


epoch,train_loss,valid_loss,accuracy,time
0,0.560308,0.326959,0.875657,00:17
1,0.489167,0.303019,0.886165,00:17
2,0.382597,0.276688,0.898424,00:17
3,0.365501,0.247717,0.904553,00:18
4,0.277556,0.245371,0.908932,00:17
5,0.282006,0.237996,0.915937,00:17


[I 2024-05-24 22:17:45,930] Trial 0 finished with value: 0.917688250541687 and parameters: {'arch': 'resnet34', 'wd': 0.0002481040974867811, 'epochs': 6, 'bs': 32, 'drop': 0.2}. Best is trial 0 with value: 0.917688250541687.


epoch,train_loss,valid_loss,accuracy,time
0,0.788204,0.412232,0.862522,00:21


epoch,train_loss,valid_loss,accuracy,time
0,0.351873,0.26215,0.925569,00:26
1,0.299193,0.231609,0.93345,00:26
2,0.243308,0.262405,0.929072,00:26
3,0.191433,0.175836,0.944834,00:26
4,0.133285,0.168542,0.948336,00:26
5,0.105654,0.23054,0.949212,00:26
6,0.096123,0.158684,0.95359,00:26


[I 2024-05-24 22:21:15,615] Trial 1 finished with value: 0.9492118954658508 and parameters: {'arch': 'resnet50', 'wd': 0.002136832907235877, 'epochs': 7, 'bs': 64, 'drop': 0.4}. Best is trial 1 with value: 0.9492118954658508.


epoch,train_loss,valid_loss,accuracy,time
0,0.823435,0.355779,0.880035,00:21


epoch,train_loss,valid_loss,accuracy,time
0,0.334317,0.203773,0.931699,00:25
1,0.280687,0.215644,0.921191,00:25
2,0.238456,0.439207,0.925569,00:25
3,0.182599,0.181663,0.937828,00:25
4,0.150487,0.165346,0.942207,00:25
5,0.111429,0.183122,0.948336,00:25
6,0.089262,0.157622,0.952715,00:25
7,0.090734,0.156967,0.95359,00:25


[I 2024-05-24 22:25:07,888] Trial 2 finished with value: 0.9518388509750366 and parameters: {'arch': 'resnet50', 'wd': 3.6138942712165278e-06, 'epochs': 8, 'bs': 64, 'drop': 0.2}. Best is trial 2 with value: 0.9518388509750366.


epoch,train_loss,valid_loss,accuracy,time
0,1.102986,0.427587,0.853765,00:12


epoch,train_loss,valid_loss,accuracy,time
0,0.5414,0.328082,0.880911,00:15
1,0.483583,0.297755,0.885289,00:15
2,0.401491,0.263839,0.89317,00:15
3,0.330477,0.252028,0.903678,00:15
4,0.283069,0.249856,0.908932,00:15
5,0.260506,0.250313,0.911559,00:15


[I 2024-05-24 22:26:59,027] Trial 3 finished with value: 0.9045534133911133 and parameters: {'arch': 'resnet34', 'wd': 0.000269264691008618, 'epochs': 6, 'bs': 64, 'drop': 0.2}. Best is trial 2 with value: 0.9518388509750366.


epoch,train_loss,valid_loss,accuracy,time
0,0.667771,0.379004,0.904553,00:22


epoch,train_loss,valid_loss,accuracy,time
0,0.373117,0.274419,0.920315,00:27
1,0.327938,0.585635,0.930823,00:27
2,0.262243,0.608685,0.923818,00:27
3,0.216059,0.221038,0.944834,00:27
4,0.167353,4.16435,0.93345,00:27
5,0.139341,0.361042,0.941331,00:27


[I 2024-05-24 22:30:13,466] Trial 4 finished with value: 0.9378283619880676 and parameters: {'arch': 'resnet50', 'wd': 5.762487216478604e-05, 'epochs': 6, 'bs': 32, 'drop': 0.2}. Best is trial 2 with value: 0.9518388509750366.


epoch,train_loss,valid_loss,accuracy,time
0,0.751824,0.272709,0.906305,00:09


epoch,train_loss,valid_loss,accuracy,time
0,0.453656,0.238539,0.918564,00:11
1,0.366793,0.219829,0.918564,00:11
2,0.30212,0.191546,0.925569,00:11
3,0.285857,0.202364,0.924694,00:11
4,0.228826,0.183159,0.929072,00:11
5,0.196506,0.176833,0.937828,00:11
6,0.171231,0.180764,0.934326,00:11


[I 2024-05-24 22:31:44,996] Trial 5 finished with value: 0.9352014064788818 and parameters: {'arch': 'resnet18', 'wd': 0.00015375920235481777, 'epochs': 7, 'bs': 32, 'drop': 0.2}. Best is trial 2 with value: 0.9518388509750366.


epoch,train_loss,valid_loss,accuracy,time
0,1.10196,0.398535,0.851138,00:12


epoch,train_loss,valid_loss,accuracy,time
0,0.530839,0.325091,0.887916,00:15
1,0.482721,0.300519,0.889667,00:15
2,0.39503,0.281887,0.898424,00:15
3,0.338371,0.259621,0.905429,00:15
4,0.298018,0.26009,0.905429,00:15


[I 2024-05-24 22:33:18,873] Trial 6 finished with value: 0.9063047170639038 and parameters: {'arch': 'resnet34', 'wd': 6.0803901902966035e-06, 'epochs': 5, 'bs': 64, 'drop': 0.4}. Best is trial 2 with value: 0.9518388509750366.


epoch,train_loss,valid_loss,accuracy,time
0,0.772641,0.359026,0.883538,00:20


epoch,train_loss,valid_loss,accuracy,time
0,0.361652,0.214777,0.930823,00:25
1,0.274222,0.192497,0.942207,00:25
2,0.240062,0.185538,0.932574,00:25
3,0.205064,0.181526,0.938704,00:25
4,0.17514,0.360629,0.940455,00:25
5,0.142931,0.150462,0.947461,00:26
6,0.112472,0.153896,0.949212,00:26
7,0.101038,0.140357,0.951839,00:26
8,0.081824,0.135689,0.952715,00:26
9,0.070126,0.14805,0.948336,00:26


[I 2024-05-24 22:39:23,135] Trial 7 finished with value: 0.9553415179252625 and parameters: {'arch': 'resnet50', 'wd': 3.6618192203924288e-06, 'epochs': 13, 'bs': 64, 'drop': 0.2}. Best is trial 7 with value: 0.9553415179252625.


epoch,train_loss,valid_loss,accuracy,time
0,1.072078,0.42,0.844133,00:13


epoch,train_loss,valid_loss,accuracy,time
0,0.544761,0.344332,0.871278,00:16
1,0.517602,0.307067,0.890543,00:16
2,0.436178,0.276744,0.901051,00:16
3,0.380206,0.267414,0.901051,00:16
4,0.334762,0.257378,0.901926,00:16
5,0.26878,0.237314,0.91944,00:16
6,0.241955,0.232983,0.911559,00:16
7,0.218444,0.22377,0.917688,00:16
8,0.195902,0.21774,0.91944,00:16
9,0.184507,0.222164,0.914186,00:16


[I 2024-05-24 22:43:13,933] Trial 8 finished with value: 0.9238178730010986 and parameters: {'arch': 'resnet34', 'wd': 0.0008241925264876454, 'epochs': 13, 'bs': 64, 'drop': 0.4}. Best is trial 7 with value: 0.9553415179252625.


epoch,train_loss,valid_loss,accuracy,time
0,0.754382,0.30697,0.89317,00:10


epoch,train_loss,valid_loss,accuracy,time
0,0.449796,0.255285,0.911559,00:12
1,0.400212,0.207009,0.91944,00:12
2,0.325846,0.204704,0.91944,00:12
3,0.275052,0.171263,0.931699,00:12
4,0.235291,0.1693,0.935201,00:12
5,0.199903,0.156644,0.941331,00:12
6,0.197643,0.150987,0.93958,00:12
7,0.164178,0.148437,0.936077,00:12


[I 2024-05-24 22:45:05,289] Trial 9 finished with value: 0.9360770583152771 and parameters: {'arch': 'resnet18', 'wd': 1.753594952976443e-05, 'epochs': 8, 'bs': 32, 'drop': 0.2}. Best is trial 7 with value: 0.9553415179252625.


Accuracy: 0.9553415179252625
Best hyperparameters: {'arch': 'resnet50', 'wd': 3.6618192203924288e-06, 'epochs': 13, 'bs': 64, 'drop': 0.2}


# Hyperparameter Optimization with Optuna's Successive Halving Pruner

The addition of SuccessiveHalvingPruner helps to improve efficiency during optimization. By early stopping trials with low validation accuracy (potentially bad performers), SHP focuses resources on trials that are more likely to be good. This can significantly reduce the total training time, especially for computationally expensive models.

For n_trails = 10, accuracy scores and hyper-paramters are:

| Trial | Best Score | Architecture | Weight Decay | Epochs | Batch Size | Drop               |
|-------|------------|--------------|--------------|--------|------------|--------------------|
| 0     | 0.9623     | resnet50     | 2.057e-06    | 8      | 32         | 0.2888             |
| 1     | 0.9518     | resnet50     | 0.001421     | 7      | 32         | 0.2707             |
| 2     | 0.9807     | resnet34     | 3.744e-06    | 9      | 32         | 0.3918             |
| 3     | 0.9641     | resnet18     | 2.667e-05    | 7      | 64         | 0.3196             |
| 4     | 0.9711     | resnet34     | 0.005883     | 8      | 64         | 0.2000             |
| 5     | 0.9650     | resnet18     | 5.694e-06    | 6      | 64         | 0.2266             |
| 6     | 0.9737     | resnet18     | 1.813e-05    | 6      | 64         | 0.3732             |
| 7     | 0.9851     | resnet34     | 0.004016     | 13     | 32         | 0.2680             |
| 8     | 0.9667     | resnet50     | 0.008095     | 15     | 32         | 0.3013             |
| 9     | 0.9632     | resnet50     | 0.0006995    | 6      | 32         | 0.3595             |


**The best aacuracy score is 0.9851 with these hyperparameters:**

- Architecture: ResNet 34
- Weight Decay: 0.004016
- Epochs: 13
- Batch Size: 32
- Drop: 0.268

SHA identifies and eliminates underperforming configurations early on. This prevents further resource expenditure on trials unlikely to yield good results. Optuna's Successive Halving (SHA) pruner can be advantageous over a simple random search for hyperparameter optimization.

In [8]:
from fastai.vision.all import *
import optuna
import numpy as np
from optuna.pruners import SuccessiveHalvingPruner

# objective function for Optuna
def objective(trial):
    path = Path('/kaggle/input/brain-tumor-mri-classification-dataset/Brain_Tumor_MRI_Image_Dataset/Training')

    # hyperparameters
    arch = trial.suggest_categorical('arch', [resnet18, resnet34, resnet50])
    wd = trial.suggest_loguniform('wd', 1e-6, 1e-2)
    epochs = trial.suggest_int('epochs', 5, 15)
    bs = trial.suggest_categorical('bs', [32, 64])
    drop = trial.suggest_float('drop', 0.2, 0.4)

    #DataBlock
    dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),
                       get_items=get_image_files,
                       get_y=parent_label,
                       splitter=RandomSplitter(valid_pct=0.2, seed=42))
    
    dls = dblock.dataloaders(path, bs=bs)
    
    # learner
    learn = vision_learner(dls, arch, metrics=[accuracy], wd=wd)
    
    # Train the model
    learn.fine_tune(epochs, base_lr=0.001, cbs=[MixedPrecision()])
    
    # validation accuracy
    accuracy_metric = float(learn.validate()[1])
    
    return accuracy_metric

# Optuna study with the Halving Pruner
study = optuna.create_study(direction='maximize', pruner=SuccessiveHalvingPruner())

# Optimize the objective function
study.optimize(objective, n_trials=10)

# best trial
trial = study.best_trial

print("Best Accuracy: {}".format(trial.value))
print("Best hyperparameters: {}".format(trial.params))

[I 2024-05-24 23:18:32,835] A new study created in memory with name: no-name-3b9d03b9-fb10-4d8b-834d-1b4f896e975e


epoch,train_loss,valid_loss,accuracy,time
0,0.599265,0.300957,0.8993,00:23


epoch,train_loss,valid_loss,accuracy,time
0,0.264046,0.18743,0.922942,00:28
1,0.177461,0.192091,0.932574,00:28
2,0.111791,0.161029,0.949212,00:28
3,0.053418,0.128679,0.950088,00:28
4,0.039403,0.142595,0.951839,00:28
5,0.01671,0.131251,0.958844,00:28
6,0.020159,0.11207,0.965849,00:28
7,0.008556,0.110309,0.963222,00:28


[I 2024-05-24 23:22:49,077] Trial 0 finished with value: 0.9623467326164246 and parameters: {'arch': <function resnet50 at 0x7a55d092f880>, 'wd': 2.056715875355261e-06, 'epochs': 8, 'bs': 32, 'drop': 0.2887557735470722}. Best is trial 0 with value: 0.9623467326164246.


epoch,train_loss,valid_loss,accuracy,time
0,0.638909,0.292574,0.915061,00:22


epoch,train_loss,valid_loss,accuracy,time
0,0.245941,0.205356,0.936953,00:28
1,0.183602,0.176353,0.943082,00:28
2,0.093307,0.156542,0.956217,00:28
3,0.046877,0.133005,0.960595,00:28
4,0.024073,0.149629,0.955342,00:28
5,0.021273,0.133934,0.957093,00:28
6,0.01173,0.134178,0.957093,00:28


[I 2024-05-24 23:26:35,485] Trial 1 finished with value: 0.9518388509750366 and parameters: {'arch': <function resnet50 at 0x7a55d092f880>, 'wd': 0.001421022419179072, 'epochs': 7, 'bs': 32, 'drop': 0.27067631749319276}. Best is trial 0 with value: 0.9623467326164246.
Downloading: "https://download.pytorch.org/models/resnet34-b627a593.pth" to /root/.cache/torch/hub/checkpoints/resnet34-b627a593.pth
100%|██████████| 83.3M/83.3M [00:01<00:00, 82.3MB/s]


epoch,train_loss,valid_loss,accuracy,time
0,0.707973,0.257959,0.910683,00:14


epoch,train_loss,valid_loss,accuracy,time
0,0.330547,0.151822,0.951839,00:18
1,0.17898,0.119798,0.951839,00:17
2,0.111872,0.11392,0.960595,00:17
3,0.064642,0.081061,0.976357,00:18
4,0.051511,0.100059,0.971979,00:17
5,0.026336,0.060369,0.983362,00:17
6,0.009389,0.068532,0.97986,00:17
7,0.007585,0.069661,0.978984,00:17
8,0.004051,0.067763,0.980736,00:17


[I 2024-05-24 23:29:36,291] Trial 2 finished with value: 0.9807355403900146 and parameters: {'arch': <function resnet34 at 0x7a55d092f6d0>, 'wd': 3.743608702148963e-06, 'epochs': 9, 'bs': 32, 'drop': 0.391776451145661}. Best is trial 2 with value: 0.9807355403900146.


epoch,train_loss,valid_loss,accuracy,time
0,0.932631,0.280745,0.909807,00:09


epoch,train_loss,valid_loss,accuracy,time
0,0.344489,0.164103,0.943082,00:11
1,0.204429,0.138138,0.943958,00:11
2,0.107068,0.112685,0.964098,00:11
3,0.048349,0.085158,0.963222,00:11
4,0.023066,0.088834,0.962347,00:11
5,0.014993,0.089301,0.964098,00:11
6,0.008996,0.093573,0.962347,00:11


[I 2024-05-24 23:31:07,263] Trial 3 finished with value: 0.9640980958938599 and parameters: {'arch': <function resnet18 at 0x7a55d092f520>, 'wd': 2.666590746016712e-05, 'epochs': 7, 'bs': 64, 'drop': 0.3196090658228747}. Best is trial 2 with value: 0.9807355403900146.


epoch,train_loss,valid_loss,accuracy,time
0,0.901441,0.251823,0.90718,00:12


epoch,train_loss,valid_loss,accuracy,time
0,0.349033,0.162082,0.948336,00:16
1,0.192588,0.135106,0.956217,00:16
2,0.092922,0.108112,0.963222,00:16
3,0.056528,0.101595,0.968476,00:16
4,0.0295,0.0981,0.972855,00:16
5,0.011884,0.093614,0.972855,00:16
6,0.005482,0.095259,0.970228,00:16
7,0.003129,0.106637,0.971103,00:16


[I 2024-05-24 23:33:36,021] Trial 4 finished with value: 0.971103310585022 and parameters: {'arch': <function resnet34 at 0x7a55d092f6d0>, 'wd': 0.005882789424928583, 'epochs': 8, 'bs': 64, 'drop': 0.20001891036926267}. Best is trial 2 with value: 0.9807355403900146.


epoch,train_loss,valid_loss,accuracy,time
0,0.909503,0.293536,0.904553,00:09


epoch,train_loss,valid_loss,accuracy,time
0,0.35884,0.165944,0.937828,00:11
1,0.204989,0.119504,0.957968,00:11
2,0.092436,0.118193,0.962347,00:11
3,0.041803,0.119817,0.95972,00:11
4,0.024639,0.108614,0.964098,00:11
5,0.012232,0.106503,0.965849,00:11


[I 2024-05-24 23:34:56,095] Trial 5 finished with value: 0.9649737477302551 and parameters: {'arch': <function resnet18 at 0x7a55d092f520>, 'wd': 5.693699927575653e-06, 'epochs': 6, 'bs': 64, 'drop': 0.2265821977897189}. Best is trial 2 with value: 0.9807355403900146.


epoch,train_loss,valid_loss,accuracy,time
0,0.911686,0.281321,0.8993,00:09


epoch,train_loss,valid_loss,accuracy,time
0,0.361514,0.165636,0.945709,00:11
1,0.184928,0.141123,0.95359,00:11
2,0.08621,0.109074,0.960595,00:10
3,0.043293,0.094568,0.970228,00:10
4,0.023215,0.077831,0.972855,00:10
5,0.011812,0.07752,0.974606,00:10


[I 2024-05-24 23:36:13,018] Trial 6 finished with value: 0.9737303256988525 and parameters: {'arch': <function resnet18 at 0x7a55d092f520>, 'wd': 1.812866118696492e-05, 'epochs': 6, 'bs': 64, 'drop': 0.37324777487924343}. Best is trial 2 with value: 0.9807355403900146.


epoch,train_loss,valid_loss,accuracy,time
0,0.721485,0.26701,0.91331,00:13


epoch,train_loss,valid_loss,accuracy,time
0,0.305417,0.163025,0.943958,00:17
1,0.174053,0.099729,0.964098,00:17
2,0.104068,0.135178,0.956217,00:17
3,0.079441,0.131653,0.960595,00:17
4,0.077399,0.149306,0.961471,00:17
5,0.063717,0.177402,0.964974,00:17
6,0.031494,0.084892,0.97986,00:18
7,0.033331,0.077717,0.97373,00:18
8,0.018448,0.069668,0.983362,00:18
9,0.021099,0.06196,0.982487,00:17


[I 2024-05-24 23:40:21,627] Trial 7 finished with value: 0.9851138591766357 and parameters: {'arch': <function resnet34 at 0x7a55d092f6d0>, 'wd': 0.004016134981130656, 'epochs': 13, 'bs': 32, 'drop': 0.2679709587877348}. Best is trial 7 with value: 0.9851138591766357.


epoch,train_loss,valid_loss,accuracy,time
0,0.620788,0.288412,0.895797,00:22


epoch,train_loss,valid_loss,accuracy,time
0,0.301431,0.195068,0.93345,00:28
1,0.145843,0.167635,0.943958,00:28
2,0.085029,0.167764,0.947461,00:28
3,0.083989,0.15637,0.952715,00:28
4,0.054462,0.123673,0.958844,00:27
5,0.03449,0.124303,0.966725,00:27
6,0.034632,0.127395,0.960595,00:27
7,0.023373,0.12262,0.965849,00:27
8,0.017231,0.114999,0.971103,00:27
9,0.012671,0.110276,0.970228,00:27


[I 2024-05-24 23:47:49,546] Trial 8 finished with value: 0.9667250514030457 and parameters: {'arch': <function resnet50 at 0x7a55d092f880>, 'wd': 0.008095083336439402, 'epochs': 15, 'bs': 32, 'drop': 0.30129588154296516}. Best is trial 7 with value: 0.9851138591766357.


epoch,train_loss,valid_loss,accuracy,time
0,0.643114,0.311658,0.901051,00:22


epoch,train_loss,valid_loss,accuracy,time
0,0.260481,0.190513,0.932574,00:27
1,0.164268,0.152902,0.950963,00:27
2,0.076222,0.153643,0.95359,00:27
3,0.045928,0.119837,0.966725,00:27
4,0.019451,0.116562,0.965849,00:27
5,0.01966,0.117123,0.964098,00:27


[I 2024-05-24 23:51:04,891] Trial 9 finished with value: 0.9632224440574646 and parameters: {'arch': <function resnet50 at 0x7a55d092f880>, 'wd': 0.0006994648882504557, 'epochs': 6, 'bs': 32, 'drop': 0.3595360164107316}. Best is trial 7 with value: 0.9851138591766357.


Best Accuracy: 0.9851138591766357
Best hyperparameters: {'arch': <function resnet34 at 0x7a55d092f6d0>, 'wd': 0.004016134981130656, 'epochs': 13, 'bs': 32, 'drop': 0.2679709587877348}
