# A Quick Guide to Hyperparameter Tuning in Deep Learning
Deep learning models, such as neural networks, have achieved remarkable success in various applications, from image classification to natural language processing. However, getting the best performance out of these models often requires careful tuning of hyperparameters — key settings that govern how the learning process unfolds. Unlike model parameters (like weights), which the model learns automatically during training, hyperparameters must be set before training starts. Let’s explore what hyperparameter tuning is and why it’s crucial.

## What Are Hyperparameters?
Hyperparameters control different aspects of the model's training process. Some common examples include:

 - Learning Rate: Controls how much to adjust the model’s weights with respect to the loss gradient. A small learning rate may lead to slow convergence, while a large one can overshoot the optimal values.
- Batch Size: Determines how many samples are processed before updating the model's weights. Smaller batch sizes make the model update weights more frequently, while larger batches provide more stable estimates of the gradient.
- Number of Layers and Neurons: Defines the depth of the neural network and how many neurons are present in each layer.
- Dropout Rate: Regularization technique that randomly "drops" a fraction of neurons during training to prevent overfitting.
- Optimizer: The algorithm used to minimize the loss function, e.g., SGD, Adam, RMSprop.

## Why Hyperparameter Tuning Matters
Even with the best model architecture, poor hyperparameter choices can lead to suboptimal performance or even failure to converge. Tuning the right hyperparameters can significantly improve model accuracy, speed up convergence, and help avoid overfitting.

## Popular Hyperparameter Tuning Methods
- Grid Search: This method exhaustively searches through a manually specified subset of hyperparameter combinations. Though simple, it is computationally expensive.

- Random Search: Instead of trying all combinations, random search samples a fixed number of random combinations. It can often find good solutions faster than grid search.

- Bayesian Optimization: This method models the performance of hyperparameter settings as a probabilistic function and iteratively chooses the next combination to evaluate based on past results. It’s more efficient than grid and random search, especially for complex models.

- Hyperband: A resource-efficient method that dynamically allocates more resources to promising hyperparameter configurations while early-stopping less effective ones.

## Best Practices for Hyperparameter Tuning
- Start Simple: Begin with default hyperparameters and gradually refine them based on model performance.
- Use Cross-Validation: To ensure that your tuning process generalizes well to unseen data, use cross-validation to validate the performance.
- Monitor Overfitting: Use techniques like early stopping, dropout, or L2 regularization to prevent overfitting during tuning.
- Automation: Frameworks like Keras Tuner, Optuna, and Hyperopt can help automate the hyperparameter tuning process, saving time and computational resources.
 
## Conclusion
Hyperparameter tuning is a critical step in maximizing the performance of deep learning models. By carefully selecting and optimizing the right hyperparameters, you can significantly enhance your model’s accuracy and efficiency, leading to better and faster results in real-world applications.



# Learn how to do Hyperparameter Tuning in Neural Network
our purpose is not that to make perfect model, i want to learn **how to** hyperparameter tuning using Keras

In [1]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')



import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

/kaggle/input/pima-indians-diabetes-database/diabetes.csv


In [2]:
df = pd.read_csv("/kaggle/input/pima-indians-diabetes-database/diabetes.csv")

In [3]:
df.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [4]:
df.corr()['Outcome']

Pregnancies                 0.221898
Glucose                     0.466581
BloodPressure               0.065068
SkinThickness               0.074752
Insulin                     0.130548
BMI                         0.292695
DiabetesPedigreeFunction    0.173844
Age                         0.238356
Outcome                     1.000000
Name: Outcome, dtype: float64

In [5]:
X = df.iloc[:,:-1].values
y = df.iloc[:,-1].values

In [6]:
X

array([[  6.   , 148.   ,  72.   , ...,  33.6  ,   0.627,  50.   ],
       [  1.   ,  85.   ,  66.   , ...,  26.6  ,   0.351,  31.   ],
       [  8.   , 183.   ,  64.   , ...,  23.3  ,   0.672,  32.   ],
       ...,
       [  5.   , 121.   ,  72.   , ...,  26.2  ,   0.245,  30.   ],
       [  1.   , 126.   ,  60.   , ...,  30.1  ,   0.349,  47.   ],
       [  1.   ,  93.   ,  70.   , ...,  30.4  ,   0.315,  23.   ]])

In [8]:
# scalling data
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

In [9]:
X = scaler.fit_transform(X)

In [10]:
X.shape

(768, 8)

In [11]:
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test  = train_test_split(X,y,test_size = 0.2,random_state=1)

In [12]:
import tensorflow
from tensorflow import keras
from keras import Sequential
from keras.layers import Dense,Dropout

In [13]:
model = Sequential()

model.add(Dense(32,activation = 'relu',input_dim=8))
model.add(Dense(1,activation = 'sigmoid'))


In [14]:
model.compile(optimizer = 'Adam',loss = 'binary_crossentropy',metrics = ['accuracy'])

In [15]:
model.fit(X_train,y_train,batch_size = 32,epochs=100,validation_data=(X_test,y_test))

Epoch 1/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 12ms/step - accuracy: 0.4050 - loss: 0.7903 - val_accuracy: 0.5779 - val_loss: 0.6998
Epoch 2/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.6274 - loss: 0.6769 - val_accuracy: 0.6623 - val_loss: 0.6413
Epoch 3/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.6878 - loss: 0.6360 - val_accuracy: 0.7662 - val_loss: 0.6007
Epoch 4/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.7167 - loss: 0.6042 - val_accuracy: 0.7792 - val_loss: 0.5726
Epoch 5/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.7773 - loss: 0.5721 - val_accuracy: 0.7987 - val_loss: 0.5503
Epoch 6/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.7680 - loss: 0.5423 - val_accuracy: 0.8117 - val_loss: 0.5318
Epoch 7/100
[1m20/20[0m [32m━━

<keras.src.callbacks.history.History at 0x7cab93676c80>

In [16]:
# 1. how to select appropriate optimizer
# 2. how to select number of node in hidden layer
# 3. how to select number of layers
# 4. All in all one model

In [17]:
import keras_tuner as kt

# 1. How to Select Appropriate Optimizer:

In [18]:
def build_model(hp):
    model = Sequential()

    model.add(Dense(32,activation='relu',input_dim=8))
    model.add(Dense(1,activation='sigmoid'))
    
    optimizer = hp.Choice('optimizer',values = ['adam','sgd','rmsprop','adadelta'])
    model.compile(optimizer=optimizer,loss = 'binary_crossentropy',metrics=['accuracy'])
    return model

In [19]:
tuner = kt.RandomSearch(build_model,objective='val_accuracy',max_trials=5)

In [20]:
tuner.search(X_train,y_train,epochs=5,validation_data = (X_test,y_test))

Trial 4 Complete [00h 00m 02s]
val_accuracy: 0.701298713684082

Best val_accuracy So Far: 0.798701286315918
Total elapsed time: 00h 00m 08s


In [21]:
tuner.get_best_hyperparameters()[0].values

{'optimizer': 'adam'}

In [22]:
model = tuner.get_best_models(num_models=1)[0]

In [23]:
model.summary()

In [24]:
model.fit(X_train,y_train,batch_size=32,epochs=100,initial_epoch=6,validation_data=(X_test,y_test))

Epoch 7/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 12ms/step - accuracy: 0.7795 - loss: 0.4953 - val_accuracy: 0.7922 - val_loss: 0.4756
Epoch 8/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.7663 - loss: 0.5028 - val_accuracy: 0.7922 - val_loss: 0.4696
Epoch 9/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.7621 - loss: 0.4897 - val_accuracy: 0.7857 - val_loss: 0.4660
Epoch 10/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.7911 - loss: 0.4504 - val_accuracy: 0.7857 - val_loss: 0.4645
Epoch 11/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.7776 - loss: 0.4709 - val_accuracy: 0.7922 - val_loss: 0.4636
Epoch 12/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.7850 - loss: 0.4738 - val_accuracy: 0.7922 - val_loss: 0.4617
Epoch 13/100
[1m20/20[0m [3

<keras.src.callbacks.history.History at 0x7cabe69a5960>

# 2. How to Select Number of Node in Layer :

In [25]:
def build_model(hp):
    model = Sequential()

    units = hp.Int('units',min_value=8,max_value = 128,step=8)
    model.add(Dense(units=units,activation='relu',input_dim=8))
    model.add(Dense(1,activation='sigmoid'))

    model.compile(optimizer='rmsprop',loss = 'binary_crossentropy',metrics=['accuracy'])
    return model   
    

In [26]:
tuner = kt.RandomSearch(build_model,
                        objective='val_accuracy',
                        max_trials=5,
                        directory='mydir',
                       project_name='Vijay Shah')

In [27]:
tuner.search(X_train,y_train,epochs=5,validation_data=(X_test,y_test))

Trial 5 Complete [00h 00m 02s]
val_accuracy: 0.6363636255264282

Best val_accuracy So Far: 0.798701286315918
Total elapsed time: 00h 00m 10s


In [28]:
tuner.get_best_hyperparameters()[0].values

{'units': 128}

In [29]:
model = tuner.get_best_models(num_models=1)[0]

In [30]:
model.fit(X_train,y_train,batch_size=32,epochs=100,initial_epoch=6)

Epoch 7/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.7731 - loss: 0.4649   
Epoch 8/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.7587 - loss: 0.4982 
Epoch 9/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7687 - loss: 0.4698 
Epoch 10/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.8167 - loss: 0.4151 
Epoch 11/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.7942 - loss: 0.4313 
Epoch 12/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.7795 - loss: 0.4595 
Epoch 13/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.7631 - loss: 0.4505 
Epoch 14/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.7551 - loss: 0.4764 
Epoch 15/100
[1m20/20[0m [32m━

<keras.src.callbacks.history.History at 0x7cab781eb850>

# 3. How to select number of layer:

In [31]:
def build_model(hp):
    model = Sequential()

    model.add(Dense(72,activation='relu',input_dim=8))
    for i in range(hp.Int('num_layers',min_value=1,max_value=10)):
        model.add(Dense(72,activation='relu'))

    model.add(Dense(1,activation='sigmoid'))
    model.compile(optimizer = 'rmsprop',loss = 'binary_crossentropy',metrics=['accuracy'])
    return model

In [32]:
tuner = kt.RandomSearch(build_model,
                       objective='val_accuracy',
                       max_trials=3,
                       directory='mydir',
                       project_name='num_layers')

In [33]:
tuner.search(X_train,y_train,epochs=5,validation_data=(X_test,y_test))

Trial 3 Complete [00h 00m 02s]
val_accuracy: 0.8116883039474487

Best val_accuracy So Far: 0.8311688303947449
Total elapsed time: 00h 00m 09s


In [34]:
tuner.get_best_hyperparameters()[0].values

{'num_layers': 8}

In [35]:
model = tuner.get_best_models(num_models=1)[0]

In [36]:
model.fit(X_train,y_train,epochs=100, initial_epoch=6,validation_data=(X_test,y_test))

Epoch 7/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 15ms/step - accuracy: 0.7798 - loss: 0.4689 - val_accuracy: 0.7922 - val_loss: 0.5199
Epoch 8/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.7925 - loss: 0.4577 - val_accuracy: 0.8052 - val_loss: 0.4952
Epoch 9/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.7818 - loss: 0.4597 - val_accuracy: 0.7143 - val_loss: 0.5564
Epoch 10/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.7939 - loss: 0.4279 - val_accuracy: 0.7987 - val_loss: 0.4910
Epoch 11/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.8234 - loss: 0.4357 - val_accuracy: 0.8052 - val_loss: 0.4754
Epoch 12/100
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.7918 - loss: 0.3943 - val_accuracy: 0.7857 - val_loss: 0.5111
Epoch 13/100
[1m20/20[0m [3

<keras.src.callbacks.history.History at 0x7cab5b2f2260>

# 4. Build Complete Model:

In [37]:
def build_model(hp):
    model = Sequential()
    counter= 0
   
    for i in range(hp.Int('num_layer',min_value=1,max_value=10)):
        if counter==0:
            model.add(
                Dense
                (hp.Int('units' + str(i),min_value=8,max_value=128,step=8),
                 activation= hp.Choice('activation' + str(i),values = ['relu','tanh','sigmoid']),
                  input_dim = 8
                )
            )
            model.add(Dropout(hp.Choice('dropout' + str(i),values=[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9])))
        else:
            model.add(
                Dense
                (hp.Int('units' + str(i),min_value=8,max_value=128,step=8),
                 activation= hp.Choice('activation' + str(i),values = ['relu','tanh','sigmoid']),
                
                )
            )
        counter+=1
    model.add(Dense(1,activation = 'sigmoid'))
    model.compile(optimizer = hp.Choice('optimizer',values=['rmsprop','adam','sgd','nadam','adadelta']),
                 loss = 'binary_crossentropy',
                 metrics=['accuracy'])
    return model

In [38]:
tuner = kt.RandomSearch(build_model,
                         objective='val_accuracy',
                         max_trials=3,
                         directory= 'mydir',
                         project_name = 'final1'
                        )

In [39]:
tuner.search(X_train,y_train,epochs=5,validation_data=[X_test,y_test])

Trial 3 Complete [00h 00m 02s]
val_accuracy: 0.6428571343421936

Best val_accuracy So Far: 0.798701286315918
Total elapsed time: 00h 00m 08s


In [40]:
tuner.get_best_hyperparameters()[0].values

{'num_layer': 3,
 'units0': 48,
 'activation0': 'relu',
 'dropout0': 0.2,
 'optimizer': 'rmsprop',
 'units1': 8,
 'activation1': 'relu',
 'units2': 8,
 'activation2': 'relu'}

In [41]:
model = tuner.get_best_models(num_models=1)[0]

In [42]:
model.fit(X_train,y_train,epochs=200,initial_epoch=5,validation_data=(X_test,y_test))

Epoch 6/200
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step - accuracy: 0.7000 - loss: 0.6553 - val_accuracy: 0.7987 - val_loss: 0.6240
Epoch 7/200
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.7883 - loss: 0.6337 - val_accuracy: 0.8052 - val_loss: 0.6174
Epoch 8/200
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.7628 - loss: 0.6463 - val_accuracy: 0.7987 - val_loss: 0.6133
Epoch 9/200
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.7770 - loss: 0.6206 - val_accuracy: 0.7857 - val_loss: 0.6068
Epoch 10/200
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.7752 - loss: 0.6216 - val_accuracy: 0.7922 - val_loss: 0.6020
Epoch 11/200
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.7724 - loss: 0.6139 - val_accuracy: 0.7857 - val_loss: 0.5972
Epoch 12/200
[1m20/20[0m [32

<keras.src.callbacks.history.History at 0x7cab5ae3a440>