# Hyper Parameters Tuning

One of the basic block of neural network and harder to work on. This notebook is part of my article @analytics_vidhya to provide a hand's on guide to work on the problem

# Solution

The solution discussed here is very simple. Here are the steps:

BASIC:
- Loading Libraries And 
- Loading Dataset & Preprocessing - mnist 
- Creating a model fn & Quick Evaluation

Tuned:
- Creating h_param_dict
- Fitting To estimator
- Finding Best Model & Training
- Evaluation 

# Basic

In [1]:
# loading libraries
import tensorflow
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Dropout, Dense
from tensorflow.keras.optimizers import Adam

In [2]:
print(tensorflow.test.is_gpu_available())
print( tensorflow.test.is_built_with_cuda())

Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
False
True


In [3]:
# Loading Dataset and preprocessing
from tensorflow.keras.datasets import mnist

# loading dataset
(train_images, train_labels),(test_images, test_labels) = mnist.load_data()

# conversion +  rescaling 
train_data = train_images.astype("float32") / 255
test_data = test_images.astype("float32")/255\

print(train_data.shape , test_data.shape)

(60000, 28, 28) (10000, 28, 28)


In [4]:
# creating model fn 
def create_model(hidden_layer_one = 784, hidden_layer_two = 256, dropout = 0.2,  lr_rate = 0.01):
    
    # intializing a sequential model+flattening the input
    model = Sequential()
    model.add(Flatten())
    
    # creatin our first FFC layer - Dense => Relu => Dropout
    model.add(Dense(hidden_layer_one, activation = 'relu'))
    model.add(Dropout(dropout))
    
    # Creating 2nd FCC layer - Dense => Relu => Dropout
    model.add(Dense(hidden_layer_two, activation = 'relu'))
    model.add(Dropout(dropout))
    
    # adding a sofmax layer on top
    model.add(Dense(10, activation = 'softmax'))
    
    # compiling model
    model.compile(optimizer= Adam(learning_rate = lr_rate ),
                  loss = "sparse_categorical_crossentropy",
                  metrics=['accuracy'])
    
    # returnig compiled model
    return model
    

In [5]:
# evaluation - Basic

print("fetching model...")
model = create_model()

print("completed...")

print("training model")
h = model.fit(x = train_data, y= train_labels,
             validation_data = (test_data, test_labels),
             batch_size = 8,
             epochs = 20)

# make predictions on the test set and evaluate it
print("evaluating network...")
accuracy = model.evaluate(test_data, test_labels)[1]

print("accuracy: {:.2f}%".format(accuracy * 100))

fetching model...
completed...
training model
Train on 60000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
evaluating network...


accuracy: 76.50%


**Base line Accuracy** 

# Tuned

In [7]:
# import tensorflow and fix the random seed for better reproducibility
import tensorflow as tf
tf.random.set_seed(42)

# import the necessary packages
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import RandomizedSearchCV


In [15]:
# creating h-param dict
# define a grid of the hyperparameter search space

HL1 = [256, 512, 784] # num_units 1
HL2 = [128, 256, 512] # num_units 2
LR = [1e-2, 1e-3, 1e-4] # learning rate
DROP = [0.3, 0.4, 0.5] # dropout rate
BATCH_SZ = [4, 8, 16, 32] # batch size
EPOCHS = [10, 20, 30, 40] # epochs

# create a dictionary from the hyperparameter grid
grid = dict(
    hidden_layer_one = HL1,
    hidden_layer_two = HL2,
    dropout = DROP,
    batch_size = BATCH_SZ,
    epochs = EPOCHS)


In [26]:
model = KerasClassifier(build_fn=create_model, verbose=0)


# initialize a random search with a 3-fold cross-validation and then
# start the hyperparameter search process
print("[INFO] performing random search...")

searcher = RandomizedSearchCV(estimator=model, n_jobs=1, cv=3,
                             param_distributions=grid, scoring="accuracy")

#finding optimal values
searchResults = searcher.fit(train_data, train_labels, verbose = 10)

[INFO] performing random search...
Train on 40000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Train on 40000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Train on 40000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Train on 40000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/

Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40
Train on 40000 samples
Epoch 1/40
Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40
Train on 40000 samples
Epoch 1/40
Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40


In [35]:
# summarize grid search information
best_score = searchResults.best_score_
best_params = searchResults.best_params_

print("[INFO] best score is {:.2f} using {}".format(best_score,best_params))

[INFO] best score is 0.94 using {'hidden_layer_two': 128, 'hidden_layer_one': 512, 'epochs': 40, 'dropout': 0.4, 'batch_size': 32}


In [48]:
# grabbing the best model
best_model = searchResults.best_estimator_
# checking the accuracy
accuracy = best_model.score(test_data,test_labels)
print("accuracy: {:.2f}%".format(accuracy * 100))

accuracy: 94.83%


A huge leap

However for faster search consider using :

```
#explicitly require this experimental feature
from sklearn.experimental import enable_halving_search_cv # noqa

#now you can import normally from model_selection
from sklearn.model_selection import HalvingRandomSearchCV
```
