# Optimizing Neural Network Hyperparameters

We will consider again an Image Recognition problem with the MNIST dataset (Images of 28 x 28 pixels). 
The MNIST dataset has a training set of 60,000 images and a test set of 10,000 images. 
The digits have been sized-normalized and centered in a fixed-size image.
It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

In this Notebook you will follow four different approaches to tune the hyperparameters:
- "Trial and Error" approach
- "Grid Search" Hyperparameter optimization with Scikit-Learn wrapper
- "Random Search" Hyperparameter optimization with Scikit-Learn wrapper
- Bayesian Optimization using Hyperopt

### Data Preparation

In [5]:
from keras.datasets import mnist
import numpy as np

(train_images, train_labels), (test_images, test_labels) = mnist.load_data() # train_x, train_y, test_x, test_y
# x values are the 28x28 pixels
# y values are the labels, the digit that is being represented in the image. We have 10 labels. Digits from 0 to 9.


In [6]:
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype('float32') / 255

In [7]:
# We need to encode the categorical labels in one-hot format

from tensorflow.keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

In [8]:
# We split the training set into training and validation set
# 30% validation, 70% training

split_size = int(train_images.shape[0]*0.7)

train_images, val_images = train_images[:split_size], train_images[split_size:]
train_labels, val_labels = train_labels[:split_size], train_labels[split_size:]

In [9]:
# Fix random seed for reproducibility
seed = 42
np.random.seed(seed)

### Building the Model

Build a neural network with 3 layers, input, hidden and output:
- Dense layer with 50 hidden units and an appropriate activation function;
- Dense layer with 10 output units and appropriate activation function;

First we define some useful parameters:

In [10]:
# define vars
input_num_units = 784 # 28x28
hidden_num_units = 50
output_num_units = 10

epochs = 5
batch_size = 128

**Import Keras packages that you think may need.**

In [11]:
# Import Keras modules
from tensorflow.keras import models
from tensorflow.keras import layers
from tensorflow.keras import optimizers

**Create the model as described above.**

In [12]:
# Create a Sequential Model
model = models.Sequential()
model.add(layers.Dense(units=hidden_num_units, activation='relu', input_shape=(input_num_units,)))
model.add(layers.Dense(units=output_num_units, activation='softmax'))

# Print a summary of your model
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 50)                39250     
                                                                 
 dense_1 (Dense)             (None, 10)                510       
                                                                 
Total params: 39,760
Trainable params: 39,760
Non-trainable params: 0
_________________________________________________________________


In [13]:
# or Create a functional model
from keras import Input
from keras import Model

inputs = Input(shape=(input_num_units,))
x = layers.Dense(hidden_num_units, activation='relu')(inputs)
outputs = layers.Dense(output_num_units, activation='softmax')(x)
func_model = Model(inputs=inputs, outputs=outputs)

# Print a summary of your model
# Here the input layer is shown in the summary
func_model.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 784)]             0         
                                                                 
 dense_2 (Dense)             (None, 50)                39250     
                                                                 
 dense_3 (Dense)             (None, 10)                510       
                                                                 
Total params: 39,760
Trainable params: 39,760
Non-trainable params: 0
_________________________________________________________________


**Configure the model with an optimizer and an appropriate loss function.**

In [10]:
# Compile the model with necessary attributes
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

**Train the model (also with the validation set).**

In [11]:
# Train the model
trained_model = model.fit(train_images, train_labels, epochs=epochs, 
                          batch_size=batch_size, validation_data=(val_images, val_labels))

Epoch 1/5
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


### Model Evaluation

**Test the model: get predictions and evaluate the model.**

In [12]:
from sklearn.metrics import confusion_matrix,ConfusionMatrixDisplay

pred = np.argmax(model.predict(test_images), axis=-1)
label = np.argmax(test_labels, axis = 1)

confusion = confusion_matrix(label, pred)
print(confusion)

Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
[[ 967    0    1    3    1    1    2    2    2    1]
 [   0 1116    4    0    1    1    4    2    7    0]
 [   3    3  981    8    5    2    4    7   19    0]
 [   1    0    9  958    0    8    0    9   17    8]
 [   1    0    6    0  949    0    2    2    3   19]
 [   5    1    0   19    6  823   11    3   20    4]
 [   8    3    1    2    6    9  920    2    7    0]
 [   0    8   15    6    3    1    0  972    3   20]
 [   4    3    2    9    6    4    5    9  927    5]
 [   5    6    1   12   18    2    0   12    5  948]]


In [13]:
test_loss, test_acc = model.evaluate(test_images, test_labels) 



In [14]:
print('test_acc:', test_acc)

test_acc: 0.9560999870300293


Let's try to improve it by tuning some Hyperparameters.

### Hyperparameters Optimization - Trial and Error

Some important parameters to look out while optimizing neural networks are:
- Type of architecture
- Number of layers
- Number of neurons per layer
- Regularization parameters
- Learning rate
- Type of optimization/backpropagation technique
- Dropout rate
- Weight sharing

Now repeat all the previous steps (train, test etc..) but tuning the following parameters:
1. Make the model "wide": Increase the number of neurons in the hidden layer; 
2. Make the model "deep": Increase the number of hidden layers neurons each;
3. Dropout to deal with Overfitting;
4. Increase Epochs to 50;
5. Both "wide" and "deep": more hidden layers, each with more than 50 neurons

After every step, analyse your results and draw some conclusions.

### 1. Make the model "wide": increase number of neurons in the hidden layer.

**Define the new variables.**

In [15]:
input_num_units = 784
hidden_num_units = 500
output_num_units = 10
epochs = 5
batch_size = 128

**Build the network.**

In [16]:
model_1  = models.Sequential()
model_1.add(layers.Dense(units=hidden_num_units, activation='relu', input_shape=(input_num_units,)))
model_1.add(layers.Dense(units=output_num_units, activation='softmax'))

**Configure the network.**

In [17]:
model_1.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

**Train the network.**

In [18]:
trained_model_1 = model_1.fit(train_images, train_labels, epochs=epochs, batch_size=batch_size, validation_data=(val_images, val_labels))

Epoch 1/5
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


Considerations: what can you notice from these results? Is your model performing better than before?

**Results**: the model should perform a lot better than before.

### 2. Make the model "deep": Increase the number of hidden layers.

**Define the new variables.**

In [19]:
input_num_units = 784
hidden1_num_units = 50
hidden2_num_units = 50
hidden3_num_units = 50
hidden4_num_units = 50
hidden5_num_units = 50
output_num_units = 10

epochs = 5
batch_size = 128

**Build the network.**

In [20]:
model_2 = models.Sequential()

model_2.add(layers.Dense(units=hidden1_num_units, input_shape=(input_num_units,), activation='relu'))
model_2.add(layers.Dense(units=hidden2_num_units, activation='relu'))
model_2.add(layers.Dense(units=hidden3_num_units, activation='relu'))
model_2.add(layers.Dense(units=hidden4_num_units, activation='relu'))
model_2.add(layers.Dense(units=hidden5_num_units, activation='relu'))

model_2.add(layers.Dense(units=output_num_units, activation='softmax'))

model_2.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_6 (Dense)              (None, 50)                39250     
_________________________________________________________________
dense_7 (Dense)              (None, 50)                2550      
_________________________________________________________________
dense_8 (Dense)              (None, 50)                2550      
_________________________________________________________________
dense_9 (Dense)              (None, 50)                2550      
_________________________________________________________________
dense_10 (Dense)             (None, 50)                2550      
_________________________________________________________________
dense_11 (Dense)             (None, 10)                510       
Total params: 49,960
Trainable params: 49,960
Non-trainable params: 0
__________________________________________________

**Configure the network.**

In [21]:
model_2.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

**Train the network.**

In [22]:
trained_model_2 = model_2.fit(train_images, train_labels, epochs=epochs, 
                              batch_size=batch_size, validation_data=(val_images, val_labels))

Epoch 1/5
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


Considerations: what can you notice from these results? Is your model performing better than before?

**Results**: the model is performing slightly better than the orgininal model but worse than the second model. This may be due to a little bit of overfitting. To deal with this, we will use the dropout technique.

### 3. Dropout

<img src="images/dropout.png" width='500'/>

**Define the new variables, remember to define also the dropout_ratio.**

In [19]:
input_num_units = 784
hidden1_num_units = 50
hidden2_num_units = 50
hidden3_num_units = 50
hidden4_num_units = 50
hidden5_num_units = 50
output_num_units = 10

epochs = 5
batch_size = 128

dropout_ratio = 0.2

**Build the network.**

In [20]:
model_3 = models.Sequential()

model_3.add(layers.Dense(units=hidden1_num_units, input_shape=(input_num_units,), activation='relu'))
model_3.add(layers.Dropout(dropout_ratio))
model_3.add(layers.Dense(units=hidden2_num_units, activation='relu'))
model_3.add(layers.Dropout(dropout_ratio))
model_3.add(layers.Dense(units=hidden3_num_units, activation='relu'))
model_3.add(layers.Dropout(dropout_ratio))
model_3.add(layers.Dense(units=hidden4_num_units, activation='relu'))
model_3.add(layers.Dropout(dropout_ratio))
model_3.add(layers.Dense(units=hidden5_num_units, activation='relu'))
model_3.add(layers.Dropout(dropout_ratio))

model_3.add(layers.Dense(output_num_units, activation='softmax'))

model_3.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_6 (Dense)              (None, 50)                39250     
_________________________________________________________________
dropout (Dropout)            (None, 50)                0         
_________________________________________________________________
dense_7 (Dense)              (None, 50)                2550      
_________________________________________________________________
dropout_1 (Dropout)          (None, 50)                0         
_________________________________________________________________
dense_8 (Dense)              (None, 50)                2550      
_________________________________________________________________
dropout_2 (Dropout)          (None, 50)                0         
_________________________________________________________________
dense_9 (Dense)              (None, 50)               

**Configure the network.**

In [21]:
model_3.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

**Train the network.**

In [22]:
trained_model_3 = model_3.fit(train_images, train_labels, epochs=epochs, batch_size=batch_size, validation_data=(val_images, val_labels))

Epoch 1/5
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


Considerations: what can you notice from these results? Is your model improving?

**Results**: There seems to be some problems, the model is not improving. One possible reason may be that we are not using enough epochs to train the model. Let's try to increase the number of training epochs.

### 4. Increase training Epochs to 50.

This will take a while.

**Define the new variables.**

In [23]:
input_num_units = 784
hidden1_num_units = 50
hidden2_num_units = 50
hidden3_num_units = 50
hidden4_num_units = 50
hidden5_num_units = 50
output_num_units = 10

epochs = 50
batch_size = 128

**Build the network.**

In [24]:
model_4 = models.Sequential()

model_4.add(layers.Dense(units=hidden1_num_units, input_shape=(input_num_units, ), activation='relu'))
model_4.add(layers.Dropout(0.2))
model_4.add(layers.Dense(units=hidden2_num_units, activation='relu'))
model_4.add(layers.Dropout(0.2))
model_4.add(layers.Dense(units=hidden3_num_units, activation='relu'))
model_4.add(layers.Dropout(0.2))
model_4.add(layers.Dense(units=hidden4_num_units, activation='relu'))
model_4.add(layers.Dropout(0.2))
model_4.add(layers.Dense(units=hidden5_num_units, activation='relu'))
model_4.add(layers.Dropout(0.2))

model_4.add(layers.Dense(units=output_num_units, activation='softmax'))

model_4.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_12 (Dense)             (None, 50)                39250     
_________________________________________________________________
dropout_5 (Dropout)          (None, 50)                0         
_________________________________________________________________
dense_13 (Dense)             (None, 50)                2550      
_________________________________________________________________
dropout_6 (Dropout)          (None, 50)                0         
_________________________________________________________________
dense_14 (Dense)             (None, 50)                2550      
_________________________________________________________________
dropout_7 (Dropout)          (None, 50)                0         
_________________________________________________________________
dense_15 (Dense)             (None, 50)               

**Configure the network.**

In [25]:
model_4.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

**Train the network.**

In [26]:
trained_model_4 = model_4.fit(train_images, train_labels, epochs=epochs, 
                              batch_size=batch_size, validation_data=(val_images, val_labels))

Epoch 1/50
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/

Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


Considerations: what can you notice from these results? Has the accuracy increased?

**Results**: Now seems better, there is an increase in the accuracy.

### 5. Make the model "wide" and "deep": more hidden layers, each with more than 50 neurons.

**Define the new variables.**

In [27]:
input_num_units = 784
hidden1_num_units = 500
hidden2_num_units = 500
hidden3_num_units = 500
hidden4_num_units = 500
hidden5_num_units = 500
output_num_units = 10

epochs = 25
batch_size = 128

**Build the network.**

In [28]:
model_5 = models.Sequential()

model_5.add(layers.Dense(units=hidden1_num_units, input_shape=(input_num_units,), activation='relu'))
model_5.add(layers.Dropout(0.2))
model_5.add(layers.Dense(units=hidden2_num_units,  activation='relu'))
model_5.add(layers.Dropout(0.2))
model_5.add(layers.Dense(units=hidden3_num_units,  activation='relu'))
model_5.add(layers.Dropout(0.2))
model_5.add(layers.Dense(units=hidden4_num_units,  activation='relu'))
model_5.add(layers.Dropout(0.2))
model_5.add(layers.Dense(units=hidden5_num_units, activation='relu'))
model_5.add(layers.Dropout(0.2))

model_5.add(layers.Dense(units=output_num_units,  activation='softmax'))

**Configure the network.**

In [29]:
model_5.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

**Train the network. Use 25 epochs if 50 takes too long.**

In [30]:
trained_model_5 = model_5.fit(train_images, train_labels, epochs=epochs, 
                              batch_size=batch_size, validation_data=(val_images, val_labels))

Epoch 1/25
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/

Considerations: what do you think about your final model?
It seems that we obtained our final solution model. Let's evaluate it with some predictions.

**Make some predictions and Evaluate the network.**

In [31]:
from sklearn.metrics import confusion_matrix

pred = np.argmax(model_5.predict(test_images), axis=-1)
label = np.argmax(test_labels, axis = 1)

confusion = confusion_matrix(label, pred)
print(confusion)

Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
[[ 974    0    1    1    0    0    2    1    1    0]
 [   0 1126    2    1    0    2    1    0    3    0]
 [   3    0 1014    1    2    0    3    6    3    0]
 [   1    0    2  999    0    2    0    3    3    0]
 [   1    2    1    1  960    0    7    2    1    7]
 [   3    0    0    3    1  877    2    1    3    2]
 [   4    4    0    1    1    3  943    0    2    0]
 [   0    4    8    4    0    0    0 1007    3    2]
 [   5    1    4    3    2    1    1    2  953    2]
 [   3    6    0    6   12    2    2    4    2  972]]


In [32]:
test_loss_final, test_acc_final = model_5.evaluate(test_images, test_labels) 



In [33]:
print('test_acc:', test_acc_final)

test_acc: 0.9825000166893005


**Results**: Finally we get our very good model!

### Hyperparameters Optimization - Grid Search

Instead of proceeding with a "trial and error" approach we can also use GridSearch to combine all the hyperparameters we want to tune. The Grid search works by trying every possible combination of parameters you want to try, therefore it will take a lot of time to perform the entire search.

Grid search is a simple and naive, yet effective method to automate the hyperparameter tuning:

* First, you define the set of hyperparameters you want to tune, e.g. $\{neurons, learning\_rate, regularization\_strength\}$.

* For each hyperparameter, you then define a set of possible values, e.g. $neurons = \{50, 100, 250, 500\}$.

* Then, you train a model for every possible combination of these hyperparameter values and afterwards select the combination that works best (e.g. in terms of accuracy on your validation set).

What you have to do is to use Sequential() model in Keras as a part of the Scikit-Learn workflow via the wrappers. 
Check out how this [workflow](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html?highlight=gridsearch#sklearn.model_selection.GridSearchCV) works.
**Note**: to keep things simple for the beginning, it'll be enough to just focus on the hyperparameters `neurons`

Please note that without GPU is extremely time consuming to tune all the hyperparameters in one shoot, by using an appropriate number of epochs. For this reason, in this example the idea is for you to understand how you can use GridSerach with Keras Model, but you will probably not be able to obtain an excellent model.

For this reason, try to tune the number of neurons in the hidden layers (more than one hidden layer) with just 5-10 epochs.

**Import GridSearchCV and KerasClassifier.**

In [34]:
import numpy
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier
from tensorflow.keras.constraints import MaxNorm

In [35]:
input_num_units = 784
output_num_units = 10

**Create a function called `create_model` in which you build your KerasClassifier with number of hidden units equal to a general variable (you can call this `neurons`). Inside the function you should then also compile the model.**

In [36]:
# Function to create model, required for KerasClassifier
def create_model(neurons=1):
  
    model = models.Sequential()  
    model.add(layers.Dense(units=neurons, input_shape=(input_num_units,), activation='relu'))
    model.add(layers.Dropout(0.2))
    model.add(layers.Dense(units=neurons, activation='relu'))
    model.add(layers.Dropout(0.2))
    model.add(layers.Dense(units=neurons, activation='relu'))
    model.add(layers.Dropout(0.2))
    model.add(layers.Dense(units=neurons, activation='relu'))
    model.add(layers.Dropout(0.2))
    model.add(layers.Dense(units=neurons, activation='relu'))
    model.add(layers.Dropout(0.2))
    model.add(layers.Dense(units=output_num_units, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

**Create a model wrapper.**

In [37]:
clf = KerasClassifier(build_fn=create_model, epochs=5, batch_size=32, verbose=1)

**Create a dictionary of parameters grid for the number of neurons in the hidden layer.**

In [38]:
neurons = [50, 100, 250, 500]
param_grid = dict(neurons=neurons) 

In [39]:
param_grid

{'neurons': [50, 100, 250, 500]}

**Grid Search: use `GridSearchCV`with the model you have obtained from the wrapper as estimator and the dictionary you have just created as param_grid.**

In [40]:
grid = GridSearchCV(estimator=clf, param_grid=param_grid, n_jobs=None)

**Fit the result from the Grid Search, call your result `grid_result`.**

In [41]:
grid_result = grid.fit(train_images, train_labels, validation_data = (val_images, val_labels), epochs=5)

Epoch 1/5
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Epoch 1/5
Please report this to the TensorFlow team. When filing the bug, set the 

Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Epoch 1/5
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity 

Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Epoch 1/5
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity 

Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Epoch 1/5
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity 

Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Epoch 1/5
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity 

**Let's print some results. Fill in the #TO DOs with the best_score and best_params that you got after fitting.**

In [42]:
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_)) #TO DO, #TO DO
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))
    

Best: 0.967857 using {'neurons': 250}
0.950333 (0.002999) with: {'neurons': 50}
0.962119 (0.002014) with: {'neurons': 100}
0.967857 (0.002405) with: {'neurons': 250}
0.964714 (0.006065) with: {'neurons': 500}


### Hyperparameters Optimization - Random Search

Similarly, we can use Scikit-Learn workflow to perform [random search](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html). Random search is very similar to grid search, with the only difference, that instead of providing specific values for every hyperparameter, you only define a range for each hyperparameter - then, the values are sampled randomly from the provided ranges. The number of parameter settings that are tried is given by n_iter.
![alt text](https://i.stack.imgur.com/cIDuR.png "")

The figure above illustrates the difference in the hyperparameter space exploration between grid search and random search: assume you have 2 hyperparameters with each 3 values. Running a grid search results in training $3^2=9$ different models - but in the end, you've just tired out 3 values for each parameter. For random search on the other hand, after training 9 models you'll have tried out 9 different values for each hyperparameter, which often leads much faster to good results.

To get a deeper understanding of random search and why it is more efficient than grid search, you should definitely check out this paper: http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf.

**Import RandomizedSearchCV.**

In [43]:
from sklearn.model_selection import RandomizedSearchCV


**Random Search: use `RandomizedSearchCV` with the model you have obtained from the wrapper as estimator and the dictionary you have created for GridSearch as param_distributions.**

In [44]:
# We reuse the estimator wrapper and the param_grid defined for GridSearch
print(param_grid)
random = RandomizedSearchCV(estimator=clf, param_distributions=param_grid, n_iter=2, n_jobs=None)


{'neurons': [50, 100, 250, 500]}


**Fit the result from the Grid Search, call your result `random_result`.**


In [45]:
random_result = random.fit(train_images, train_labels, validation_data = (val_images, val_labels), epochs=1)


Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) 

Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) 

Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) 

**Let's print some results. Fill in the #TO DOs with the best_score and best_params that you got after fitting.**

In [46]:
print("Best: %f using %s" % (random_result.best_score_, random_result.best_params_)) #TO DO, #TO DO
means = random_result.cv_results_['mean_test_score']
stds = random_result.cv_results_['std_test_score']
params = random_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))
    

Best: 0.946262 using {'neurons': 500}
0.937071 (0.006708) with: {'neurons': 100}
0.946262 (0.003667) with: {'neurons': 500}


### Hyperparameters Optimization - Bayesian Optimization

Manual tuning (Trial and Error), grid search and random search are considered brute-force methods. 
Grid search is an exhaustive approach to evaluate the hyperparameter values defined in a grid. 
The random search approach will randomly choose the values for the hyperparameters. Grid and
random search perform better than manual tuning, but they do not consider past evaluations.
Hence, they lose significant time evaluating hyperparameters in the same inadequate region.
Furthermore, the objective functions to minimize are computationally expensive and running
them each time with a new set of parameters is time-demanding. 
Therefore, Bergstra et al. proposed a model-based method called Sequential Model-Based Optimization (SMBO).
The main idea of this approach is to create a surrogate model of the objective function using a
probabilistic distribution. This surrogate model bases on the Bayes’ theorem. At each new iteration, the surrogate will become more and more confident about which new guess can lead to improvements.

For this we will use the package [hyperopt](http://hyperopt.github.io/hyperopt/)

````pip install hyperopt````


**The data for the optimizer should be returned from a function ````get_data````**

In [48]:
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
import numpy as np


def get_data():
    (train_images, train_labels), (test_images, test_labels) = mnist.load_data() 
    train_images = train_images.reshape((60000, 28 * 28))
    train_images = train_images.astype('float32') / 255

    test_images = test_images.reshape((10000, 28 * 28))
    test_images = test_images.astype('float32') / 255
    # We need to encode the categorical labels in one-hot format
    train_labels = to_categorical(train_labels)
    test_labels = to_categorical(test_labels)

    return train_images, train_labels, test_images, test_labels


**Use Hyperopt for hyperparameter tunning**

In [50]:
import numpy as np
from hyperopt import Trials, STATUS_OK, tpe, fmin, hp
# Import Keras modules
from tensorflow.keras import models
from tensorflow.keras import layers
from tensorflow.keras import optimizers

# Function to create model
def f_nn(params):
    train_images,train_labels, test_images, test_labels = get_data()
  
    model = models.Sequential()  
    # Notice here how we define the number of neurons per layer
    model.add(layers.Dense(units=params['units_1'], input_shape=(28*28,), activation='relu'))
    model.add(layers.Dropout(0.2))
    model.add(layers.Dense(units=params['units_2'], activation='relu'))
    model.add(layers.Dropout(0.2))
    model.add(layers.Dense(units=params['units_3'], activation='relu'))
    model.add(layers.Dropout(0.2))
    model.add(layers.Dense(units=params['units_4'], activation='relu'))
    model.add(layers.Dropout(0.2))
    model.add(layers.Dense(units=params['units_5'], activation='relu'))
    model.add(layers.Dropout(0.2))
    model.add(layers.Dense(units=10, activation='softmax'))
    # model_compile
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    # model fitting
    result = model.fit(train_images, train_labels, epochs=2, 
              batch_size=128, validation_split=0.3)
    
    # Get the lowest validation loss of the training epochs
    validation_loss = np.amin(result.history['val_loss']) 
    print('Best validation loss of epoch:', validation_loss)
    
    score = model.evaluate(test_images, test_labels, verbose=0)
    accuracy = score[1]
    return {'loss': -accuracy,'status': STATUS_OK, 'model': model, 'params':params} 

In [51]:
space = {'units_1':hp.choice('units_1', [50,100,250,500]),
         'units_2':hp.choice('units_2', [50,100,250,500]),
         'units_3':hp.choice('units_3', [50,100,250,500]),
         'units_4':hp.choice('units_4', [50,100,250,500]),
         'units_5':hp.choice('units_5', [50,100,250,500])}

In [52]:
trials = Trials()
best = fmin( 
    fn=f_nn,
    space=space,
    algo=tpe.suggest,
    max_evals=5,
    trials=trials,
    rstate = np.random.default_rng(42))

Epoch 1/2                                                                       
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'arguments' object has no attribute 'posonlyargs'
  1/329 [..............................]                                        
 - ETA: 4:19 - loss: 2.2996 - accuracy: 0.1250                                  
                                                                               
  3/329 [..............................]                                        
 - ETA: 11s - loss: 2.2895 - accuracy: 0.1328                                   
                                                                                


 - ETA: 7s - loss: 0.9154 - accuracy: 0.6979                                    
                                                                               
 - ETA: 7s - loss: 0.9055 - accuracy: 0.7016                                    
                                                                               
 - ETA: 7s - loss: 0.8935 - accuracy: 0.7057                                    
                                                                               
 - ETA: 7s - loss: 0.8846 - accuracy: 0.7088                                    
                                                                               
 - ETA: 7s - loss: 0.8778 - accuracy: 0.7113                                    
                                                                               
 - ETA: 6s - loss: 0.8666 - accuracy: 0.7152                                    
                                                                               
 - ETA: 6s - loss: 0.8548 - accura

 - ETA: 2s - loss: 0.6048 - accuracy: 0.8095                                    
                                                                               
 - ETA: 2s - loss: 0.6025 - accuracy: 0.8102                                    
                                                                               
 - ETA: 2s - loss: 0.5990 - accuracy: 0.8113                                    
                                                                               
 - ETA: 2s - loss: 0.5957 - accuracy: 0.8124                                    
                                                                               
 - ETA: 1s - loss: 0.5920 - accuracy: 0.8135                                    
                                                                               
 - ETA: 1s - loss: 0.5897 - accuracy: 0.8144                                    
                                                                               
 - ETA: 1s - loss: 0.5876 - accura

                                                                               
 - ETA: 3s - loss: 0.2488 - accuracy: 0.9312                                    
                                                                               
 - ETA: 3s - loss: 0.2479 - accuracy: 0.9314                                    
                                                                               
 - ETA: 3s - loss: 0.2476 - accuracy: 0.9315                                    
                                                                               
 - ETA: 3s - loss: 0.2474 - accuracy: 0.9316                                    
                                                                               
 - ETA: 3s - loss: 0.2471 - accuracy: 0.9316                                    
                                                                               
 - ETA: 3s - loss: 0.2476 - accuracy: 0.9315                                    
                                  

 - ETA: 0s - loss: 0.2371 - accuracy: 0.9350                                    
                                                                               
 - ETA: 0s - loss: 0.2370 - accuracy: 0.9350                                    
                                                                               
 - ETA: 0s - loss: 0.2366 - accuracy: 0.9352                                    
                                                                               
 - ETA: 0s - loss: 0.2366 - accuracy: 0.9353                                    
                                                                               
 - ETA: 0s - loss: 0.2362 - accuracy: 0.9355                                    
                                                                               
 - ETA: 0s - loss: 0.2362 - accuracy: 0.9354                                    
                                                                               
 - ETA: 0s - loss: 0.2365 - accura

                                                                               
 - ETA: 3s - loss: 0.7559 - accuracy: 0.7534                                    
                                                                               
 - ETA: 3s - loss: 0.7487 - accuracy: 0.7558                                    
                                                                               
 - ETA: 3s - loss: 0.7348 - accuracy: 0.7609                                    
                                                                               
 - ETA: 3s - loss: 0.7241 - accuracy: 0.7647                                    
                                                                               
 - ETA: 3s - loss: 0.7136 - accuracy: 0.7685                                    
                                                                               
 - ETA: 3s - loss: 0.7031 - accuracy: 0.7720                                    
                                  

 17/329 [>.............................]                                        
 - ETA: 4s - loss: 0.1942 - accuracy: 0.9467                                    
                                                                               
 21/329 [>.............................]                                        
 - ETA: 4s - loss: 0.1961 - accuracy: 0.9453                                    
                                                                               
 25/329 [=>............................]                                        
 - ETA: 4s - loss: 0.1910 - accuracy: 0.9453                                    
                                                                               
 29/329 [=>............................]                                        
 - ETA: 4s - loss: 0.1912 - accuracy: 0.9445                                    
                                                                               
 33/329 [==>....................

 - ETA: 1s - loss: 0.1716 - accuracy: 0.9514                                    
                                                                               
 - ETA: 1s - loss: 0.1707 - accuracy: 0.9516                                    
                                                                               
 - ETA: 1s - loss: 0.1705 - accuracy: 0.9516                                    
                                                                               
 - ETA: 1s - loss: 0.1709 - accuracy: 0.9515                                    
                                                                               
 - ETA: 1s - loss: 0.1707 - accuracy: 0.9516                                    
                                                                               
 - ETA: 1s - loss: 0.1711 - accuracy: 0.9515                                    
                                                                               
 - ETA: 1s - loss: 0.1708 - accura

  1/329 [..............................]                                        
 - ETA: 2s - loss: 0.6628 - accuracy: 0.8125                                    
                                                                               
 12/329 [>.............................]                                        
 - ETA: 1s - loss: 0.4953 - accuracy: 0.8555                                    
                                                                               
 25/329 [=>............................]                                        
 - ETA: 1s - loss: 0.4602 - accuracy: 0.8700                                    
                                                                               
 37/329 [==>...........................]                                        
 - ETA: 1s - loss: 0.4692 - accuracy: 0.8693                                    
                                                                               
 49/329 [===>...................

                                                                               
 - ETA: 0s - loss: 0.7741 - accuracy: 0.7436                                    
                                                                               
 - ETA: 0s - loss: 0.7618 - accuracy: 0.7480                                    
                                                                               
 - ETA: 0s - loss: 0.7535 - accuracy: 0.7513                                    
                                                                               
 - ETA: 0s - loss: 0.7416 - accuracy: 0.7558                                    
                                                                               
 - ETA: 0s - loss: 0.7319 - accuracy: 0.7594                                    
                                                                               
 - ETA: 0s - loss: 0.7215 - accuracy: 0.7630                                    
                                  

 - ETA: 0s - loss: 0.2524 - accuracy: 0.9303                                    
                                                                               
 - ETA: 0s - loss: 0.2517 - accuracy: 0.9307                                    
                                                                               
 - ETA: 0s - loss: 0.2499 - accuracy: 0.9310                                    
                                                                               
 - ETA: 0s - loss: 0.2490 - accuracy: 0.9313                                    
                                                                               
 - ETA: 0s - loss: 0.2492 - accuracy: 0.9314                                    
                                                                               
 - 3s 9ms/step - loss: 0.2488 - accuracy: 0.9315 - val_loss: 0.1548 - val_accuracy: 0.9563

Best validation loss of epoch:                                                  
0.1547917276620865    

                                                                               
 66/329 [=====>........................]                                        
 - ETA: 1s - loss: 0.3630 - accuracy: 0.8974                                    
                                                                               
 73/329 [=====>........................]                                        
 - ETA: 1s - loss: 0.3626 - accuracy: 0.8980                                    
                                                                               
 - ETA: 1s - loss: 0.3652 - accuracy: 0.8985                                    
                                                                               
 - ETA: 1s - loss: 0.3691 - accuracy: 0.8969                                    
                                                                               
 - ETA: 1s - loss: 0.3689 - accuracy: 0.8967                                    
                                 

In [53]:
best_model = trials.results[np.argmin([r['loss'] for r in trials.results])]['model']
best_model.summary()

Model: "sequential_38"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_222 (Dense)            (None, 500)               392500    
_________________________________________________________________
dropout_180 (Dropout)        (None, 500)               0         
_________________________________________________________________
dense_223 (Dense)            (None, 250)               125250    
_________________________________________________________________
dropout_181 (Dropout)        (None, 250)               0         
_________________________________________________________________
dense_224 (Dense)            (None, 100)               25100     
_________________________________________________________________
dropout_182 (Dropout)        (None, 100)               0         
_________________________________________________________________
dense_225 (Dense)            (None, 100)             

In [57]:
train_images,train_labels, test_images, test_labels = get_data()
pred = np.argmax(best_model.predict(test_images), axis=-1)
label = np.argmax(test_labels, axis = 1)

confusion = confusion_matrix(label, pred)
print(confusion)

[[ 970    0    0    0    1    1    3    1    3    1]
 [   0 1125    4    0    0    1    1    0    3    1]
 [  13    2  981    7    7    0    5    7    9    1]
 [   0    0    4  973    0    6    0    8   10    9]
 [   2    1    1    0  950    0    6    0    1   21]
 [   4    0    0   18    1  851    8    0    5    5]
 [   6    2    0    0    4    8  934    0    4    0]
 [   3    8    9    1    0    0    0  972    1   34]
 [   5    2    1    6   14    2    3    3  927   11]
 [   4    3    1    9   14    2    1    2    4  969]]
