<a href="https://colab.research.google.com/github/anaustinbeing/neural-networks/blob/main/keras_sequential_finetuning_(cifar10).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Implementing a multi-layer ANN on Cifar10 dataset using Keras Sequential and finetuning hyperparameters.

Importing the necessary modules:

In [None]:
import pandas as pd
import numpy as np

from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, explained_variance_score

from tensorflow.keras.models import Sequential
from tensorflow.keras import layers, datasets
import tensorflow.keras as keras
import tensorflow as tf

Loading the dataset:

In [None]:
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
train_images, test_images = train_images / 255.0, test_images / 255.0

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


Exploring the dataset:

In [None]:
train_images.shape, train_labels.shape

((50000, 32, 32, 3), (50000, 1))

In [None]:
test_images.shape, test_images.shape

((10000, 32, 32, 3), (10000, 32, 32, 3))

Finding the unique values in the target column:

In [None]:
np.unique(train_labels)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8)

We find that there are 10 unique values in the target column, which means there are 10 classes. So, we will have 10 neurons in the output for the neural network.

In [None]:
keras.backend.clear_session()
np.random.seed(42)
tf.random.set_seed(42)

def build_model(n_hidden=20, n_neurons=100, learning_rate=5e-3):
  model = tf.keras.Sequential()
  model.add(tf.keras.layers.Flatten(input_shape=[32, 32, 3]))
  for _ in range(n_hidden):
      model.add(tf.keras.layers.Dense(n_neurons,
                                      activation="swish",
                                      kernel_initializer="he_normal"))
  model.add(tf.keras.layers.Dense(10, activation="softmax"))
  optimizer = tf.keras.optimizers.Nadam(learning_rate=5e-5)
  model.compile(loss="sparse_categorical_crossentropy",
                optimizer=optimizer,
                metrics=["accuracy"])
  return model

model = build_model()
model.fit(train_images, train_labels, epochs=10, validation_data=(test_images, test_labels), callbacks=[keras.callbacks.EarlyStopping(patience=10)])

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f056855a210>

In [None]:
model.evaluate(test_images, test_labels)



[1.5005723237991333, 0.4643000066280365]

In [None]:
predictions = model.predict(test_images)
predictions = np.array([np.argmax(x) for x in predictions])
print('The original test target value is: ', test_labels[0])
print('The predicted test value is: ', predictions[0])

The original test target value is:  [3]
The predicted test value is:  3


The prediction is correct for the test.

### Finetuning hyper parameters:

Hyper parameters tuned are:

1. Number of hidden layers: (1, 2, 3)
2. Number of neurons: range(1, 100)
3. Learning rate 

In [None]:
from scipy.stats import reciprocal
from sklearn.model_selection import RandomizedSearchCV
from sklearn.metrics import make_scorer
from sklearn.metrics import accuracy_score, precision_score, recall_score
from sklearn.model_selection import GridSearchCV

param_distribs = {
    # "optimizer": ["adam", "sgd"]
    "n_hidden": list(range(30)),
    "n_neurons": np.arange(1, 200),
    "learning_rate": reciprocal(3e-4, 5e-3),
}

keras_reg = keras.wrappers.scikit_learn.KerasRegressor(build_model)
rnd_search_cv = RandomizedSearchCV(keras_reg, param_distribs, n_iter=1, cv=2, verbose=2)
rnd_search_cv.fit(train_images, train_labels)
rnd_search_cv.best_params_



Fitting 2 folds for each of 1 candidates, totalling 2 fits
[CV] END learning_rate=0.002352378152341783, n_hidden=28, n_neurons=21; total time=  22.5s
[CV] END learning_rate=0.002352378152341783, n_hidden=28, n_neurons=21; total time=  21.0s


{'learning_rate': 0.002352378152341783, 'n_hidden': 28, 'n_neurons': 21}

We see that the best learning rate from the range is 0.00235, best number of hidden layers is 28 and the number of neurons in the hidden layer is 21.

In [None]:
best_params = rnd_search_cv.best_params_

Now we will build the model again with the best parameters we found out in the previous step.

In [None]:
def build_model(n_hidden, n_neurons, learning_rate):
  model = tf.keras.Sequential()
  model.add(tf.keras.layers.Flatten(input_shape=[32, 32, 3]))
  for _ in range(n_hidden):
      model.add(tf.keras.layers.Dense(n_neurons,
                                      activation="swish",
                                      kernel_initializer="he_normal"))
  model.add(tf.keras.layers.Dense(10, activation="softmax"))
  optimizer = tf.keras.optimizers.Nadam(learning_rate=5e-5)
  model.compile(loss="sparse_categorical_crossentropy",
                optimizer=optimizer,
                metrics=["accuracy"])
  return model

model = build_model(best_params['n_hidden'], best_params['n_neurons'], best_params['learning_rate'])
model.fit(train_images, train_labels, epochs=10, validation_data=(test_images, test_labels), callbacks=[keras.callbacks.EarlyStopping(patience=10)])

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f055d505310>

In [None]:
model.evaluate(test_images, test_labels)



[1.8433630466461182, 0.28940001130104065]

In [None]:
predictions = model.predict(test_images)
predictions = np.array([np.argmax(x) for x in predictions])
print('The original test target value is: ', test_labels[0])
print('The predicted test value is: ', predictions[0])
print('The original test target value is: ', test_labels[1])
print('The predicted test value is: ', predictions[1])

The original test target value is:  [3]
The predicted test value is:  7
The original test target value is:  [8]
The predicted test value is:  8
