# Intermediate Neural Network in TensorFlow

In this notebook, we improve on our [introductory shallow net](https://github.com/jonkrohn/DLTFpT/blob/master/notebooks/shallow_net_in_tensorflow.ipynb) by incorporating the theory we've covered since.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/jonkrohn/DLTFpT/blob/master/notebooks/intermediate_net_in_tensorflow.ipynb)

#### Load dependencies

In [1]:
import tensorflow
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import SGD
from matplotlib import pyplot as plt

#### Load data

In [2]:
(X_train, y_train), (X_valid, y_valid) = mnist.load_data()

#### Preprocess data

In [3]:
X_train = X_train.reshape(60000, 784).astype('float32')
X_valid = X_valid.reshape(10000, 784).astype('float32')

In [4]:
X_train /= 255
X_valid /= 255

In [5]:
n_classes = 10
y_train = to_categorical(y_train, n_classes)
y_valid = to_categorical(y_valid, n_classes)

#### Design neural network architecture

In [6]:
model = Sequential()

#hidden layers:
model.add(Dense(64, activation='relu', input_shape=(784,)))
# we are adding one more hidden layer, for experimenting purpose.
model.add(Dense(64, activation='relu'))

#output layer:
model.add(Dense(10, activation='softmax'))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [7]:
model.summary()

In [8]:
# this is the calculation for number of parameters (connections - arrows in neural network from one layer to other)
(64*64)

4096

In [9]:
# + 64 addition is because of bias, there are 64 neurons in this layer, each will have its own bias.
(64*64)+64

4160

#### Configure model

In [10]:
# N.B.: learning rate is order of magnitude quicker relative to shallow net
# since we learnt that more than quadratic cost - mean squared error, cross entropy works well
# trying out with that cost function.
model.compile(loss='categorical_crossentropy', optimizer=SGD(learning_rate=0.1), metrics=['accuracy'])

#### Train!

In [11]:
# N.B.: number of epochs is order of magnitude smaller relative to shallow net
# Note: here we are specifying the batch size, epochs - so these can also be considered as hyper parameters
# along with learning rate.
model.fit(X_train, y_train, batch_size=128, epochs=20, verbose=1, validation_data=(X_valid, y_valid))

Epoch 1/20
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 900us/step - accuracy: 0.7738 - loss: 0.7925 - val_accuracy: 0.9227 - val_loss: 0.2587
Epoch 2/20
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 741us/step - accuracy: 0.9305 - loss: 0.2357 - val_accuracy: 0.9455 - val_loss: 0.1906
Epoch 3/20
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 743us/step - accuracy: 0.9482 - loss: 0.1776 - val_accuracy: 0.9524 - val_loss: 0.1576
Epoch 4/20
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 747us/step - accuracy: 0.9585 - loss: 0.1419 - val_accuracy: 0.9549 - val_loss: 0.1522
Epoch 5/20
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 723us/step - accuracy: 0.9637 - loss: 0.1230 - val_accuracy: 0.9611 - val_loss: 0.1304
Epoch 6/20
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 723us/step - accuracy: 0.9688 - loss: 0.1090 - val_accuracy: 0.9642 - val_loss: 0.1172
Epoch 7/20
[1m4

<keras.src.callbacks.history.History at 0x17fd41390>

#### Performing inference

In [12]:
valid_0 = X_valid[0].reshape(1, 784)

In [13]:
model.predict(valid_0)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step


array([[2.3899322e-06, 1.7875396e-07, 4.7974285e-05, 1.5992101e-04,
        3.9024964e-12, 2.9564990e-08, 1.6198105e-13, 9.9976045e-01,
        1.2823992e-07, 2.8827661e-05]], dtype=float32)

In [14]:
# The predict_classes() method no longer exists in recent TensorFlow releases. 
# Instead you could use:
import numpy as np
np.argmax(model.predict(valid_0), axis=-1)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step


array([7])