<a href="https://colab.research.google.com/github/bhadaur1/Chollet/blob/master/Chollet_Chap2_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Mathematical building blocks of neural networks

In [5]:
from tensorflow import keras
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

**Load the mnist dataset**

The problem we’re trying to solve here is to classify grayscale images of handwritten digits (28 × 28 pixels) into their 10 categories (0 through 9). We’ll use the MNIST dataset, a classic in the machine-learning community, which has been around almost as long as the field itself and has been intensively studied. It’s a set of 60,000 training images, plus 10,000 test images, assembled by the National Institute of Standards and Technology (the NIST in MNIST) in the 1980s. 

In [15]:
from keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [9]:
display(train_images.shape)

display(len(train_labels))

display(train_labels)

(60000, 28, 28)

60000

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

In [10]:
display(test_images.shape)

display(len(test_labels))

display(test_labels)

(10000, 28, 28)

10000

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

**Define Bare bones model architecture**

`input_shape` is set to be `28 * 28` so images need to flatten 

In [11]:
from keras import models
from keras import layers

network = models.Sequential()
network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,)))
network.add(layers.Dense(10, activation='softmax'))

**Do the model compilation**

Need:
1. Loss function
1. Optimizer
1. Metrics to monitor during training and testing

In [12]:
network.compile(optimizer='rmsprop',
                loss='categorical_crossentropy',
                metrics=['accuracy'])

**Preprocess and prepare the training and test data**

Before training, we’ll preprocess the data by *reshaping it into the shape the network expects and scaling it so that all values are in the [0, 1] interval.* Previously, our training images, for instance, were stored in an array of shape (60000, 28, 28) of type uint8 with values in the [0, 255] interval. We transform it into a float32 array of shape (60000, 28 * 28) with values between 0 and 1.

In [16]:
train_images = train_images.reshape([60000, 28 * 28])
train_images = train_images.astype('float32') / 255.0

test_images = test_images.reshape([10000, 28 * 28])
test_images = test_images.astype('float32') / 255.0

**Preparing the labels**

In [17]:
from keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

**Finally, fit the model and plot loss and accuracy**

In [18]:
network.fit(x=train_images, y=train_labels, batch_size=128, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f4d27d2bc50>

In [20]:
test_loss, test_acc = network.evaluate(x = test_images, y = test_labels)
print(f"test_acc: {test_acc}")

test_acc: 0.9800000190734863
