In [15]:
%config Completer.use_jedi = False

# Chapter 2: The mathematical building blocks of Neural Networks
> Notes about the book Deep Learning with python, 2nd edition

- toc: true 
- badges: true
- comments: true
- categories: [deep_learning, python, tensorflow, book]

## Required packages

In [57]:
import numpy as np
import tensorflow as tf

## A first look at a Neural Network

### MNIST dataset

Task: classify grayscale images of handwritten digits (28 × 28 pixels) into their 10 categories (0 through 9).

In [3]:
from tensorflow.keras.datasets import mnist

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


Training data:

In [4]:
train_images.shape

(60000, 28, 28)

In [10]:
len(train_labels)

60000

In [11]:
train_labels

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

Test data:

In [12]:
test_images.shape

(10000, 28, 28)

In [13]:
len(test_labels)

10000

In [14]:
test_labels

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

### Define and compile the model

Define a basic multi-layer network

In [16]:
model = tf.keras.Sequential(
    [
        tf.keras.layers.Dense(512, activation="relu"), 
        tf.keras.layers.Dense(10, activation="softmax")
    ]
)

Compile the model by specifying the optimization algorithm, the loss function and the metrics to track:

In [17]:
model.compile(
    optimizer="rmsprop", 
    loss="sparse_categorical_crossentropy", 
    metrics=["accuracy"]
)

### Pre-process the data as expected by the model

Transform the features from an array `(60000, 28, 28)` with values between `[0, 255]` to a flat array of size `(60000, 28 * 28)` of values `[0,1]`.

In [19]:
train_images.shape

(60000, 28, 28)

In [18]:
train_images.dtype

dtype('uint8')

In [23]:
train_images = train_images.reshape((60000, 28*28))
train_images = train_images.astype("float32") / 255

In [24]:
train_images.shape

(60000, 784)

In [25]:
train_images.dtype

dtype('float32')

In [26]:
test_images.shape

(10000, 28, 28)

In [27]:
test_images.dtype

dtype('uint8')

In [28]:
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype("float32") / 255

In [29]:
test_images.shape

(10000, 784)

In [30]:
test_images.dtype

dtype('float32')

### Fit the model

In [54]:
model.fit(train_images, train_labels, epochs=20, batch_size=128)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x14a481ee0>

### Predict with the model

Select the first 10 images of the test set.

In [40]:
test_digits = test_images[0:10]
test_digits.shape

(10, 784)

Compute predictions for the first 10 images:

In [35]:
predictions = model.predict(test_digits)

Class probabilities for the first test images:

In [36]:
predictions[0]

array([0., 0., 0., 0., 0., 0., 0., 1., 0., 0.], dtype=float32)

Pick the class with the highest probability:

In [37]:
predictions[0].argmax()

7

Check what is the true label of the first test image:

In [39]:
test_labels[0]

7

### Evaluate the model on test data

In [55]:
test_loss, test_acc = model.evaluate(test_images, test_labels)



In [56]:
print(f"Test accuracy: {test_acc:.4}\nTest loss: {test_loss:.6}")

Test accuracy: 0.9514
Test loss: 18.4538


## Data representation for neural networks: Tensors

Tensors are the basic data structures used in Machine Learning. Tensor are multi-dimmentional arrays. In the context of tensors, a dimensional is also called an axis.

In deep learning, you'll generally manipulate tensors with ranks 0 to 4, although you may go up to 5 if you process video data.

### Scalars (rank-0 tensors, 0D tensor)

In [58]:
x = np.array(12)

In [59]:
x.ndim

0

### Vectors (rank-1 tensors, 1D tensor)

In [60]:
x = np.array([12, 3, 6, 14, 7])

In [62]:
x.ndim

1

### Matrices (rank-2 tensors, 2D tensors)

In [63]:
x = np.array(
    [
        [5, 78, 2, 34, 0],
        [6, 79, 3, 35, 1],
        [7, 80, 4, 36, 2]
    ]
)

In [64]:
x.ndim

2

### Rank-3 and higher rank tensors

If you pack such matrices in a new array, you obtain a rank-3 tensor (or 3D tensor)

In [65]:
x = np.array(
    [
        [
            [5, 78, 2, 34, 0],
            [6, 79, 3, 35, 1],
            [7, 80, 4, 36, 2]
        ],
        [
            [5, 78, 2, 34, 0],
            [6, 79, 3, 35, 1],
            [7, 80, 4, 36, 2]
        ],
        [
            [5, 78, 2, 34, 0],
            [6, 79, 3, 35, 1],
            [7, 80, 4, 36, 2]
        ]
    ]
)

In [66]:
x.ndim

3

### Tensor key attributes

1. Number of axis 

In [67]:
train_images.ndim

2

2. Shape

In [68]:
train_images.shape

(60000, 784)

3. Data type

In [69]:
train_images.dtype

dtype('float32')

### Manipulating tensors in NumPy