# Chapter 2. The Mathematical Building Blocks of Neural Networks

## 2.1. A first look at a neural network

In [1]:
# import library
from tensorflow.keras.datasets import mnist
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

In [2]:
# reference: https://bic-berkeley.github.io/psych-214-fall-2016/printing_floating.html
np.set_printoptions(precision=8)

### Import Dataset

In [3]:
# load MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [4]:
# check shape
print(train_images.shape)
print(test_images.shape)

(60000, 28, 28)
(10000, 28, 28)


### Build Model

In [5]:
# build neural network
model = keras.Sequential([
    layers.Dense(512, activation="relu"),
    layers.Dense(10, activation="softmax")
])

2023-01-29 02:40:59.385503: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


In [6]:
# compile model: add optimization
model.compile(optimizer="rmsprop",
             loss="sparse_categorical_crossentropy",
             metrics=["accuracy"])

**Note**: the input shape of the image is 512 = 28 x 28, so we need to reshape this

### Feature Engineering

In [7]:
# reshape input and normalized it
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype("float32") / 255

test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype("float32") / 255

### Train Model

In [8]:
# train the model
model.fit(train_images, train_labels, epochs=5, batch_size=128)

2023-01-29 02:40:59.903057: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f6e953e1c10>

In [9]:
# make prediction with model
test_digits = test_images[0:10]
predictions = model.predict(test_digits)
predictions[0]

array([8.8546548e-09, 1.3109719e-11, 3.4533289e-07, 3.5175533e-05,
       4.9776541e-12, 1.2499955e-08, 1.2581968e-13, 9.9996328e-01,
       6.2665464e-08, 1.0738448e-06], dtype=float32)

In [10]:
# from the prediction, get the index off the highest probability
print(predictions[0].argmax())

7


In [11]:
# print prediction value
print(predictions[0][7])

# check test label
print(test_labels[0])

0.9999633
7


### Evaluate Model On Test Dataset

In [12]:
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"test_acc: {test_acc}")

test_acc: 0.9796000123023987


**Note**: as the model training accuracy > testing accuracy, this is overfit. This means that the model does not perform well