# Machine Learning Series: Simple Neural Network Pipeline 1

- &copy;2024 Madhava Pandiyan CN (MPtheRoboticist)

## 0. Installations

- **Jupyter Notebook Version:** 7.0.6
- **IPython Version:** 8.12.3
- **Tensorflow version:** 2.3.0

## 1. Import  

In [1]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.utils import to_categorical

2024-08-11 00:00:03.279314: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/ros/noetic/lib:/opt/ros/noetic/lib/x86_64-linux-gnu
2024-08-11 00:00:03.279354: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


## 2. Load Data

From **[1]**

- **x_train:** _uint8_ NumPy array of grayscale image data with shapes (60000, 28, 28), containing the training data. Pixel values range from 0 to 255.

- **y_train:** _uint8_ NumPy array of digit labels (integers in range 0-9) with shape (60000,) for the training data.

- **x_test:** _uint8_ NumPy array of grayscale image data with shapes (10000, 28, 28), containing the test data. Pixel values range from 0 to 255.

- **y_test:** _uint8_ NumPy array of digit labels (integers in range 0-9) with shape (10000,) for the test data.


In [2]:
# Loading data to following variables
'''
x_train: 60000 grayscale images each made of 28x28 pixels.
y_train: label for all 60000 images split among 10 classes.

x_test: 10000 grayscale images each made of 28x28 pixels.
y_test: label for all 10000 images split among 10 classes.
'''
(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [3]:
# Checking the size of each variable
print("Size of x_train: ", x_train.shape)
print("Size of y_train: ", y_train.shape)
print("Size of x_test:\t ", x_test.shape)
print("Size of y_test:\t ", y_test.shape)

Size of x_train:  (60000, 28, 28)
Size of y_train:  (60000,)
Size of x_test:	  (10000, 28, 28)
Size of y_test:	  (10000,)


In [4]:
# Checking the value of y_train before pre-processing.
print(y_train)

[5 0 4 ... 5 6 8]


## 3. Pre-processing the data

- **x_train_:** Changes the range of _x_train_. New range --> 0 to 1
- **x_test_:** Changes the range of _x_test_. New range --> 0 to 1
- **y_train_:** Changes _y_train_ from integer encoding to _one-hot encoding_.
- **y_test_:** Changes _y_test_ from integer encoding to _one-hot encoding_.

In [5]:
x_train_ = x_train.astype('float32') / 255
x_test_ = x_test.astype('float32') / 255

# The function to_categorical converts values to binary.
# Here the class label is converted to 
# one-hot encoding format.
# 10 indicates that there are 10 classes.
y_train_ = to_categorical(y_train, 10)
y_test_ = to_categorical(y_test, 10)

In [6]:
# Checking the size of each variable
print("Size of x_train after preprocessing:\t ", x_train_.shape)
print("Size of y_train after preprocessing:\t ", y_train_.shape)
print("Size of x_test after preprocessing:\t ", x_test_.shape)
print("Size of y_test after preprocessing:\t ", y_test_.shape)

Size of x_train after preprocessing:	  (60000, 28, 28)
Size of y_train after preprocessing:	  (60000, 10)
Size of x_test after preprocessing:	  (10000, 28, 28)
Size of y_test after preprocessing:	  (10000, 10)


In [7]:
# Checking the value of y_train after pre-processing.
print(y_train_)

[[0. 0. 0. ... 0. 0. 0.]
 [1. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 1. 0.]]


## 4. Neural Network Architecture

**Sequential [2]:**
- Note: can be imported either from **tensorflow.keras.models.Sequential** or from **tensorflow.keras.Sequential**.
- **Sequential** groups a linear stack of layers into a **Model**.

**Flatten [3]:**
- **Flatten():** Flattens the input. Does not affect the batch size.
- i.e. here since input shape is (28, 28), the output would be (None, 784).
- 28 x 28 = 784

**Dense [4]:**
- **Dense(units, activation):** Just your regular densely-connected NN layer.
- **units:** Positive integer, dimensionality of the output space.
- **activation:** Activation function to use. Check [5] for list of available in-built activation functions.

**ReLU [6]:**
- ReLU --> Rectified Linear Unit
- The Rectified Linear Unit (ReLU for short) is a linear activation function that was introduced to solve the vanishing gradient problem and has become increasingly popular in applications in recent years. In short, it keeps positive values and sets negative input values equal to zero.

                        f(x) = max{0, x}

**Softmax [7]:**
- The elements of the output vector are in range [0, 1] and sum to 1.

                        f(x) = exp(x) / sum(exp(x))

In [8]:
model = Sequential([
    Flatten(input_shape=(28, 28)),
    Dense(128, activation='relu'),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

2024-08-11 00:00:42.920147: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2024-08-11 00:00:43.365623: E tensorflow/stream_executor/cuda/cuda_driver.cc:314] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2024-08-11 00:00:43.365657: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (madhava-pandiyan): /proc/driver/nvidia/version does not exist
2024-08-11 00:00:43.366117: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-11 00:00:43.542383: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2394465000 Hz
2024-08-11 00:00:43.543462: I ten

## 5. Compile the architecture

**Compile [2]:**
- Configures the model for training.
- Arguments: Optimizer, Loss, Metrics.

**Optimizer:**
- Check [8] for list of in-built optimizers.

**Loss:**
- Check [9] for list of in-built loss functions.

**Adam:**

- With reference to [10], Adam Optimizer algorithm is as follows:

<img src="https://miro.medium.com/v2/resize:fit:720/format:webp/1*zfdW5zAyQxge85gA_mFPYg.png" alt="adam_optimizer" width="600"/>

**Categorical Cross entropy:**

- With reference to [11], Categorical Cross Entropy is as follows:

<img src="https://miro.medium.com/v2/resize:fit:720/format:webp/0*PSuYoaQICXefd6qA.png" alt="categorical_cross_entropy" width="400"/>

In [9]:
model.compile(optimizer='adam', 
              loss='categorical_crossentropy')

## 6. Train the model

**Fit [2]:**

- Trains the model for a fixed number of epochs (dataset iterations).
- **Batch size:** Number of samples per gradient update.

In [10]:
model.fit(x_train_, y_train_, epochs=20, batch_size=32)

2024-08-11 00:01:10.109086: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 188160000 exceeds 10% of free system memory.


Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<tensorflow.python.keras.callbacks.History at 0x7f972c095040>

## 7. Evaluate

**Evaluate [2]:**

- Returns the loss value & metrics values for the model in test mode.
- **Note:** Here metrics value has not been called as we didn't specify one during compiling.

In [11]:
test_loss = model.evaluate(x_test_, y_test_, batch_size=32)



In [12]:
print("Test loss: ", test_loss)

Test loss:  0.12531176209449768


## 8. Reference

[1] - **tf.keras.datasets.mnist.load_data**(https://www.tensorflow.org/api_docs/python/tf/keras/datasets/mnist/load_data) 
    
[2] - **tf.keras.Sequential** (https://www.tensorflow.org/api_docs/python/tf/keras/Sequential)

[3] - **tf.keras.layers.Flatten** (https://www.tensorflow.org/api_docs/python/tf/keras/layers/Flatten)

[4] - **tf.keras.layers.Dense** (https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense)

[5] - **tf.keras.activations** (https://www.tensorflow.org/api_docs/python/tf/keras/activations)

[6] - **What is ReLU-function (Rectified Linear Unit)?** (https://databasecamp.de/en/ml/relu-en)

[7] - **tf.keras.activations.softmax** (https://www.tensorflow.org/api_docs/python/tf/keras/activations/softmax)

[8] - **tf.keras.optimizers** (https://www.tensorflow.org/api_docs/python/tf/keras/optimizers)

[9] - **tf.keras.losses** (https://www.tensorflow.org/api_docs/python/tf/keras/losses)

[10] - **Everything you need to know about Adam Optimizer** (https://medium.com/@nishantnikhil/adam-optimizer-notes-ddac4fd7218)

[11] - **Categorical cross-entropy loss — The most important loss function** (https://neuralthreads.medium.com/categorical-cross-entropy-loss-the-most-important-loss-function-d3792151d05b)