# Tensorflow Keras Tutorial - Neural Network (Part 1)

**What is Keras?** Keras is a user-friendly wrapper that simplifies the implementation of Deep Neural Networks without delving into intricate network details. It can utilize either *Tensorflow* or *Theano* as its backend. This tutorial series is designed to take you from a beginner to an intermediate level in understanding Keras.

## In this part, we will cover:

- Loading the MNIST Digit dataset
- Basic preprocessing of image data
- Training a simple neural network
- Validating our trained model
- Implementing early stopping when reaching desired accuracy or loss levels


In [3]:
pip install tensorflow

Collecting tensorflow
  Downloading tensorflow-2.13.0-cp310-cp310-win_amd64.whl (1.9 kB)
Collecting tensorflow-intel==2.13.0
  Downloading tensorflow_intel-2.13.0-cp310-cp310-win_amd64.whl (276.5 MB)
     -------------------------------------- 276.5/276.5 MB 1.8 MB/s eta 0:00:00
Collecting tensorflow-io-gcs-filesystem>=0.23.1
  Downloading tensorflow_io_gcs_filesystem-0.31.0-cp310-cp310-win_amd64.whl (1.5 MB)
     ---------------------------------------- 1.5/1.5 MB 1.6 MB/s eta 0:00:00
Collecting libclang>=13.0.0
  Downloading libclang-16.0.6-py2.py3-none-win_amd64.whl (24.4 MB)
     ---------------------------------------- 24.4/24.4 MB 2.8 MB/s eta 0:00:00
Collecting tensorboard<2.14,>=2.13
  Downloading tensorboard-2.13.0-py3-none-any.whl (5.6 MB)
     ---------------------------------------- 5.6/5.6 MB 3.3 MB/s eta 0:00:00
Collecting grpcio<2.0,>=1.24.3
  Downloading grpcio-1.57.0-cp310-cp310-win_amd64.whl (4.3 MB)
     ---------------------------------------- 4.3/4.3 MB 3.3 MB/s et

# Step 1 - Importing libraries

In [1]:
# Import necessary libraries
import matplotlib.pyplot as plt  # Library for creating visualizations
import tensorflow as tf  # Open-source machine learning framework
import numpy as np  # Numerical computing library

# Display visualizations directly in the notebook
%matplotlib inline


ModuleNotFoundError: No module named 'tensorflow'

# Step 2 - Importing Dataset


Here we are loading mnist Dataset which is preloaded in tensorflow. <br>

>```mnist = tf.keras.datasets.mnist```<br>
This returns the dataset object. Similarly there are 6 more datasets preloaded in keras.

>Calling the `load_data` function on this object returns splitted train and test data in form of (features, target).

In [None]:
# Load the MNIST dataset using TensorFlow's Keras API
mnist = tf.keras.datasets.mnist

# Load and unpack the training and testing data
(x_train, y_train), (x_test, y_test) = mnist.load_data()


## Overview of the dataset

## Dataset Overview

The MNIST dataset consists of images, each having a resolution of 28x28 pixels. The dataset is divided into training and test sets, containing 60000 and 10000 images respectively.

- The shape `(60000, 28, 28)` indicates that the training data contains 60000 images, each with dimensions 28x28.

- The shape `(60000,)`, equivalent to `(60000, 1)`, represents the training labels. There are 60000 labels, one for each image.

In summary, the dataset comprises images of handwritten digits, and each image is associated with a corresponding label.


In [None]:
# Print the shapes of the loaded data and labels
print(f'Shape of the training data: {x_train.shape}')
print(f'Shape of the training target: {y_train.shape}')
print(f'Shape of the test data: {x_test.shape}')
print(f'Shape of the test target: {y_test.shape}')

In [None]:
# Print the training labels
print(y_train)


In [None]:
# Let's plot the first image in the training data and look at its corresponding target (y) variable.

# Display the first training image using a grayscale colormap
plt.imshow(x_train[0], cmap='gray')

# Print the corresponding target (label) for the first training image
print(f'Target variable is {y_train[0]}')


# Step 3 - Preprocess - Data Normalization

The pixel values in the image data range from 0 to 255, representing grayscale intensity. To better suit the neural network's input requirements, we'll scale these values to a range of 0 to 1. This scaling process is known as **Normalization**.

> Note: While normalization is beneficial for training neural networks, it's not mandatory. You can opt to skip these lines and observe the impact on the final output.


In [None]:
# Setting custom printwidth to print the array properly
np.set_printoptions(linewidth=200)

# Print the pixel values of the first training image
print(x_train[0])


In [None]:
# Normalizing the data
# Each element of the nested list/array is divided by 255 to normalize the pixel values between 0 and 1.
# This normalization is commonly performed to improve the training stability of neural networks.
x_train = x_train / 255
x_test = x_test / 255


In [None]:
# Print the normalized pixel values of the first training image after normalization
print(x_train[0])


# Step 4 - Modelling

## Types of Models in TensorFlow

There are two primary types of models in TensorFlow:

1. **Sequential** - Discussed in this tutorial
2. **Graphical**

## Models

- `tf.keras.models.Sequential()`: This function allows you to create a linear stack of layers, resulting in a sequential neural network.

- `tf.model()`: This function enables you to construct an arbitrary graph of layers, as long as there are no cycles.

## Flatten Layer

- `tf.keras.layers.Flatten()`: This layer is used to flatten the input. For input of shape `(batch_size, height, width)`, the output is reshaped to `(batch_size, height * width)`.

## Dense Layer

- `tf.keras.layers.Dense()`: Represents a normal dense layer in a neural network, where each node is connected to every node in the next layer.

    - **units**: Corresponds to the number of nodes in the layer.
    - **activation**: An element-wise activation function.
    
        - **relu**: Converts negative values to 0 while keeping positive values unchanged.
        - **softmax**: Converts the element with the maximum value to 1 and the rest to 0.

In the example below, we have three dense layers with 128, 64, and 10 nodes respectively. The layer with 10 nodes is intended to be the output layer. We use the *softmax* activation function for the final/output layer to obtain a single value as 1 (preferably the maximum value).

## Compiling the Model

The `model.compile()` function configures the optimizer, loss, and metrics for training.

- **optimizer**: Updates the parameters of the neural network.
- **loss**: Measures the error in our model.
- **metrics**: Used for evaluating the model's performance. While metrics are not used for training, loss evaluates the model's error during training and guides the optimizer in minimizing the error.


In [None]:
# Creating the architecture of the model
# A Sequential model is created, and layers are added to define the neural network architecture.
model = tf.keras.models.Sequential()

# Flattening layer: Reshapes the input data into a 1D array before passing it to the neural network.
model.add(tf.keras.layers.Flatten())

# Dense layers: Fully connected layers with specified number of units and activation functions.
# Here, we add two hidden layers with 128 and 64 units respectively, using ReLU activation.
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(64, activation=tf.nn.relu))

# Output layer: Final layer with 10 units (one for each digit) and softmax activation for classification.
model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))

# Compiling the model
# The model is compiled with an optimizer, a loss function, and evaluation metrics.
model.compile(
    optimizer='adam',  # 'adam' optimizer for adaptive learning rates
    loss='sparse_categorical_crossentropy',  # Loss function for multi-class classification
    metrics=['accuracy']  # Metric to monitor during training and evaluation
)


# Step 5 - Training

The `model.fit` Method for Training

- **x_train**: Training data or features.
- **y_train**: Target labels.
- **epochs**: Number of times the entire dataset is fed into the model.

During the training process, you can observe the loss and accuracy calculated based on the training data. However, determining the number of epochs is a trial-and-error process. It depends on various factors, such as the dataset size and the complexity of classification. With experience, you will develop an intuition for estimating the appropriate number of epochs required for a specific model and dataset.


In [None]:
# Training the model
# The model is trained using the training data (x_train) and labels (y_train) for a specified number of epochs.
# An epoch represents one complete pass through the training dataset.
model.fit(x_train, y_train, epochs=3)


# Step 6 - Validation

## Assessing Model Accuracy on New Data

We've validated our model's accuracy on the training data, achieving an approximate accuracy of 97%. Now, let's examine how well the model performs on new, unseen data. It's common for this validation accuracy to be slightly lower than the training accuracy.

> Validation serves a critical purpose in preventing **overfitting**. Overfitting occurs when a model excels on training data but struggles on new test data. Large disparities between training and validation accuracies are indicative of overfitting.

Detecting overfitting is crucial. When a notable gap exists between training and validation accuracy, it's a sign of potential overfitting. Fortunately, strategies to mitigate overfitting are available and will be explored later in this course.

*Remember, while a modest decrease in validation accuracy is expected, a significant drop signals a need for further investigation.*



In [None]:
# Evaluating the trained model on the test data
# The model's performance is evaluated using the test data (x_test) and test labels (y_test).

# Evaluate the model and store the validation loss and accuracy
val_loss, val_acc = model.evaluate(x_test, y_test)

# Print the validation loss and accuracy
print(f'Validation loss: {val_loss}')
print(f'Validation accuracy: {val_acc}')


# Step 7 - Stopping at Reaching Target Accuracy

## Early Stopping for Desired Accuracy

Consider a scenario where your goal is to achieve a model accuracy of 95%. However, you're uncertain about the optimal number of epochs required to attain this accuracy. You might set a high number of epochs, but if the model reaches the desired accuracy early on, continuing training could lead to overfitting. To address this, you can implement a mechanism to automatically stop training once the accuracy reaches 95%. Let's explore how to achieve this.



In [None]:
# Callback class which checks on the logs when the epoch ends
class myCallback(tf.keras.callbacks.Callback):
    # This method is called at the end of each epoch
    def on_epoch_end(self, epoch, logs={}):
        # Check if the loss is below a certain threshold
        if logs.get('loss') < 0.05:
            print("\nReached minimal loss, so cancelling training!")
            self.model.stop_training = True

# Create an instance of the callback class
callbacks = myCallback()


In [None]:
# Creating the architecture of the model
# A Sequential model is created, and layers are added to define the neural network architecture.
model = tf.keras.models.Sequential()

# Flattening layer: Reshapes the input data into a 1D array before passing it to the neural network.
model.add(tf.keras.layers.Flatten())

# Dense layers: Fully connected layers with specified number of units and activation functions.
# Here, we add two hidden layers with 128 units each, using ReLU activation.
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))

# Output layer: Final layer with 10 units (one for each digit) and softmax activation for classification.
model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))

# Compiling the model
# The model is compiled with an optimizer, a loss function, and evaluation metrics.
model.compile(
    optimizer='adam',  # 'adam' optimizer for adaptive learning rates
    loss='sparse_categorical_crossentropy',  # Loss function for multi-class classification
    metrics=['accuracy']  # Metric to monitor during training and evaluation
)

# Callback class which checks on the logs when the epoch ends
class myCallback(tf.keras.callbacks.Callback):
    # This method is called at the end of each epoch
    def on_epoch_end(self, epoch, logs={}):
        # Check if the loss is below a certain threshold
        if logs.get('loss') < 0.05:
            print("\nReached minimal loss, so cancelling training!")
            self.model.stop_training = True

# Create an instance of the callback class
callbacks = myCallback()

# Training the model with the custom callback applied
model.fit(x_train, y_train, epochs=50, callbacks=[callbacks])


## Early Stopping for Target Loss

It's important to note that even though we've set the number of epochs to 50, the training process can complete sooner. As soon as the loss drops below 0.05, the training halts. Alternatively, you can also monitor and use the ```accuracy``` parameter instead of ```loss``` for early stopping. Feel free to experiment with both approaches.



**In the next tutorial, we will explore the advantages of a Simple Convolutional Neural Network (CNN) over the basic neural network we've discussed here. Stay tuned for deeper insights!**