# Introduction to Deep Learning with TensorFlow and Keras


## Introduction to Deep Learning

Deep Learning is a subset of machine learning that employs algorithms known as neural networks to learn from and make decisions based on vast amounts of data. It is renowned for its ability to process large and complex datasets, with applications ranging from image and speech recognition to natural language processing and autonomous vehicles.


## Setting Up the Environment

Ensure Python is installed on your system. Then, install TensorFlow and Keras by running the following command in a code cell.


```bash
pip install tensorflow


Now, import the necessary libraries.

In [21]:
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt

## Basic Concepts in Deep Learning

### Neural Networks

Neural Networks are computational models inspired by the human brain's structure. They consist of layers of nodes or "neurons," interconnected to form a network.


### Activation Functions

Activation functions determine the output of a neural network node. Common examples include ReLU (Rectified Linear Unit), Sigmoid, and Softmax.

### Loss Functions

Loss functions measure how well the model's predictions match the target data during training. Common loss functions include Mean Squared Error for regression tasks and Cross-Entropy for classification tasks.

### Optimizers

Optimizers are algorithms that adjust the weights of the network to minimize the loss function. Examples include SGD (Stochastic Gradient Descent) and Adam.

<img src="./imgs/NN_learning.png" alt="drawing" width="900"/>

[This video](https://www.youtube.com/watch?v=aircAruvnKk&ab_channel=3Blue1Brown) does an exceptional job of introducing neural networks learn.

## Creating a Simple Deep Learning Model

## I. Data Preparation

In this example, we will work with the MNIST Dataset. 

The MNIST dataset, short for Modified National Institute of Standards and Technology dataset, is one of the most iconic datasets in the field of machine learning and deep learning. Comprising a collection of 70,000 handwritten digits, it is split into a training set of 60,000 examples and a test set of 10,000 examples. Each image is grayscale, 28x28 pixels, and labeled with the digit it represents, ranging from 0 to 9. MNIST serves as a benchmark dataset for evaluating the performance of algorithms in the domain of image recognition. Since its release, it has become a standard dataset for beginners and researchers alike to test and benchmark their machine learning and deep learning models. The simplicity of MNIST allows for quick testing of concepts or algorithms, making it an excellent starting point for anyone new to the field.

<img src="./imgs/mnist.webp" alt="drawing" width="450"/>

Load and preprocess the MNIST dataset

In [22]:
mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize the pixel values of the images
train_images = train_images / 255.0
test_images = test_images / 255.0

# II. Modeling

Creating a model in `keras` involves several steps:

1. **Create your network architecture** using the `Sequential` class.
2. **Compile your model** by specifying the loss function, optimizer, and metrics.
3. **Fit your model** to the training data.


### 1. Model Initialization Steps

1. **Instantiate Model**: We start by creating an empty sequential model with `Sequential()`. Neural networks are sequential models, where we build a sequence of layers. Using the `Sequential()` class is one of the simpler ways to accomplish this.
2. **Add Layers**: We add layers to our model one by one using `Dense()`.
   - **First hidden layer**: We specify the number of nodes, use the `ReLU` activation function, and set the input shape based on `X` (specifically, the number of feature columns; the number of rows is handled automatically). The ideal number of nodes is often trial and error, but a common starting point is to align it with the number of input features.
   - **Second hidden layer**: This layer automatically takes the output of the first hidden layer as its input. With the `Sequential()` class handling layer connections, we simply add another layer with a specified number of nodes and `ReLU` activation.
   - **Output layer**: We specify the number of nodes equal to the number of classes and use the softmax activation function for a multi-class classification problem.


In [23]:
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(10, activation='softmax')
])

### 2. Model Compilation Steps

Compiling our model in Keras requires:

1. **Choosing an optimizer**: We typically default to `Adam` due to its effectiveness across a wide range of tasks. Adam stands for Adaptive Moment Estimation and combines the best properties of the AdaGrad and RMSProp algorithms to handle sparse gradients on noisy problems. Adam is efficient in terms of computation and requires little memory. It's particularly favored for its adaptiveness, making it suitable for most problems without needing much customization or tuning of the learning rate. This optimizer adjusts the learning rate during training, which helps converge faster and more effectively.
2. **Specifying the loss function**: Crucial for neural networks due to the need for a differentiable loss for gradient descent. In our case, we use `sparse_categorical_crossentropy` for multi-class classification. This choice is motivated by our problem type—classification—where we aim to categorize inputs into multiple classes (image to 10 digits). This function compares the distribution of the predictions (the outputs of the softmax function in our model) with the true distribution. It's a good fit for classification problems with multiple classes.
3. **Defining metrics**: Metrics like `accuracy` give us a performance indicator to monitor during training. They do not impact the training directly but provide insights into model performance and potential adjustments to the learning rate or other parameters. In classification tasks, accuracy is a common and intuitive metric, providing a straightforward understanding of how well the model is performing. It's particularly useful in balanced datasets where each class has a roughly equal number of instances. However, it's important to complement accuracy with other metrics when dealing with imbalanced datasets or when other aspects of the model's performance are critical.

In [24]:
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

### 3. Model Training Steps

**Iterations per Training Epoch**: With a hypothetical dataset of 100,000 rows, assuming a 20% validation split leaves us 80,000 rows for training. With a batch size of 32, we get 2,500 iterations per epoch.

- **Epochs**: Training neural networks involves multiple epochs, where the entire dataset is passed through the model in each epoch. A common range for the number of epochs is 5-10, with adjustments to the learning rate if you're considering higher epochs for improved accuracy.

**Interpreting Output**:

- **Loss and Accuracy**: Post each epoch, we observe the average training loss and accuracy, providing insights into the model's learning progress.
- **Validation Loss and Accuracy**: Calculated at the end of each epoch, these metrics help gauge the model's generalization ability and highlight potential overfitting if the training accuracy significantly surpasses validation accuracy.


In [25]:
hist = model.fit(train_images, train_labels, batch_size=64, epochs=10,validation_split=0.2)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


### 4. Evaluating the Model

Test the model's performance:

In [26]:
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print('\nTest accuracy:', test_acc)

313/313 - 0s - loss: 0.0753 - accuracy: 0.9774 - 122ms/epoch - 390us/step

Test accuracy: 0.977400004863739


## 5. Conclusion

This notebook introduced the basics of training a deep learning model using TensorFlow and Keras, from setting up the environment and understanding key concepts to building, training, and evaluating a simple model. For further exploration, consider diving into more complex models, experimenting with different datasets, and exploring the extensive features of TensorFlow and Keras.