<a href="https://colab.research.google.com/github/Jetsukda/Deep-Learning-with-Python/blob/main/2.%20Before%20we%20begin%3A%20the%20mathematical%20building%20blocks%20of%20neural%20networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 2.1 A first look at a neural network

>Note on **classes** and **labels**

- In machine learning, a **category** in a classification problem is called a ***class***.
- **Data points** are called ***samples***.
- The **class associated** with a  specific sample is called a ***label***.

Let's look at a concrete example of neural network that uses the Python library Keras to learn to classify handwritten digits. Unless you already have experience with Keras or similar libaries, you won't understand everything about this first example tight away.

**Loading the MNIST dataset in Keras**

In [None]:
from tensorflow.keras.datasets import mnist

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()


- `train_images` and `train_labels` form **training set**
- `test_images` and `test_images` form **test set**
- The images are encoded as NumPy arrays, and the labels are an array of digits, ranging from 0 to 9.
- The images and labels have one-to-one correspondence.

In [None]:
# (number of image, rows, cols) -> row x col = images size = 28x28 Pixel.
train_images.shape

(60000, 28, 28)

In [None]:
# 60,000 images.
len(train_images)

60000

In [None]:
# labels range from 0 to 9.
train_labels

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

And here's the test data

In [None]:
# (number of image, rows, cols) -> row x col = images size = 28x28 Pixel.
test_images.shape

(10000, 28, 28)

In [None]:
# 10,000 images.
len(test_images)

10000

In [None]:
# labels range from 0 to 9.
test_labels

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

**The workflow will be as follows**:
- First, we'll feed the neural network the training data, `train_images`, `train_labels`.
- The network will then learn to associate images and labels.
- Finally, we'll ask the network to product predictions for `test_images`, and we'll verify whether these predictions match the labels from `test_labels`.

**The network architecture**

In [None]:
from tensorflow.keras import models
from tensorflow.keras import layers

In [None]:
network = models.Sequential()
network.add(layers.Dense(512, activation="relu", input_shape=(28*28,)))
network.add(layers.Dense(10, activation="softmax"))

The core building block of neural network is the `layer`, **data-processing** module that you can think of as a **filter for data**. Some data goes in, and it comes out in more useful form.

Specifically, layers exteact **representations** out of the data feed into them hopefully from, representations that are more meaningful for the problem as hand.

Most of deep learning consists of chaining together simple layers that will implement a form of progressive **data distillation**.

A deep learning model is like a sieve for data processing, made of succession of increasingly refined data filters - the layers.
- **Dense Layers**: Which are densely connected (also caleed **fully connected**) neural layers.
    - The last layer is **10-way softmax** layer, which means it will return an array of 10 probability scores (summing to 1).

        \begin{equation}
        \sigma(z)_i  = \frac{exp(z_i)}{\sum_{j}^{ }exp(z_j))}
        \end{equation}
        <p align="center">
        <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/8/88/Logistic-curve.svg/1200px-Logistic-curve.svg.png" width="700" >
        <figcaption align="center">Fig1. Softmax</figcaption>
        </p>
        
    - The last layer is **10-way softmax** layer, which means it will return an array of 10 probability scores (summing to 1).
- To make the network ready for training, we need to pick three more things, as part of the **compilation** step:
    - **Loss function** - How the network will be able to measure its performance on the training data, and thus how it will be able to steer itself in thr right direction.
    - **Optimizer** - The mechanism through which the network will update itself based on the data it sees ans its **loss function**
    - **Metrics to monitor during training and testing** - Here, we'll only care about accuracy(the fraction of the images that were correctly classified).


**The compilation step**

In [None]:
network.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=["accuracy"])

**Preparing the image data**

In [None]:
train_images = train_images.reshape(60000, 28*28)
train_images = train_images.astype("float32")/255

test_images = test_images.reshape(10000, 28*28)
test_images = test_images.astype("float32")/255

*We also need to categorically encode the labels, a step that's explained in chapter 3.

**Preparing the labels**

In [None]:
from tensorflow.keras.utils import to_categorical

In [None]:
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

We're now ready to train the network, which in Keras is done to the network's fit method - we fit the model to its training data

In [29]:
network.fit(train_images, train_labels, epochs=5, batch_size=128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f7fa25e4a58>

# 2.2 Data representations for neural networks

- Multidimensional NumPy arrays(**`tensors`**): Container for numbers.
- In general, all current machine learning system use tensors as their **basic data structure**.
- Tensors are a generalization of matrices to an arbitrary number of dimensions
    - A **dimension** is often called an **axis**

## 2.2.1 Scalars (0D Tensors)
- A tensor that contains only number.
- In NumPy, a `float32` or `float64` number is a scalar tensor (or scalar array)
- a scalar tensor has 0 axes (ndim == 0).
- The number of axes of a tensor is also called its rank.

In [30]:
import numpy as np

In [31]:
# Here's a NumPy scalar
x = np.array(12)

In [32]:
x

array(12)

In [33]:
x.ndim

0

## 2.2.2 Vector (1D tensors)

- An array of numbers iss caleed a ***vector*** or ***1D tensor***
- have one axis

In [34]:
x = np.array([12, 3, 6, 14])

In [35]:
x

array([12,  3,  6, 14])

In [36]:
x.ndim

1

- This vector has five entries and so id called a ***5-dimensional vertor***.
- Dont' confuse a **5D vector** with a **5D tensor**!
    - 5D vector has only one axis and have five dimensions along its axis
    - 5D tensor has five axis (and may have any number of dimensions along each axis).
- **Dimensional** can denote either **the number of entries along a spaceific axis** or **the number of axes in tensor**

## 2.2.3 Matrices (2D tensors)