# Ch 2 - Mathematical Building Blocks of Neural Networks

Understanding deep learning requires familiarity with many simple mathematical concepts: tensors, tensor operations, differentiation, gradient descent, and so on.

## 2.1 A First Look at a Neural Network



The problem we’re trying to solve here is to classify grayscale images of handwritten digits (28 × 28 pixels) into their 10 categories (0 through 9). We’ll use the MNIST dataset, a classic in the machine-learning community, which has been around almost as long as the field itself and has been intensively studied. It’s a set of 60,000 training images, plus 10,000 test images, assembled by the National Institute of Standards and Technology (the NIST in MNIST) in the 1980s. You can think of “solving” MNIST as the “Hello World” of deep learning—it’s what you do to verify that your algorithms are working as expected.

Note on classes and labels:
- In machine learning, a category in a classification problem is called a class. Data points are called samples. The class associated with a specific sample is called a label.

![MNIST](Images/02_01.jpg)



In [None]:
from keras.datasets import mnist

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

*train_images* and *train_labels* form the training set, the data that the model will learn from.

The model will then be tested on the test set, *test_images* and *test_labels*.

The images are encoded as Numpy arrays, and the labels are an array of digits, ranging from 0 to 9. The images and labels have a one-to-one correspondence.

#### Training Data:

In [None]:
train_images.shape

In [None]:
len(train_labels)

In [None]:
train_labels

#### Testing Data:

In [None]:
test_images.shape

In [None]:
len(test_labels)

In [None]:
test_labels

The workflow will be as follows: First, we’ll feed the neural network the training data, train_images and train_labels. The network will then learn to associate images and labels. Finally, we’ll ask the network to produce predictions for test_images, and we’ll verify whether these predictions match the labels from test_labels.

#### The Network Architecture

In [None]:
from keras import models
from keras import layers

In [None]:
network = models.Sequential()
network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,)))
network.add(layers.Dense(10, activation='softmax'))

## 2.2 Data Representations for Neural Networks

### 2.2.1 Scalars (DD Tensors)

### 2.2.2 Vectors (1D Tensors)

### 2.2.3 Matrices (2D Tensors)

### 2.2.4 3D Tensors and Higher-Dimensional Tensors

### 2.2.5 Key Attributes

### 2.2.6 Manipulating Tensors in Numpy

### 2.2.7 The Notion of Data Batches

### 2.2.8 Real-World Examples of Data Tensors

### 2.2.9 Vector Data

### 2.2.10 Timeseries Data or Sequence Data

### 2.2.11 Image Data

### 2.2.12 Video Data

## 2.3 The Gears of Neural Networks: Tensor Operations

### 2.3.1 Element-Wise Operations

### 2.3.2 Broadcasting

### 2.3.3 Tensor Dot

### 2.3.4 Tensor Reshaping

### 2.3.5 Geometric Interpretation of Tensor Operations

### 2.3.6 A Geometric Interpretation of Deep Learning

## 2.4 The Engine of Neural Networks: Gradient-Based Optimization

### 2.4.1 What's a Derivative?

### 2.4.2 Derivative of a Tensor Operation: the Gradient

### 2.4.3 Stochastic Gradient Descent

### 2.4.4 Chaining Derivatives: the Backpropagation Algorithm

## 2.5 Looking Back at our First Example