## What is PyTorch?

### Machine learning framework

* Deep learning primitives
* NN layer types
* Activation & Loss functions
* Optimizers

### Research prototyping

* Models are Python
* Autograd and "eager mode"

### Production deployment

* TorchScript
* TorchServe
* quantization

### Tensors

* Tensors can be thought of as multi-dimensional array data structures on crack.
* Computation of tensors happens in compiled C++ code!


**PyTorch is Open Source**


## Tensors

In [63]:
# import the pytorch library
import torch

### Creating Tensors

In [64]:
# Create a 5x3 tensor, and fill it with zeros
x = torch.zeros(5, 3)
print(x)
print(x.dtype)

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])
torch.float32


In [65]:
# Create a 5x3 tensor, fill it with ones, and specify the type as int
x = torch.ones(5, 3, dtype=torch.int16)
print(x)

tensor([[1, 1, 1],
        [1, 1, 1],
        [1, 1, 1],
        [1, 1, 1],
        [1, 1, 1]], dtype=torch.int16)


### Seeding

In [66]:
# Initialize learning weights randomly while providing the seed. It's normal to manually provide the seed while initializing the learning weights.
torch.manual_seed(1337)
t1 = torch.rand(5, 3)

torch.manual_seed(666)
t2 = torch.rand(5, 3)

In [67]:
print(f't1 \n {t1}')
print(f't2 \n {t2}')

t1 
 tensor([[0.0783, 0.4956, 0.6231],
        [0.4224, 0.2004, 0.0287],
        [0.5851, 0.6967, 0.1761],
        [0.2595, 0.7086, 0.5809],
        [0.0574, 0.7669, 0.8778]])
t2 
 tensor([[0.3119, 0.2701, 0.1118],
        [0.1012, 0.1877, 0.0181],
        [0.3317, 0.0846, 0.5732],
        [0.0079, 0.2520, 0.5518],
        [0.8785, 0.5281, 0.4961]])


### Tensor Operations

In [68]:
# Print out operations between two tensors
print(f't1 + t2 \n{t1 + t2}')
print(f't1 - t2 \n{t1 - t2}')
print(f't1 * t2 \n{t1 * t2}')
print(f't1 / t2 \n{t1 / t2}')

t1 + t2 
tensor([[0.3902, 0.7658, 0.7349],
        [0.5236, 0.3881, 0.0467],
        [0.9168, 0.7813, 0.7494],
        [0.2674, 0.9606, 1.1327],
        [0.9358, 1.2950, 1.3739]])
t1 - t2 
tensor([[-0.2336,  0.2255,  0.5113],
        [ 0.3212,  0.0127,  0.0106],
        [ 0.2535,  0.6121, -0.3971],
        [ 0.2516,  0.4565,  0.0292],
        [-0.8211,  0.2387,  0.3818]])
t1 * t2 
tensor([[0.0244, 0.1339, 0.0697],
        [0.0427, 0.0376, 0.0005],
        [0.1941, 0.0590, 0.1010],
        [0.0021, 0.1786, 0.3205],
        [0.0504, 0.4050, 0.4355]])
t1 / t2 
tensor([[ 0.2511,  1.8347,  5.5743],
        [ 4.1747,  1.0678,  1.5857],
        [ 1.7642,  8.2331,  0.3073],
        [32.8512,  2.8115,  1.0529],
        [ 0.0653,  1.4520,  1.7696]])


In [69]:
# Create a third tensor of a different size than the others
torch.manual_seed(420)
t3 = torch.rand(7, 7)
print(t3)

tensor([[0.8054, 0.1990, 0.9759, 0.1028, 0.3475, 0.1554, 0.8856],
        [0.6876, 0.2506, 0.1133, 0.2105, 0.4035, 0.2448, 0.8644],
        [0.2896, 0.1729, 0.3458, 0.0117, 0.2572, 0.2272, 0.6076],
        [0.9066, 0.5540, 0.2086, 0.7058, 0.2871, 0.2633, 0.4042],
        [0.2391, 0.5550, 0.9059, 0.5682, 0.8020, 0.0656, 0.1067],
        [0.4335, 0.5005, 0.8121, 0.0603, 0.7086, 0.0708, 0.5807],
        [0.8304, 0.5690, 0.6596, 0.8179, 0.9947, 0.1862, 0.6638]])


In [70]:
# This won't work because the sizes are different
print(f't1 + t3 \n {t1 + t3}')

RuntimeError: The size of tensor a (3) must match the size of tensor b (7) at non-singleton dimension 1

In [None]:
ones = torch.ones(2, 3)
print(ones)

twos = torch.ones(2, 3) * 2
print(twos)

threes = torch.ones(2, 3) * 3
print(threes)

fours = threes + 1
print(fours)
print(fours.shape)

tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[2., 2., 2.],
        [2., 2., 2.]])
tensor([[3., 3., 3.],
        [3., 3., 3.]])
tensor([[4., 4., 4.],
        [4., 4., 4.]])
torch.Size([2, 3])


In [None]:
# Here are some of the other math operations available 

# create a random matrix with values (-1, 1)
r = (torch.rand(2, 2) - 0.5) * 2
print('r')
print(r)


r
tensor([[ 0.2681, -0.9185],
        [-0.1245,  0.2122]])
Absolute value of r
tensor([[0.2681, 0.9185],
        [0.1245, 0.2122]])
Inverse sine (asin) of r
tensor([[ 0.2714, -1.1642],
        [-0.1248,  0.2138]])
Determinant of r
tensor(-0.0574)
Singular value decomposition of r
torch.return_types.svd(
U=tensor([[-0.9701,  0.2428],
        [ 0.2428,  0.9701]]),
S=tensor([0.9862, 0.0582]),
V=tensor([[-0.2944, -0.9557],
        [ 0.9557, -0.2944]]))
Average and standard deviation of r
(tensor(0.5468), tensor(-0.1407))
Max value of r
tensor(0.2681)


#### Absolute Value

In [None]:
print(torch.abs(r))

tensor([[0.2681, 0.9185],
        [0.1245, 0.2122]])


#### Inverse Sine (asin)

In [None]:

print('Inverse sine (asin) of r')
print(torch.asin(r))


#### Determinant

In [None]:

print('Determinant of r')
print(torch.det(r))


#### Singular value decomposition (svd)

In [None]:

print('Singular value decomposition of r')
print(torch.svd(r))


#### Average and standard deviation

In [None]:

print('Average and standard deviation of r')
print(torch.std_mean(r))


#### Max value

In [None]:

print('Max value of r')
print(torch.max(r))

See [the official documentation](https://pytorch.org/docs/stable/torch.html) for a complete list of `torch` methods

### Models

In [None]:
import torch                        # for all things pytorch
import torch.nn as nn               # for torch.nn.Module, the parent object for PyTorch models 
import torch.nn.functional as F     # for the activation function

## LeNet-5

LeNet-5 Is one of the earliest convolutional neural nets. It was build to read small images (32x32) of handwritten numbers, and correctly classify which digit was represented in the image.

![An image of a neural networkle-net-5 diagram](images/mnist.png "Figure: LeNet-5")

* Layer C1 is a convolutional layer, meaning that it scans the input image for features it learned during training. It outputs a map of where it saw each of its learned features in the image. This “activation map” is downsampled in layer S2.

* Layer C3 is another convolutional layer, this time scanning C1’s activation map for combinations of features. It also puts out an activation map describing the spatial locations of these feature combinations, which is downsampled in layer S4.

* Finally, the fully-connected layers at the end, F5, F6, and OUTPUT, are a classifier that takes the final activation map, and classifies it into one of ten bins representing the 10 digits.

### LeNet-5 Class

This is how we can express LeNet in code

In [None]:
class LeNet(nn.Module):      # extend this class from nn.Module

    def __init_(self):
        # 1 input image channel (black & white), 6 output channels, 3x3 square convolution
        self.conv1 = nn.Conv2d(1, 6, 3)
        self.conv2 = nn.Conv2d(6, 16, 3)
        self.fc1 = nn.Linear(16 * 6 * 6, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.f3(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]     # all dimensions except the batch dimension
        num_features = 1
        for s in size:
            num_features *= s
        return num_features
