# Live Coding 1 🎥

## Deep Learning Basics with PyTorch

What we're going to cover:

- [torch.Tensor](https://pytorch.org/docs/stable/tensors.html): High-dimensional arrays, similar to NumPy arrays.
 
- [nn Modules](https://pytorch.org/docs/stable/nn.html):
  - [nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html): Base class for all neural network modules.
  - [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html): Applies a linear transformation to the incoming data.
  - [nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html): Applies a 2D convolution over an input signal.

- [Activation Functions](https://pytorch.org/docs/stable/nn.html#non-linear-activations):
  - [nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html): Rectified Linear Unit.
  - [nn.Sigmoid](https://pytorch.org/docs/stable/generated/torch.nn.Sigmoid.html): Sigmoid function.
  - [nn.Tanh](https://pytorch.org/docs/stable/generated/torch.nn.Tanh.html): Hyperbolic tangent function.

- [Optim](https://pytorch.org/docs/stable/optim.html): Optimization algorithms.
  - [torch.optim.SGD](https://pytorch.org/docs/stable/optim.html#torch.optim.SGD): Stochastic Gradient Descent.
  - [torch.optim.Adam](https://pytorch.org/docs/stable/optim.html#torch.optim.Adam): Adam algorithm.

- [Loss Functions](https://pytorch.org/docs/stable/nn.html#loss-functions):
  - [nn.CrossEntropyLoss](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html): Combines `nn.LogSoftmax` and `nn.NLLLoss`.

  - [nn.MSELoss](https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html): Mean squared error loss.
  - [nn.BCELoss](https://pytorch.org/docs/stable/generated/torch.nn.BCELoss.html): Binary cross entropy loss.

- [Data Loading and Processing](https://pytorch.org/docs/stable/data.html):
  - [torch.utils.data.DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader): Data loader.
  - [torchvision.datasets](https://pytorch.org/vision/stable/datasets.html): Standard datasets.
  - [torchvision.transforms](https://pytorch.org/vision/stable/transforms.html): Common image transformations.

- [Training Loop](https://pytorch.org/tutorials/beginner/basics/optimization_tutorial.html): Putting it all together to train a model.
- [Saving and Loading Models](https://pytorch.org/tutorials/beginner/saving_loading_models.html): Persisting and restoring models.

In [65]:
import torch

## Tensor

In [66]:
s = torch.tensor(1)
s

tensor(1)

In [67]:
s.ndim

0

In [68]:
s.shape

torch.Size([])

In [69]:
vec = torch.Tensor([1, 2, 4])
vec

tensor([1., 2., 4.])

In [70]:
vec.ndim

1

In [71]:
vec.shape

torch.Size([3])

In [72]:
matrix = torch.Tensor([[1, 2, 3],
                       [3, 4, 4]])
matrix

tensor([[1., 2., 3.],
        [3., 4., 4.]])

In [73]:
matrix.shape

torch.Size([2, 3])

In [74]:
matrix.ndim

2

In [75]:
tensor = torch.Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]).reshape(2, 3, 2)
tensor

tensor([[[ 1.,  2.],
         [ 3.,  4.],
         [ 5.,  6.]],

        [[ 7.,  8.],
         [ 9., 10.],
         [11., 12.]]])

In [76]:
tensor.ndim

3

In [77]:
tensor.shape

torch.Size([2, 3, 2])

In [78]:
tensor[1][1][1]

tensor(10.)

## Tensor Operations 

In [79]:
a = torch.Tensor([1, 2, 3])
b = torch.Tensor([3, -2, 5])

In [80]:
a + b 

tensor([4., 0., 8.])

In [81]:
a - b 

tensor([-2.,  4., -2.])

In [82]:
a * b # element wise

tensor([ 3., -4., 15.])

In [83]:
a / b

tensor([ 0.3333, -1.0000,  0.6000])

In [84]:
torch.dot(a, b)

tensor(14.)

In [85]:
m1 = torch.Tensor([[1, 2, 3],
                   [3, 4, 4]])

m2 = torch.Tensor([[1, 2, 3],
                  [3, 4, 4]])

In [86]:
m1

tensor([[1., 2., 3.],
        [3., 4., 4.]])

In [87]:
m1.t()

tensor([[1., 3.],
        [2., 4.],
        [3., 4.]])

In [88]:
m1.shape

torch.Size([2, 3])

In [89]:
m1.t().shape

torch.Size([3, 2])

In [90]:
torch.mm(m1, m2.t())

tensor([[14., 23.],
        [23., 41.]])

In [91]:
m1 @ m2.t()

tensor([[14., 23.],
        [23., 41.]])

In [92]:
torch.Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]).reshape(2, 3, 2)
tensor

tensor([[[ 1.,  2.],
         [ 3.,  4.],
         [ 5.,  6.]],

        [[ 7.,  8.],
         [ 9., 10.],
         [11., 12.]]])

In [93]:
tensor.shape

torch.Size([2, 3, 2])

In [94]:
m1.shape

torch.Size([2, 3])

In [95]:
res = tensor @ m1
res

tensor([[[ 7., 10., 11.],
         [15., 22., 25.],
         [23., 34., 39.]],

        [[31., 46., 53.],
         [39., 58., 67.],
         [47., 70., 81.]]])

In [96]:
res.shape

torch.Size([2, 3, 3])

## Other Operations

In [99]:
tensor = torch.Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
tensor

tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.])

In [100]:
tensor.shape

torch.Size([12])

In [102]:
tensor.reshape(3, 4) # 3 x 4 = 12 

tensor([[ 1.,  2.,  3.,  4.],
        [ 5.,  6.,  7.,  8.],
        [ 9., 10., 11., 12.]])

In [104]:
tensor.reshape(3, 2, 2)

tensor([[[ 1.,  2.],
         [ 3.,  4.]],

        [[ 5.,  6.],
         [ 7.,  8.]],

        [[ 9., 10.],
         [11., 12.]]])

In [105]:
m1

tensor([[1., 2., 3.],
        [3., 4., 4.]])

In [106]:
m1.reshape(6)

tensor([1., 2., 3., 3., 4., 4.])

In [109]:
m1.reshape(3, 1, 2)

tensor([[[1., 2.]],

        [[3., 3.]],

        [[4., 4.]]])

In [110]:
m1

tensor([[1., 2., 3.],
        [3., 4., 4.]])

In [111]:
m1.flatten()

tensor([1., 2., 3., 3., 4., 4.])

### usqueeze

In [112]:
vec = torch.Tensor([1, 2, 3])
vec

tensor([1., 2., 3.])

In [113]:
vec.t()

tensor([1., 2., 3.])

In [114]:
vec.shape

torch.Size([3])

In [115]:
vec.ndim

1

In [118]:
vecs = vec.unsqueeze(dim=0)
vecs

tensor([[1., 2., 3.]])

In [120]:
vec

tensor([1., 2., 3.])

In [119]:
vecs

tensor([[1., 2., 3.]])

In [121]:
vecs.shape

torch.Size([1, 3])

In [122]:
vec.shape

torch.Size([3])

In [123]:
vecs

tensor([[1., 2., 3.]])

In [124]:
vecs.t()

tensor([[1.],
        [2.],
        [3.]])

In [125]:
vec.t()

tensor([1., 2., 3.])

## Tensor Creation

In [131]:
torch.zeros(6, 3)

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

In [132]:
torch.ones(2, 3)

tensor([[1., 1., 1.],
        [1., 1., 1.]])

In [135]:
torch.full((3, 2), 55)

tensor([[55, 55],
        [55, 55],
        [55, 55]])

In [137]:
rd = torch.randn(3, 2)
rd

tensor([[ 0.3583,  1.0882],
        [-2.4244, -0.7644],
        [-0.0251,  0.5025]])

## nn Module

In [138]:
import torch.nn as nn

### nn.Linear

In [146]:
lin1 = nn.Linear(
    in_features= 4,
    out_features= 5 
)

In [147]:
input = torch.Tensor([1, 2, 3, 4])
input

tensor([1., 2., 3., 4.])

In [152]:
res1 = lin1(input)
res1

tensor([ 0.6527, -0.0510,  0.6981,  3.9205, -0.6515], grad_fn=<ViewBackward0>)

In [153]:
res1.shape

torch.Size([5])

In [149]:
A = lin1.weight
A

Parameter containing:
tensor([[ 0.2994,  0.3813,  0.3518, -0.2679],
        [ 0.1156, -0.4725, -0.3124,  0.4607],
        [-0.4372, -0.0886, -0.1835,  0.4572],
        [ 0.4710,  0.1227,  0.4908,  0.3300],
        [ 0.0223, -0.0151, -0.3965,  0.2584]], requires_grad=True)

In [150]:
b = lin1.bias
b

Parameter containing:
tensor([-0.3931, -0.1271,  0.0340,  0.4120, -0.4876], requires_grad=True)

In [151]:
lin2 = nn.Linear(
    in_features= 5, 
    out_features= 3
)

In [156]:
res2 = lin2(res1)
res2

tensor([-1.1904,  0.6140,  0.3755], grad_fn=<ViewBackward0>)

In [155]:
res.shape

torch.Size([5])

In [157]:
res2.shape

torch.Size([3])

In [158]:
lin3 = nn.Linear(3, 2)

In [159]:
res3 = lin3(res2)
res3

tensor([ 0.1413, -0.2351], grad_fn=<ViewBackward0>)

### nn.Sequential

In [160]:
input

tensor([1., 2., 3., 4.])

In [161]:
model = nn.Sequential(
    nn.Linear(4, 5),
    nn.Linear(5, 3),
    nn.Linear(3, 2)
)

In [162]:
model

Sequential(
  (0): Linear(in_features=4, out_features=5, bias=True)
  (1): Linear(in_features=5, out_features=3, bias=True)
  (2): Linear(in_features=3, out_features=2, bias=True)
)

In [163]:
model(input)

tensor([-1.2047, -0.3935], grad_fn=<ViewBackward0>)

## Activation Functions 

$$ \text{ReLU}(x) = \max(0, x)$$

In [165]:
relu = nn.ReLU()

In [167]:
input = torch.Tensor([1, 2, -3, 4])
input

tensor([ 1.,  2., -3.,  4.])

In [168]:
relu(input)

tensor([1., 2., 0., 4.])

$$ \text{Sigmoid}(x) = \frac{1}{1 + e^{-x}} \in [0, 1]$$

In [170]:
sigmoid = nn.Sigmoid()

In [171]:
sigmoid(input)

tensor([0.7311, 0.8808, 0.0474, 0.9820])

In [198]:
model = nn.Sequential(
    nn.Linear(4, 5),
    nn.ReLU(),
    nn.Linear(5, 3),
    nn.ReLU(),
    nn.Linear(3, 2),
    nn.ReLU()
)

In [199]:
input = torch.randn(1, 4)
input

tensor([[ 0.2212, -0.0319,  1.3592, -0.6210]])

In [200]:
model(input)

tensor([[0.2481, 0.0000]], grad_fn=<ReluBackward0>)

In [202]:
def simgoid(input: torch.Tensor):
    return 1 / (1 + torch.exp(- input))

In [203]:
input = torch.randn(1, 4)
input

tensor([[ 0.0329, -0.0703,  0.4665,  0.1911]])

In [204]:
sigmoid(input)

tensor([[0.5082, 0.4824, 0.6146, 0.5476]])

## class 

In [209]:
class NeuralNetwork(nn.Module):

    def __init__(self):
        super(NeuralNetwork, self).__init__()

        self.lin1 = nn.Linear(4, 5)
        self.lin2 = nn.Linear(5, 3)
        self.lin3 = nn.Linear(3, 2)

    def forward(self, x: torch.Tensor):
        res1 = self.lin1(x)

        res2 = self.lin2(res1)

        res3 = self.lin3(res2)

        return res3

In [210]:
model = NeuralNetwork()

In [211]:
input = torch.randn(1, 4)
input

tensor([[ 0.0965, -0.7289, -0.7876,  1.3844]])

In [212]:
model(input)

tensor([[ 0.8351, -0.2027]], grad_fn=<AddmmBackward0>)

In [221]:
class NeuralNetworkV2(nn.Module):

    def __init__(self):
        super(NeuralNetworkV2, self).__init__()
    
        self.cnn = nn.Sequential(
            nn.Linear(4, 5),
            nn.ReLU(),
            nn.Linear(5, 3),
            nn.ReLU(),
            nn.Linear(3, 2),
            nn.ReLU()
        )

        self.fc = nn.Linear(2, 1)

    def forward(self, x: torch.Tensor):
        
        res1 = self.mlp(x)

        res2 = self.fc(res1)

        return res2

In [222]:
model2 = NeuralNetworkV2()

In [223]:
model2(input)

tensor([[0.2383]], grad_fn=<AddmmBackward0>)

In [224]:
class CosActivation(nn.Module):

    def __init__(self):
        super(CosActivation, self).__init__()

    def forward(self, x: torch.Tensor):
        return torch.cos(x)

In [None]:
class CosActivation(nn.Module):

    def __init__(self):
        super(CosActivation, self).__init__()

    def forward(self, x: torch.Tensor):
        return torch.cos(x)

In [225]:
cosact = CosActivation()

In [226]:
input

tensor([[ 0.0965, -0.7289, -0.7876,  1.3844]])

In [228]:
cosact(input)

tensor([[0.9953, 0.7459, 0.7055, 0.1853]])

In [230]:
model = nn.Sequential(
    nn.Linear(4, 5),
    CosActivation(),
    nn.Linear(5, 1),
    CosActivation()
)

In [233]:
model(input)

tensor([[0.9978]], grad_fn=<CosBackward0>)

In [234]:
model

Sequential(
  (0): Linear(in_features=4, out_features=5, bias=True)
  (1): CosActivation()
  (2): Linear(in_features=5, out_features=1, bias=True)
  (3): CosActivation()
)

In [238]:
tanh = nn.Tanh()

In [239]:
input

tensor([[ 0.0965, -0.7289, -0.7876,  1.3844]])

In [240]:
tanh(input)

tensor([[ 0.0962, -0.6224, -0.6571,  0.8819]])