<a href="https://colab.research.google.com/github/The1AndOnlyAlex/PyTorch-MNIST/blob/main/pytorch1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
import time

## **Tensors and Numpy Arrays**

In [4]:
# Numpy vs Torch
n = np.linspace(0,1,5)
t = torch.linspace(0,1,5)

In [15]:
# Resizing
torch.arange(48)

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
        36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47])

In [16]:
t = torch.arange(48).reshape(3,4,4)
t

tensor([[[ 0,  1,  2,  3],
         [ 4,  5,  6,  7],
         [ 8,  9, 10, 11],
         [12, 13, 14, 15]],

        [[16, 17, 18, 19],
         [20, 21, 22, 23],
         [24, 25, 26, 27],
         [28, 29, 30, 31]],

        [[32, 33, 34, 35],
         [36, 37, 38, 39],
         [40, 41, 42, 43],
         [44, 45, 46, 47]]])

## **General Broadcasting Rules**

In [4]:
a = np.array([1,2])
b = np.array([3,4])
a*b

array([3, 8])

When operating on two arrays, NumPy compares their shapes element-wise. Starting with the trailing (rightmost) dimensions and working its way left.

Two dimensions are compatible when:
1. they are equal, or
2. one of them is 1

ex. These two shapes are compatible:

Shape 1: (1,6,4,1,7,2)

Shape 2: (5,6,1,3,1,2)

In [5]:
a = torch.ones((6,5))
b = torch.arange(5).reshape((1,5))
a+b

tensor([[1., 2., 3., 4., 5.],
        [1., 2., 3., 4., 5.],
        [1., 2., 3., 4., 5.],
        [1., 2., 3., 4., 5.],
        [1., 2., 3., 4., 5.],
        [1., 2., 3., 4., 5.]])

In [4]:
# Scaling an image
Image = torch.randn((256,256,3))
Scale = torch.tensor([0.5,1.5,1])

Result = Image*Scale
Result

tensor([[[ 1.5260e-01,  2.8262e-01, -5.1125e-01],
         [ 2.1666e-01, -5.3111e-02,  5.3918e-01],
         [-5.7866e-01,  1.0442e+00,  1.3581e+00],
         ...,
         [ 6.0683e-01, -1.0575e+00,  7.9136e-01],
         [-6.3966e-01, -1.9801e+00, -7.4135e-01],
         [-2.5494e-01, -1.7491e+00,  4.0271e-03]],

        [[-2.8846e-02, -1.9216e+00, -2.5000e-01],
         [ 4.5896e-01, -1.9274e-01, -1.7969e+00],
         [ 7.6994e-01, -1.8882e-01, -8.0327e-01],
         ...,
         [-2.5037e-01, -6.3744e-01,  1.5131e+00],
         [ 6.2855e-01,  4.2123e-01,  1.8648e+00],
         [ 2.1093e-01,  1.5544e+00, -2.1796e-01]],

        [[ 3.3062e-01, -9.5185e-01,  1.5588e+00],
         [ 8.5951e-01, -2.9838e-01,  1.6548e-01],
         [ 8.5340e-01, -8.2403e-01,  4.5089e-01],
         ...,
         [-2.3632e-02,  7.1747e-01,  4.6145e-01],
         [-2.9514e-01,  1.5115e+00,  7.9702e-01],
         [ 2.8225e-01, -1.0264e+00, -4.6705e-01]],

        ...,

        [[-2.2848e-01, -1.5899e+00,  5

In [5]:
# Example: An array of 2 256x256 images with color channels
# Images (4d array): 2 x 256 x 256 x 3
# Scales (4d array): 2 x 1 x 1 x 3
# Results (4d array): 2 x 256 x 256 x 3

Images = torch.randn((2,256,256,3))
Scales = torch.tensor([0.5,1.5,1,1.5,1,0.5]).reshape(2,1,1,3)

## **Operations Across Dimensions**

In [7]:
# Simple operations can be done on 1 dimensional tensors:

t = torch.tensor([0.5,1,3,4])
torch.mean(t), torch.std(t), torch.max(t), torch.min(t)

(tensor(2.1250), tensor(1.6520), tensor(4.), tensor(0.5000))

In [9]:
# But we can do that with a higher dimension tensors (2d tensor too) for example
t = torch.arange(20, dtype=float).reshape(5,4)
t

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [12., 13., 14., 15.],
        [16., 17., 18., 19.]], dtype=torch.float64)

In [10]:
torch.mean(t, axis=0)

tensor([ 8.,  9., 10., 11.], dtype=torch.float64)

In [13]:
# Even higher dimensionality arrays!

t = torch.randn(5,256,256,3)

In [17]:
# Taking the mean across batch (size 5)
torch.mean(t, axis=0).shape

torch.Size([256, 256, 3])

In [18]:
# Taking the mean across color channels (RGB)
torch.mean(t, axis=-1).shape
# NOTE: -1 means the last dimension in the tensor, -2 means second to last...

torch.Size([5, 256, 256])

In [19]:
# Useful method: .max to take the maximum of the color channel values
# AND corresponding indices

values, indices = torch.max(t,axis=-1)

In [22]:
values.shape

torch.Size([5, 256, 256])

In [21]:
indices

tensor([[[0, 1, 2,  ..., 2, 2, 0],
         [0, 2, 0,  ..., 2, 1, 1],
         [0, 1, 1,  ..., 2, 0, 2],
         ...,
         [2, 2, 0,  ..., 1, 0, 1],
         [2, 2, 0,  ..., 0, 0, 2],
         [2, 0, 2,  ..., 0, 1, 0]],

        [[0, 2, 2,  ..., 1, 2, 2],
         [1, 0, 0,  ..., 1, 1, 0],
         [2, 1, 0,  ..., 0, 1, 2],
         ...,
         [2, 1, 1,  ..., 2, 1, 1],
         [2, 0, 2,  ..., 1, 1, 2],
         [0, 2, 0,  ..., 2, 2, 1]],

        [[1, 2, 1,  ..., 1, 1, 1],
         [2, 2, 1,  ..., 0, 1, 2],
         [0, 1, 0,  ..., 1, 0, 2],
         ...,
         [1, 1, 2,  ..., 2, 2, 1],
         [2, 1, 1,  ..., 1, 0, 0],
         [1, 0, 1,  ..., 1, 0, 1]],

        [[0, 2, 1,  ..., 0, 2, 0],
         [2, 1, 0,  ..., 0, 1, 1],
         [2, 2, 1,  ..., 2, 0, 0],
         ...,
         [0, 1, 2,  ..., 2, 2, 0],
         [0, 2, 1,  ..., 2, 1, 2],
         [2, 2, 1,  ..., 0, 2, 1]],

        [[2, 0, 2,  ..., 1, 1, 2],
         [0, 2, 1,  ..., 2, 2, 1],
         [0, 1, 1,  ..., 2

## **Difference between NumPy and PyTorch**

In [23]:
# PyTorch automatically computs the gradients of operations done to a tensor

x = torch.tensor([[5.,8.],[4.,6.]], requires_grad=True)
y = x.pow(3).sum()
y

tensor(917., grad_fn=<SumBackward0>)

In [24]:
# Compute the gradient:
y.backward() # compute the gradient
x.grad # print the gradient (everything that has happened to x)

tensor([[ 75., 192.],
        [ 48., 108.]])

In [25]:
# Compare and verify
3*x**2

tensor([[ 75., 192.],
        [ 48., 108.]], grad_fn=<MulBackward0>)

When changing weights (x), how does it affect the results of the outputs (y)? Using the gradients kept track by PyTorch, you can see exactly how the outputs are affected by changing a parameter by a given amount

### Additionally, PyTorch computes matrix multiplication problems faster, especially if you're running on a GPU

Using PyTorch:

In [58]:
A = torch.randn((1000,1000)).cuda() # Added .cuda() to show an even bigger difference
B = torch.randn((1000,1000)).cuda() # Which means its using the GPU

In [59]:
t1 = time.perf_counter()
torch.matmul(A,B)
t2 = time.perf_counter()
print(t2-t1)

0.0006265770002755744


Using NumPy:

In [60]:
A = np.random.randn(int(1e6)).reshape((1000,1000))
B = np.random.randn(int(1e6)).reshape((1000,1000))

In [61]:
t1 = time.perf_counter()
A@B
t2 = time.perf_counter()
print(t2-t1)

0.05138796600022033


However, nothing is free. PyTorch tensors take up a little more memory than NumPy arrays on your computer.

PyTorch = Faster Operations >> But tradeoff being that it takes up more memory on PC