# 00. Pytorch Fundamentals

Resource notebook: https://www.learnpytorch.io/00-pytorch-fundamentals/

If you have a question: https://github.com/mrdbourke/pytorch-deep-learning/discussions

In [1]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.4.0


In [2]:
!nvidia-smi

Fri Jul 26 03:14:07 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.70                 Driver Version: 560.70         CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 2060      WDDM  |   00000000:01:00.0  On |                  N/A |
| N/A   69C    P0             24W /   80W |    1824MiB /   6144MiB |      9%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

## Introduction to Tensors

### Creating Tensors

PyTorch tensors are created using `torch.Tensor()` = https://pytorch.org/docs/stable/tensors.html

In [3]:
# Scalar
scalar = torch.tensor(7)
scalar

tensor(7)

Get the number of dimensions

In [4]:
scalar.ndim  # Here the dimension is 0

0

Get the tensor back as Python Int

In [5]:
scalar.item()

7

In [6]:
scalar.shape

torch.Size([])

In [7]:
# Vector
vector = torch.tensor([1, 2, 3, 4, 5])
vector

tensor([1, 2, 3, 4, 5])

In [8]:
vector.ndim

1

In [9]:
vector.shape

torch.Size([5])

In [10]:
# MATRIX
MATRIX = torch.tensor([[1,2],
                       [3,4]])
MATRIX

tensor([[1, 2],
        [3, 4]])

In [11]:
MATRIX.ndim

2

In [12]:
MATRIX.shape

torch.Size([2, 2])

In [13]:
MATRIX[0]

tensor([1, 2])

In [14]:
MATRIX[1]

tensor([3, 4])

In [15]:
# TENSOR
TENSOR = torch.tensor([[[1,2,3],
                        [4,5,6],
                        [7,8,9]]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [16]:
TENSOR.ndim

3

In [17]:
TENSOR.shape

torch.Size([1, 3, 3])

### Random tensors


Why random tensors?
Random tensors are important because the way many neural networks learn is that they start with tensors full of random numbers and then adjust those random to better represent the data.

`Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers -> ....`


Torch random tensors — https://pytorch.org/docs/stable/generated/torch.rand.html

In [18]:
# Create a random Tensor of size (3,4)
random_tensor = torch.rand(3,4)
random_tensor

tensor([[0.7321, 0.2951, 0.1870, 0.2228],
        [0.6234, 0.1095, 0.8023, 0.3097],
        [0.4184, 0.6193, 0.4543, 0.2422]])

In [19]:
random_tensor.ndim

2

In [20]:
random_tensor1 = torch.rand(1,10,10)
random_tensor1

tensor([[[4.3396e-01, 7.9041e-01, 2.1471e-01, 8.5110e-01, 5.1757e-01,
          7.0966e-01, 8.8520e-01, 9.1851e-03, 9.1606e-01, 2.7355e-01],
         [9.2171e-01, 8.9553e-01, 3.7225e-01, 3.1633e-01, 8.6061e-01,
          5.8490e-01, 7.0734e-01, 6.2798e-01, 3.5427e-02, 4.8310e-01],
         [3.1074e-01, 5.3273e-01, 8.0227e-02, 2.3559e-01, 7.7160e-01,
          2.0782e-01, 2.0592e-01, 4.8041e-01, 3.2458e-01, 7.2269e-01],
         [9.0525e-01, 3.6679e-01, 5.1672e-01, 7.2491e-04, 7.2613e-01,
          7.5330e-01, 5.8603e-01, 3.1638e-01, 3.6637e-01, 4.8880e-01],
         [8.4685e-01, 2.9329e-01, 8.5564e-01, 4.6222e-01, 9.9438e-01,
          5.8080e-01, 9.9739e-01, 7.1093e-02, 9.3276e-01, 2.4454e-02],
         [6.8351e-01, 1.0705e-01, 4.7978e-01, 2.6708e-01, 2.1989e-01,
          3.8960e-01, 3.3552e-02, 2.6794e-02, 5.4780e-01, 6.2815e-01],
         [8.9816e-01, 8.0556e-01, 7.9739e-01, 8.1022e-02, 1.6180e-01,
          4.7095e-01, 2.2615e-01, 8.4481e-01, 5.3238e-01, 1.4817e-01],
         [3.3

In [21]:
random_tensor1.ndim

3

In [22]:
# create a random image tensor
random_image = torch.rand(size = (224,224,3)) # 224*224 is the height and width of the image and 3 is number of colour channels.
random_image

tensor([[[0.6211, 0.3794, 0.9431],
         [0.8770, 0.5937, 0.2930],
         [0.2141, 0.2615, 0.2168],
         ...,
         [0.0974, 0.7568, 0.7583],
         [0.7039, 0.3746, 0.4803],
         [0.9881, 0.7511, 0.4480]],

        [[0.4965, 0.6123, 0.3493],
         [0.8497, 0.4556, 0.5724],
         [0.8613, 0.4687, 0.6528],
         ...,
         [0.3083, 0.4904, 0.7180],
         [0.7699, 0.8317, 0.8199],
         [0.4525, 0.8840, 0.9061]],

        [[0.1574, 0.9924, 0.8202],
         [0.7456, 0.7644, 0.2544],
         [0.4838, 0.3403, 0.3302],
         ...,
         [0.7689, 0.4766, 0.8564],
         [0.5958, 0.9789, 0.7124],
         [0.0962, 0.6236, 0.1098]],

        ...,

        [[0.4803, 0.3042, 0.0118],
         [0.2624, 0.1307, 0.9448],
         [0.7208, 0.4117, 0.9322],
         ...,
         [0.0844, 0.6836, 0.2838],
         [0.4491, 0.9760, 0.8202],
         [0.2131, 0.4953, 0.6704]],

        [[0.8541, 0.6910, 0.3836],
         [0.4714, 0.9752, 0.8164],
         [0.

In [23]:
random_image.shape, random_image.ndim

(torch.Size([224, 224, 3]), 3)

## Zeros and Ones

In [24]:
zeros = torch.zeros(size=(3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [25]:
ones = torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [26]:
ones.dtype,zeros.dtype

(torch.float32, torch.float32)

## Creating a range of tensors and tensors-like

In [27]:
# torch.arange()
one_to_ten = torch.arange(1,11)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [28]:
step_tensor  = torch.arange(start=1,end=1002,step=100)
step_tensor

tensor([   1,  101,  201,  301,  401,  501,  601,  701,  801,  901, 1001])

In [29]:
# Creating Tensors Like
ten_zeros = torch.zeros_like(input = one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

:## Tensor Datatypes

**Note:** Tensor datatype is the 3 big errors you will run into with pytorch and deep learning:

1. Tensors not the right datatype
2. Tensors not the right shape
3. Tensors not on the right device

In [30]:
float_32_tensor = torch.tensor([3.0,6.0,9.0],
                               dtype = torch.float32, # What is the datatype of the tensor? (eg. float32, float16 or float64, etc) Default is float32
                               device = "cuda",  # What device is your tensor on?
                               requires_grad = True) # Whether or not to track gradients with this tensor's operations

float_32_tensor

tensor([3., 6., 9.], device='cuda:0', requires_grad=True)

In [31]:
float_32_tensor.dtype

torch.float32

In [32]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], device='cuda:0', dtype=torch.float16,
       grad_fn=<ToCopyBackward0>)

In [33]:
float_32_tensor * float_16_tensor

tensor([ 9., 36., 81.], device='cuda:0', grad_fn=<MulBackward0>)

In [34]:
int_32_tensor = torch.tensor([1,2,3],
                             dtype = torch.int32,
                             device = "cuda",
                             requires_grad = False)

int_32_tensor

tensor([1, 2, 3], device='cuda:0', dtype=torch.int32)

In [35]:
float_32_tensor * int_32_tensor

tensor([ 3., 12., 27.], device='cuda:0', grad_fn=<MulBackward0>)

### Getting information from tensors (Tensor Attributes)

1. Tensors not the right datatype - To get the datatype of the tensor you can use `tensor.dtype`
2. Tensors not the right shape - to get the shape of the tensor you can use `tensor.shape`
3. Tensors not on the right device - to get the device of the tensor you can use `tensor.device`

In [36]:
print(float_32_tensor.dtype)

print(int_32_tensor.dtype)

print(float_32_tensor.shape)

print(int_32_tensor.shape)

print(float_32_tensor.device)

print(int_32_tensor.device)


torch.float32
torch.int32
torch.Size([3])
torch.Size([3])
cuda:0
cuda:0


In [37]:
some_tensor = torch.rand(4,5)
some_tensor

tensor([[0.4751, 0.4115, 0.1705, 0.0481, 0.3471],
        [0.4849, 0.8767, 0.8949, 0.4659, 0.1045],
        [0.9348, 0.3080, 0.5424, 0.9775, 0.5438],
        [0.6038, 0.8251, 0.8767, 0.5685, 0.9427]])

In [38]:
print(f"Datatype of the tensor: {some_tensor.dtype}")
print(f"Shape of the tensor : {some_tensor.shape}")
print(f"Shape of the tensor : {some_tensor.size()}")
print(f"Device tensor is stored on : {some_tensor.device}")

Datatype of the tensor: torch.float32
Shape of the tensor : torch.Size([4, 5])
Shape of the tensor : torch.Size([4, 5])
Device tensor is stored on : cpu


### Manipulating Tensors (Tensor operations)

Tensor Operations include:
1. Addition
2. Subtraction
3. Multiplication (Element-wise)
4. Division
5. Matrix Multiplication (Dot Product)


In [39]:
# Create a tensor and add 10 to it

tensor = torch.tensor([1,2,3])
tensor = tensor + 10
tensor

tensor([11, 12, 13])

In [40]:
# other way

tensor = torch.tensor([1,2,3])
tensor = torch.add(tensor,5)

In [41]:
# Create a tensor and subtract 10 from it
tensor = torch.tensor([10,20,30])
tensor = tensor - 5
tensor


tensor([ 5, 15, 25])

In [42]:
# Multiplying a tensor

tensor = torch.tensor([1,2,3])
tensor = tensor * 5
tensor

tensor([ 5, 10, 15])

In [43]:
# other way

tensor = torch.tensor([1,2,3])
tensor = torch.mul(tensor,5)

### Matrix Multiplication

There are 2 main ways of performing multiplication in neural networks and deep learning.

1. Element wise multiplication
2. Matrix multiplication (dot product)

More information on multiplying matrices - https://www.mathsisfun.com/algebra/matrix-multiplying.html

There are two main rules that Matrix mutliplication needs to satisfy:
1. The inner dimensions must match:
* `(3,2) @ (2,3)` will work
* `(3,2) @ (3,2)` will not work
* `(2,3) @ (3,2)` will work
* `(2,3) @ (2,3)` will not work
* The numbers above are dimensions of the matrix

2. The reulting matrix has a shape of outer dimensions
* `(3,2) @ (2,3)` will result in a matrix of dimension `(3,3)`
* `(2,3) @ (3,2)` will result in a matrix of dimension `(2,2)`

Website showing how matrix multiplication is performed - http://matrixmultiplication.xyz/

In [44]:
# element wise multiplication

tensor = torch.tensor([2,3,4])

print(tensor, " * ", tensor)
print("Equals : ", tensor * tensor)

tensor([2, 3, 4])  *  tensor([2, 3, 4])
Equals :  tensor([ 4,  9, 16])


In [45]:
# Matrix Multiplication

matrix_mul = torch.matmul(tensor, tensor)
matrix_mul

tensor(29)

## One of the most common errors in deep learning: shape errors

In [46]:
# shapes for matrix multiplication

tensor_A = torch.tensor([[1,2],
                         [3,4],
                         [5,6]])

tensor_B = torch.tensor([[7,10],
                         [8,11],
                         [9,12]])

#torch.matmul(tensor_A, tensor_B)  ## This will throw an error because both the tensors don't have the same inner dimension.

#torch.mm(tensor_A, tensor_B) ## torch.mm is an alias for torch.matmul. This will also throw an error

To fix our tensor shape issues, we can manipulate the shape of one of our tensors using a **transpose**.


A **transpose** switches the axes or dimensions of a given tensor.

In [47]:
tensor_B, tensor_B.shape

(tensor([[ 7, 10],
         [ 8, 11],
         [ 9, 12]]),
 torch.Size([3, 2]))

In [48]:
tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [49]:
# The matrix multiplication operation of tensor_B works when we take a transpose of tensor_B

print(f"Original shapes --> tensor_A = {tensor_A.shape} , tensor_B = {tensor_B.shape}")
print(f"New shapes --> tensor_A = {tensor_A.shape} , tensor_B = {tensor_B.T.shape}")
print(f"Multiplying {tensor_A.shape} @ {tensor_B.shape} (Same shape as above)  <-- Inner dimmensions must match here.")
print(f"Output : \n")
output = torch.matmul(tensor_A,tensor_B.T)
print(output)
print(f"\nOutput shape is : {output.shape}")

torch.mm(tensor_A,tensor_B.T)  ## This is another way to perform the same operation using the torch.nn alias.

Original shapes --> tensor_A = torch.Size([3, 2]) , tensor_B = torch.Size([3, 2])
New shapes --> tensor_A = torch.Size([3, 2]) , tensor_B = torch.Size([2, 3])
Multiplying torch.Size([3, 2]) @ torch.Size([3, 2]) (Same shape as above)  <-- Inner dimmensions must match here.
Output : 

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

Output shape is : torch.Size([3, 3])


tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

### Tensor Aggregation

Finding the min, max, mean, sum, etc of tensors

In [50]:
x = torch.arange(0,100,10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

#### Find Min

In [51]:
torch.min(x), x.min()

(tensor(0), tensor(0))

#### Find Max

In [52]:
torch.max(x), x.max()

(tensor(90), tensor(90))

#### Find Average/Mean

`Note`: The torch.mean()/.mean() function requires a dataype of float32. Or else it throws an error

In [53]:
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(45.), tensor(45.))

#### Find the sum

In [54]:
torch.sum(x), x.sum()

(tensor(450), tensor(450))

### Positional min and max

`torch.argmin()/.argmin()` and `torch.argmax()/.argmax()` function return the position/index of the min/max of the tensor.

In [55]:
torch.argmin(x), x.argmin()

(tensor(0), tensor(0))

In [56]:
torch.argmax(x),x.argmax()

(tensor(9), tensor(9))

## Reshaping, stacking, squeezing and unsqueezing tensors

* Reshaping - reshapes an input tensor to a defined shape
* View - Return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking - combine multiple tensors on top of each other (vstack) or side by side (hstack)
* Squeeze - removes all `1` dimensions from a tensor
* Unsqueeze - add a `1` dimension to a target tensor
* Permute - Return a view of the input with dimensions permuted (swapped) in a certain way

#### Reshape

In [57]:
import torch

x = torch.arange(1.,11.,1)
x, x.shape

(tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]), torch.Size([10]))

In [58]:
reshaped = torch.reshape(x,(1,10))
reshaped, reshaped.shape

(tensor([[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 torch.Size([1, 10]))

In [59]:
reshaped = x.reshape(10,1)  # this is the same function as above just a different way of calling the method.
reshaped, reshaped.shape

(tensor([[ 1.],
         [ 2.],
         [ 3.],
         [ 4.],
         [ 5.],
         [ 6.],
         [ 7.],
         [ 8.],
         [ 9.],
         [10.]]),
 torch.Size([10, 1]))

In [60]:
reshaped = x.reshape(5,2)
reshaped, reshaped.shape

(tensor([[ 1.,  2.],
         [ 3.,  4.],
         [ 5.,  6.],
         [ 7.,  8.],
         [ 9., 10.]]),
 torch.Size([5, 2]))

In [61]:
reshaped = x.reshape(2,5)
reshaped, reshaped.shape

(tensor([[ 1.,  2.,  3.,  4.,  5.],
         [ 6.,  7.,  8.,  9., 10.]]),
 torch.Size([2, 5]))

#### View

In [62]:
# changing the view

z= x.view(1,10)
z, z.shape

(tensor([[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 torch.Size([1, 10]))

Changing `z` changes `x` because the view of a tensor occupies the same memory as the original tensor

In [63]:
x

tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [64]:
z[:,0] = 5
z , x

(tensor([[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 tensor([ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]))

#### Stack

In [65]:
# stacking the tensors on top of each other

x_stacked = torch.stack([x,x,x,x,x],dim = 0)
x_stacked

tensor([[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]])

In [66]:
# stacking the tensors on top of each other on the other dimension

x_stacked = torch.stack([x,x,x,x,x],dim = 1)
x_stacked

tensor([[ 5.,  5.,  5.,  5.,  5.],
        [ 2.,  2.,  2.,  2.,  2.],
        [ 3.,  3.,  3.,  3.,  3.],
        [ 4.,  4.,  4.,  4.,  4.],
        [ 5.,  5.,  5.,  5.,  5.],
        [ 6.,  6.,  6.,  6.,  6.],
        [ 7.,  7.,  7.,  7.,  7.],
        [ 8.,  8.,  8.,  8.,  8.],
        [ 9.,  9.,  9.,  9.,  9.],
        [10., 10., 10., 10., 10.]])

### Squeeze

In [67]:
print(x, x.shape)
x = x.reshape(1,1,10)
x, x.shape

tensor([ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]) torch.Size([10])


(tensor([[[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]]),
 torch.Size([1, 1, 10]))

In [68]:
squeezed_x = torch.squeeze(x)

print(squeezed_x, squeezed_x.shape)

tensor([ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]) torch.Size([10])


In [69]:
# torch.squeeze to remove all the single dimensions from a target tensor

x = x.reshape(1,1,10)

print(f"previous tensor : {x}")
print(f"shape of previous tensor : {x.shape}\n")

squeezed_x = torch.squeeze(x)
print(f"squeezed tensor : {squeezed_x}")
print(f"shape of squeezed tensor : {squeezed_x.shape}")

previous tensor : tensor([[[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]])
shape of previous tensor : torch.Size([1, 1, 10])

squeezed tensor : tensor([ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])
shape of squeezed tensor : torch.Size([10])


### Unsqueeze

In [70]:
# torch.unsqueeze adds a single dimension to the target tensor on the specified dim (dimension)

print(f"previous tensor : {x}")
print(f"shape of previous tensor : {x.shape}\n")

unsqueezed_x = torch.unsqueeze(x,dim=0) # adding a single dimension at zeroeth dimension
print(f"unsqueezed tensor : {unsqueezed_x}")
print(f"shape of unsqueezed tensor : {unsqueezed_x.shape}")

previous tensor : tensor([[[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]])
shape of previous tensor : torch.Size([1, 1, 10])

unsqueezed tensor : tensor([[[[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]]])
shape of unsqueezed tensor : torch.Size([1, 1, 1, 10])


In [71]:
# torch.unsqueeze adds a single dimension to the target tensor on the specified dim (dimension)

print(f"previous tensor : {x}")
print(f"shape of previous tensor : {x.shape}\n")

unsqueezed_x = torch.unsqueeze(x,dim=3) # adding a single dimension at third dimension
print(f"unsqueezed tensor : {unsqueezed_x}")
print(f"shape of unsqueezed tensor : {unsqueezed_x.shape}")

previous tensor : tensor([[[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]])
shape of previous tensor : torch.Size([1, 1, 10])

unsqueezed tensor : tensor([[[[ 5.],
          [ 2.],
          [ 3.],
          [ 4.],
          [ 5.],
          [ 6.],
          [ 7.],
          [ 8.],
          [ 9.],
          [10.]]]])
shape of unsqueezed tensor : torch.Size([1, 1, 10, 1])


### Permute

In [72]:
# torch.permute - rearranges the dimensions of the target tensor to the dimensions specified in specific order
# this creates a view of the original tensor. so it uses the same memory.

x = torch.rand(224,224,3) # height , width , colour_channels

# permute the original tensor to rearrange the axis (or dim) order
permuted_x = x.permute(2,0,1) # colour_channels, height, width
# Shifts the axis from 0->1, 1->2, 2->0

print(f"shape of previous tensor : {x.shape}")

print(f"shape of permuted tensor : {permuted_x.shape}")

shape of previous tensor : torch.Size([224, 224, 3])
shape of permuted tensor : torch.Size([3, 224, 224])


In [73]:
x[0,2,1]

tensor(0.3182)

In [74]:
permuted_x[1,0,2]

tensor(0.3182)

## Indexing - selecting data from tensors

Indexing data in pytorch is the same as indexing data with numpy

In [75]:
# Create a tensor

import torch
x = torch.arange(1,17).reshape(1,4,4)
x

tensor([[[ 1,  2,  3,  4],
         [ 5,  6,  7,  8],
         [ 9, 10, 11, 12],
         [13, 14, 15, 16]]])

In [76]:
# indexing on zeroth dimension
x[0]

tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12],
        [13, 14, 15, 16]])

In [77]:
# indexing on first dimension
x[0][0]

tensor([1, 2, 3, 4])

In [78]:
# indexing on the second dimension
x[0][0][1]

tensor(2)

In [79]:
# how to get the number 9?
x[0][2][0]

tensor(9)

In [80]:
x[:,:,2]

tensor([[ 3,  7, 11, 15]])

# PyTorch tensors and NumPy

NumPy is a scientific Python numericalcomputing library

And because of this PyTorch has the ability to interact with NumPy.

* Data is in NumPy format -> want to convert it to PyTorch Tensors : `torch.from_numpy(ndarray)`
* Data is in PyTorch Tensor format -> want to convert it to NumPy : `torch.Tensor.numpy()`


In [81]:
# NumPy array to PyTorch tensor
import torch
import numpy as np

numpy_arr = np.arange(1,101,4,dtype=np.int16)
print(numpy_arr, type(numpy_arr), numpy_arr.dtype)

# note: while converting pytorch retains the original dtype of the numpy array uneless specified otherwise
torch_tensor = torch.from_numpy(numpy_arr).type(torch.float32)
print(torch_tensor, type(torch_tensor),torch_tensor.dtype)

[ 1  5  9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93
 97] <class 'numpy.ndarray'> int16
tensor([ 1.,  5.,  9., 13., 17., 21., 25., 29., 33., 37., 41., 45., 49., 53.,
        57., 61., 65., 69., 73., 77., 81., 85., 89., 93., 97.]) <class 'torch.Tensor'> torch.float32


In [82]:
# What if we modify the contents of the numpy array. What will happen to the tensor?

numpy_arr = numpy_arr + 1
print(numpy_arr)
print(torch_tensor)

[ 2  6 10 14 18 22 26 30 34 38 42 46 50 54 58 62 66 70 74 78 82 86 90 94
 98]
tensor([ 1.,  5.,  9., 13., 17., 21., 25., 29., 33., 37., 41., 45., 49., 53.,
        57., 61., 65., 69., 73., 77., 81., 85., 89., 93., 97.])


In [83]:
# pytorch tensor to numpy array

new_numpy_arr = torch.Tensor.numpy(torch_tensor)
print(new_numpy_arr, type(new_numpy_arr), new_numpy_arr.dtype)
# pytorch also retains the dtype of the original tensor while creating a new numpy array unless specified otherwise


[ 1.  5.  9. 13. 17. 21. 25. 29. 33. 37. 41. 45. 49. 53. 57. 61. 65. 69.
 73. 77. 81. 85. 89. 93. 97.] <class 'numpy.ndarray'> float32


In [84]:
# what happens to the numpy array if we change the original tensor?

torch_tensor = torch_tensor + 2
print(torch_tensor)
print(new_numpy_arr)

tensor([ 3.,  7., 11., 15., 19., 23., 27., 31., 35., 39., 43., 47., 51., 55.,
        59., 63., 67., 71., 75., 79., 83., 87., 91., 95., 99.])
[ 1.  5.  9. 13. 17. 21. 25. 29. 33. 37. 41. 45. 49. 53. 57. 61. 65. 69.
 73. 77. 81. 85. 89. 93. 97.]


## Reproducbility (trying to take random out of the random)

In short how a neural network learns:

`start with random numbers —> tensor operations —> update random numbers to try and make them better representations of the data —> again —> again —> again..`

To reduce the randomness in neural networks and PyTorch comes the concept of a **random seed**.

Essentially what the random seed does is "flavour" the randomness.

In [85]:
import torch

# create random tensors

random_tensor_A = torch.rand(3,4)
random_tensor_B = torch.rand(3,4)
print(random_tensor_A)
print(random_tensor_B)
random_tensor_A == random_tensor_B

tensor([[0.0655, 0.9338, 0.8605, 0.1228],
        [0.4696, 0.0552, 0.2301, 0.3960],
        [0.3597, 0.9647, 0.5911, 0.0189]])
tensor([[0.2044, 0.0526, 0.8519, 0.7331],
        [0.1641, 0.1946, 0.8829, 0.5153],
        [0.0631, 0.6547, 0.7178, 0.8623]])


tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

In [86]:
# lets make some random but reproducible tensors

import torch


# set the random seed
RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3,4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3,4)

print(random_tensor_C)
print(random_tensor_D)
# both tensors should be the same because we set the random seed to the same value
# this is useful for reproducibility in machine learning experiments
random_tensor_C == random_tensor_D

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])


tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

Extra resources for reproducibility:

* https://pytorch.org/docs/stable/notes/randomness.html
* https://en.wikipedia.org/wiki/Random_seed

## Running tensors and PyTorch objects on the GPUs (and making faster computations)
GPUs = faster computation on numbers, thanks to CUDA + NVIDIA hardware + PyTorch working behind the scenes to make everything hunky
dory (good).


### 1. Getting a GPU
1. Easiest - Use Google Colab for a free GPU (options to upgrade as well)
2. Use your own GPU - takes a little bit of setup and requires the investment of purchasing a GPU, there's lots of options..., see online resources for
what options to get
3. Use cloud computing - GCP, AWS, Azure, these services allow you to rent computers on the cloud and access them

For 2, 3 PyTorch + GPU drivers (CUDA) takes a little bit of setting up, to do this, refer to PyTorch setup documentation

In [87]:
!nvidia-smi

Fri Jul 26 03:14:10 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.70                 Driver Version: 560.70         CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 2060      WDDM  |   00000000:01:00.0  On |                  N/A |
| N/A   69C    P0             24W /   80W |    1930MiB /   6144MiB |     10%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

### 2. Check for GPU access with PyTorch

In [88]:
torch.cuda.is_available()

True

For PyTorch since it's capable of running compute on the GPU or CPU, it's best practice to setup device agnostic code: https://pytorch.org/docs/stable/notes/cuda.html

E.g. run on GPU if available, else default to

In [89]:
# setup device agnostic code

device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [90]:
torch.cuda.device_count()

1

### 3. Putting tensors (and models) on the GPU
The reason we want our tensors/models on the GPU is because using a GPU results in faster computations.

In [91]:
# Create a tensor (deault is GPU)
tensor = torch.tensor([1,2,3], device="cpu")

# Tensor not on GPU
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [92]:
# Move Tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)

print(tensor_on_gpu, tensor_on_gpu.device)

tensor([1, 2, 3], device='cuda:0') cuda:0


### 4. Moving tensors back to the CPU

In [93]:
# Move tensor from GPU to CPU
tensor_on_cpu = tensor_on_gpu.to("cpu")
print(tensor_on_cpu, tensor_on_cpu.device)

tensor([1, 2, 3]) cpu


# Exercise

1. Documentation reading - A big part of deep learning (and learning to code in general) is getting familiar with the documentation of a certain framework you're using. We'll be using the PyTorch documentation a lot throughout the rest of this course. So I'd recommend spending 10-minutes reading the following (it's okay if you don't get some things for now, the focus is not yet full understanding, it's awareness). See the documentation on torch.Tensor and for torch.cuda.

2. Create a random tensor with shape (7, 7).

In [94]:
random_tensor = torch.rand(7,7)
random_tensor

tensor([[0.8694, 0.5677, 0.7411, 0.4294, 0.8854, 0.5739, 0.2666],
        [0.6274, 0.2696, 0.4414, 0.2969, 0.8317, 0.1053, 0.2695],
        [0.3588, 0.1994, 0.5472, 0.0062, 0.9516, 0.0753, 0.8860],
        [0.5832, 0.3376, 0.8090, 0.5779, 0.9040, 0.5547, 0.3423],
        [0.6343, 0.3644, 0.7104, 0.9464, 0.7890, 0.2814, 0.7886],
        [0.5895, 0.7539, 0.1952, 0.0050, 0.3068, 0.1165, 0.9103],
        [0.6440, 0.7071, 0.6581, 0.4913, 0.8913, 0.1447, 0.5315]])

3. Perform a matrix multiplication on the tensor from 2 with another random tensor with shape (1, 7) (hint: you may have to transpose the second tensor).

In [95]:
new_tensor = torch.rand(1,7)
new_tensor

tensor([[0.1587, 0.6542, 0.3278, 0.6532, 0.3958, 0.9147, 0.2036]])

In [96]:
print(new_tensor.shape, random_tensor.shape)

torch.Size([1, 7]) torch.Size([7, 7])


In [97]:
new_tensor = new_tensor.reshape(7,1)
print(new_tensor, new_tensor.shape)

tensor([[0.1587],
        [0.6542],
        [0.3278],
        [0.6532],
        [0.3958],
        [0.9147],
        [0.2036]]) torch.Size([7, 1])


In [98]:
random_tensor.matmul(new_tensor)

tensor([[1.9625],
        [1.0950],
        [0.9967],
        [1.8910],
        [1.9205],
        [1.0674],
        [1.6949]])

4. Set the random seed to 0 and do exercises 2 & 3 over again.

In [99]:
RANDOM_SEED = 0
torch.manual_seed(RANDOM_SEED)
random_tensor = torch.rand(7,7)
print(random_tensor)
print(random_tensor.shape)

#torch.manual_seed(RANDOM_SEED)
new_tensor = torch.rand(1,7)
print(new_tensor)
print(new_tensor.shape)

random_tensor.mm(new_tensor.T)


tensor([[0.4963, 0.7682, 0.0885, 0.1320, 0.3074, 0.6341, 0.4901],
        [0.8964, 0.4556, 0.6323, 0.3489, 0.4017, 0.0223, 0.1689],
        [0.2939, 0.5185, 0.6977, 0.8000, 0.1610, 0.2823, 0.6816],
        [0.9152, 0.3971, 0.8742, 0.4194, 0.5529, 0.9527, 0.0362],
        [0.1852, 0.3734, 0.3051, 0.9320, 0.1759, 0.2698, 0.1507],
        [0.0317, 0.2081, 0.9298, 0.7231, 0.7423, 0.5263, 0.2437],
        [0.5846, 0.0332, 0.1387, 0.2422, 0.8155, 0.7932, 0.2783]])
torch.Size([7, 7])
tensor([[0.4820, 0.8198, 0.9971, 0.6984, 0.5675, 0.8352, 0.2056]])
torch.Size([1, 7])


tensor([[1.8542],
        [1.9611],
        [2.2884],
        [3.0481],
        [1.7067],
        [2.5290],
        [1.7989]])

5. Speaking of random seeds, we saw how to set it with torch.manual_seed() but is there a GPU equivalent? (hint: you'll need to look into the documentation for torch.cuda for this one). If there is, set the GPU random seed to 1234.

In [100]:
GPU_RANDOM_SEED = 1234

6. Create two random tensors of shape (2, 3) and send them both to the GPU (you'll need access to a GPU for this). Set torch.manual_seed(1234) when creating the tensors (this doesn't have to be the GPU random seed).

In [101]:
device = "cuda" if torch.cuda.is_available() else "cpu"

torch.cuda.manual_seed(GPU_RANDOM_SEED)
random_tensor_A = torch.rand((2,3),device=device)

#torch.cuda.manual_seed(GPU_RANDOM_SEED)
random_tensor_B = torch.rand((2,3), device=device)

print(random_tensor_A, random_tensor_A.device)
print(random_tensor_B, random_tensor_B.device)

tensor([[0.1272, 0.8167, 0.5440],
        [0.6601, 0.2721, 0.9737]], device='cuda:0') cuda:0
tensor([[0.6208, 0.0276, 0.3255],
        [0.1114, 0.6812, 0.3608]], device='cuda:0') cuda:0


7. Perform a matrix multiplication on the tensors you created in 6 (again, you may have to adjust the shapes of one of the tensors).

In [102]:
matrix_multiplication = random_tensor_A.mm(random_tensor_B.T)
matrix_multiplication

tensor([[0.2786, 0.7668],
        [0.7343, 0.6102]], device='cuda:0')

8. Find the maximum and minimum values of the output of 7.

In [103]:
print(torch.min(matrix_multiplication))
print(torch.max(matrix_multiplication))

tensor(0.2786, device='cuda:0')
tensor(0.7668, device='cuda:0')


9. Find the maximum and minimum index values of the output of 7.

In [104]:
print(torch.argmin(matrix_multiplication))
print(torch.argmax(matrix_multiplication))

tensor(0, device='cuda:0')
tensor(1, device='cuda:0')


10. Make a random tensor with shape (1, 1, 1, 10) and then create a new tensor with all the 1 dimensions removed to be left with a tensor of shape (10). Set the seed to 7 when you create it and print out the first tensor and it's shape as well as the second tensor and it's shape.

In [105]:
RANDOM_SEED = 7
torch.manual_seed(RANDOM_SEED)
tensor = torch.rand(1,1,1,10)

print(tensor, tensor.shape)

squeezed_tensor = tensor.squeeze()
print(squeezed_tensor, squeezed_tensor.shape)

tensor([[[[0.5349, 0.1988, 0.6592, 0.6569, 0.2328, 0.4251, 0.2071, 0.6297,
           0.3653, 0.8513]]]]) torch.Size([1, 1, 1, 10])
tensor([0.5349, 0.1988, 0.6592, 0.6569, 0.2328, 0.4251, 0.2071, 0.6297, 0.3653,
        0.8513]) torch.Size([10])
