<a href="https://colab.research.google.com/github/tfranke0814/pytorch-deep-learning/blob/main/00_pytorch_fundamentals_code.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 00. PyTorch Fundamentals

Resource notebook: https://www.learnpytorch.io/00_pytorch_fundamentals/

If you have a question: https://github.com/mrdbourke/pytorch-deep-learning/discussions

In [3]:
!nvidia-smi

Tue Jun  3 23:39:05 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   55C    P8             13W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [4]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.6.0+cu124


## Intro to Tensors
Dan Fleisch : https://www.youtube.com/watch?v=f5liqUk0ZTw

### Creating tensors

https://docs.pytorch.org/docs/stable/tensors.html

In [5]:
# Scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [6]:
scalar.ndim

0

In [7]:
# Return tensor as Python int
scalar.item()

7

In [8]:
# Vector
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [9]:
vector.ndim

1

In [10]:
vector.shape

torch.Size([2])

In [11]:
# MATRIX
MATRIX = torch.tensor([[7, 8],
                       [9, 10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [12]:
MATRIX.ndim

2

In [13]:
MATRIX[0]
MATRIX[1]

tensor([ 9, 10])

In [14]:
MATRIX.shape

torch.Size([2, 2])

In [15]:
# TENSOR
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [16]:
TENSOR.ndim

3

In [17]:
TENSOR.shape

torch.Size([1, 3, 3])

In [18]:
TENSOR2 = torch.tensor([[[1,2,3],
                         [4,5,6],
                         [7,8,9]],
                          [[9,8,7],
                         [6,5,4],
                         [3,2,1]]])
TENSOR2

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]],

        [[9, 8, 7],
         [6, 5, 4],
         [3, 2, 1]]])

In [19]:
TENSOR2.ndim

3

In [20]:
TENSOR2.shape

torch.Size([2, 3, 3])

In [21]:
## ERROR

# TENSOR3 = torch.tensor([[[1,2,3],
#                          [4,5,6]],
#                           [[9,8,7],
#                          [6,5,4],
#                          [3,2,1]]])
# TENSOR3

TENSOR4 = torch.tensor([[[[1,2,3, 4, 3],
                         [4,5,6, 3, 7]],
                          [[9,8,7, 5, 5],
                         [6,5,4, 6, 7]], [[1,2,3, 4, 3],
                         [4,5,6, 3, 7]],
                          [[9,8,7, 5, 5],
                         [6,5,4, 6, 7]]], [[[1,2,3, 4, 3],
                         [4,5,6, 3, 7]],
                          [[9,8,7, 5, 5],
                         [6,5,4, 6, 7]], [[1,2,3, 4, 3],
                         [4,5,6, 3, 7]],
                          [[9,8,7, 5, 5],
                         [6,5,4, 6, 7]]]])
print(TENSOR4)
print(TENSOR4.ndim)
print(TENSOR4.shape)

tensor([[[[1, 2, 3, 4, 3],
          [4, 5, 6, 3, 7]],

         [[9, 8, 7, 5, 5],
          [6, 5, 4, 6, 7]],

         [[1, 2, 3, 4, 3],
          [4, 5, 6, 3, 7]],

         [[9, 8, 7, 5, 5],
          [6, 5, 4, 6, 7]]],


        [[[1, 2, 3, 4, 3],
          [4, 5, 6, 3, 7]],

         [[9, 8, 7, 5, 5],
          [6, 5, 4, 6, 7]],

         [[1, 2, 3, 4, 3],
          [4, 5, 6, 3, 7]],

         [[9, 8, 7, 5, 5],
          [6, 5, 4, 6, 7]]]])
4
torch.Size([2, 4, 2, 5])


### Random tensors

Why random tensors?

Many NNs start with random numbers and adjust numbers to better represent data.


In [22]:
# Create a random tensors of size (3, 4)
# random_tensor = torch.rand(3, 4)
# random_tensor = torch.rand(1, 3, 4)
random_tensor = torch.rand(2, 3, 4)
random_tensor

tensor([[[0.4809, 0.9034, 0.3470, 0.4118],
         [0.6197, 0.3665, 0.1770, 0.2639],
         [0.3673, 0.2941, 0.2363, 0.4554]],

        [[0.2211, 0.3978, 0.5339, 0.4757],
         [0.8792, 0.8056, 0.7889, 0.5603],
         [0.7081, 0.8305, 0.5843, 0.5438]]])

In [23]:
random_tensor.ndim

3

In [24]:
# Create a random tensor with similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(224, 224, 3)) # height, width, color channels (R, G, B)
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Zeros and Ones

In [25]:
# Create a tensor of all zeros
zeros = torch.zeros(2, 3)
print(zeros)

# Create a tensor of all ones
ones = torch.ones(2, 3)
print(ones)

tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])


In [26]:
ones.dtype

torch.float32

In [27]:
random_tensor.dtype

torch.float32

### Creating a range of tensors and tensors-like

In [28]:
# Use torch.range() -> Depricated -> Use torch.arange()
one_to_ten = torch.arange(start=1, end=11, step=1)
print(one_to_ten)

# Creating tensors like
tens_zeroes = torch.zeros_like(input=one_to_ten)
print(tens_zeroes)

tens_ones = torch.ones_like(input=one_to_ten)
print(tens_ones)

tens_rand = torch.rand_like(input=one_to_ten.float())
print(tens_rand)

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
tensor([0.8353, 0.4836, 0.6017, 0.9326, 0.2709, 0.1934, 0.5641, 0.6609, 0.7638,
        0.2068])


1:56:26  - https://www.youtube.com/watch?v=LyJtbe__2i0&t=1h56m24s

### Tensor datatypes

**Note:** Tensor datatypes is one of the 3 big errors you'll encounter.
1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on the right device

In [29]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None, # What datatype is the tensor (e.g. float32 or float 16)
                               device=None, # What device is the tensor on. Default is "cpu", can change to "cuda"
                               requires_grad=False) # Whether or not to track gradients of tensors
float_32_tensor.dtype

torch.float32

In [30]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [31]:
(float_16_tensor * float_32_tensor).dtype

torch.float32

In [32]:
int_32_tensor = torch.tensor([3, 6, 9], dtype=torch.int32)
int_32_tensor

tensor([3, 6, 9], dtype=torch.int32)

In [33]:
int_32_tensor * float_32_tensor

tensor([ 9., 36., 81.])

### Getting information from tensors (tensor attributes)

1. Tensors not right datatype - to get datatype from a tensor, can use `tensor.dtype`
2. Tensors not right shape - to get shape from a tensor, can use `tensor.shape`
3. Tensors not on the right device - to get device from a tensor, can use `tensor.device`

In [34]:
# Create a tensor
some_tensor = torch.rand(3, 4) #, device="cuda")
print(some_tensor)

# Find out details about some tensor
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Device of tensor: {some_tensor.device}")
some_tensor.shape, some_tensor.size, some_tensor.size()

tensor([[0.2218, 0.1332, 0.6608, 0.3596],
        [0.5916, 0.3246, 0.8337, 0.1040],
        [0.1032, 0.1565, 0.3196, 0.2512]])
Datatype of tensor: torch.float32
Shape of tensor: torch.Size([3, 4])
Device of tensor: cpu


(torch.Size([3, 4]), <function Tensor.size>, torch.Size([3, 4]))

### Manipulating Tensors (tensor operations)

Tensor operations include:
* Addition
* Subtraction
* Multiplication (element-wise)
* Division
* Matrix Multiplication

In [35]:
# Addition
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [36]:
# Multiplication
tensor * 10

tensor([10, 20, 30])

In [37]:
# Subtraction
tensor - 10

tensor([-9, -8, -7])

In [38]:
# PyTorch in-built functions
torch.mul(tensor, 10)
torch.add(tensor, 10)
torch.sub(tensor, 10)

tensor([-9, -8, -7])

### Matrix Multiplication

Two main ways to perform multiplication in NN and DL

1. Element-wise Multiplication
2. Matrix Multiplication (dot product) - Use `torch.matmul()` over `for` loops for vectorized optimization speed

There are two main rules for matrix multiplication
1. The **inner Dimensions** must match:
* `(3, 2) @ (3, 2)` won't work
* `(2, 3) @ (3, 2)` will work
* `(3, 2) @ (2, 3)` will work
2. The resulting matrix has the shape of the **outer dimensions**
* `(2, 3) @ (3, 2)` -> `(2, 2)`
* `(3, 2) @ (2, 3)` -> `(3, 3)`

In [39]:
# Element-wise
print(tensor, "*",  tensor, "=", tensor * tensor)

# Matrix
torch.matmul(tensor, tensor)

tensor([1, 2, 3]) * tensor([1, 2, 3]) = tensor([1, 4, 9])


tensor(14)

In [40]:
tensor @ tensor

tensor(14)

### Dealing with tensor shape errors
A **transpose** the dimensionsaxes of the given tensor

In [41]:
# Shapes for matrix multiplication
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]])

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]])

# torch.mm(tensor_A, tensor_B) # torch.mm is the same as torch.matmul (it's an alias for writing less code)

# ERROR!!!! : torch.matmul(tensor_A, tensor_B)

tensor_B.T # Transpose
torch.mm(tensor_A, tensor_B.T), torch.matmul(tensor_A, tensor_B.T).shape

(tensor([[ 27,  30,  33],
         [ 61,  68,  75],
         [ 95, 106, 117]]),
 torch.Size([3, 3]))

## Finding the min, max, min, sum, etc. (tensor aggregation)

In [42]:
x = torch.arange(0, 100, 10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [43]:
print(torch.min(x), x.min())
print(torch.max(x), x.max())
print(torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()) # Can't take datatype long
print(torch.sum(x), x.sum())

tensor(0) tensor(0)
tensor(90) tensor(90)
tensor(45.) tensor(45.)
tensor(450) tensor(450)


## Finding the positional min and max
 - `argmin()`
 - `argmax()`

In [44]:
print(f"Positional Min: {torch.argmin(x), x.argmin()}")
print(f"Min: {x[torch.argmin(x)], x.min()}")
print(f"Positional Max : {torch.argmax(x), x.argmax()}")
print(f"Max : {x[torch.argmax(x)], x.max()}")

Positional Min: (tensor(0), tensor(0))
Min: (tensor(0), tensor(0))
Positional Max : (tensor(9), tensor(9))
Max : (tensor(90), tensor(90))


## Reshaping, stacking, squeezing, and unsqueezing tensor

* Reshaping - Reshapes an input tensor to a defined shape
* View - Return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking - Combine multiple tensors on top of each other (vstack) or side by side (hstack)
* Squeeze - Removes all `1` dimensions from a tensor
* Unsqueeze - Add a `1` dimension to a target tensor
* Permute - Return a view of the input with dimensions permuted (swapped) in a certain way

In [45]:
x = torch.arange(1., 11.)
x, x.shape

(tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]), torch.Size([10]))

In [46]:
x_reshaped = x.reshape(2, 5)
x_reshaped, x_reshaped.shape

(tensor([[ 1.,  2.,  3.,  4.,  5.],
         [ 6.,  7.,  8.,  9., 10.]]),
 torch.Size([2, 5]))

In [47]:
x = torch.arange(1., 10.)
# Change the view
z = x.view(1, 9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [48]:
# Changing z changes x (because a view of a tensor shares the same memory as the original input)
z[:, 0] = 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [49]:
# Stack tensors on top of each other
x_stackedv = torch.stack([x, x, x, x], dim=0)
x_stackedh = torch.stack([x, x, x, x], dim=1)
x_stackedv, x_stackedv.shape, x_stackedh, x_stackedh.shape

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 torch.Size([4, 9]),
 tensor([[5., 5., 5., 5.],
         [2., 2., 2., 2.],
         [3., 3., 3., 3.],
         [4., 4., 4., 4.],
         [5., 5., 5., 5.],
         [6., 6., 6., 6.],
         [7., 7., 7., 7.],
         [8., 8., 8., 8.],
         [9., 9., 9., 9.]]),
 torch.Size([9, 4]))

In [50]:
# Squeeze and Unsqueeze
print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")
print(x_reshaped.unsqueeze(dim=1).unsqueeze(dim=0))
print(x_reshaped.unsqueeze(dim=0).unsqueeze(dim=1))
print(x_reshaped.unsqueeze(dim=0).unsqueeze(dim=1).squeeze())

Previous tensor: tensor([[ 1.,  2.,  3.,  4.,  5.],
        [ 6.,  7.,  8.,  9., 10.]])
Previous shape: torch.Size([2, 5])
tensor([[[[ 1.,  2.,  3.,  4.,  5.]],

         [[ 6.,  7.,  8.,  9., 10.]]]])
tensor([[[[ 1.,  2.,  3.,  4.,  5.],
          [ 6.,  7.,  8.,  9., 10.]]]])
tensor([[ 1.,  2.,  3.,  4.,  5.],
        [ 6.,  7.,  8.,  9., 10.]])


In [51]:
# torch.squeeze - removes all single dimensions
x_reshaped = x.reshape(1, 9)
x_squeezed = x_reshaped.squeeze()
x_reshaped, x_reshaped.shape, x_squeezed, x_squeezed.shape

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 torch.Size([1, 9]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]),
 torch.Size([9]))

In [52]:
# torch.unsqueeze() - adds a single dimension to a target tensor  at a specific dim (dimension)
print(f"Previous tensor: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")
# Adds an extra
print(f"New tensor: {x_squeezed.unsqueeze(dim=0)}")
print(f"New shape: {x_squeezed.unsqueeze(dim=0).shape}")
print(f"New tensor: {x_squeezed.unsqueeze(dim=1)}")
print(f"New shape: {x_squeezed.unsqueeze(dim=1).shape}")

Previous tensor: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous shape: torch.Size([9])
New tensor: tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
New shape: torch.Size([1, 9])
New tensor: tensor([[5.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])
New shape: torch.Size([9, 1])


In [53]:
# torch.permute - rearranges the dimensions of a target tensor in a specified order
tensor = torch.randn(2,3,5)
print(f"Tensor: {tensor}")
print(f"Tensor Shape: {tensor.shape}")

new_tensor = tensor.permute(2, 0, 1)
print(f"\nNew Tensor: {new_tensor}")
print(f"New Tensor Shape: {new_tensor.shape}")
print(f"Floop the Tensor: {x_reshaped.permute(1, 0)}")

# Images
x_original = torch.rand(size=(224, 224, 3)) # [height, width, color_channels]
x_permuted = x_original.permute(2, 0, 1) # [color_channels, height, width]
print(f"\nImages\nOriginal shape: {x_original.shape}")
print(f"Permuted shape: {x_permuted.shape}")

Tensor: tensor([[[-0.3983,  0.7081, -0.3357, -0.7670,  0.2310],
         [ 0.5862,  1.0539,  0.2348, -1.0195, -0.6762],
         [ 1.0192, -1.1358,  1.8083,  1.0717,  1.0052]],

        [[ 0.2603, -0.7928,  0.9412, -0.8172,  1.6802],
         [-0.3237, -1.2111, -0.7030, -0.2812, -0.8508],
         [ 1.3521,  0.4953,  0.6987,  0.7350, -2.4614]]])
Tensor Shape: torch.Size([2, 3, 5])

New Tensor: tensor([[[-0.3983,  0.5862,  1.0192],
         [ 0.2603, -0.3237,  1.3521]],

        [[ 0.7081,  1.0539, -1.1358],
         [-0.7928, -1.2111,  0.4953]],

        [[-0.3357,  0.2348,  1.8083],
         [ 0.9412, -0.7030,  0.6987]],

        [[-0.7670, -1.0195,  1.0717],
         [-0.8172, -0.2812,  0.7350]],

        [[ 0.2310, -0.6762,  1.0052],
         [ 1.6802, -0.8508, -2.4614]]])
New Tensor Shape: torch.Size([5, 2, 3])
Floop the Tensor: tensor([[5.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])

Images
Original shape: torc

In [54]:
x_original[0, 0, 0] = 83457
x_original[0, 0, 0], x_permuted[0, 0, 0], x_original[0, 0, 1], x_permuted[1, 0, 0]

(tensor(83457.), tensor(83457.), tensor(0.3309), tensor(0.3309))

 ## Indexing

 Indexing in PyTorch is similar to in Numpy

In [55]:
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [56]:
x[0], x[0][0], x[0, 0]

(tensor([[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]),
 tensor([1, 2, 3]),
 tensor([1, 2, 3]))

In [57]:
np.array(x[0])[0, 0], np.array(x[0])[0][0]

(np.int64(1), np.int64(1))

In [58]:
np.array(x)[0, 0, 1], np.array(x)[0][0][1]

(np.int64(2), np.int64(2))

3:31:21 - https://www.youtube.com/watch?v=LyJtbe__2i0&t=3h31m21s

In [59]:
# Use ":" to get all of a target dimension
x[:, :, 1], x[:, 1, 1], x[0, 1, 1], x[0, 0, :], x[0, 2, 2], x[0, :, 2]

(tensor([[2, 5, 8]]),
 tensor([5]),
 tensor(5),
 tensor([1, 2, 3]),
 tensor(9),
 tensor([3, 6, 9]))

## PyTorch tensors & NumPy

Popularity and interaction
* Data in NumPy, want in PyTorch tensor -> `torch.from_numpy(ndarray)`
    * Changing one **doesn't** affect the other. Makes a Copy
* PyTorch tensor -> NumPy -> `torch.Tensor.numpy()`
    * Changing one **does** affect the other

In [60]:
# Numpy array to tensor
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
array, tensor, array.dtype, tensor.dtype

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64),
 dtype('float64'),
 torch.float64)

In [61]:
# Warning: different default dataype**
torch.arange(1.0, 8.0).dtype

torch.float32

In [62]:
# Changing one doesn't affect the other
array - array + 1
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [63]:
# Tensor to NumPy array
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor, tensor.dtype, numpy_tensor.dtype

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32),
 torch.float32,
 dtype('float32'))

In [64]:
# Changing tensor Changes the array
tensor += 1
tensor, numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([2., 2., 2., 2., 2., 2., 2.], dtype=float32))

In [65]:
numpy_tensor += 1
tensor, numpy_tensor

(tensor([3., 3., 3., 3., 3., 3., 3.]),
 array([3., 3., 3., 3., 3., 3., 3.], dtype=float32))

## PyTorch Reproducibility (Taking the random out of random)

Brief NN Summary:

 `Start with random numbers -> tensor operations -> update random numbers to try and make them out of daya -> again -> again -> again -> ...`

To reduce randomness in NN anfd PyTorch comes the concept of a **random seed**.

Essentially what the random seed does is "flavour" the randomness.

In [66]:
# Pseudorandom

random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(random_tensor_A)
print(random_tensor_B)
print(random_tensor_A == random_tensor_B)

tensor([[0.3659, 0.9994, 0.9359, 0.2054],
        [0.2839, 0.7332, 0.0950, 0.5987],
        [0.0668, 0.7772, 0.0035, 0.3130]])
tensor([[0.5770, 0.6730, 0.7942, 0.0403],
        [0.7620, 0.6988, 0.9629, 0.7225],
        [0.8128, 0.7970, 0.8744, 0.5077]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [67]:
# Random, but reproducible
# Set random seed
RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED) # Manual seed only works for 1 block each time
random_tensor_C = torch.rand(3, 4)
torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3, 4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C == random_tensor_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## Accessing a GPU in PyTorch (Faster Computations)
CUDA + NVIDIA + PyTorch <3

### Getting a GPU

1. Get a cloud gpu from Google Colab for free (Or upgrade)
2. Use your own GPU - takes setup up and purchasing one
3. Use cloud computing - GCP, AWS, Azure

For 2, 3 PyTorch + GPU dirver (CUDA) takes some setup. See documentation

In [68]:
!nvidia-smi

Tue Jun  3 23:39:10 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   55C    P8             13W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

### 2. Check for GPU access with PyTorch

In [72]:
# Check for GPU access with PyTorch
torch.cuda.is_available()

True

For PyTorch since it's capable of running compute on the GPU or CPU, it's best practice to setup device agnostic code: https://pytorch.org/docs/stable/notes/cuda.html#best-practices

In [70]:
# Setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [71]:
# count number of devices
torch.cuda.device_count()

1

### 3. Putting tensors (and models) on the GPU

Will result in faster computations

In [73]:
# Create a tensor (default on the CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU
print(tensor, tensor.device)

# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3]) cpu


tensor([1, 2, 3], device='cuda:0')

### 4. Moving tensor back to the CPU

In [75]:
# If tensor is on GPU, can't transform it to NumPy
# tensor_on_gpu.numpy()

## TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

# Can set it back to CPU
tensor_on_gpu.cpu().numpy()

array([1, 2, 3])