<a href="https://colab.research.google.com/github/JunHL96/PyTorch/blob/main/00_pytorch_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 00. Pytorch Fundamentals

Resource notebook: https://www.learnpytorch.io/00_pytorch_fundamentals/

In [153]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.4.1+cu121


### Device-Agnostic Code:

In [154]:
# Setup device-agnostic code
if torch.cuda.is_available():
    device = "cuda" # NVIDIA GPU
elif torch.backends.mps.is_available():
    device = "mps" # Apple GPU
else:
    device = "cpu" # Defaults to CPU if NVIDIA GPU/Apple GPU aren't available

print(f"Using device: {device}")

Using device: cpu


## Introduction to Tensors

### What are tensors?

A tensor is a multi-dimensional matrix containing elements of a single data type.

### Creating tensors

PyTorch tensors are created using 'torch.tensor()' = https://pytorch.org/docs/stable/tensors.html

In [155]:
# scalar
scalar = torch.tensor(7, device = device)
scalar

tensor(7)

In [156]:
scalar.ndim

0

In [157]:
# Get tensor back as Python int
scalar.item()

7

In [158]:
# Vector (magnitude, direction)
vector = torch.tensor([7, 7], device = device)
vector

tensor([7, 7])

In [159]:
vector.ndim

1

In [160]:
vector.shape

torch.Size([2])

<details>

### `.shape`
- **Returns the dimensions of the tensor (or array)** as a tuple.
- Each element of the tuple (an ordered, immutable collection of items) represents the size of the tensor in that dimension.

Example:
```python
vector = torch.tensor([7, 7])
print(vector.shape)  # Output: torch.Size([2])
```

### `.ndim`
- **Returns the number of dimensions (rank) of the tensor.**
- this is an integer value representing how many dimensions the tensor has

Example:
```python
vector = torch.tensor([7, 7])
print(vector.ndim)  # Output: 1
```

### Explanation:
- `vector = torch.tensor([7, 7])` creates a 1D tensor (vector) with 2 elements.
- `.shape` gives `(2)` because the vector has 2 elements in one dimension.
- `.ndim` gives `1` because the tensor is 1-dimensional.


| Attribute  | Description                               | Example Output            |
|------------|-------------------------------------------|---------------------------|
| `.shape`   | Tuple representing the size of each dimension | `torch.Size([2])`   |
| `.ndim`    | Integer representing the number of dimensions | `1`                       |

</details>

## Matrix

In [161]:
# MATRIX
MATRIX = torch.tensor([[7, 8],
                      [9, 10]], device = device)

In [162]:
MATRIX.ndim

2

In [163]:
MATRIX[0] # index on 0th axis

tensor([7, 8])

In [164]:
MATRIX[1] # index on 1st axis

tensor([ 9, 10])

In [165]:
MATRIX.shape

torch.Size([2, 2])

## Tensor

In [166]:
# TENSOR
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]], device = device)
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [167]:
TENSOR.ndim

3

In [168]:
TENSOR.shape # the result is torch.Size([1, 3, 3]), meaning we have one dimension of 3x3 tensor

torch.Size([1, 3, 3])

### Image Representation
https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/00-pytorch-different-tensor-dimensions.png

In [169]:
TENSOR[0]

tensor([[1, 2, 3],
        [3, 6, 9],
        [2, 4, 5]])

## Random Tensors

### Why Random Tensors?
Random tensors are important b/c the way many neural networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better represent the data.

`Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers`

### Documentation of torch.rand
https://pytorch.org/docs/stable/generated/torch.rand.html

In [170]:
# Create a random tensor of size (3, 4)

random_tensor = torch.rand(3, 4, device=device)
#random_tensor = torch.rand(5, 10, 10, device=mps)
random_tensor

tensor([[0.2704, 0.8974, 0.3574, 0.7363],
        [0.8192, 0.9043, 0.5150, 0.0697],
        [0.8362, 0.0492, 0.8988, 0.0420]])

In [171]:
random_tensor.ndim

2

In [172]:
# Create a random tensor with similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(224, 224, 3), device=device) # height, width, color channels (R, G, B)
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Image Representation
https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/00-tensor-shape-example-of-image.png

### Zeroes and Ones Tensors

In [173]:
# Create a tensor of all zeros
zeros = torch.zeros(size=(3, 4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [174]:
# Create a tensor of all ones
ones = torch.ones(size=(3, 4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [175]:
random_tensor.dtype # check data type of tensor

torch.float32

## Creating a range of tensors and tensors-like

In [176]:
# Use torch.arange()
one_to_ten = torch.arange(start=1, end=11, step = 1, device=device)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [177]:
# Creating tensors-like
ten_zeros = torch.zeros_like(input=one_to_ten) # the zeros_like function creates a new tensor with same shape as the input
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

## Tensor Datatypes

**Note:** Tensor datatypes is one of the 3 big potential errors you'll run into with PyTorch & Deep Learning:
1. Tensors are not the right datatype
2. Tensors are not the right shape
3. Tensors are not on the right device (such as cpu, cuda, mps, etc)

In [178]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None,  # what datatype is the tensor (e.g. float32, float16)
                               device=device, # What device is your tensor on
                               requires_grad=False) # whether or not to track gradients with this tensors operation
float_32_tensor

tensor([3., 6., 9.])

In [179]:
float_32_tensor.dtype

torch.float32

In [180]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [181]:
int_32_tensor = torch.tensor([3, 6, 9], dtype=torch.int32, device=device)
int_32_tensor

tensor([3, 6, 9], dtype=torch.int32)

### Getting Information from Tensors

1. Tensors are not the right datatype - to get detatype from a tensor, can use `tensor.dtype`
2. Tensors are not the right shape - to get shape from a tensor, can use `tensor.shape`
3. Tensors are not on the right device - to get device from a tensor, can use `tensor.device`

In [182]:
# Create a tensor to get information from
test_tensor1 = torch.rand(3, 4, dtype=torch.float64)
test_tensor2 = torch.rand(3, 4, device=device)

In [183]:
# Find out information
print(test_tensor1)
print(f"Datatype of tensor1: {test_tensor1.dtype}")
print(f"Datatype of tensor2: {test_tensor2.dtype}")
print(f"Shape of tensor: {test_tensor1.shape}")
print(f"Device tensor1 is on: {test_tensor1.device}")
print(f"Device tensor2 is on: {test_tensor2.device}")

tensor([[0.2231, 0.4155, 0.9709, 0.4749],
        [0.6553, 0.0638, 0.7042, 0.6727],
        [0.2406, 0.2365, 0.7004, 0.9602]], dtype=torch.float64)
Datatype of tensor1: torch.float64
Datatype of tensor2: torch.float32
Shape of tensor: torch.Size([3, 4])
Device tensor1 is on: cpu
Device tensor2 is on: cpu


### Manipulating Tensors (Tensor Operations)

Tensor operations include:
* Addition
* Subtraction
* Multiplication (element-wise)
* Division
* Matrix Multiplication

In [184]:
# Create a tensor
tensor = torch.tensor([1, 2, 3], device=device)
tensor + 10

tensor([11, 12, 13])

In [185]:
# Multiply tensor by 10
tensor * 10

tensor([10, 20, 30])

In [186]:
# Subtract by 10
tensor - 10

tensor([-9, -8, -7])

In [187]:
# Try out Pytorch in-built functions
torch.mul(tensor, 10) # Generally, you want to use Python functions instead

tensor([10, 20, 30])

In [188]:
torch.add(tensor, 10)

tensor([11, 12, 13])

### Matrix Multiplication

Two main ways of performing multiplication in neural networks and deep learning:

1. Element-wise multiplication
2. Matrix multiplication (dot-product)

The main two rules for matrix multiplication to remember are:

* The inner dimensions must match: \\
(3, 2) @ (3, 2) won't work \\
(2, 3) @ (3, 2) will work \\
(3, 2) @ (2, 3) will work \\
* The resulting matrix has the shape of the outer dimensions: \\
(2, 3) @ (3, 2) -> (2, 2) \\
(3, 2) @ (2, 3) -> (3, 3)

```
Note: "@" in Python is the symbol for matrix multiplication.
```
```
Resource: You can see all of the rules for matrix multiplication using torch.matmul() in the PyTorch documentation.
```

In [189]:
# Element-wise multiplication
print(tensor, "*", * tensor)
print(f"Equals: {tensor * tensor}")

tensor([1, 2, 3]) * tensor(1) tensor(2) tensor(3)
Equals: tensor([1, 4, 9])


In [190]:
# Matrix Multiplication

tensor = tensor.float() # currently, MPS device only supports float32
torch.matmul(tensor, tensor)

tensor(14.)

In [191]:
%%time
value = 0
for i in range(len(tensor)):
    value += tensor[i] * tensor[i]
print(value)
# Matrix multiplication by hand
# Don't use for loops in PyTorch b/c they are computationally expensive

tensor(14.)
CPU times: user 1.76 ms, sys: 0 ns, total: 1.76 ms
Wall time: 1.88 ms


In [192]:
%%time
torch.matmul(tensor, tensor)

# we see that the in-built function is much more optimized

CPU times: user 559 µs, sys: 0 ns, total: 559 µs
Wall time: 875 µs


tensor(14.)

In [193]:
torch.matmul(torch.rand(3, 2, device=device), torch.rand(2, 3, device=device))

tensor([[0.8143, 0.4428, 0.7491],
        [0.8480, 0.5171, 0.7614],
        [0.9724, 0.5834, 0.8762]])

In [194]:
torch.matmul(torch.rand(2, 10, device=device), torch.rand(10, 3, device=device))

# remember: resulting matrix has the shape of the outer dimensions

tensor([[2.7050, 1.2753, 2.6037],
        [2.0490, 0.7975, 1.6406]])

### One of the most common errors in Deep Learning: Shape Errors

In [195]:
# Shapes for matrix multiplication
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]])

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]])


# torch.mm(tensor_A, tensor_B) # torch.mm is the same as torch.matmul (alias)
#torch.matmul(tensor_A, tensor_B) # uncomment this to get an error because shapes are incompatible

In [196]:
# Check tensor sizes to make sure they are compatible for matrix multiplication
tensor_A.shape, tensor_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

To fix our tensor shape issues, we can manipulate the shape of one of our tensors using a transpose.

A **transpose** switches the axes or dimensions of a given tensor.

In [197]:
tensor_B

tensor([[ 7, 10],
        [ 8, 11],
        [ 9, 12]])

In [198]:
tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [199]:
# transpose switches the dimensions from 3x2 to 2x3
tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [200]:
# The matrix multiplication operation works when tensor_B is transposed
print(f"Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}\n")
print(f"New shapes: tensor_A = {tensor_A.shape} (same shape as above), tensor_B.T = {tensor_B.T.shape}\n")
print(f"Multiplying: {tensor_A.shape} @ {tensor_B.T.shape} <- inner dimensions must match\n")
print("Output:")
output = torch.matmul(tensor_A, tensor_B.T)
print(output)
print(f"\nOutput shape: {output.shape}")

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])

New shapes: tensor_A = torch.Size([3, 2]) (same shape as above), tensor_B.T = torch.Size([2, 3])

Multiplying: torch.Size([3, 2]) @ torch.Size([2, 3]) <- inner dimensions must match

Output:
tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

Output shape: torch.Size([3, 3])


## Finding the min, max, mean, sum, etc (tensor aggregation)

In [201]:
# Create a tensor
x = torch.arange(1, 100, 10) # torch.arange(start, end, step)
x, x.dtype


(tensor([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91]), torch.int64)

In [202]:
# Find the min, note that both torch.sum(x) and x.sum() are functionally equivalent
torch.min(x), x.min()

(tensor(1), tensor(1))

In [203]:
# Find the max
torch.max(x), x.max()

(tensor(91), tensor(91))

In [204]:
# Find the mean
#torch.mean(x), x.mean() # This will result in an error because x is not a float tensor

# the torch.mean() function requires a tensor of float32 dtype to work properly
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(46.), tensor(46.))

In [205]:
# Find the sum
torch.sum(x), x.sum()

(tensor(460), tensor(460))

## Finding the positional min and max

In [206]:
# Find the position in tensor that has the minimum value with argmin()
x.argmin() # This returns index position of target tensor where the minimum value occurs

tensor(0)

In [207]:
x[0] # This returns the actual minimum value

tensor(1)

In [208]:
# Find the position in tensor that has the maximum value with argmax()
x. argmax()

tensor(9)

In [209]:
x[9] # This returns the actual maximum value

tensor(91)

## Reshaping, stacking, squeezing and unsqueezing tensors

* Reshaping - reshapes an input tensor into a defined shape
* View - return a view of an input tensor of a certain shape but keep the same memory as the original tensor
* Stacking - combine multiple tensors on top of each other (vstack) or side by side (hstack)
* Squeeze - removes all `1` dimensions from a tensor
* Unsqueeze - adds a `1` dimension to a target tensor
* Permute - return a view of the input with dimensions permuted (swapped) in a certain way

In [210]:
# Create a tensor
x = torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [211]:
# Add an extra dimension to the tensor

# x.reshape(dim1, dim2, ..., dimN) # Dimension must have the same number of elements as the original tensor, same number as what torch.Size gives
#x_reshaped = x.reshape(1, 7) # Error occurs because reshaping to (1, 7) is invalid as the total number of elements must remain the same.
#x_reshaped = x.reshape(2, 9) # Error occurs because this would require 18 elements, but we only have 9.
x_reshaped = x.reshape(9, 1)
x_reshaped, x_reshaped.shape

(tensor([[1.],
         [2.],
         [3.],
         [4.],
         [5.],
         [6.],
         [7.],
         [8.],
         [9.]]),
 torch.Size([9, 1]))

In [212]:
# Change the view
z = x.view(1, 9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [213]:
# Changing z changes x (because a view of a tensor shares the same memory as the original tensor)
z[:, 0] = 5 # This changes the first element of z to 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [214]:
# Stack tensors on top of each other

# Stack tensors on top of each other along the first dimension (rows), creating a 2D tensor.
x_stacked = torch.stack([x, x, x, x], dim=0)

# Stack tensors side by side along the second dimension (columns), creating a 2D tensor.
#x_stacked = torch.stack([x, x, x, x], dim=1)

# Attempt to stack tensors along the third dimension, but this will result in an error for 1D tensors.
#x_stacked = torch.stack([x, x, x, x], dim=2)

x_stacked


tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [215]:
# torch.squeeze(input, dim=None) - removes all single dimensions from a target tensor
print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")

# Remove extra dimensions from x_reshaped
x_squeezed = x_reshaped.squeeze()
print(f"\nNew tensor: {x_squeezed}")
print(f"New shape: {x_squeezed.shape}")

Previous tensor: tensor([[5.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])
Previous shape: torch.Size([9, 1])

New tensor: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
New shape: torch.Size([9])


In [216]:
# torch.unsqueeze(input, dim) - adds a single dimension to a target tensor at a specific dim
print(f"Previous target: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

# Add an extra dimension with unsqueeze
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
print(f"\nNew tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

Previous target: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous shape: torch.Size([9])

New tensor: tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
New shape: torch.Size([1, 9])


In [217]:
# torch.unsqueeze(input, dim) - adds a single dimension to a target tensor at a specific dim
print(f"Previous target: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

# Add an extra dimension with unsqueeze
x_unsqueezed = x_squeezed.unsqueeze(dim=1)
print(f"\nNew tensor: \n{x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

Previous target: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous shape: torch.Size([9])

New tensor: 
tensor([[5.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])
New shape: torch.Size([9, 1])


In [218]:
# torch.permute - rearranges the dimensions of a target tensor of a specified order
x_original = torch.rand(size=(224, 224, 3)) # [height, width, color_channels]

# Permute the original tensor to rearrange the axis order
x_permuted = x_original.permute(2, 0, 1) # shifts axis 0->1, 1->2, 2->0

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}")

Previous shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


Remember the purpose of reshaping, stacking, squeezing, and unsqueezing tensors: these help us fix shape and dimension issues with tensors, which is the most common error in Deep Learning and Neural Networks.

## Indexing (selecting data from tensors)

Indexing with PyTorch is similar to indexing with NumPy

In [219]:
# Create a tensor
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [220]:
# Let's index on our new tensor
x[0], x[0].shape # Indexing on the first dimension (batch dimension)

(tensor([[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]),
 torch.Size([3, 3]))

In [221]:
# Let's index on the middle bracket (dim=1)
x[0][0], x[0][0].shape # Indexing on the second dimension

(tensor([1, 2, 3]), torch.Size([3]))

In [222]:
# Let's index on the most inner bracket (last dimension)
x[0][0][0], x[0][0][0].shape # Indexing on the third dimension


(tensor(1), torch.Size([]))

In [223]:
# Play around with the indexing!
x[0][2][2], x[0][2][2].shape # note that for our current tensor, x[1][0][0] will give us an error b/c the index is out of bounds for current tensor

(tensor(9), torch.Size([]))

In [224]:
# You can also use ":" to select "all" of a target dimension
x[:, 0]

tensor([[1, 2, 3]])

In [225]:
# Get all values across the 0th and 1st dimensions, but only the 2nd index of the 2nd dimension
x[:, :, 1]

tensor([[2, 5, 8]])

In [226]:
# Get all values across the 0th dimension, but only the 1 index value of 1st and 2nd dimension
x[:, 1, 1] # retrieves all elements from 0th dimension, grabs the element at index 1 in 1st dimension, grabs the element at index 1 in 2nd dimension

# Note that this is very similar to x[0][1][1] except we have an extra dimension [ ]

tensor([5])

In [227]:
# Get all values across the 0th dimension, but only the 1 index value of 1st and 2nd dimension
x[:, 1, 1] # retrieves all elements from 0th dimension, grabs the element at index 1 in 1st dimension, grabs the element at index 1 in 2nd dimension

# Note that this is very similar to x[0][1][1] except we have an extra dimension [ ]

tensor([5])

In [228]:
# Get index 0 of 0th and 1st dimension and all values of 2nd dimension
x[0, 0, :] # equivalent to x[0][0]

tensor([1, 2, 3])

In [229]:
# Index on x to return 9
print(x[:, 2, 2])

# Index on x to return 3, 6, 9
print(x[:, :, 2])

tensor([9])
tensor([[3, 6, 9]])


## PyTorch Tensors & NumPy

NumPy is a popular scientific Python numerical computing library.

And because of this, PyTorch has functionality to interact with it.

* NumPy Data -> PyTorch tensor: `torch.from_numpy(ndarray)`
* Pytorch tensor -> NumPy Data: `torch.Tensor.numpy()`

In [230]:
# NumPy array to tensor

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array).type(torch.float32)
array, tensor # warning: NumPy's default datatype is float64, while PyTorch's default datatype is float32

(array([1., 2., 3., 4., 5., 6., 7.]), tensor([1., 2., 3., 4., 5., 6., 7.]))

In [231]:
# Change the value of array. what will this do to `tensor`?

array = array + 1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]), tensor([1., 2., 3., 4., 5., 6., 7.]))

In [232]:
# Tensor to NumPy array

tensor = torch.ones(7)
numpy_tensor = tensor.numpy() # Recall the warning above! It's better to convert to float64 here
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [233]:
# Change the tensor, what happens to `numpy_tensor`?
tensor = tensor + 1
tensor, numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducability (trying to take random out of random)

### In short how a neural network learns:

`start with random numbers -> tensor operations -> update random numbers to try and make them better representations of the data -> again (repeat)`

To reduce the randomness in neural networks and Pytorch, we then use the concept of a **random seed**.

Essentially what the random seed does is set the initial random number generator to a specific state, so that if you run the same code multiple times, it will produce the same random numbers.

### Resource
https://pytorch.org/docs/stable/notes/randomness.html

In [234]:
torch.rand(3, 3)

tensor([[0.5726, 0.8087, 0.5344],
        [0.2857, 0.4369, 0.2036],
        [0.1960, 0.0249, 0.4506]])

In [235]:
# Create two random tensors
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(random_tensor_A)
print(random_tensor_B)
print(random_tensor_A == random_tensor_B) # You likely won't ever get "True" here

tensor([[0.5684, 0.1908, 0.2372, 0.0534],
        [0.2162, 0.8481, 0.7626, 0.4362],
        [0.4284, 0.6492, 0.5029, 0.7402]])
tensor([[0.0624, 0.1453, 0.0558, 0.8247],
        [0.5892, 0.8912, 0.0722, 0.6052],
        [0.5551, 0.9665, 0.6146, 0.7366]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [236]:
# Let's make some random but reproducible tensors

# Set the random seed
RANDOM_SEED = 42 # An arbitrary number
torch.manual_seed(RANDOM_SEED) # torch.manual_seed() generally only works for one block of code

random_tensor_C = torch.rand(3, 4)

torch.manual_seed(RANDOM_SEED) # Without this line, tensor C != tensor D

random_tensor_D = torch.rand(3, 4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C == random_tensor_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## Running tensors and PyTorch objects on the GPUs (and making faster computations)

Computing on tensors generally happens much faster on GPUs (graphics processing units, typically from NVIDIA) than CPUs (computer processing units).

MPS stands for "Metal Performance Shader" which is Apple's GPU (M1, M1 Pro, M2 etc).

It is advised to perform training on the fastest piece of hardware you have available, which will generally be: NVIDIA GPU ("cuda") > MPS device ("mps") > CPU ("cpu").

### 1. Getting a GPU

1. Easiest - Use Google Colab for a free GPU (options to upgrade as well)
2. Use your own GPU - takes a little bit of setup and requires the investment of purchasing a GPU. There are many options.
* https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/
* https://www.quora.com/Why-are-GPUs-well-suited-to-deep-learning
3. Use cloud computing - GCP, AWS, Azure. These services allow you to rent computers on the cloud and access them.

For setting up PyTorch + GPU Drivers (CUDA), refer to: https://pytorch.org/get-started/locally/

For Google Colab, we "change runtime type" to GPU.

`nvidia-smi` use this command to check your GPU

### 2. Check for GPU with PyTorch

In [237]:
# Setup Device-Agnostic Code

if torch.cuda.is_available():
    device = "cuda" # NVIDIA GPU
elif torch.backends.mps.is_available():
    device = "mps" # Apple GPU
else:
    device = "cpu" # Defaults to CPU if NVIDIA GPU/Apple GPU aren't available

print(f"Using device: {device}")

Using device: cpu


In [238]:
# Count number of devices
torch.mps.device_count()
# torch.cuda.device_count()

0

### 3. Putting tensors (and models) on the GPU
The reason we want our tensors/models on the GPU is because using a GPU results in faster computations.

In [239]:
# Create a tensor (default on the CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [240]:
# Move tensor to GPU if available
tensor_on_gpu = tensor.to(device)
tensor_on_gpu # mps:0 refers to index 0 of MPS device

tensor([1, 2, 3])

### 4. Moving tensors back to CPU
This is important because there are some operations that are only supported on the CPU, such as NumPy operations.

In [241]:
# If tensor is on GPU, can't transform it to NumPy
#tensor_on_gpu.numpy()  # Raises an error

# To fix the GPU tensor with NumPy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu


array([1, 2, 3])

In [242]:
tensor_on_gpu

tensor([1, 2, 3])