<a href="https://colab.research.google.com/github/JunHL96/PyTorch/blob/main/00_pytorch_fundamentals_notes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 00. Pytorch Fundamentals

Resource notebook: https://www.learnpytorch.io/00_pytorch_fundamentals/

In [None]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.5.0


### Device-Agnostic Code:

In [None]:
# Setup device-agnostic code
if torch.cuda.is_available():
    device = "cuda" # NVIDIA GPU
elif torch.backends.mps.is_available():
    device = "mps" # Apple GPU
else:
    device = "cpu" # Defaults to CPU if NVIDIA GPU/Apple GPU aren't available

print(f"Using device: {device}")

Using device: mps


## What is PyTorch?

[PyTorch](https://pytorch.org/) is an open source machine learning and deep learning framework.

## What can PyTorch be used for?

PyTorch allows you to manipulate and process data and write machine learning algorithms using Python code.

## Who uses PyTorch?

Many of the world's largest technology companies such as [Meta (Facebook)](https://ai.facebook.com/blog/pytorch-builds-the-future-of-ai-and-machine-learning-at-facebook/), Tesla and Microsoft as well as artificial intelligence research companies such as [OpenAI use PyTorch](https://openai.com/blog/openai-pytorch/) to power research and bring machine learning to their products.

![pytorch being used across industry and research](https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/00-pytorch-being-used-across-research-and-industry.png)

For example, Andrej Karpathy (head of AI at Tesla) has given several talks ([PyTorch DevCon 2019](https://youtu.be/oBklltKXtDE), [Tesla AI Day 2021](https://youtu.be/j0z4FweCy4M?t=2904)) about how Tesla uses PyTorch to power their self-driving computer vision models.

PyTorch is also used in other industries such as agriculture to [power computer vision on tractors](https://medium.com/pytorch/ai-for-ag-production-machine-learning-for-agriculture-e8cfdb9849a1).

## Why use PyTorch?

Machine learning researchers love using PyTorch. And as of February 2022, PyTorch is the [most used deep learning framework on Papers With Code](https://paperswithcode.com/trends), a website for tracking machine learning research papers and the code repositories attached with them.

PyTorch also helps take care of many things such as GPU acceleration (making your code run faster) behind the scenes.

So you can focus on manipulating data and writing algorithms and PyTorch will make sure it runs fast.

And if companies such as Tesla and Meta (Facebook) use it to build models they deploy to power hundreds of applications, drive thousands of cars and deliver content to billions of people, it's clearly capable on the development front too.

## What we're going to cover in this module

This course is broken down into different sections (notebooks).

Each notebook covers important ideas and concepts within PyTorch.

Subsequent notebooks build upon knowledge from the previous one (numbering starts at 00, 01, 02 and goes to whatever it ends up going to).

This notebook deals with the basic building block of machine learning and deep learning, the tensor.

Specifically, we're going to cover:

| **Topic** | **Contents** |
| ----- | ----- |
| **Introduction to tensors** | Tensors are the basic building block of all of machine learning and deep learning. |
| **Creating tensors** | Tensors can represent almost any kind of data (images, words, tables of numbers). |
| **Getting information from tensors** | If you can put information into a tensor, you'll want to get it out too. |
| **Manipulating tensors** | Machine learning algorithms (like neural networks) involve manipulating tensors in many different ways such as adding, multiplying, combining. |
| **Dealing with tensor shapes** | One of the most common issues in machine learning is dealing with shape mismatches (trying to mix wrong shaped tensors with other tensors). |
| **Indexing on tensors** | If you've indexed on a Python list or NumPy array, it's very similar with tensors, except they can have far more dimensions. |
| **Mixing PyTorch tensors and NumPy** | PyTorch plays with tensors ([`torch.Tensor`](https://pytorch.org/docs/stable/tensors.html)), NumPy likes arrays ([`np.ndarray`](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html)) sometimes you'll want to mix and match these. |
| **Reproducibility** | Machine learning is very experimental and since it uses a lot of *randomness* to work, sometimes you'll want that *randomness* to not be so random. |
| **Running tensors on GPU** | GPUs (Graphics Processing Units) make your code faster, PyTorch makes it easy to run your code on GPUs. |

## Where can you get help?

All of the materials for this course [live on GitHub](https://github.com/mrdbourke/pytorch-deep-learning).

And if you run into trouble, you can ask a question on the [Discussions page](https://github.com/mrdbourke/pytorch-deep-learning/discussions) there too.

There's also the [PyTorch developer forums](https://discuss.pytorch.org/), a very helpful place for all things PyTorch.

## Introduction to Tensors

### What are tensors?

A tensor is a multi-dimensional matrix containing elements of a single data type.

### Creating tensors

PyTorch tensors are created using 'torch.tensor()' = https://pytorch.org/docs/stable/tensors.html


For example, you could represent an image as a tensor with shape `[3, 224, 224]` which would mean `[colour_channels, height, width]`, as in the image has `3` colour channels (red, green, blue), a height of `224` pixels and a width of `224` pixels.

![example of going from an input image to a tensor representation of the image, image gets broken down into 3 colour channels as well as numbers to represent the height and width](https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/00-tensor-shape-example-of-image.png)

In tensor-speak (the language used to describe tensors), the tensor would have three dimensions, one for `colour_channels`, `height` and `width`.


In [None]:
# scalar
scalar = torch.tensor(7, device = device)
scalar

tensor(7, device='mps:0')

In [None]:
scalar.ndim

0

In [None]:
# Get tensor back as Python int
scalar.item()

7

In [None]:
# Vector (magnitude, direction)
vector = torch.tensor([7, 7], device = device)
vector

tensor([7, 7], device='mps:0')

In [None]:
vector.ndim

1

In [None]:
vector.shape

torch.Size([2])

<details>

### `.shape`
- **Returns the dimensions of the tensor (or array)** as a tuple.
- Each element of the tuple (an ordered, immutable collection of items) represents the size of the tensor in that dimension.

Example:
```python
vector = torch.tensor([7, 7])
print(vector.shape)  # Output: torch.Size([2])
```

### `.ndim`
- **Returns the number of dimensions (rank) of the tensor.**
- this is an integer value representing how many dimensions the tensor has

Example:
```python
vector = torch.tensor([7, 7])
print(vector.ndim)  # Output: 1
```

### Explanation:
- `vector = torch.tensor([7, 7])` creates a 1D tensor (vector) with 2 elements.
- `.shape` gives `(2)` because the vector has 2 elements in one dimension.
- `.ndim` gives `1` because the tensor is 1-dimensional.


| Attribute  | Description                               | Example Output            |
|------------|-------------------------------------------|---------------------------|
| `.shape`   | Tuple representing the size of each dimension | `torch.Size([2])`   |
| `.ndim`    | Integer representing the number of dimensions | `1`                       |

</details>

## Matrix

In [None]:
# MATRIX
MATRIX = torch.tensor([[7, 8],
                      [9, 10]], device = device)

In [None]:
MATRIX.ndim

2

In [None]:
MATRIX[0] # index on 0th axis

tensor([7, 8], device='mps:0')

In [None]:
MATRIX[1] # index on 1st axis

tensor([ 9, 10], device='mps:0')

In [None]:
MATRIX.shape

torch.Size([2, 2])

## Tensor

In [None]:
# TENSOR
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]], device = device)
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]], device='mps:0')

In [None]:
TENSOR.ndim

3

In [None]:
TENSOR.shape # the result is torch.Size([1, 3, 3]), meaning we have one dimension of 3x3 tensor

torch.Size([1, 3, 3])

### Image Representation
![example of different tensor dimensions](https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/00-pytorch-different-tensor-dimensions.png)

In [None]:
TENSOR[0]

tensor([[1, 2, 3],
        [3, 6, 9],
        [2, 4, 5]], device='mps:0')

### Summarization so far

| Name | What is it? | Number of dimensions | Lower or upper (usually/example) |
| ----- | ----- | ----- | ----- |
| **scalar** | a single number | 0 | Lower (`a`) |
| **vector** | a number with direction (e.g. wind speed with direction) but can also have many other numbers | 1 | Lower (`y`) |
| **matrix** | a 2-dimensional array of numbers | 2 | Upper (`Q`) |
| **tensor** | an n-dimensional array of numbers | can be any number, a 0-dimension tensor is a scalar, a 1-dimension tensor is a vector | Upper (`X`) |

![scalar vector matrix tensor and what they look like](https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/00-scalar-vector-matrix-tensor.png)

## Random Tensors

### Why Random Tensors?
Random tensors are important b/c the way many neural networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better represent the data.

`Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers`

### Documentation of torch.rand
https://pytorch.org/docs/stable/generated/torch.rand.html

In [None]:
# Create a random tensor of size (3, 4)

random_tensor = torch.rand(3, 4, device=device)
#random_tensor = torch.rand(5, 10, 10, device=mps)
random_tensor

tensor([[0.7467, 0.0097, 0.0881, 0.1962],
        [0.6511, 0.3817, 0.8581, 0.2633],
        [0.6535, 0.2196, 0.4270, 0.3614]], device='mps:0')

In [None]:
random_tensor.ndim

2

In [None]:
# Create a random tensor with similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(224, 224, 3), device=device) # height, width, color channels (R, G, B)
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Zeroes and Ones Tensors

In [None]:
# Create a tensor of all zeros
zeros = torch.zeros(size=(3, 4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
# Create a tensor of all ones
ones = torch.ones(size=(3, 4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
random_tensor.dtype # check data type of tensor

torch.float32

## Creating a range of tensors and tensors-like

In [None]:
# Use torch.arange()
one_to_ten = torch.arange(start=1, end=11, step = 1, device=device)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10], device='mps:0')

In [None]:
# Creating tensors-like
ten_zeros = torch.zeros_like(input=one_to_ten) # the zeros_like function creates a new tensor with same shape as the input
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='mps:0')

## Tensor Datatypes

**Note:** Tensor datatypes is one of the 3 big potential errors you'll run into with PyTorch & Deep Learning:
1. Tensors are not the right datatype
2. Tensors are not the right shape
3. Tensors are not on the right device (such as cpu, cuda, mps, etc)

In [None]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None,  # what datatype is the tensor (e.g. float32, float16)
                               device=device, # What device is your tensor on
                               requires_grad=False) # whether or not to track gradients with this tensors operation
float_32_tensor

tensor([3., 6., 9.], device='mps:0')

In [None]:
float_32_tensor.dtype

torch.float32

In [None]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], device='mps:0', dtype=torch.float16)

In [None]:
int_32_tensor = torch.tensor([3, 6, 9], dtype=torch.int32, device=device)
int_32_tensor

tensor([3, 6, 9], device='mps:0', dtype=torch.int32)

### Getting Information from Tensors

1. Tensors are not the right datatype - to get detatype from a tensor, can use `tensor.dtype`
2. Tensors are not the right shape - to get shape from a tensor, can use `tensor.shape`
3. Tensors are not on the right device - to get device from a tensor, can use `tensor.device`

In [None]:
# Create a tensor to get information from
test_tensor1 = torch.rand(3, 4, dtype=torch.float64)
test_tensor2 = torch.rand(3, 4, device=device)

In [None]:
# Find out information
print(test_tensor1)
print(f"Datatype of tensor1: {test_tensor1.dtype}")
print(f"Datatype of tensor2: {test_tensor2.dtype}")
print(f"Shape of tensor: {test_tensor1.shape}")
print(f"Device tensor1 is on: {test_tensor1.device}")
print(f"Device tensor2 is on: {test_tensor2.device}")

tensor([[0.7209, 0.8423, 0.2945, 0.9707],
        [0.9395, 0.1181, 0.4804, 0.5916],
        [0.8457, 0.7156, 0.0367, 0.4763]], dtype=torch.float64)
Datatype of tensor1: torch.float64
Datatype of tensor2: torch.float32
Shape of tensor: torch.Size([3, 4])
Device tensor1 is on: cpu
Device tensor2 is on: mps:0


### Manipulating Tensors (Tensor Operations)

Tensor operations include:
* Addition
* Subtraction
* Multiplication (element-wise)
* Division
* Matrix Multiplication

In [None]:
# Create a tensor
tensor = torch.tensor([1, 2, 3], device=device)
tensor + 10

tensor([11, 12, 13], device='mps:0')

In [None]:
# Multiply tensor by 10
tensor * 10

tensor([10, 20, 30], device='mps:0')

In [None]:
# Subtract by 10
tensor - 10

tensor([-9, -8, -7], device='mps:0')

In [None]:
# Try out Pytorch in-built functions
torch.mul(tensor, 10) # Generally, you want to use Python functions instead

tensor([10, 20, 30], device='mps:0')

In [None]:
torch.add(tensor, 10)

tensor([11, 12, 13], device='mps:0')

### Matrix Multiplication

Two main ways of performing multiplication in neural networks and deep learning:

1. Element-wise multiplication
2. Matrix multiplication (dot-product)

The main two rules for matrix multiplication to remember are:

* The inner dimensions must match: \\
(3, 2) @ (3, 2) won't work \\
(2, 3) @ (3, 2) will work \\
(3, 2) @ (2, 3) will work \\
* The resulting matrix has the shape of the outer dimensions: \\
(2, 3) @ (3, 2) -> (2, 2) \\
(3, 2) @ (2, 3) -> (3, 3)

```
Note: "@" in Python is the symbol for matrix multiplication.
```
```
Resource: You can see all of the rules for matrix multiplication using torch.matmul() in the PyTorch documentation.
```

In [None]:
# Element-wise multiplication
print(tensor, "*", * tensor)
print(f"Equals: {tensor * tensor}")

tensor([1, 2, 3], device='mps:0') * tensor(1, device='mps:0') tensor(2, device='mps:0') tensor(3, device='mps:0')
Equals: tensor([1, 4, 9], device='mps:0')


In [None]:
# Matrix Multiplication

tensor = tensor.float() # currently, MPS device only supports float32
torch.matmul(tensor, tensor)

tensor(14., device='mps:0')

In [None]:
%%time
value = 0
for i in range(len(tensor)):
    value += tensor[i] * tensor[i]
print(value)
# Matrix multiplication by hand
# Don't use for loops in PyTorch b/c they are computationally expensive

tensor(14., device='mps:0')
CPU times: user 9.93 ms, sys: 2.27 ms, total: 12.2 ms
Wall time: 12.9 ms


In [None]:
%%time
torch.matmul(tensor, tensor)

# we see that the in-built function is much more optimized

CPU times: user 194 μs, sys: 59 μs, total: 253 μs
Wall time: 131 μs


tensor(14., device='mps:0')

In [None]:
torch.matmul(torch.rand(3, 2, device=device), torch.rand(2, 3, device=device))

tensor([[0.6396, 0.5466, 0.5008],
        [1.0496, 0.9113, 0.8535],
        [0.9927, 0.8652, 0.8144]], device='mps:0')

In [None]:
torch.matmul(torch.rand(2, 10, device=device), torch.rand(10, 3, device=device))

# remember: resulting matrix has the shape of the outer dimensions

tensor([[1.7733, 2.7598, 1.7407],
        [1.6163, 2.2521, 2.3059]], device='mps:0')

### One of the most common errors in Deep Learning: Shape Errors

In [None]:
# Shapes for matrix multiplication
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]])

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]])


# torch.mm(tensor_A, tensor_B) # torch.mm is the same as torch.matmul (alias)
#torch.matmul(tensor_A, tensor_B) # uncomment this to get an error because shapes are incompatible

In [None]:
# Check tensor sizes to make sure they are compatible for matrix multiplication
tensor_A.shape, tensor_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

To fix our tensor shape issues, we can manipulate the shape of one of our tensors using a transpose.

A **transpose** switches the axes or dimensions of a given tensor.

In [None]:
tensor_B

tensor([[ 7, 10],
        [ 8, 11],
        [ 9, 12]])

In [None]:
tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [None]:
# transpose switches the dimensions from 3x2 to 2x3
tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [None]:
# The matrix multiplication operation works when tensor_B is transposed
print(f"Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}\n")
print(f"New shapes: tensor_A = {tensor_A.shape} (same shape as above), tensor_B.T = {tensor_B.T.shape}\n")
print(f"Multiplying: {tensor_A.shape} @ {tensor_B.T.shape} <- inner dimensions must match\n")
print("Output:")
output = torch.matmul(tensor_A, tensor_B.T)
print(output)
print(f"\nOutput shape: {output.shape}")

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])

New shapes: tensor_A = torch.Size([3, 2]) (same shape as above), tensor_B.T = torch.Size([2, 3])

Multiplying: torch.Size([3, 2]) @ torch.Size([2, 3]) <- inner dimensions must match

Output:
tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

Output shape: torch.Size([3, 3])


## Finding the min, max, mean, sum, etc (tensor aggregation)

In [None]:
# Create a tensor
x = torch.arange(1, 100, 10) # torch.arange(start, end, step)
x, x.dtype


(tensor([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91]), torch.int64)

In [None]:
# Find the min, note that both torch.sum(x) and x.sum() are functionally equivalent
torch.min(x), x.min()

(tensor(1), tensor(1))

In [None]:
# Find the max
torch.max(x), x.max()

(tensor(91), tensor(91))

In [None]:
# Find the mean
#torch.mean(x), x.mean() # This will result in an error because x is not a float tensor

# the torch.mean() function requires a tensor of float32 dtype to work properly
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(46.), tensor(46.))

In [None]:
# Find the sum
torch.sum(x), x.sum()

(tensor(460), tensor(460))

## Finding the positional min and max

In [None]:
# Find the position in tensor that has the minimum value with argmin()
x.argmin() # This returns index position of target tensor where the minimum value occurs

tensor(0)

In [None]:
x[0] # This returns the actual minimum value

tensor(1)

In [None]:
# Find the position in tensor that has the maximum value with argmax()
x. argmax()

tensor(9)

In [None]:
x[9] # This returns the actual maximum value

tensor(91)

## Reshaping, stacking, squeezing and unsqueezing tensors
Often times you'll want to reshape or change the dimensions of your tensors without actually changing the values inside them.

To do so, some popular methods are:

| Method | One-line description |
| ----- | ----- |
| [`torch.reshape(input, shape)`](https://pytorch.org/docs/stable/generated/torch.reshape.html#torch.reshape) | Reshapes `input` to `shape` (if compatible), can also use `torch.Tensor.reshape()`. |
| [`Tensor.view(shape)`](https://pytorch.org/docs/stable/generated/torch.Tensor.view.html) | Returns a view of the original tensor in a different `shape` but shares the same data as the original tensor. |
| [`torch.stack(tensors, dim=0)`](https://pytorch.org/docs/1.9.1/generated/torch.stack.html) | Concatenates a sequence of `tensors` along a new dimension (`dim`), all `tensors` must be same size. |
| [`torch.squeeze(input)`](https://pytorch.org/docs/stable/generated/torch.squeeze.html) | Squeezes `input` to remove all the dimenions with value `1`. |
| [`torch.unsqueeze(input, dim)`](https://pytorch.org/docs/1.9.1/generated/torch.unsqueeze.html) | Returns `input` with a dimension value of `1` added at `dim`. |
| [`torch.permute(input, dims)`](https://pytorch.org/docs/stable/generated/torch.permute.html) | Returns a *view* of the original `input` with its dimensions permuted (rearranged) to `dims`. |

Why do any of these?

Because deep learning models (neural networks) are all about manipulating tensors in some way. And because of the rules of matrix multiplication, if you've got shape mismatches, you'll run into errors. These methods help you make sure the right elements of your tensors are mixing with the right elements of other tensors.

Let's try them out.

First, we'll create a tensor.

In [None]:
# Create a tensor
x = torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
# Add an extra dimension to the tensor

# x.reshape(dim1, dim2, ..., dimN) # Dimension must have the same number of elements as the original tensor, same number as what torch.Size gives
#x_reshaped = x.reshape(1, 7) # Error occurs because reshaping to (1, 7) is invalid as the total number of elements must remain the same.
#x_reshaped = x.reshape(2, 9) # Error occurs because this would require 18 elements, but we only have 9.
x_reshaped = x.reshape(9, 1)
x_reshaped, x_reshaped.shape

(tensor([[1.],
         [2.],
         [3.],
         [4.],
         [5.],
         [6.],
         [7.],
         [8.],
         [9.]]),
 torch.Size([9, 1]))

In [None]:
# Change the view
z = x.view(1, 9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
# Changing z changes x (because a view of a tensor shares the same memory as the original tensor)
z[:, 0] = 5 # This changes the first element of z to 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [None]:
# Stack tensors on top of each other

# Stack tensors on top of each other along the first dimension (rows), creating a 2D tensor.
x_stacked = torch.stack([x, x, x, x], dim=0)

# Stack tensors side by side along the second dimension (columns), creating a 2D tensor.
#x_stacked = torch.stack([x, x, x, x], dim=1)

# Attempt to stack tensors along the third dimension, but this will result in an error for 1D tensors.
#x_stacked = torch.stack([x, x, x, x], dim=2)

x_stacked


tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [None]:
# torch.squeeze(input, dim=None) - removes all single dimensions from a target tensor
print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")

# Remove extra dimensions from x_reshaped
x_squeezed = x_reshaped.squeeze()
print(f"\nNew tensor: {x_squeezed}")
print(f"New shape: {x_squeezed.shape}")

Previous tensor: tensor([[5.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])
Previous shape: torch.Size([9, 1])

New tensor: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
New shape: torch.Size([9])


In [None]:
# torch.unsqueeze(input, dim) - adds a single dimension to a target tensor at a specific dim
print(f"Previous target: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

# Add an extra dimension with unsqueeze
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
print(f"\nNew tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

Previous target: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous shape: torch.Size([9])

New tensor: tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
New shape: torch.Size([1, 9])


In [None]:
# torch.unsqueeze(input, dim) - adds a single dimension to a target tensor at a specific dim
print(f"Previous target: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

# Add an extra dimension with unsqueeze
x_unsqueezed = x_squeezed.unsqueeze(dim=1)
print(f"\nNew tensor: \n{x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

Previous target: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous shape: torch.Size([9])

New tensor: 
tensor([[5.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])
New shape: torch.Size([9, 1])


In [None]:
# torch.permute - rearranges the dimensions of a target tensor of a specified order
x_original = torch.rand(size=(224, 224, 3)) # [height, width, color_channels]

# Permute the original tensor to rearrange the axis order
x_permuted = x_original.permute(2, 0, 1) # shifts axis 0->1, 1->2, 2->0

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}")

Previous shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


Remember the purpose of reshaping, stacking, squeezing, and unsqueezing tensors: these help us fix shape and dimension issues with tensors, which is the most common error in Deep Learning and Neural Networks.

## Indexing (selecting data from tensors)

Indexing with PyTorch is similar to indexing with NumPy

In [None]:
# Create a tensor
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
# Let's index on our new tensor
x[0], x[0].shape # Indexing on the first dimension (batch dimension)

(tensor([[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]),
 torch.Size([3, 3]))

In [None]:
# Let's index on the middle bracket (dim=1)
x[0][0], x[0][0].shape # Indexing on the second dimension

(tensor([1, 2, 3]), torch.Size([3]))

In [None]:
# Let's index on the most inner bracket (last dimension)
x[0][0][0], x[0][0][0].shape # Indexing on the third dimension


(tensor(1), torch.Size([]))

In [None]:
# Play around with the indexing!
x[0][2][2], x[0][2][2].shape # note that for our current tensor, x[1][0][0] will give us an error b/c the index is out of bounds for current tensor

(tensor(9), torch.Size([]))

In [None]:
# You can also use ":" to select "all" of a target dimension
x[:, 0]

tensor([[1, 2, 3]])

In [None]:
# Get all values across the 0th and 1st dimensions, but only the 2nd index of the 2nd dimension
x[:, :, 1]

tensor([[2, 5, 8]])

In [None]:
# Get all values across the 0th dimension, but only the 1 index value of 1st and 2nd dimension
x[:, 1, 1] # retrieves all elements from 0th dimension, grabs the element at index 1 in 1st dimension, grabs the element at index 1 in 2nd dimension

# Note that this is very similar to x[0][1][1] except we have an extra dimension [ ]

tensor([5])

In [None]:
# Get all values across the 0th dimension, but only the 1 index value of 1st and 2nd dimension
x[:, 1, 1] # retrieves all elements from 0th dimension, grabs the element at index 1 in 1st dimension, grabs the element at index 1 in 2nd dimension

# Note that this is very similar to x[0][1][1] except we have an extra dimension [ ]

tensor([5])

In [None]:
# Get index 0 of 0th and 1st dimension and all values of 2nd dimension
x[0, 0, :] # equivalent to x[0][0]

tensor([1, 2, 3])

In [None]:
# Index on x to return 9
print(x[:, 2, 2])

# Index on x to return 3, 6, 9
print(x[:, :, 2])

tensor([9])
tensor([[3, 6, 9]])


## PyTorch Tensors & NumPy

NumPy is a popular scientific Python numerical computing library.

And because of this, PyTorch has functionality to interact with it.

* NumPy Data -> PyTorch tensor: `torch.from_numpy(ndarray)`
* Pytorch tensor -> NumPy Data: `torch.Tensor.numpy()`

In [None]:
# NumPy array to tensor

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array).type(torch.float32)
array, tensor # warning: NumPy's default datatype is float64, while PyTorch's default datatype is float32

(array([1., 2., 3., 4., 5., 6., 7.]), tensor([1., 2., 3., 4., 5., 6., 7.]))

In [None]:
# Change the value of array. what will this do to `tensor`?

array = array + 1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]), tensor([1., 2., 3., 4., 5., 6., 7.]))

In [None]:
# Tensor to NumPy array

tensor = torch.ones(7)
numpy_tensor = tensor.numpy() # Recall the warning above! It's better to convert to float64 here
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [None]:
# Change the tensor, what happens to `numpy_tensor`?
tensor = tensor + 1
tensor, numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducability (trying to take random out of random)

### In short how a neural network learns:

`start with random numbers -> tensor operations -> update random numbers to try and make them better representations of the data -> again (repeat)`

To reduce the randomness in neural networks and Pytorch, we then use the concept of a **random seed**.

Essentially what the random seed does is set the initial random number generator to a specific state, so that if you run the same code multiple times, it will produce the same random numbers.

### Resource
https://pytorch.org/docs/stable/notes/randomness.html

In [None]:
torch.rand(3, 3)

tensor([[0.5551, 0.9699, 0.7323],
        [0.7531, 0.6240, 0.0389],
        [0.6641, 0.7219, 0.1748]])

In [None]:
# Create two random tensors
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(random_tensor_A)
print(random_tensor_B)
print(random_tensor_A == random_tensor_B) # You likely won't ever get "True" here

tensor([[0.9904, 0.2649, 0.0933, 0.2652],
        [0.1414, 0.6662, 0.9613, 0.7012],
        [0.9322, 0.0388, 0.1791, 0.5544]])
tensor([[0.0441, 0.5297, 0.9194, 0.4380],
        [0.1000, 0.7825, 0.6678, 0.1498],
        [0.6538, 0.3127, 0.9382, 0.3729]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [None]:
# Let's make some random but reproducible tensors

# Set the random seed
RANDOM_SEED = 42 # An arbitrary number
torch.manual_seed(RANDOM_SEED) # torch.manual_seed() generally only works for one block of code

random_tensor_C = torch.rand(3, 4)

torch.manual_seed(RANDOM_SEED) # Without this line, tensor C != tensor D

random_tensor_D = torch.rand(3, 4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C == random_tensor_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## Running tensors and PyTorch objects on the GPUs (and making faster computations)

Computing on tensors generally happens much faster on GPUs (graphics processing units, typically from NVIDIA) than CPUs (computer processing units).

MPS stands for "Metal Performance Shader" which is Apple's GPU (M1, M1 Pro, M2 etc).

It is advised to perform training on the fastest piece of hardware you have available, which will generally be: NVIDIA GPU ("cuda") > MPS device ("mps") > CPU ("cpu").

### 1. Getting a GPU


| **Method** | **Difficulty to setup** | **Pros** | **Cons** | **How to setup** |
| ----- | ----- | ----- | ----- | ----- |
| Google Colab | Easy | Free to use, almost zero setup required, can share work with others as easy as a link | Doesn't save your data outputs, limited compute, subject to timeouts | [Follow the Google Colab Guide](https://colab.research.google.com/notebooks/gpu.ipynb) |
| Use your own | Medium | Run everything locally on your own machine | GPUs aren't free, require upfront cost | Follow the [PyTorch installation guidelines](https://pytorch.org/get-started/locally/) |
| Cloud computing (AWS, GCP, Azure) | Medium-Hard | Small upfront cost, access to almost infinite compute | Can get expensive if running continually, takes some time to setup right | Follow the [PyTorch installation guidelines](https://pytorch.org/get-started/cloud-partners/) |

There are more options for using GPUs but the above three will suffice for now.

Personally, I use a combination of Google Colab and my own personal computer for small scale experiments (and creating this course) and go to cloud resources when I need more compute power.

> **Resource:** If you're looking to purchase a GPU of your own but not sure what to get, [Tim Dettmers has an excellent guide](https://timdettmers.com/2020/09/07/which-gpu-for-deep-learning/).

To check if you've got access to a Nvidia GPU, you can run `!nvidia-smi` where the `!` (also called bang) means "run this on the command line".

### 2. Check for GPU with PyTorch

In [None]:
# Setup Device-Agnostic Code

if torch.cuda.is_available():
    device = "cuda" # NVIDIA GPU
elif torch.backends.mps.is_available():
    device = "mps" # Apple GPU
else:
    device = "cpu" # Defaults to CPU if NVIDIA GPU/Apple GPU aren't available

print(f"Using device: {device}")

Using device: mps


In [None]:
# Count number of devices
torch.mps.device_count()
# torch.cuda.device_count()

1

### 3. Putting tensors (and models) on the GPU
The reason we want our tensors/models on the GPU is because using a GPU results in faster computations.

In [None]:
# Create a tensor (default on the CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [None]:
# Move tensor to GPU if available
tensor_on_gpu = tensor.to(device)
tensor_on_gpu # mps:0 refers to index 0 of MPS device

tensor([1, 2, 3], device='mps:0')

### 4. Moving tensors back to CPU
This is important because there are some operations that are only supported on the CPU, such as NumPy operations.

In [None]:
# If tensor is on GPU, can't transform it to NumPy
#tensor_on_gpu.numpy()  # Raises an error

# To fix the GPU tensor with NumPy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu


array([1, 2, 3])

In [None]:
tensor_on_gpu

tensor([1, 2, 3], device='mps:0')