# Pytorch Module 00

| [Video](https://www.youtube.com/watch?v=Z_ikDlimN6A) |
[PyTorch](https://pytorch.org/docs/stable/index.html) |
[Repository](https://github.com/mrdbourke/pytorch-deep-learning/blob/main/00_pytorch_fundamentals.ipynb) |


In [2]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

1.13.1


# Pytorch Basics and Fundamentals (tensors and tensor operations)

"A torch.Tensor is a multi-dimensional matrix containing elements of a single data type."

Tensors are the fundamental building blocks of nn

https://pytorch.org/docs/stable/tensors.html

Inputs -> Numerical Encoding (tensor) -> Learning -> Representation Ouputs (tensor) -> Outputs

## Introduction to Tensors

### Creating Tensors

In [3]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [4]:
scalar.ndim

0

In [5]:
scalar.item()

7

#### Vector

In [6]:
vector = torch.tensor([7,7])
vector

tensor([7, 7])

**Dimension/Rank**:
- 0: Scalar
- 1: Vector
- 2: Matrix
- 3: Cube (3-Tensor)
- n: n-Tensor

In [7]:
vector.ndim

1

**Total number of elements**

In [8]:
vector.shape

torch.Size([2])

#### MATRIX

In [9]:
MATRIX = torch.tensor([[7,8],
                       [9,10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [10]:
MATRIX.ndim

2

In [11]:
MATRIX[0]

tensor([7, 8])

In [12]:
MATRIX.shape

torch.Size([2, 2])

#### TENSOR

an n-dimensional array of numbers n > 2

In [13]:

TENSOR = torch.tensor([[[1,2,3],
                        [3,6,5],
                        [3,6,8]]])

TENSOR

tensor([[[1, 2, 3],
         [3, 6, 5],
         [3, 6, 8]]])

In [14]:
TENSOR.ndim

3

In [15]:
TENSOR.shape

torch.Size([1, 3, 3])

In [16]:
TENSOR[0]

tensor([[1, 2, 3],
        [3, 6, 5],
        [3, 6, 8]])

| Name | What is it? | Number of dimensions | Lower or upper (usually/example) |
| ----- | ----- | ----- | ----- |
| **scalar** | a single number | 0 | Lower (`a`) | 
| **vector** | a number with direction (e.g. wind speed with direction) but can also have many other numbers | 1 | Lower (`y`) |
| **matrix** | a 2-dimensional array of numbers | 2 | Upper (`Q`) |
| **tensor** | an n-dimensional array of numbers | can be any number, a 0-dimension tensor is a scalar, a 1-dimension tensor is a vector | Upper (`X`) | 

#### Random tensors

The way many neural networks learn is to start with a tensor full of random numbers

`Start with random numbers -> look at data -> update random numbers -> look -> update`

**torch.rand()**

In [17]:
random_tensor = torch.rand(3,4)
random_tensor

tensor([[0.0433, 0.1208, 0.1568, 0.4083],
        [0.5378, 0.7873, 0.5815, 0.3775],
        [0.3308, 0.5742, 0.7561, 0.3904]])

In [18]:
random_tensor.ndim

2

**Create random tensor with similar shape to an image tensor**

Represent height, width, color channels (R, G, B)

In [19]:
random_image_size_tensor = torch.rand(size=(224,224,3))
print(random_image_size_tensor.shape, random_image_size_tensor.ndim)

torch.Size([224, 224, 3]) 3


#### Zeros and ones

In [20]:
# Create a tensor of all zeros
zeros = torch.zeros(size=(3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [21]:
zeros*random_tensor

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [22]:
# Create a tensor of all ones
ones = torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [23]:
ones.dtype

torch.float32

#### Creating a range of tensors and tensors-like

**torch.arange()**

In [24]:
# one_to_ten = torch.arange(start=0,end=1000, step=77)
one_to_ten = torch.arange(1,11)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [25]:
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [26]:
ten_randones = torch.ones_like(input=one_to_ten)
ten_randones

tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

### Tensor datatypes

In [27]:
float_32_tensor = torch.tensor([3.0 ,6.0, 9.0],
                               dtype=None, # eg. dtype=torch.float_16
                               device=None,
                               requires_grad=False)

In [28]:
float_32_tensor.dtype

torch.float32

In [29]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor.dtype

torch.float16

In [30]:
float_16_tensor * float_32_tensor # sometimes it works anyway

tensor([ 9., 36., 81.])

In [31]:
int_32_tensor = torch.tensor([3, 6, 9], dtype=torch.int32)
int_32_tensor

tensor([3, 6, 9], dtype=torch.int32)

In [32]:
float_32_tensor*int_32_tensor

tensor([ 9., 36., 81.])

**Tensor Atributes**

In [33]:
some_tensor = torch.rand(3,4)

print(some_tensor)
print(f"dtype: {some_tensor.dtype}")
print(f"shape: {some_tensor.shape}")
print(f"size: {some_tensor.size()}")
print(f"device: {some_tensor.device}")

tensor([[0.1137, 0.2806, 0.7238, 0.3555],
        [0.1968, 0.5964, 0.5514, 0.5438],
        [0.2169, 0.5143, 0.4983, 0.7512]])
dtype: torch.float32
shape: torch.Size([3, 4])
size: torch.Size([3, 4])
device: cpu


## Manipulating Tensors (tensor operations)

- Addition
- Subtraction
- Multiplication
- Division
- Matrix multiplication

In [34]:
tensor = torch.tensor([1,2,3])
tensor + 10

tensor([11, 12, 13])

In [35]:
tensor - 10

tensor([-9, -8, -7])

In [36]:
tensor * 10

tensor([10, 20, 30])

In [37]:
tensor / 10

tensor([0.1000, 0.2000, 0.3000])

**PyTorch built-in functions**

In [38]:
torch.mul(tensor, 10)

tensor([10, 20, 30])

In [39]:
torch.add(tensor, 3)

tensor([4, 5, 6])

Matrix multiplication (is all you need)
One of the most common operations in machine learning and deep learning algorithms (like neural networks) is matrix multiplication.

PyTorch implements matrix multiplication functionality in the torch.matmul() method.

The main two rules for matrix multiplication to remember are:

The inner dimensions must match:
(3, 2) @ (3, 2) won't work
(2, 3) @ (3, 2) will work
(3, 2) @ (2, 3) will work
The resulting matrix has the shape of the outer dimensions:
(2, 3) @ (3, 2) -> (2, 2)
(3, 2) @ (2, 3) -> (3, 3)
Note: "@" in Python is the symbol for matrix multiplication.

Resource: You can see all of the rules for matrix multiplication using torch.matmul() in the PyTorch documentation.

Let's create a tensor and perform element-wise multiplication and matrix multiplication on it

In [40]:
import torch
tensor = torch.tensor([1, 2, 3])
tensor.shape

torch.Size([3])

The difference between element-wise multiplication and matrix multiplication is the addition of values.

For our `tensor` variable with values `[1, 2, 3]`:

| Operation | Calculation | Code |
| ----- | ----- | ----- |
| **Element-wise multiplication** | `[1*1, 2*2, 3*3]` = `[1, 4, 9]` | `tensor * tensor` |
| **Matrix multiplication** | `[1*1 + 2*2 + 3*3]` = `[14]` | `tensor.matmul(tensor)` |


In [41]:
tensor1 = torch.tensor([[1,2],
                        [3,4],
                        [5,6]])

tensor2 = torch.tensor([[7,10],
                        [8,11],
                        [9,12]])



In [42]:
torch.matmul(tensor1, tensor2)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

In [None]:
tensor1.shape, tensor2.shape

### Transpose

Useful to allow matrix multiplication

In [None]:
tensor2, tensor2.shape

In [None]:
tensor2.T, tensor2.T.shape

In [None]:
torch.matmul(tensor1, tensor2.T), torch.matmul(tensor1, tensor2.T).shape

In [None]:
# The operation works when tensor_B is transposed
print(f"Original shapes: tensor1 = {tensor1.shape}, tensor2 = {tensor2.shape}\n")
print(f"New shapes: tensor1 = {tensor1.shape} (same as above), tensor2.T = {tensor2.T.shape}\n")
print(f"Multiplying: {tensor1.shape} * {tensor2.T.shape} <- inner dimensions match\n")
print("Output:\n")
output = torch.matmul(tensor1, tensor2.T)
print(output) 
print(f"\nOutput shape: {output.shape}")

Neural networks are full of matrix multiplications and dot products.

The [`torch.nn.Linear()`](https://pytorch.org/docs/1.9.1/generated/torch.nn.Linear.html) module (we'll see this in action later on), also known as a feed-forward layer or fully connected layer, implements a matrix multiplication between an input `x` and a weights matrix `A`.

$$
y = x\cdot{A^T} + b
$$

Where:
* `x` is the input to the layer (deep learning is a stack of layers like `torch.nn.Linear()` and others on top of each other).
* `A` is the weights matrix created by the layer, this starts out as random numbers that get adjusted as a neural network learns to better represent patterns in the data (notice the "`T`", that's because the weights matrix gets transposed).
  * **Note:** You might also often see `W` or another letter like `X` used to showcase the weights matrix.
* `b` is the bias term used to slightly offset the weights and inputs.
* `y` is the output (a manipulation of the input in the hopes to discover patterns in it).

This is a linear function (you may have seen something like $y = mx+b$ in high school or elsewhere), and can be used to draw a straight line!



### Aggregation

In [None]:
x = torch.arange(0,100,10)
x

In [None]:
torch.min(x), x.min()

In [None]:
torch.max(x), x.max()

In [None]:
torch.mean(x)

In [None]:
x.dtype

**convert integers to float32 for mean calculations**

In [None]:
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

In [None]:
torch.sum(x), x.sum()

**Positional min and max**

In [None]:
torch.argmax(x), x.argmax()

In [None]:
torch.argmin(x), x.argmin()

### Reshaping, stacking, squeezing and unsqueezing

* Reshape - obvious
* View - Return a view of input tensor of a certain shape but keep the same memory as the original tensor
* Stacking - vstack or hstack
* Squeeze - removes all `1` dimensions
* Unsqueeze - add a `1` dimension
* Permute - return a view with dimensions permuted (swapped) in a certain way

In [None]:
import torch
x = torch.arange(1.,10.)
x, x.shape

In [None]:
x_reshaped = x.reshape(9,1)
x_reshaped, x_reshaped.shape

In [None]:
x_reshaped = x.reshape(1,9)
x_reshaped, x_reshaped.shape

In [None]:
x_reshaped = x.reshape(3,3)
x_reshaped, x_reshaped.shape

In [None]:
x = torch.arange(1.,11.)
x, x.shape
x_reshaped = x.reshape(5,2)
x_reshaped, x_reshaped.shape

**view()**

In [None]:
x = torch.arange(1.,10.)
z = x.view(1,9)
z, z.shape

**Changing z changes x (z shares memory with input)**

In [None]:
z[:,0] = 5
z, x

**Stacks**

In [None]:
x_stacked = torch.stack([x,x,x,x], dim=1)
x_stacked

In [None]:
# Stack tensors on top
x_stacked = torch.stack([x,x,x,x]) # default dim=0
x_stacked

**Squeeze and Unsqueeze**

In [None]:
x_reshaped = x.reshape(1,9)
x_reshaped

In [None]:
print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")

x_squeezed = x_reshaped.squeeze()

print(f"\nNew tensor: {x_squeezed}")
print(f"New shape: {x_squeezed.shape}")

In [None]:
print(f"Previous tensor: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

x_unsqueezed = x_squeezed.unsqueeze(dim=0)

print(f"\nNew tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

**Permute**

In [None]:
x_original = torch.rand(size=(224,224,3))

x_permuted = x_original.permute(2,0,1) # shifts 2->0, 0->1, 1->2

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}")

In [None]:
x_original.shape

### Indexing (selecting data from tensors)

Indexint with PyTorch is similar to indexing with NumPy

In [None]:
# Create a tensor
import torch
x = torch.arange(1,10).reshape(1,3,3)
x, x.shape

In [None]:
x[0]

In [None]:
x[0,0], x[0][0]

In [None]:
x[0,1,0], x[0][1][0]

In [None]:
x[:,0]

In [None]:
x[:,:,1]

In [None]:
x[:,1,1]

In [None]:
x[0,0,:]

In [None]:
x[:,:,2], x[:,:,2].squeeze()

## NumPy
- Convert NumPy data to Pytorch tensor `torch.from_numpy(ndarray)`
- Conver tensor to NumPy `torch.Tensor.numpy()`

In [None]:
import torch
import numpy as np

array = np.arange(1.0,8.0)
tensor = torch.from_numpy(array)  #numpy default is 64 .type(torch.float32) change
# tensor = torch.from_numpy(array).type(torch.float32)  #numpy default is 64 .type(torch.float32) change
array, tensor

In [None]:
array.dtype, tensor.dtype

In [None]:
tensor = torch.from_numpy(array).type(torch.float32)
tensor.dtype

In [None]:
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

## Reproducibility

To reduce the randomness in neural networks use **random seed* (salt)


In [None]:
random_tensor_A = torch.rand(3,4)
random_tensor_B = torch.rand(3,4)

print(random_tensor_A)
print(random_tensor_B)
print(random_tensor_B == random_tensor_A)

In [None]:
RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3,4)
torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3,4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C == random_tensor_D)

In [43]:
tensor.device

device(type='cpu')

In [44]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [45]:
input = tensor.to(device)

In [46]:
print(input, input.device)

tensor([1, 2, 3], device='cuda:0') cuda:0


In [189]:
!nvidia-smi

Sun Dec 18 11:35:53 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.85.02    Driver Version: 510.85.02    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   50C    P0    N/A /  N/A |    355MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [190]:
torch.cuda.is_available()

True

In [191]:
 # Count number of devices
torch.cuda.device_count()

1

## Putting tensors and models on gpu

In [192]:
tensor = torch.tensor([1,2,3])
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [195]:
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

In [196]:
new_tensor = torch.tensor([2,3,4], device=device)

In [197]:
new_tensor

tensor([2, 3, 4], device='cuda:0')

In [None]:
# Must convert to cpu for numpy

In [198]:
tensor_back_to_cpu = new_tensor.cpu().numpy()
tensor_back_to_cpu

array([2, 3, 4])