<a href="https://colab.research.google.com/github/robmonday/pytorch-sandbox/blob/main/00__pytorch_fundamentals_video.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 00. PyTorch Fundamentals

Resource notebook: https://www.learnpytorch.io/00_pytorch_fundamentals


In [112]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.0.1+cu118


## Introduction to Tensors

### Creating tensors
https://pytorch.org/docs/stable/tensors.html

In [113]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [114]:
scalar.ndim

0

In [115]:
# Get tensor back as Python int
scalar.item()

7

In [116]:
# Vector
vector = torch.tensor([7,7])
vector

tensor([7, 7])

In [117]:
vector.ndim
vector.shape

torch.Size([2])

In [118]:
# Matrix
MATRIX = torch.tensor([[7,8],
                       [9, 10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [119]:
MATRIX.ndim
MATRIX.shape

torch.Size([2, 2])

In [120]:
MATRIX[0]
MATRIX[1]

tensor([ 9, 10])

In [121]:
# TENSOR
TENSOR = torch.tensor([[[1,2,3],
                        [3,6,9],
                        [2,5,4]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 5, 4]]])

In [122]:
TENSOR.ndim
TENSOR.shape

torch.Size([1, 3, 3])

In [123]:
TENSOR[0][0]

tensor([1, 2, 3])

### Random tensors

Random tensors are important because the way many neural networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better represent the data.

```Start with random numbers -> look at data --> update random numbers --> look at data --> update random numbers```

In [124]:
# Create a random tensor of size (3,4)
random_tensor = torch.rand(3,4)
random_tensor

tensor([[0.0990, 0.8946, 0.0093, 0.8467],
        [0.2245, 0.6247, 0.5927, 0.6790],
        [0.9858, 0.2836, 0.4916, 0.1429]])

In [125]:
random_tensor.ndim

2

In [126]:
# Create a random tensor with similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(224,224,3)) # height, width, color channels (RGB) # size doesn't have to be a named parameter
random_image_size_tensor.shape, random_image_size_tensor.ndim


(torch.Size([224, 224, 3]), 3)

### Zeros and ones

In [127]:
# Create a tensor of all zeros
zeros = torch.zeros(3,4)
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [128]:
zeros * random_tensor

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [129]:
# Create a tensor of all ones
ones = torch.ones(3,4)
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [130]:
# getting datatype of your tensor values ...float32 is the default
ones.dtype

torch.float32

# Create a range of tensors and tensors-like

In [131]:
# Use torch.arange()
one_to_ten = torch.arange(1,11) # start, end
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [132]:
example = torch.arange(start=0,end=1000, step=77) #start, end, step
example

tensor([  0,  77, 154, 231, 308, 385, 462, 539, 616, 693, 770, 847, 924])

In [133]:
# Creating tensors like
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

## Tensor datatypes

why use anything other than 32 bit?  
16 bit is half precision (twice as fast to compute)
64 bit is double precision (twice as long to compute)  ...google "precision in computing"

In [134]:
float_32_tensor = torch.tensor([3.0, 6.0, 9.0], dtype=None, # data type, default dtype is float32
                                                device=None, # what device is your tensor on? Default is "cpu", but can be changed to "cuda" for gpu
                                                requires_grad=False) # do you want to track gradients?
float_32_tensor
float_32_tensor.dtype

torch.float32

3 big errors you'll run into with Pytorch and deep learning:
1. Tensors not right dtype
1. Tensors not right shape
1. Tensors not right device

In [135]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [136]:
product = float_16_tensor * float_32_tensor
product.dtype #

torch.float32

In [137]:
int_32_tensor = torch.tensor([3,6,9], dtype=torch.int32)
int_32_tensor

tensor([3, 6, 9], dtype=torch.int32)

In [138]:
product2 = float_32_tensor * int_32_tensor
product2.dtype

torch.float32

### Getting info from tensors
tensor.dtype
tensor.shape
tensor.device


In [139]:
# Create a tensor
some_tensor = torch.rand(size=(3,4), dtype=torch.float64)
some_tensor.dtype

torch.float64

In [140]:
# Get details about some_tensor
print(some_tensor)
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Device of tensor: {some_tensor.device}")

tensor([[0.7191, 0.8540, 0.7064, 0.1163],
        [0.3299, 0.5859, 0.9259, 0.0156],
        [0.2426, 0.0645, 0.5195, 0.3776]], dtype=torch.float64)
Datatype of tensor: torch.float64
Shape of tensor: torch.Size([3, 4])
Device of tensor: cpu


### Manipulating Tensors (tensor operations)

Tensor operations include:
* Addition
* Subtraction
* Multiplication (element-wise)
* Division
* Matrix Multiplication

In [141]:
# create a tensor and add 10 to it
tensor = torch.tensor([1,2,3])
tensor + 10
# torch.add(tensor,10)

tensor([11, 12, 13])

In [142]:
# multiply tensor by 10
tensor * 10
# torch.mul(tensor, 10)  # this is a more elaborate way than using operators, runs faster but less intuitive

tensor([10, 20, 30])

In [143]:
# subtract 10 from tensor
tensor - 10

tensor([-9, -8, -7])

### Matrix multiplication

one of the most common operations in neural networks

[More Detail about Matrix Multiplication](http://mathisfun.com/algebra/matrix-multiplying.html)

Two main ways of performing multiplication in neural nets and deep learning:

1. Element-wise multiplication
1. Matrix multiplication (dot-product)

Two main rules that performing matrix multipliation needs to satisfy to avoid error:  
1. The **inner dimensions** must match:
* `(2,3) @ (2,3) wont work`
* `(2,3) @ (3,2) will work`
* `(3,2) @ (2,3) will work`
* `(3,2) @ (3,2) wont work`

1. The resulting matrix has the shape of the **outer dimensions**
* `(2,3) @ (3,2)` -> `(2,2)`
* `(3,2) @ (2,3)` -> `(3,3)`

In [144]:
# Element wise multiplication
print(tensor, "*", tensor)
print("equals: ", tensor*tensor)

tensor([1, 2, 3]) * tensor([1, 2, 3])
equals:  tensor([1, 4, 9])


In [145]:
# Matrix multiplication
torch.matmul(tensor, tensor) # faster version
tensor @ tensor # this is the less commonly-used operator version

tensor(14)

In [146]:
%%time
value = 0
for i in range(len(tensor)):
  value += tensor[i]*tensor[i]
value

CPU times: user 265 µs, sys: 0 ns, total: 265 µs
Wall time: 273 µs


tensor(14)

In [147]:
%%time
torch.matmul(tensor, tensor)

CPU times: user 79 µs, sys: 0 ns, total: 79 µs
Wall time: 83.4 µs


tensor(14)

### One of the most common errors in deep learning:  shape errors

[Practice matrix multiplication](https://matrixmultiplication.xyz)

In [148]:
# Shapes for matrix multiplication

tensor_A = torch.tensor([[1,2],
                         [3,4],
                         [5,6]])

tensor_B = torch.tensor([[7,10],
                         [8,11],
                         [9,12]])

# torch.mm(tensor_A, tensor_B) # torch.mm is an alias for torch.matmul
# torch.matmul(tensor_A, tensor_B) # error

To fix tensor shape issues, we can manipulate shape using **transpose**

A **transpose** switches the axes or dimensions of a given tensor

In [149]:
tensor_B.T

tensor([[ 7,  8,  9],
        [10, 11, 12]])

In [150]:
print("started with", tensor_B.shape)
print("transposed to", tensor_B.T.shape)

started with torch.Size([3, 2])
transposed to torch.Size([2, 3])


In [151]:
torch.matmul(tensor_A, tensor_B.T)

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

In [152]:
# The matrix multiplication operation works when tensor_B is transposed
print(f"Original shapes: tensor_A={tensor_A.shape} tensor_B={tensor_B.shape}")
print(f"New shapes: tensor_A={tensor_A.shape} tensor_B.T={tensor_B.T.shape}")
print(f"Multiplying: {tensor_A.shape} @ {tensor_B.shape} <-- inner dimensions must match")
print("Output:\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output)
print(f"\n Output shape: {output.shape}")


Original shapes: tensor_A=torch.Size([3, 2]) tensor_B=torch.Size([3, 2])
New shapes: tensor_A=torch.Size([3, 2]) tensor_B.T=torch.Size([2, 3])
Multiplying: torch.Size([3, 2]) @ torch.Size([3, 2]) <-- inner dimensions must match
Output:

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

 Output shape: torch.Size([3, 3])


### Tensor Aggregation:  finding the min, max, mean, sum, etc

In [153]:
# Create a tensor
x = torch.arange(1,100,10)
print(x.dtype)
x

torch.int64


tensor([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91])

In [154]:
# Find the min
torch.min(x), x.min() # both method and function versions are available; using either is fine

(tensor(1), tensor(1))

In [155]:
# Find the max
torch.max(x), x.max()

(tensor(91), tensor(91))

In [156]:
# Find the mean
# torch.mean(x) # error because mean doesn't accept long (int64) datatype
torch.mean(x.type(torch.float32))

tensor(46.)

### Finding the positional min and max

In [157]:
# find the position in tensor that has the minimum value --> returns index position of target tensor where min value occurs
x.argmin()

tensor(0)

In [158]:
x.argmax() # get the index
x[9] # get the value

tensor(91)

## Reshaping, stacking, squeezing, and unsqueezing tensors

* Reshaping - reshapes an input tensor to a deinfed shape
* View - return a view of an input tensor of certain shape but keep the same memory as original tensor
* Stacking - combine multiple tensor on top of eachother or side-by-side (stack, vstack, & hstack)
* Squeeze - removes all '1' dimensions from a tensor
* Unsqueeze - adds a '1' dimension to a target tensor
* Permute - return a view of the input with dimensions permuted (swapped) in a certain way



In [159]:
# Lets create a tensor
import torch
x = torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [165]:
# Reshape: just remember you can't change the number of elements
x_reshaped = x.reshape(1,1,9) # Add an extra  dimension
x_reshaped = x.reshape(9,1) # Transpose
x_reshaped = x.reshape(3,3)
x_reshaped

tensor([[1., 2., 3.],
        [4., 5., 6.],
        [7., 8., 9.]])

In [167]:
# View: returns a view of the original tensor, but doesn't save the result anywhere
z = x.view(1,9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [168]:
# Changing z changes x as well (because a view of a tensor shares same memory as the original tensor)
z[:,0] = 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [176]:
# Stack tensors on top of each other
x_stacked = torch.stack([x,x,x,x], dim=0)
x_stacked
# there are also variations vstack (dim=0) and hstack (dim=1)

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [181]:
# Squeeze
# "squeezes" out ALL extraneous dimensions
l = torch.tensor([[[1,2,3,4]]])
l, l.squeeze()

(tensor([[[1, 2, 3, 4]]]), tensor([1, 2, 3, 4]))

In [186]:
# Unsqueeze
"""Returns a new tensor with a dimension of size one inserted at the specified position"""
m = torch.tensor([5,6,7,8])
m.unsqueeze(dim=1)

tensor([[5],
        [6],
        [7],
        [8]])

In [194]:
# Permute - rearranges the dimensions of a target tensor in a specified order
"""Returns a view of the original tensor input with its dimensions permuted."""
x = torch.randn(2, 3, 5)
x.size()
torch.permute(x, (2, 0, 1)).size()

torch.Size([5, 2, 3])

In [193]:
# permute can be helpful in handling images
image = torch.rand(size=(224,224,3)) # (height, width, color channels)

# switching color channels to be listed as 0th dimension
permuted_image = image.permute(2,0,1)
# permuted_image.shape
permuted_image

tensor([[[0.3027, 0.8892, 0.8592,  ..., 0.1477, 0.7128, 0.3171],
         [0.3214, 0.6682, 0.9338,  ..., 0.2402, 0.4036, 0.3994],
         [0.5903, 0.7127, 0.7236,  ..., 0.8543, 0.7674, 0.0883],
         ...,
         [0.9321, 0.2452, 0.0852,  ..., 0.2593, 0.6858, 0.4689],
         [0.7959, 0.6744, 0.9363,  ..., 0.6474, 0.3059, 0.1746],
         [0.2042, 0.9951, 0.9884,  ..., 0.5393, 0.2484, 0.9395]],

        [[0.6614, 0.7534, 0.9824,  ..., 0.3621, 0.9970, 0.3749],
         [0.3983, 0.2267, 0.4191,  ..., 0.6828, 0.1049, 0.3678],
         [0.0573, 0.5118, 0.2143,  ..., 0.9816, 0.4606, 0.1696],
         ...,
         [0.9061, 0.0766, 0.9054,  ..., 0.4929, 0.4377, 0.8641],
         [0.1599, 0.6122, 0.1554,  ..., 0.5130, 0.8877, 0.3412],
         [0.3423, 0.2034, 0.5278,  ..., 0.4863, 0.1106, 0.2179]],

        [[0.7776, 0.1507, 0.9512,  ..., 0.3005, 0.1571, 0.4445],
         [0.6138, 0.1962, 0.1948,  ..., 0.8964, 0.9632, 0.8585],
         [0.4283, 0.7585, 0.7768,  ..., 0.3321, 0.4208, 0.

## Indexing (selecting data from tensors)

Indexing with PyTorch is similar to indexing with NumPy

In [198]:
# Create a tensor
import torch
x = torch.arange(1,10).reshape(1,3,3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [199]:
# Lets index on our new tensor
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [212]:
# Lets index on the middle bracket (dim=1)
x[0][0]
# x[0,0] # equivalent syntax to above


tensor([1, 2, 3])

In [215]:
# Lets index on the innermost bracket
x[0][0][0]
x[0][1][1]
x[0][2][2]

tensor(9)

In [226]:
# You can also use ":" to select "all" of a target dimension
# using ':' returns an extra dimension because result can contain more than one index
x[:,:,1]

tensor([[2, 5, 8]])

In [224]:
# Get all values of 0th dimension, but only the 1 index values of 1st and 2nd dimension
x[:,1,1]

tensor([5])

In [225]:
# Get index 0 of 0th and 1st dimension, and all values of 2nd dimension
x[0,0,:]

tensor([1, 2, 3])

In [230]:
x

# Index on x to return 9
x[0,2,2]

# Index on x to return 3,6,9
x[0,:,2]

tensor([3, 6, 9])

## PyTorch tensors & NumPy

Numpy is a popular scientific Python numerical computing library
Because of this, PyTorch has functionality to interact with it

* Data in Numpy, want in PyTorch tensor -> torch.from_numpy(ndarray)
* PyTorch tensor -> NumPy -> torch.Tensor.numpy(tensor)

In [242]:
# NumPy array to tensor
import torch
import numpy as np

array = np.arange(1.0,8.0) # create a NumPy Array
tensor = torch.from_numpy(array) # convert to PyTorch Tensor
array, tensor

# Numpy default dtype is float64, and PyTorch default dtype is float32
# and it will be carried over to your tensor when you convert

# tensor.type(torch.float32).dtype # although you can cast it back to float32

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [244]:
# Change value of the NumPy array...what will happen to the tensor
array = array + 1
array, tensor
# notice array did change, but tensor was not mutated...they do not share space in memory

(array([3., 4., 5., 6., 7., 8., 9.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [249]:
# Tensor to NumPy array
tensor = torch.ones(7)
array = tensor.numpy()
tensor, array
# Notice: since tensor default dtype is float32, this is carried over to NumPy upon conversion

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility:  trying to take the random out of random

In short, how a neural network learns:
```start with random numbers -> tensor operations -> update random numbers to try and make them better representations of the data -> again -> again -> again ...```
In reality, computers work in pseudorandomness, not actual randomness
To reduce randomness, we use a **random seed**, which 'flavors' the randomness


In [254]:
import torch

# Create two random tensors
random_tensor_A = torch.rand(3,4)
random_tensor_B = torch.rand(3,4)

print(random_tensor_A)
print(random_tensor_B)
print(random_tensor_A == random_tensor_B) # notice none of the random numbers are equal

tensor([[0.6621, 0.2603, 0.6178, 0.0320],
        [0.2337, 0.6517, 0.8606, 0.9311],
        [0.4574, 0.8489, 0.2774, 0.0192]])
tensor([[0.6461, 0.6426, 0.0509, 0.5797],
        [0.8141, 0.9596, 0.7169, 0.7272],
        [0.8338, 0.5266, 0.7403, 0.9103]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])
