<a href="https://colab.research.google.com/github/Aryan95614/Pytorch-Lessons/blob/RUNTHISFILE/Pytorch_Tutorial_1_Fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Making sure that Pytorch is here Installed
import torch 

print(torch.__version__)

1.13.0+cu116


Tensors are the fundamental building block of machine learning.

Their job is to represent data in a numerical way.

For example, you could represent an image as a tensor with shape [3, 224, 224] which would mean [colour_channels, height, width], as in the image has 3 colour channels (red, green, blue), a height of 224 pixels and a width of 224 pixels.

You can tell the number of dimensions a tensor in PyTorch has by the number of square brackets on the outside ([) and you only need to count one side.


In [None]:
# Creating Scalar -> Zero Dimension Tensor
Scaler = torch.tensor(7)
print(Scaler.ndim) # 0 because it shows the dimensions as we said
print(Scaler.item())

A vector is a single dimension tensor but can contain many numbers.

As in, you could have a vector [3, 2] to describe [bedrooms, bathrooms] in your house. Or you could have [3, 2, 2] to describe [bedrooms, bathrooms, car_parks] in your house.

The important trend here is that a vector is flexible in what it can represent (the same with tensors).

In [None]:
# Creating Vector -> Flexible it what it can be
Vector = torch.tensor([7, 7])
print(Vector.shape) #torch.Size([2]): tells you how it is shaped like that

A matrix is a bit different but will have one more dimension compared to a vector, it also has the same characterisitics.


In [None]:
# Matrix
MATRIX = torch.tensor([[7, 8], 
                       [9, 10]])
print(MATRIX.ndim)  # 2: We can tell that this already 2 dimensional and unlike a vector can carry more paramaters
print(MATRIX.shape) # torch.Size([2, 2]): We can see two subarrays with two elements, making it 2-2

In [None]:
#Tensor
Tensor = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])
print(Tensor.ndim) # 3 This is 3 dimensional 
print(Tensor.shape)# torch.Size([1, 3, 3]) Do it in reverse and you will be able to tell

We've established tensors represent some form of data.

And machine learning models such as neural networks manipulate and seek patterns within tensors.

But when building machine learning models with PyTorch, it's rare you'll create tenors by hand (like what we've being doing).

Instead, a machine learning model often starts out with large random tensors of numbers and adjusts these random numbers as it works through data to better represent it.

In essence:

Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers...

As a data scientist, you can define how the machine learning model starts (initialization), looks at data (representation) and updates (optimization) its random numbers.

In [None]:
# We can generate random tensors that already have their data fitted into them 

# Create a random tensor of size (3, 4)
random_tensor = torch.rand(size=(3, 4))
print(random_tensor)
print(random_tensor.dtype)

# Making sure that only 0s and 1s are filled in it
zeros = torch.zeros(size=(3, 4))
print(zeros)
print(zeros.dtype)

ones = torch.ones(size=(3, 4))
print(ones)
print(ones.dtype , "\n")

# Creating any sort of ranges and creating a range of values 0 to 10
zero_to_ten = torch.arange(start=0, end=10, step=1)
print(zero_to_ten)

# Creating a comparitive tensor of zeros
ten_zeros = torch.zeros_like(input=zero_to_ten) # will have same shape
print(ten_zeros)


Tensor datatypes
There are many different tensor datatypes available in PyTorch.
[link](https://pytorch.org/docs/stable/tensors.html#data-types) 
Some are specific for CPU and some are better for GPU.

Getting to know which is which can take some time.

Generally if you see torch.cuda anywhere, the tensor is being used for GPU (since Nvidia GPUs use a computing toolkit called CUDA).

The most common type (and generally the default) is torch.float32 or torch.float.

This is referred to as "32-bit floating point".

But there's also 16-bit floating point (torch.float16 or torch.half) and 64-bit floating point (torch.float64 or torch.double).


In [None]:
# Default datatype for tensors is float32
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None, # defaults to None, which is torch.float32 or whatever datatype is passed
                               device=None, # defaults to None, which uses the default tensor type
                               requires_grad=False) # if True, operations perfromed on the tensor are recorded 

print(float_32_tensor.shape)
print(float_32_tensor.dtype)
print(float_32_tensor.device) # cpu written(PyTorch likes calculations between tensors to be on the same device).


float_16_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=torch.float16) # torch.half would also work
print(float_16_tensor.dtype)

# Getting information from tensors

Once you've created tensors (or someone else or a PyTorch module has created them for you), you might want to get some information from them.

We've seen these before but three of the most common attributes you'll want to find out about tensors are:

shape - what shape is the tensor? (some operations require specific shape rules)
dtype - what datatype are the elements within the tensor stored in?
device - what device is the tensor stored on? (usually GPU or CPU)
Let's create a random tensor and find out details about it.

In [None]:
# Create a tensor
some_tensor = torch.rand(3, 4)

# Find out details about it
print(some_tensor)
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Device tensor is stored on: {some_tensor.device}") # will default to CPU


#Note: When you run into issues in PyTorch, it's very often
#one to do with one of the three attributes above. 
#So when the error messages show up, sing yourself a 
#little song called "what, what, where":

#"what shape are my tensors? what datatype are they and where are they stored? 
#what shape, what datatype, where where where"



Manipulating tensors (tensor operations)
In deep learning, data (images, text, video, audio, protein structures, etc) gets represented as tensors.

A model learns by investigating those tensors and performing a series of operations (could be 1,000,000s+) on tensors to create a representation of the patterns in the input data.

These operations are often a wonderful dance between:

- Addition
- Substraction
- Multiplication (element-wise)
- Division
- Matrix multiplication
And that's it. Sure there are a few more here and there but these are the basic building blocks of neural networks.

Stacking these building blocks in the right way, you can create the most sophisticated of neural networks (just like lego!).

In [None]:
# Create a tensor of values and add a number to it
tensor = torch.tensor([1, 2, 3])
print(tensor + 10) # make sure to assign variables to this to better it

# Subtract and reassign
tensor = tensor - 10
print(tensor) # tensor([-9, -8, -7])

# Add and reassign
tensor = tensor + 10
print(tensor) #tensor([1, 2, 3]) -> remember we reassigned this


PyTorch also has a bunch of built-in functions like torch.mul() (short for multiplcation) and torch.add() to perform basic operations.

In [None]:
# Can also use torch functions
tensor = torch.multiply(torch.tensor([1, 2, 3]), 10)

# Original tensor is still unchanged 
print(tensor) # tensor([10, 20, 30]) -> used some basic encapsulation 

## [Matrix Multiplication](https://www.mathsisfun.com/algebra/matrix-multiplying.html)

One of the most common operations in machine learning and deep learning algorithms (like neural networks) is matrix multiplication.

PyTorch implements matrix multiplication functionality in the torch.matmul() method.

The main two rules for matrix multiplication to remember are:

The inner dimensions must match:
**Notice how they have to be different sizes as one row's are multiplied by the other's columns**
- (3, 2) @ (3, 2) won't work
- (2, 3) @ (3, 2) will work
- (3, 2) @ (2, 3) will work

The resulting matrix has the shape of the outer dimensions:

- (2, 3) @ (3, 2) -> (2, 2)
- (3, 2) @ (2, 3) -> (3, 3)

In [None]:
#%%time
'''
CPU times: user 4.36 ms, sys: 0 ns, total: 4.36 ms
Wall time: 7.08 ms
'''
# Just intializing 
tensor = torch.tensor([1, 2, 3])
print(tensor.shape)

# Element-wise matrix multiplication
print(tensor * tensor) # tensor([1, 4, 9])


# Matrix multiplication: Method 1
print(torch.matmul(tensor, tensor)) # tensor(14)

# Matrix multiplication: Method 2
print(tensor @ tensor) # tensor(14) -> extremely discouraged 


##One of the most common errors in deep learning (shape errors)

Because much of deep learning is multiplying and performing operations on matrices and matrices have a strict rule about what shapes and sizes can be combined, one of the most common errors you'll run into in deep learning is shape mismatches.

In [None]:
# Shapes need to be in the right way  
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11], 
                         [9, 12]], dtype=torch.float32)

try:
  torch.matmul(tensor_A, tensor_B) # (this will error)
except RuntimeError as e:
  print(" Extremely common mistake it is, you have to remember to look at the bolded text in the above above column") # Yoda Voice

 Extremely common mistake it is, you have to remember to look at the bolded text in the above above column


##One of the ways to do this is with a transpose (switch the dimensions of a given tensor).

You can perform transposes in PyTorch using either:

torch.transpose(input, dim0, dim1) 
- where input is the desired tensor to transpose and dim0 and dim1 are the dimensions to be swapped.
tensor.T 
- where tensor is the desired tensor to transpose.

Let's try the **latter**.

In [None]:
# View tensor_A and tensor_B
print(tensor_A)
print(tensor_B)
print()

#Fixing the issue, lets view tensor_A and tensor_B.T
print(tensor_A)
print(tensor_B.T)

# Visit http://matrixmultiplication.xyz/. 

# torch.mm is a shortcut for matmul
print("This is a shortcut to it\n" , torch.mm(tensor_A, tensor_B.T))

Neural networks are full of matrix multiplications and dot products.

The torch.nn.Linear() module (we'll see this in action later on), also known as a feed-forward layer or fully connected layer, implements a matrix multiplication between an input x and a weights matrix A.

y
=
x
⋅
A
ᵀ
+
b

Where:

- x is the input to the layer (deep learning is a stack of layers like torch.nn.Linear() and others on top of each other).
- A is the weights matrix created by the layer, this starts out as random numbers that get adjusted as a neural network learns to better represent patterns in the data (notice the "T", that's because the weights matrix gets transposed).
- Note: **You might also often see W or another letter like X used to showcase the weights matrix.**
- b is the bias term used to slightly offset the weights and inputs.
y is the output (a manipulation of the input in the hopes to discover patterns in it).

This is a linear function (you may have seen something like 
y=mx+b in high school or elsewhere), and can be used to draw a straight line!

In [None]:
# Since the linear layer starts with a random weights matrix, let's make it reproducible (more on this later)
torch.manual_seed(42)

# This uses matrix multiplication
linear = torch.nn.Linear(in_features=2, # in_features = matches inner dimension of input 
                         out_features=6) # out_features = describes outer value amount
x = tensor_A
print(tensor_A)
output = linear(x)
print(f"Input shape: {x.shape}\n")
print(f"Output:\n{output}\n\nOutput shape: {output.shape}")


##Finding the min, max, mean, sum, etc (aggregation)

Now we've seen a few ways to manipulate tensors, let's run through a few ways to aggregate them (go from more values to less values).

First we'll create a tensor and then find the max, min, mean and sum of it.

In [None]:
# Create a tensor
x = torch.arange(0, 100, 10) # start, end, step
print(x)

# Built in methods for the object itself
print(f"Minimum: {x.min()}")
print(f"Maximum: {x.max()}")
# print(f"Mean: {x.mean()}") # this will error
print(f"Mean: {x.type(torch.float32).mean()}") # won't work without float datatype
print(f"Sum: {x.sum()}")

#Built in functions to go to the object
print(torch.max(x), torch.min(x), torch.mean(x.type(torch.float32)), torch.sum(x))

##Positional min/max

You can also find the index of a tensor where the max or minimum occurs with torch.argmax() and torch.argmin() respectively.

This is helpful incase you just want the position where the highest (or lowest) value is and not the actual value itself (we'll see this in a later section when using the softmax activation function).

In [None]:
# Create a tensor
tensor = torch.arange(10, 100, 10)
print(f"Tensor: {tensor}")

# Returns index of max and min values
print(f"Index where max value occurs: {tensor.argmax()}") # Will be used in the softmax activation layer which is pretty cool
print(f"Index where min value occurs: {tensor.argmin()}\n\n") # Will be used in the softmax activation layer which is pretty cool

"""

The softmax function is often used as the last activation function of a neural
network to normalize the output of a network to a probability distribution 
over predicted output classes, based on Luce's choice axiom.


That is, softmax is used as the activation function for multi-class 
classification problems where class membership is required on more than two class labels.
"""

Note: Different datatypes can be confusing to begin with. But think of it like this, the lower the number (e.g. 32, 16, 8), the less precise a computer stores the value. And with a lower amount of storage, this generally results in faster computation and a smaller overall model. Mobile-based neural networks often operate with 8-bit integers, smaller and faster to run but less accurate than their float32 counterparts. For more on this, I'd read up about [precision in computing](https://www.wikiwand.com/en/Precision_(computer_science)).


Exercise: So far we've covered a fair few tensor methods but there's a bunch more in the [torch.Tensor documentation](https://pytorch.org/docs/stable/tensors.html), I'd recommend spending 10-minutes scrolling through and looking into any that catch your eye. Click on them and then write them out in code yourself to see what happens.

Because deep learning models (neural networks) are all about manipulating tensors in some way. And because of the rules of matrix multiplication, if you've got shape mismatches, you'll run into errors. These methods help you make the right elements of your tensors are mixing with the right elements of other tensors.

Let's try them out.

In [None]:
# Create a tensor
import torch
x = torch.arange(1., 8.)
print(x)
print(x.shape)

# Add an extra dimension
x_reshaped = x.reshape(7, 1)
print(x_reshaped)
print(x_reshaped.shape)

# This code actually changes the code of the
# object itself and reformats it with the view
z = x.view(1, 7)
print(z)
print(x)

# Stack tensors on top of each other
x_stacked = torch.stack([x, x, x, x], dim=0) # try changing dim to dim=1 and see what happens
print("\n", x_stacked) # this can actually just stack tensors


print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")

# Remove extra dimension from x_reshaped
x_squeezed = x_reshaped.squeeze()
print(f"\nNew tensor: {x_squeezed}")
print(f"New shape: {x_squeezed.shape}")


# Create tensor with specific shape
x_original = torch.rand(size=(224, 224, 3))

# Permute the original tensor to rearrange the axis order
x_permuted = x_original.permute(2, 0, 1) # shifts axis 0->1, 1->2, 2->0

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}")

'''Note: Because permuting returns a view 
(shares the same data as the original), 
the values in the permuted tensor will be
 the same as the original tensor and if you 
 change the values in the view, it will change
  the values of the original.'''


##Indexing (selecting data from tensors)
Sometimes you'll want to select specific data from tensors (for example, only the first column or second row).

To do so, you can use indexing.

If you've ever done indexing on Python lists or NumPy arrays, indexing in PyTorch with tensors is very similar.

In [None]:
# Create a tensor 
import torch
x = torch.arange(1, 10).reshape(1, 3, 3)
print(x)
print(x.shape) # torch.Size([1, 3, 3])

# Let's index bracket by bracket
print(f"First square bracket:\n{x[0]}") 
print(f"Second square bracket: {x[0][0]}")  #tensor([1, 2, 3])
print(f"Third square bracket: {x[0][0][0]}")# 1
print()

# Get all values of 0th dimension and the 0 index of 1st dimension
print(x[:, 0])

# Get all values of 0th & 1st dimensions but only index 1 of 2nd dimension
print(x[:, :, 1])

# Get all values of the 0 dimension but only the 1 index value of the 1st and 2nd dimension
print(x[:, 1, 1])

# Get index 0 of 0th and 1st dimension and all values of 2nd dimension 
print(x[0, 0, :]) # same as x[0][0]

##PyTorch tensors & NumPy

Since NumPy is a popular Python numerical computing library, PyTorch has functionality to interact with it nicely.

The two main methods you'll want to use for NumPy to PyTorch (and back again) are:

- torch.from_numpy(ndarray) - NumPy array -> PyTorch tensor.
- torch.Tensor.numpy() - PyTorch tensor -> NumPy array.

Let's try them out.

Note: By default, NumPy arrays are created with the datatype float64 and if you convert it to a PyTorch tensor, it'll keep the same datatype (as above).

However, many PyTorch calculations default to using float32.

So if you want to convert your NumPy array (float64) -> PyTorch tensor (float64) -> PyTorch tensor (float32), you can use:

**tensor = torch.from_numpy(array).type(torch.float32).**

In [None]:
# NumPy array to tensor
import torch
import numpy as np
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array).type(torch.float32)
print(array)
print(tensor)
print()

# Tensor to NumPy array
tensor = torch.ones(7) # create a tensor of ones with dtype=float32
numpy_tensor = tensor.numpy() # will be dtype=float32 unless changed
print(tensor)
print(numpy_tensor)


##Reproducibility (trying to take the random out of random)
As you learn more about neural networks and machine learning, you'll start to discover how much randomness plays a part.

Well, pseudorandomness that is. Because after all, as they're designed, a computer is fundamentally deterministic (each step is predictable) so the randomness they create are simulated randomness (though there is debate on this too, but since I'm not a computer scientist, I'll let you find out more yourself).

How does this relate to neural networks and deep learning then?

We've discussed neural networks start with random numbers to describe patterns in data (these numbers are poor descriptions) and try to improve those random numbers using tensor operations (and a few other things we haven't discussed yet) to better describe patterns in data.

In short:
```
start with random numbers -> tensor operations -> try to make better (again and again and again)```

Although randomness is nice and powerful, sometimes you'd like there to be a little less randomness.Why?So you can perform repeatable experiments.For example, you create an algorithm capable of achieving X performance.And then your friend tries it out to verify you're not crazy.How could they do such a thing?That's where reproducibility comes in.In other words, can you get the same (or very similar) results on your computer running the same code as I get on mine?Let's see a brief example of reproducibility in PyTorch. We'll start by creating two random tensors, since they're random, you'd expect them to be different right?

In [None]:
import torch

# Create two random tensors
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(f"Tensor A:\n{random_tensor_A}\n")
print(f"Tensor B:\n{random_tensor_B}\n")
print(f"Does Tensor A equal Tensor B? (anywhere)")
random_tensor_A == random_tensor_B

#torch.clone(random_tensor_A) use this to clone

import torch
import random

# # Set the random seed
RANDOM_SEED=42 # try changing this to different values and see what happens to the numbers below
torch.manual_seed(seed=RANDOM_SEED) 
random_tensor_C = torch.rand(3, 4)

# Have to reset the seed every time a new rand() is called 
# Without this, tensor_D would be different to tensor_C 
torch.random.manual_seed(seed=RANDOM_SEED) # try commenting this line out and seeing what happens
random_tensor_D = torch.rand(3, 4)

print(f"Tensor C:\n{random_tensor_C}\n")
print(f"Tensor D:\n{random_tensor_D}\n")
print(f"Does Tensor C equal Tensor D? (anywhere)")
random_tensor_C == random_tensor_D