<a href="https://www.kaggle.com/code/sachinkoirla/00-pytorch-fundamentals?scriptVersionId=199136831" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# What is PyTorch?
PyTorch is a Python-based scientific computing package serving two broad purposes:
- A replacement for NumPy to use the power of GPUs
- A deep learning research platform that provides maximum flexibility and speed

### Who uses PyTorch?
- Facebook
- Twitter
- Salesforce
- Tesla
- Uber


In [2]:
import torch
import numpy as np
import pandas as pd

In [3]:
#set device to cuda if available which is one of the functionalities of pytorch
device=('cuda' if torch.cuda.is_available() else 'cpu')
device

'cuda'

# 1. Introduction to Tensor
A tensor is a generalization of vectors and matrices and is easily understood as a multidimensional array. <br> Tensors are the basic building blocks of PyTorch.
For more information, visit the official [PyTorch documentation](https://pytorch.org/docs/stable/tensors.html).

## 1.1 Creating Tensor
we can create a tensor in PyTorch using the `torch.tensor()` method. <br>
First we will create a scalar which is a 0-dimensional tensor.

In [4]:
#scalar
scalar=torch.tensor(42)
print(scalar)
print(scalar.shape)
print(scalar.ndim)
print(scalar.dtype)

tensor(42)
torch.Size([])
0
torch.int64


<p >That means although scalar is a single number, it's of type torch.Tensor. <br>
We checked the dimensions of a tensor using the ndim attribute.<br>
What if we wanted to retrieve the number from the tensor? i.e instead of tensor(42), we just want 42. <br>
To do we can use the item() method. </p>

In [5]:
scalar.item()

42

In [6]:
#vector
#vector is 1D tensor , it is set of scalar values
vector=torch.tensor([1,2])
print(vector)
print(vector.shape)
print(vector.ndim)

tensor([1, 2])
torch.Size([2])
1


Above vector contains two numbers but only has a single dimension. How? <br> 
Trick: You can tell the number of dimensions a tensor in PyTorch has by the number of square brackets on the outside ```([)``` and you only need to count one side.


In [7]:
#MATRIX
#matrix is 2D tensor , it is set of vectors 

MATRIX=torch.tensor([[1,2],
                     [3,4]
                    ])
print(MATRIX)
print(MATRIX.shape)
print(MATRIX.ndim)

tensor([[1, 2],
        [3, 4]])
torch.Size([2, 2])
2


In [8]:
#TENSOR
#tensor is 3D tensor , it is set of matrix
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]],
                       ])


tensor=torch.tensor([[[1,2],
                      [5,6],
                      [9,1]]])
print(TENSOR)
print(tensor)
print(TENSOR.shape, tensor.shape)
print(TENSOR.ndim, tensor.ndim)

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])
tensor([[[1, 2],
         [5, 6],
         [9, 1]]])
torch.Size([1, 3, 3]) torch.Size([1, 3, 2])
3 3


In [9]:
#acessing the elements of the tensor
print(f'{TENSOR[0][0]}') # accessing the first matrix of the tensor
print(f'{TENSOR[0][0][0]}') # accessing the first element of the first matrix of the tensor
print(f'{TENSOR[0][0][0:2]}') # accessing the first two elements of the first matrix of the tensor
print(" ")
print(f'{TENSOR[0][1]}') # accessing the second matrix of the tensor
print(f'{TENSOR[0][1][0]}') # accessing the first element of the second matrix of the tensor


tensor([1, 2, 3])
1
tensor([1, 2])
 
tensor([3, 6, 9])
3


![example of different tensor dimensions](https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/00-pytorch-different-tensor-dimensions.png)

## 1.2 Random Tensor
While building machine learning models we rarely create tensor by hand.<br>
Instead, we initialize tensors with random values and then update them during training.<br>
* What we do is:<br>
    ```Start with random values --> learn from data --> update values --> repeat.```

We can create random tensors using the `torch.rand()` method. <br>

In [10]:
#creating random tensor
random_tensor=torch.rand(10,3)

In [11]:
random_image_size_tensor=torch.rand(size=(3,224,224)) # or torch.rand(3,224,224)

In [12]:
random_image_size_tensor.shape

torch.Size([3, 224, 224])

## 1.3 Creating random tensor of zeroes and ones

Sometimes we want to create a tensor of zeroes or ones. This generally happens when we want to initialize the weights of a neural network. <br>

In [13]:

zeroes=torch.zeros(size=(2,2))
zeroes

tensor([[0., 0.],
        [0., 0.]])

In [14]:
ones=torch.ones(size=(2,2))
ones

tensor([[1., 1.],
        [1., 1.]])

## 1.4 Creating a range of tensors and tensor  like
Sometimes you might want a range of numbers, such as 1 to 10 or 0 to 100.

You can use ```torch.arange(start, end, step)``` to do so.

Where:

* ```start``` = start of range (e.g. 0)
* ```end``` = end of range (e.g. 10)
* ```step``` = how many steps in between each value (e.g. 1)

In [15]:
#Tensor range
one_to_ten=torch.arange(0,10,2) # start, end , step 

Sometimes you might want one tensor of a certain type with the same shape as another tensor.

For example, a tensor of all zeros with the same shape as a previous tensor.

To do so you can use ```torch.zeros_like(input)``` or ```torch.ones_like(input)``` which return a tensor filled with zeros or ones in the same shape as the input respectively.

In [16]:
#tensor like
print(torch.ones_like(one_to_ten))
print(torch.zeros_like(one_to_ten))
print(torch.ones_like(torch.tensor([1,1,1,1])))
print(torch.ones_like(torch.arange(1,10)))

tensor([1, 1, 1, 1, 1])
tensor([0, 0, 0, 0, 0])
tensor([1, 1, 1, 1])
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1])


## 1.5 Tensor datatypes
There are different datatypes in PyTorch. Some of the most commonly used ones are:
- torch.float: 32-bit floating-point
- torch.double: 64-bit double-precision floating-point
- torch.int: 32-bit integer (signed)
- torch.long: 64-bit integer (signed)

for more information, visit the official [PyTorch Tensor datatypes documentation](https://pytorch.org/docs/stable/tensors.html).

Datatypes are important in PyTorch because they determine what kind of data a tensor can hold. This is also one of the most common errors you might encounter when working with PyTorch tensors. so, it's important to keep track of the datatypes of tensors.



In [17]:
float32_tensor=torch.tensor([3,1,2],
                            dtype=torch.float32,
                            device=None,
                            requires_grad=False) # This will be covered in later sections
print(float32_tensor.dtype)

torch.float32


In [18]:
#changing tesor type
float16_tensor=float32_tensor.type(torch.float16)
float16_tensor.dtype

torch.float16

In [19]:
(float16_tensor*float32_tensor).dtype

torch.float32

# 2. Getting information from tensors
Once you've created a tensor (or a collection of tensors), you might want to get information from them. <br>
Some of the common information you might want to extract includes:
- Shape
- Rank
- Axis or Dimension
- Total number of elements
- Datatype of elements
- Index of the largest or smallest element

To do so, you can use the following methods:
- ```.shape```: returns the dimensions of the tensor
- ```.ndim```: returns the number of tensor dimensions
- ```.size()```: returns the total number of elements in the tensor
- ```.dtype```: returns the datatype of the elements in the tensor
- ```.argmax()```: returns the index of the maximum value in the tensor
- ```.argmin()```: returns the index of the minimum value in the tensor

Among all these information, the most important one is the shape, dtype and device of the tensor. You should always check these three things before performing any operations on the tensor as these are the most common sources of errors.

In [61]:
#create some random tensor 
torch.manual_seed(42)
random_tensor =  torch.rand(size=(3,3))
print(random_tensor)
print(f"\n1. Shape of the tensor is {random_tensor.shape}")
print(f"2. Dimension of the tensor is {random_tensor.ndim}")
print(f"3. Size of the tensor is {random_tensor.size()}")
print(f"4. Datatype of the tensor is {random_tensor.dtype}")
print(f"5. Device of the tensor is {random_tensor.device}")
print(f"6. maximum value of the tensor is {random_tensor.max()}")
print(f"7. minimum value of the tensor is {random_tensor.min()}")


tensor([[0.8823, 0.9150, 0.3829],
        [0.9593, 0.3904, 0.6009],
        [0.2566, 0.7936, 0.9408]])

1. Shape of the tensor is torch.Size([3, 3])
2. Dimension of the tensor is 2
3. Size of the tensor is torch.Size([3, 3])
4. Datatype of the tensor is torch.float32
5. Device of the tensor is cpu
6. maximum value of the tensor is 0.9593056440353394
7. minimum value of the tensor is 0.2565724849700928


# 3. Manipulating Tensors
Once you've created a tensor, you might want to manipulate it in some way. <br>
Some of the common tensor operations include:
* Addition
* Substraction
* Multiplication (element-wise)
* Division
* Matrix Multiplication

In [69]:
tensor=torch.tensor([9,1,2],dtype=torch.int32)
vectorr=torch.ones(size=(1,2,2))

## 3.1 Basic Operations

In [None]:
#addition
print(f"original tensor is {tensor}\n")
print(" Tensor after addition : \n  ", tensor+22) #adds 22 to each element

original tensor is tensor([9, 1, 2], dtype=torch.int32)

 Tensor after addition : 
   tensor([31, 23, 24], dtype=torch.int32)


In [None]:
#subtraction 
print(f"original tensor is {tensor}\n")
print(" Tensor after subtraction : \n  ", tensor-9) #subtracts 9 from each element

original tensor is tensor([9, 1, 2], dtype=torch.int32)

 Tensor after subtraction : 
   tensor([ 0, -8, -7], dtype=torch.int32)


In [81]:
#multiplicaiton (element-wise)
print(f"original tensor is {tensor}\n")
print(" Tensor after element-wise multiplication : \n  ", tensor*2) #multiplies with 2 to each element

original tensor is tensor([9, 1, 2], dtype=torch.int32)

 Tensor after element-wise multiplication : 
   tensor([18,  2,  4], dtype=torch.int32)


In [82]:
#torch inbuild function for manipulations 
# we can also use torch.add(tensor,2) instead of tensor+2 and so on
print(torch.add(tensor,2))
print(torch.sub(tensor,2))
print(torch.mul(tensor,2))
print(torch.div(tensor,2))

tensor([11,  3,  4], dtype=torch.int32)
tensor([ 7, -1,  0], dtype=torch.int32)
tensor([18,  2,  4], dtype=torch.int32)
tensor([4.5000, 0.5000, 1.0000])


## 3.2 Matrix Multiplication
In case you forgot, matrix multiplication : visit here [Matrix Multiplication](https://www.mathsisfun.com/algebra/matrix-multiplying.html) <br>

we can use the `torch.matmul()` method or the `@` operator to perform matrix multiplication. <br>
The fundamental rule of matrix multiplication is:
* The number of columns in the first matrix must be equal to the number of rows in the second matrix. [i.e (m x n) * (n x p)]
* The result will have the same number of rows as the first matrix and the same number of columns as the second matrix. [i.e (m x n) * (n x p) = (m x p)]

In [92]:
# Matrix mulitiplication
mat = torch.tensor([[1,0,0],
                    [0,1,0],
                    [0,0,1]])
print("original matrix is \n",mat)
print("\nMatrix after multiplication \n", torch.matmul(mat,mat)) #ormat@mat


original matrix is 
 tensor([[1, 0, 0],
        [0, 1, 0],
        [0, 0, 1]])

Matrix after multiplication 
 tensor([[1, 0, 0],
        [0, 1, 0],
        [0, 0, 1]])


### Common error in matrix multiplication 
The most common error in matrix multiplication is the shape mismatch error. 

In [6]:
#demonstrating the shape mismatch error
torch.manual_seed(42)
mat1 =  torch.rand(size=(3,3))
mat2 =  torch.rand(size=(2,3))

print("Matrix 1 shape is \n",mat1.shape)
print("\nMatrix 2 shape is \n",mat2.shape)
print("\nMatrix after multiplication \n", torch.matmul(mat1,mat2)) # This will throw error

Matrix 1 shape is 
 torch.Size([3, 3])

Matrix 2 shape is 
 torch.Size([2, 3])


RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x3 and 2x3)

To avoid this error, you should always check the shape of the tensors before performing matrix multiplication and transpose the tensors if necessary.

In [8]:
print("\nMatrix after multiplication \n", torch.matmul(mat1,mat2.T)) # This will work as we are taking transpose of the second matrix making it 3x2 


Matrix after multiplication 
 tensor([[1.1999, 1.5702],
        [0.8494, 1.5010],
        [1.3343, 1.3708]])


## 3.3 Finding the min, max,mean,sum

In [13]:
torch.manual_seed(42)
x=torch.rand(10)
x


tensor([0.8823, 0.9150, 0.3829, 0.9593, 0.3904, 0.6009, 0.2566, 0.7936, 0.9408,
        0.1332])

In [None]:
#find the min, max, mean and sum 
print(f"Mean : {x.mean()}") # for mean to work we need to convert the tensor to float dtype
print(f"Sum  : {x.sum()}")
print(f"Max  : {x.max()}")
print(f"Min  : {x.min()}")

# Alternative torch.min(x) and torch.max(x) can also be used and so on

Mean : 0.6254957318305969
Sum  : 6.25495719909668
Max  : 0.9593056440353394
Min  : 0.13318592309951782


### Finding the Positional min and max
Sometimes you might want to find the position of the minimum or maximum value of a tensor. We can do this using the `torch.argmin()` and `torch.argmax()` methods.

In [21]:
torch.manual_seed(42)
xx=torch.rand([4])
xx

tensor([0.8823, 0.9150, 0.3829, 0.9593])

In [22]:
print(xx.argmin()) # throws the index where minimum value lies
print(xx.argmax()) # throws the index where maximum value lies

tensor(2)
tensor(3)


### Change tensor datatype 
Sometimes you might want to change the datatype of a tensor. This is common when you want to store the tensor on the GPU or convert the tensor to a different datatype.
We can do this using the `torch.to()` method.
The default datatype in PyTorch is `torch.float32`.

In [25]:
t = torch.tensor([0, 1, 0, 2, 0, 3], dtype=torch.float32) 
t.dtype

torch.float32

In [31]:
tt =  t.to(torch.int32)
print(tt.dtype)

#alternative
ttt =  t.type(torch.int64)
ttt.dtype

torch.int32


torch.int64

## 3.4 Reshaping, Stacking, Squeezing , Unsqueezeing, Permute, View 
| Method | Description |
| --- | --- |
| ```.reshape()``` | Returns a new tensor with the same data but different structure |
| ```.view()``` | Returns a view of the original tensor in a different shape but shares the same data as the original tensor |
| ```.squeeze()``` | Squeezes input to remove all the dimenions with value 1.|
| ```.unsqueeze()``` | Returns input with a dimension value of 1 added at dim. |
| ```.permute()``` | Returns a view of the original tensor with its dimensions permuted. |
| ```.stack()``` | Concatenates a sequence of tensors along a new dimension. |


In [37]:
y=torch.arange(0.0,9.0)
y,y.shape

(tensor([0., 1., 2., 3., 4., 5., 6., 7., 8.]), torch.Size([9]))

In [37]:
#reshape
y_reshaped=y.reshape(1,9)  # one row , 9 columns
y_reshaped_again=y.reshape(9,1) # 9 rows , one column


In [39]:
#view
y_view=y.view(3,3)  #same as reshape but here the memory is shared i.e if we change y_view, y will also change

y_view[-1]= -9    # this will also change last element of y

In [None]:
#stack
#for dim=0 whole matrix will stack on top
#or dim=1 new dimension created between row and column, [ rowwise stack]
#for dim=2 every element will make a pair

print(y_view)
y_stacked_0= torch.stack([y_view,-y_view],dim=1)
print("\n")
print(y_stacked_0,y_stacked_0.shape,y_stacked_0.ndim)

y_stacked_1 = torch.stack([y_view,-y_view],dim=0)
print("\n")
print(y_stacked_1,y_stacked_1.shape,y_stacked_1.ndim)

y_stacked_2 = torch.stack([y_view,-y_view],dim=2)
print("\n")
print(y_stacked_2,y_stacked_2.shape,y_stacked_2.ndim)



tensor([[ 0.,  1.,  2.],
        [ 3.,  4.,  5.],
        [-9., -9., -9.]])


tensor([[[ 0., -0.],
         [ 1., -1.],
         [ 2., -2.]],

        [[ 3., -3.],
         [ 4., -4.],
         [ 5., -5.]],

        [[-9.,  9.],
         [-9.,  9.],
         [-9.,  9.]]]) torch.Size([3, 3, 2]) 3


In [40]:
#squeeze --> removes single dimension

print(f'orginal tensor: {y_reshaped}')
print(f'orginal tensor shape:{y_reshaped.shape}')

print(f'\nsqueezed tensor: {y_reshaped.squeeze()}')
print(f'squeezed tensor shape:{y_reshaped.squeeze().shape}')


orginal tensor: tensor([[ 0.,  1.,  2.,  3.,  4.,  5., -9., -9., -9.]])
orginal tensor shape:torch.Size([1, 9])

squeezed tensor: tensor([ 0.,  1.,  2.,  3.,  4.,  5., -9., -9., -9.])
squeezed tensor shape:torch.Size([9])


In [60]:
#unsqueeze
print(f'orignal tensor:{y}')
print(f'orginal tensor shape:{y.shape}')
print(f'\nunsqueezed tensor [dim=0]:{y.unsqueeze(dim=0)}')
print(f'unsqueezed tensor shape:{y.unsqueeze(dim=0).shape}')
print(f'\nunsqueezed tensor [dim=1]:{y.unsqueeze(dim=1)}')
print(f'unsqueezed tensor shape:{y.unsqueeze(dim=1).shape}')

#for dim =  0 , it will add a new dimension at the start
#for dim =  1 , it will add a new dimension at the end

orignal tensor:tensor([ 0.,  1.,  2.,  3.,  4.,  5., -9., -9., -9.])
orginal tensor shape:torch.Size([9])

unsqueezed tensor [dim=0]:tensor([[ 0.,  1.,  2.,  3.,  4.,  5., -9., -9., -9.]])
unsqueezed tensor shape:torch.Size([1, 9])

unsqueezed tensor [dim=1]:tensor([[ 0.],
        [ 1.],
        [ 2.],
        [ 3.],
        [ 4.],
        [ 5.],
        [-9.],
        [-9.],
        [-9.]])
unsqueezed tensor shape:torch.Size([9, 1])


In [42]:
#permute --> rearranges the dimension its like view
img=torch.rand(size=(224,224,3))
print(img.shape)
new_img=img.permute(2,0,1) #change in new_img will change img

torch.Size([224, 224, 3])


## 3.5 Indexing

In [61]:
num=torch.arange(1,10).reshape(1,3,3)
num, num.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [44]:
num[0][1][2] #you know this its like normal indexing

tensor(6)

In [None]:
num[:,:,1] #all dim, all rows, 1st column

tensor([[2, 5, 8]])

In [None]:
num[:,0,0] #all dim, 0th row, 0th column

tensor([1])

In [None]:
num[0,0,:] #0th dim, 0th row, all columns

tensor([1, 2, 3])

In [None]:
num[:,:,2] #all dim, all rows, 2nd column

tensor([[3, 6, 9]])

### Pytorch tensors and numpy

In [49]:
n=np.arange(1.0,8.0)
t=torch.from_numpy(n) #numpy array are in float64 by default

print(t)

nn=torch.ones(7)
nump=nn.numpy()

print(nn,nn.dtype)

tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64)
tensor([1., 1., 1., 1., 1., 1., 1.]) torch.float32


### Reproducibility

In [50]:
random_seed=42
torch.manual_seed(random_seed) #use this line before creating any random torch
rand_a=torch.rand(3,4)

torch.manual_seed(random_seed) #use this line before creating any random torch
rand_b=torch.rand(3,4)

In [51]:
print(rand_a==rand_b)

tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


# 4. Creating tensor on GPU
PyTorch tensors can be run on either a CPU or a GPU. As deep learning models require a lot of computation, it's common to run them on GPUs. GPUs contain hundreds of cores that are optimized for performing heavy computations on floating-point numbers quickly, making them ideal for training deep learning models.<br>
To work with GPU in pytorch, you can use the `torch.cuda` package. <br>

#### How to check if GPU is available?
You can check if a GPU is available using the `torch.cuda.is_available()` method. <br>
If a GPU is available, you can create a tensor on the GPU using the `torch.cuda()` method or by using the `.to()` method. <br>
If you want to move a tensor from the GPU to the CPU, you can use the `.cpu()` method.


In [62]:
torch.cuda.is_available()

True

In [66]:
!nvidia-smi 


Sun Nov 10 12:20:17 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 537.13                 Driver Version: 537.13       CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA GeForce GTX 1650 Ti   WDDM  | 00000000:01:00.0  On |                  N/A |
| N/A   48C    P0              18W /  50W |    890MiB /  4096MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [52]:
tens=torch.rand([5,5],device='cpu')
tens,tens.device

(tensor([[0.8694, 0.5677, 0.7411, 0.4294, 0.8854],
         [0.5739, 0.2666, 0.6274, 0.2696, 0.4414],
         [0.2969, 0.8317, 0.1053, 0.2695, 0.3588],
         [0.1994, 0.5472, 0.0062, 0.9516, 0.0753],
         [0.8860, 0.5832, 0.3376, 0.8090, 0.5779]]),
 device(type='cpu'))

In [53]:
#changing from cpu to gpu
tens.to(device).device #make sure cuda is enabled first

device(type='cuda', index=0)

### Moving tensor back to CPU


In [54]:
# if the tensor in in gpu then we can't transform it into numpy
#so first comvert into cpu and then to numpy cant do directly
tens.numpy() #this won't work if tens was in gpu
tens.cpu().numpy() #this is correct approach

array([[0.86940444, 0.5677153 , 0.74109405, 0.4294045 , 0.8854429 ],
       [0.57390445, 0.26658005, 0.62744915, 0.26963168, 0.44136357],
       [0.29692084, 0.8316855 , 0.10531491, 0.26949483, 0.35881263],
       [0.19936377, 0.54719156, 0.00616044, 0.95155454, 0.07526588],
       [0.8860137 , 0.5832096 , 0.33764774, 0.808975  , 0.5779254 ]],
      dtype=float32)