[Website Pytorch.io](https://www.learnpytorch.io/)

In [None]:
import torch
import numpy as np

print(torch.__version__)
print(np.__version__)

2.9.0+cpu
2.0.2


__PyTorch Main Componetns__

1. **Tensors** - N-dimensional arrays that serve as PyTorch’s fundamental data structure. They support automatic differentiation, hardware acceleration, and provide a comprehensive API for mathematical operations. Like NumPy arrays, but GPU-accelerated.

2. **Autograd** - PyTorch’s automatic differentiation engine that tracks operations performed on tensors and builds a computational graph dynamically to be able to compute gradients.

3. **Neural Network API** - A modular framework for building neural networks with pre-defined layers, activation functions, and loss functions. The `nn.Module` base class provides a clean interface for creating custom network architectures with parameter management.

4. **DataLoaders** - Tools for efficient data handling that provide features like batching, shuffling, and parallel data loading. They abstract away the complexities of data preprocessing and iteration, allowing for optimized training loops.

[__Tensors__](https://docs.pytorch.org/tutorials/beginner/introyt/tensors_deeper_tutorial.html)


Tensors are the fundamental building block of machine learning.

Their job is to represent data in a numerical way.

For example, you could represent an image as a tensor with shape [3, 224, 224] which would mean [colour_channels, height, width], as in the image has 3 colour channels (red, green, blue), a height of 224 pixels and a width of 224 pixels.

![image1](https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/00-tensor-shape-example-of-image.png)

The tensor would have three dimensions, one for colour_channels, height and
width.

In [None]:
# A scalar is a single number and in tensor-speak it's a zero dimension tensor.

scaler = torch.tensor(5)

print(scaler)
print(scaler.dtype) # although scalar is a single number, it's of type torch.Tensor
print(scaler.shape) # a scalar has no shape
print(scaler.ndim) # scaler has 0 dimenstion
print(scaler.item()) # get the python number within a tensor (works only with one-dimensional) tensors

tensor(5)
torch.int64
torch.Size([])
0
5


In [None]:
# A vector is a single dimension tensor but can contain many numbers.

vector = torch.tensor([7,7])

print(vector)
print(vector.dtype)
print(vector.shape)
print(vector.ndim)

tensor([7, 7])
torch.int64
torch.Size([2])
1


In [None]:
# Matrix : Vector with extra dimensions


Matrix = torch.tensor([[1,2],[1,2]])


print(Matrix)
print(Matrix.dtype)
print(Matrix.shape)
print(Matrix.ndim)

tensor([[1, 2],
        [1, 2]])
torch.int64
torch.Size([2, 2])
2


In [None]:
tensor = torch.tensor([[1,2,3],
                      [4,5,6],
                      [7,8,9]])


print(tensor)
print(tensor.dtype)
print(tensor.shape)
print(tensor.ndim)
print('\n')


# adding one more dimension in the same Tensor
tensor = torch.tensor([[[1,2,3],
                      [4,5,6],
                      [7,8,9]]])


print(tensor)
print(tensor.dtype)
print(tensor.shape)
print(tensor.ndim)

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
torch.int64
torch.Size([3, 3])
2


tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])
torch.int64
torch.Size([1, 3, 3])
3


![](https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/00-pytorch-different-tensor-dimensions.png)

### **Tensors Initialization**

We've established tensors represent some form of data.

And machine learning models such as neural networks manipulate and seek patterns within tensors.

But when building machine learning models with PyTorch, it's rare you'll create tensors by hand (like what we've been doing).

Instead, a machine learning model often starts out with large random tensors of numbers and adjusts these random numbers as it works through data to better represent it.

In essence:

`Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers...`

**Random**

In [None]:
random_tensor = torch.rand(3,4) # Uniform distribution [0, 1)

print(random_tensor.dtype)
print(random_tensor)
print('\n')



random_tensor = torch.randn(3,4) # Normal distribution (mean=0, std=1)

print(random_tensor.dtype)
print(random_tensor)

torch.float32
tensor([[0.9686, 0.8343, 0.6405, 0.5475],
        [0.1293, 0.8296, 0.8670, 0.4522],
        [0.9253, 0.0679, 0.1862, 0.7701]])


torch.float32
tensor([[-0.7272, -0.0519,  0.5817,  0.0115],
        [ 1.2805,  0.9161, -0.9852,  0.2031],
        [ 0.4616,  0.3559,  0.0850,  0.8072]])


**Zeros & Ones**

In [None]:
zeros = torch.zeros(2,2)
ones = torch.ones(2,2)

print(zeros)
print(ones)

tensor([[0., 0.],
        [0., 0.]])
tensor([[1., 1.],
        [1., 1.]])


In [None]:
# Range

range_tensor = torch.arange(start=0,end=10,step=2)
print(range_tensor)

# likes : we use this when we want to create tensor with shape as another tensor

zeros_like = torch.zeros_like(input=range_tensor)
ones_like = torch.ones_like(input=range_tensor)

print(zeros_like)
print(ones_like)

tensor([0, 2, 4, 6, 8])
tensor([0, 0, 0, 0, 0])
tensor([1, 1, 1, 1, 1])


**constant value**

In [None]:
constant  = torch.full(size=(2,2),fill_value=5)
print(constant)

tensor([[5, 5],
        [5, 5]])


**Tensor for neural network weights**

In [None]:
import torch.nn as nn

'''
xavier_uniform_ : It's a weight initialization method designed to keep the variance of activations stable as they flow forward and backward through a network

Best for : Tanh, Sigmoid, Linear (no activation)
Not ideal for: ReLU / LeakyReLU → use Kaiming (He) instead

'''

tensor = torch.empty(size = (128,256))
tensor1 = nn.init.xavier_uniform_(tensor)
print(tensor1, '\n')

tensor2 = nn.init.xavier_uniform_(tensor, gain=nn.init.calculate_gain("tanh"))
print(tensor2)

tensor([[ 0.0644,  0.0502, -0.0176,  ..., -0.0169, -0.0115, -0.0548],
        [ 0.0575, -0.1050,  0.1206,  ..., -0.0751,  0.0376,  0.0694],
        [ 0.0492,  0.0062,  0.0015,  ..., -0.0639,  0.0874, -0.1211],
        ...,
        [ 0.1195, -0.0909, -0.0271,  ..., -0.0724, -0.1009,  0.0855],
        [-0.0762,  0.0769,  0.1233,  ...,  0.1108,  0.0545,  0.0496],
        [-0.0509,  0.1181,  0.0071,  ..., -0.0215, -0.0201, -0.0846]]) 

tensor([[ 0.1640, -0.1141, -0.1349,  ..., -0.1657,  0.1073,  0.1758],
        [ 0.1129, -0.0097,  0.1992,  ...,  0.0474,  0.0726,  0.1032],
        [-0.0558,  0.0454, -0.0380,  ..., -0.0821, -0.1730,  0.1064],
        ...,
        [ 0.1176,  0.0189,  0.0362,  ..., -0.1163,  0.1451, -0.1208],
        [-0.0481,  0.1588,  0.1573,  ..., -0.1779, -0.1925, -0.0299],
        [ 0.1935,  0.0190, -0.0414,  ..., -0.0389, -0.0669,  0.0923]])


### **Tensor Data Type**



There are different type of datatypes in Pytorch and is to do with precision in computing. Precision is the amount of detail used to describe a number.

In [None]:
tensor_float_32 = torch.tensor([1,5,6],
                               dtype=torch.float32,
                               device=None)


tensor_float_16 = torch.tensor([1,5,6],
                               dtype=torch.float16,
                               device=None)

print(tensor_float_32.dtype, tensor_float_32)
print(tensor_float_16.dtype, tensor_float_16)

torch.float32 tensor([1., 5., 6.])
torch.float16 tensor([1., 5., 6.], dtype=torch.float16)


Aside from shape issues (tensor shapes don't match up), two of the other most common issues you'll come across in PyTorch are datatype and device issues.

- one of tensors is torch.float32 and the other is torch.float16 (PyTorch often likes tensors to be the same format)
- one of your tensors is on the CPU and the other is on the GPU (PyTorch likes calculations between tensors to be on the same device).

### **Tensors Operations**

In [None]:
# vector opetation

tensor = torch.tensor([1,2,3])

print(tensor + 10, torch.add(tensor,10))
print(tensor * 10, torch.mul(tensor,10))
print(tensor - 10, torch.sub(tensor,10))

tensor([11, 12, 13]) tensor([11, 12, 13])
tensor([10, 20, 30]) tensor([10, 20, 30])
tensor([-9, -8, -7]) tensor([-9, -8, -7])


Matrix Operations : PyTorch implements matrix multiplication functionality in the [torch.matmul()](https://docs.pytorch.org/docs/stable/generated/torch.matmul.html) method


In [None]:
'''
The inner dimensions must match:

    (3, 2) @ (3, 2) won't work
    (2, 3) @ (3, 2) will work
    (3, 2) @ (2, 3) will work


The resulting matrix has the shape of the outer dimensions:

    (2, 3) @ (3, 2) -> (2, 2)
    (3, 2) @ (2, 3) -> (3, 3)

"@" in Python is the symbol for matrix multiplication.
'''

tensor = torch.tensor([1,2,3])

# Element wise multiplication
print(tensor * tensor)
print(torch.mul(tensor,tensor))


# Dot product (inner product) [A matrix multiplication also referred to as the dot product of two matrices]
print(tensor @ tensor)
print(torch.matmul(tensor,tensor))


# Same we can do in python
print(sum([(i*i) for i in tensor]))

tensor([1, 4, 9])
tensor([1, 4, 9])
tensor(14)
tensor(14)
tensor(14)


Neural networks are full of matrix multiplications and dot products. The `torch.nn.Linear()` module, also known as a feed-forward layer or fully connected layer, implements a matrix multiplication between an input x and a weights matrix A.

In [None]:
torch.manual_seed(42)

linear = nn.Linear(in_features=2,
                   out_features=6)

tensor = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

output = linear(tensor)


print(tensor.shape)
print(output.shape, '\n')

print(output)

torch.Size([3, 2])
torch.Size([3, 6]) 

tensor([[2.2368, 1.2292, 0.4714, 0.3864, 0.1309, 0.9838],
        [4.4919, 2.1970, 0.4469, 0.5285, 0.3401, 2.4777],
        [6.7469, 3.1648, 0.4224, 0.6705, 0.5493, 3.9716]],
       grad_fn=<AddmmBackward0>)


In [None]:
# Finding index of the max and min value in the tensor

tensor = torch.tensor([1,6,2,5,0])

print(torch.argmax(tensor), torch.max(tensor))
print(torch.argmin(tensor), torch.min(tensor))

tensor(1) tensor(6)
tensor(4) tensor(0)


In [None]:
# Testing with different tensor data type

tensor1 = torch.tensor([1,2,3],dtype=torch.float32)
tensor2 = torch.tensor([1,2,3],dtype=torch.float16)

print(tensor1)
print('\n')
print(tensor2)

print(tensor1  @ tensor2) # check the following error we got (so we need to have the same data type), Let's keep this error

tensor([1., 2., 3.])


tensor([1., 2., 3.], dtype=torch.float16)


RuntimeError: dot : expected both vectors to have same dtype, but found Float and Half

### **Dimension manipulation of your tensors**

__Reshaping, stacking, squeezing and unsqueezing__


1. `torch.reshape(input, shape)`	Reshapes input to shape (if compatible), can also use torch.Tensor.reshape().
2. `Tensor.view(shape)`	Returns a view of the original tensor in a different shape but shares the same data as the original tensor.
3. `torch.stack(tensors, dim=0)`	Concatenates a sequence of tensors along a new dimension (dim), all tensors must be same size.
4. `torch.squeeze(input)`	Squeezes input to remove all the dimenions with value 1.
5. `torch.unsqueeze(input, dim)`	Returns input with a dimension value of 1 added at dim.
6. `torch.permute(input, dims)`	Returns a view of the original input with its dimensions permuted (rearranged) to dims.

In [None]:
tensor = torch.arange(1,12,2)

tensor, tensor.ndim, tensor.shape

(tensor([ 1,  3,  5,  7,  9, 11]), 1, torch.Size([6]))

In [None]:
reshape_tensor_1 = tensor.reshape(1, 6) # added one extra dimension  (row, column)
reshape_tensor_2 = tensor.reshape(3, 2)


print(reshape_tensor_1, reshape_tensor_1.ndim, reshape_tensor_1.shape)
print(reshape_tensor_2, reshape_tensor_2.ndim, reshape_tensor_2.shape)

tensor([[ 1,  3,  5,  7,  9, 11]]) 2 torch.Size([1, 6])
tensor([[ 1,  3],
        [ 5,  7],
        [ 9, 11]]) 2 torch.Size([3, 2])


In [None]:
# Sctaking

print(tensor, '\n')
print(torch.stack([tensor,tensor],dim=0), '\n')
print(torch.stack([tensor,tensor],dim=1))

tensor([ 1,  3,  5,  7,  9, 11]) 

tensor([[ 1,  3,  5,  7,  9, 11],
        [ 1,  3,  5,  7,  9, 11]]) 

tensor([[ 1,  1],
        [ 3,  3],
        [ 5,  5],
        [ 7,  7],
        [ 9,  9],
        [11, 11]])


In [None]:
# squeeze

print(reshape_tensor_1, reshape_tensor_1.ndim, reshape_tensor_1.shape)
squeeze_tensor = torch.squeeze(reshape_tensor_1)
print('\n')
print(squeeze_tensor, squeeze_tensor.ndim, squeeze_tensor.shape)
print('\n')
un_squeeze_tensor = torch.unsqueeze(squeeze_tensor, dim=1)
print(un_squeeze_tensor, un_squeeze_tensor.ndim, un_squeeze_tensor.shape)

tensor([[ 1,  3,  5,  7,  9, 11]]) 2 torch.Size([1, 6])


tensor([ 1,  3,  5,  7,  9, 11]) 1 torch.Size([6])


tensor([[ 1],
        [ 3],
        [ 5],
        [ 7],
        [ 9],
        [11]]) 2 torch.Size([6, 1])


In [None]:
# Rearrange the order of axes values

tensor = torch.rand(size=(5,5,3))

print(tensor.size(), tensor.ndim)

permute_tenosr = tensor.permute(2,0,1) # # shifts axis 0->1, 1->2, 2->0

print(permute_tenosr.size(), permute_tenosr.ndim)

torch.Size([5, 5, 3]) 3
torch.Size([3, 5, 5]) 3


**Pssing NumPy aaray into PyTorch**

In [None]:
array = np.arange(0,10)
print(array, array.dtype)

array_tensor = torch.from_numpy(array)
print(array_tensor, array_tensor.dtype)

array_tensor = torch.from_numpy(array).type(torch.float32)
print(array_tensor, array_tensor.dtype)

[0 1 2 3 4 5 6 7 8 9] int64
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) torch.int64
tensor([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]) torch.float32


In [None]:
tensor = torch.ones(7)
print(tensor, tensor.numpy())

tensor([1., 1., 1., 1., 1., 1., 1.]) [1. 1. 1. 1. 1. 1. 1.]


------

In [None]:
# Create a random tensor with shape (7, 7)

tensor = torch.rand(7,7)
print(tensor.size())


# Perform a matrix multiplication on the above tensor with another random tensor with shape (1, 7)

tensor1 = torch.rand(1,7)
print(tensor1.shape)

print(tensor.matmul(tensor1.T))
print('\n')

# Set the random seed to 0 and do above again.

torch.manual_seed(0)

tensor2 = torch.rand(7,7)
tensor3 = torch.rand(1,7)

print(tensor2.matmul(tensor3.T))




torch.Size([7, 7])
torch.Size([1, 7])
tensor([[1.2239],
        [2.0847],
        [1.9002],
        [0.9408],
        [1.5213],
        [1.3606],
        [0.8780]])


tensor([[1.8542],
        [1.9611],
        [2.2884],
        [3.0481],
        [1.7067],
        [2.5290],
        [1.7989]])


In [None]:
tensor = torch.rand(size=(1,1,1,10))
print(tensor)
print(tensor.ndim)

tensor([[[[0.5995, 0.0652, 0.5460, 0.1872, 0.0340, 0.9442, 0.8802, 0.0012,
           0.5936, 0.4158]]]])
4


In [None]:
torch.manual_seed(7)
tensor_ones = torch.rand(10)
tensor_ones

tensor([0.5349, 0.1988, 0.6592, 0.6569, 0.2328, 0.4251, 0.2071, 0.6297, 0.3653,
        0.8513])

-----


1.   [PyTorch internals](https://blog.ezyang.com/2019/05/pytorch-internals/)
2.   [Working Class Deep Learner](https://marksaroufim.substack.com/p/working-class-deep-learner)
3. [Strides in PyTorch](https://chitrarth.substack.com/p/understanding-strides-in-pytorch)


__Notes from the above blogs :)__

**Tensor** & **Stride**

The tensor is the central data structure in PyTorch. You probably have a pretty good idea about what a tensor intuitively represents: its an n-dimensional data structure containing some sort of scalar type, e.g., floats, ints, et cetera. We can think of a tensor as consisting of some data, and then some metadata describing the size of the tensor, the type of the elements in contains (dtype), what device the tensor lives on (CPU memory? CUDA memory?)


