# **PyTorch Fundamentals**
   Notebook created by Ganesh_9124,
   CSE with major in AI, IIITDM Kancheepuram.
   
   PyTorch Tutorial:

   https://youtu.be/V_xro1bcAuA?si=tSCoh7FyckmJAcfz

In [None]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


In [None]:
print(torch.__version__)

2.3.0+cu121


**Creating Tensors**

torch tensors created using torch.tensor()

https://pytorch.org/docs/stable/tensors.html

https://pytorch.org/docs/stable/torch.html#tensor-creation-ops

In [None]:
# scalar tensor
scalar = torch.tensor(7)

In [None]:
# dimension of scalar tensor
scalar.ndim

0

In [None]:
scalar.item()

7

In [None]:
# vector
vector = torch.tensor([9, 8])

In [None]:
# no of dimension means no of square brackets
vector.ndim

1

In [None]:
vector[0]

tensor(9)

In [None]:
vector.shape

torch.Size([2])

In [None]:
vector.size()

torch.Size([2])

In [None]:
# matrix
matrix = torch.tensor([[9,5],
                       [8,5]])

In [None]:
# dimension of matrix
matrix.ndim

2

In [None]:
matrix.shape

torch.Size([2, 2])

In [None]:
# tensor

tensor = torch.tensor([[[2,2,3],
                        [6,5,4],
                        [7,6,5]]], dtype=torch.float)
tensor

tensor([[[2., 2., 3.],
         [6., 5., 4.],
         [7., 6., 5.]]])

In [None]:
# dimension of tensor
tensor.ndim

3

In [None]:
# shape of tensor
tensor.shape

torch.Size([1, 3, 3])

In [None]:
# check 0th dimension of tensor
tensor[0]

tensor([[2., 2., 3.],
        [6., 5., 4.],
        [7., 6., 5.]])

In [None]:
tensor[0][2]

tensor([7., 6., 5.])

tensor.T
Returns a view of tensor with its dimensions reversed

warning;

The use of Tensor.T() on tensors of dimension other than 2 to reverse their shape is deprecated and it will throw an error in a future release. Consider mT to transpose batches of matrices or x.permute(*torch.arange(x.ndim - 1, -1, -1)) to reverse the dimensions of a tensor.

In [None]:
tensor.T

  tensor.T


tensor([[[2.],
         [6.],
         [7.]],

        [[2.],
         [5.],
         [6.]],

        [[3.],
         [4.],
         [5.]]])

tensor.H

Returns a view of matrix conjugated and transposed

tensor.H is only supported on matrices (2-D tensors)

In [None]:
tensor_2d = torch.tensor([[9,0],
                         [9,-8]])

In [None]:
tensor_2d.H

tensor([[ 9,  9],
        [ 0, -8]])

tensor.mT

Returns a view of this tensor with last two dimensions transposed

In [None]:
tensor.mT

tensor([[[2., 6., 7.],
         [2., 5., 6.],
         [3., 4., 5.]]])

In [None]:
tensor_2d.mH

tensor([[ 9,  9],
        [ 0, -8]])

In [None]:
tensor.histogram

<function Tensor.histogram>

In [None]:
hist = torch.histogram(tensor, bins=1)

torch.zeros method:

returns tensor filled with scalar value 0
shape defined by the variable argument 'size'

parameters:

size --int
a sequence of integers defining the shape of output tensor can be tuple or list.

Keyword Arguments
out (Tensor, optional) – the output tensor.

dtype (torch.dtype, optional) – the desired data type of returned tensor. Default: if None, uses a global default (see torch.set_default_dtype()).

layout (torch.layout, optional) – the desired layout of returned Tensor. Default: torch.strided.

device (torch.device, optional) – the desired device of returned tensor. Default: if None, uses the current device for the default tensor type (see torch.set_default_device()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.

requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default: False.




In [None]:
# torrh.zeros method
z = torch.zeros((3,3,3,3), dtype=torch.uint32, device=torch.device('cpu'))

In [None]:
z

tensor([[[[0, 0, 0],
          [0, 0, 0],
          [0, 0, 0]],

         [[0, 0, 0],
          [0, 0, 0],
          [0, 0, 0]],

         [[0, 0, 0],
          [0, 0, 0],
          [0, 0, 0]]],


        [[[0, 0, 0],
          [0, 0, 0],
          [0, 0, 0]],

         [[0, 0, 0],
          [0, 0, 0],
          [0, 0, 0]],

         [[0, 0, 0],
          [0, 0, 0],
          [0, 0, 0]]],


        [[[0, 0, 0],
          [0, 0, 0],
          [0, 0, 0]],

         [[0, 0, 0],
          [0, 0, 0],
          [0, 0, 0]],

         [[0, 0, 0],
          [0, 0, 0],
          [0, 0, 0]]]], dtype=torch.uint32)

**Torch.as_tensor method**

converts data to a tensor sharing data and preserving autograd history if possible.

if data is already a tensor with requested dtype and device then data itself returned.

if data is a tensor with different dtype or device then it's copied as if using data.to  (dtype=dtype, device=device).

if data is a NumPy array (an ndarray) with the same dtype and device then a tensor is constructed using torch.from_numpy().

Parameters
data (array_like) – Initial data for the tensor. Can be a list, tuple, NumPy ndarray, scalar, and other types.

dtype (torch.dtype, optional) – the desired data type of returned tensor. Default: if None, infers data type from data.

device (torch.device, optional) – the device of the constructed tensor. If None and data is a tensor then the device of data is used. If None and data is not a tensor then the result tensor is constructed on the current device.

In [None]:
# torch.as_tensor
x = np.array(([7, 8],
             [8, 6]))
x

array([[7, 8],
       [8, 6]])

In [None]:
tensor_x = torch.as_tensor(x, dtype=torch.uint32, device=torch.device('cpu'))

In [None]:
tensor_x

tensor([[7, 8],
        [8, 6]], dtype=torch.uint32)

In [None]:
torch.is_storage(tensor)

False

## **Random Tensors**

Why random tensors?

Random tensors are important bcoz the way many neural networks learns is that they start with tensors full of random no, and then adjust those random no, to better represent data.

start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers.


Pytorch documentation for random tensors:
https://pytorch.org/docs/stable/generated/torch.rand.html

In [None]:
# Create a random tensor of size (3,4)
rand_tensor = torch.rand(5, 4, 3, 2)

In [None]:
rand_tensor.ndim

4

In [None]:
# create a random tensor with shape similar to image
image1 = torch.rand(size = (3, 224, 224))
image2 = torch.rand(size = (224, 224, 3))

In [None]:
image1.shape, image1.ndim, image2.shape, image2.ndim

(torch.Size([3, 224, 224]), 3, torch.Size([224, 224, 3]), 3)

In [None]:
image1

tensor([[[0.9893, 0.0938, 0.6154,  ..., 0.0817, 0.8142, 0.1298],
         [0.9113, 0.5382, 0.1674,  ..., 0.3086, 0.5476, 0.5346],
         [0.9322, 0.7949, 0.3265,  ..., 0.4962, 0.1806, 0.8164],
         ...,
         [0.7860, 0.3505, 0.7933,  ..., 0.8355, 0.9254, 0.4924],
         [0.4578, 0.9537, 0.5314,  ..., 0.1159, 0.1701, 0.4894],
         [0.2395, 0.7356, 0.4717,  ..., 0.6006, 0.0707, 0.5732]],

        [[0.1416, 0.6566, 0.5223,  ..., 0.8790, 0.7079, 0.2502],
         [0.4459, 0.2855, 0.1827,  ..., 0.4766, 0.5729, 0.6563],
         [0.8821, 0.1592, 0.3436,  ..., 0.3975, 0.7371, 0.0565],
         ...,
         [0.8098, 0.6389, 0.0011,  ..., 0.7819, 0.9218, 0.2838],
         [0.3993, 0.8323, 0.7835,  ..., 0.9164, 0.9480, 0.5273],
         [0.7883, 0.0326, 0.3730,  ..., 0.3335, 0.8728, 0.5170]],

        [[0.9547, 0.7270, 0.8984,  ..., 0.5051, 0.3692, 0.9946],
         [0.2348, 0.6451, 0.8160,  ..., 0.6415, 0.3500, 0.3214],
         [0.9560, 0.2446, 0.8983,  ..., 0.8296, 0.0183, 0.

In [None]:
image2

tensor([[[0.5407, 0.6097, 0.9378],
         [0.0956, 0.1911, 0.8020],
         [0.9089, 0.5124, 0.2325],
         ...,
         [0.6590, 0.7860, 0.6492],
         [0.3229, 0.6812, 0.4135],
         [0.4715, 0.2782, 0.4924]],

        [[0.3739, 0.1682, 0.1948],
         [0.0337, 0.1122, 0.4151],
         [0.3901, 0.8614, 0.9977],
         ...,
         [0.6316, 0.1789, 0.7463],
         [0.7430, 0.9277, 0.7131],
         [0.9532, 0.8188, 0.7295]],

        [[0.5899, 0.2755, 0.9320],
         [0.9790, 0.9686, 0.7002],
         [0.9337, 0.5349, 0.7609],
         ...,
         [0.0605, 0.0945, 0.1822],
         [0.5073, 0.5187, 0.9923],
         [0.2759, 0.9752, 0.7792]],

        ...,

        [[0.2711, 0.1812, 0.6551],
         [0.4948, 0.1566, 0.7999],
         [0.3898, 0.0239, 0.6250],
         ...,
         [0.2765, 0.3602, 0.3950],
         [0.2340, 0.2297, 0.4759],
         [0.4695, 0.5005, 0.1560]],

        [[0.9595, 0.3909, 0.8861],
         [0.6207, 0.4153, 0.9388],
         [0.

## **Zeros and Ones**

In [None]:
## create tensor with zeros and ones
zeros = torch.zeros(3,4)

In [None]:
zeros.ndim, zeros.dtype, zeros.shape

(2, torch.float32, torch.Size([3, 4]))

In [None]:
ones = torch.ones(3, 4)
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
ones.ndim, ones.dtype, ones.shape

(2, torch.float32, torch.Size([3, 4]))

### Creating a range of tensors and tensors-like

torch.range is deprecated and will be removed in a future release because its behavior is inconsistent with Python's range builtin. Instead, use torch.arange, which produces values in [start, end).
  torch.range(0, 3)
tensor([0., 1., 2., 3.])

In [None]:
# use torch.range
torch.range(0, 3)

  torch.range(0, 3)


tensor([0., 1., 2., 3.])

### torch.arange
returns a 1-D vector of size [end-start]//step with values from the interval [start, end) taken with common difference step beginning from start.

non-integer step is subject to floating point rounding errors when comparing against end; to avoid inconsisitency subtract small epsilon from end in such cases.

Only Tensors of floating point and complex dtype can require gradients

for more info:
https://pytorch.org/docs/stable/generated/torch.arange.html

In [None]:
one_to_three = torch.arange(0,3)
one_to_three

tensor([0, 1, 2])

In [None]:
torch.arange(start = 2,
             end = 34,
             step = 2.9,
             dtype = torch.float,
             layout = torch.strided,
             requires_grad = True)

tensor([ 2.0000,  4.9000,  7.8000, 10.7000, 13.6000, 16.5000, 19.4000, 22.3000,
        25.2000, 28.1000, 31.0000, 33.9000], requires_grad=True)

In [None]:
# creating tensor like
three_zeros = torch.zeros_like(input = one_to_three)
three_zeros

tensor([0, 0, 0])

In [None]:
three_zeros.ndim, three_zeros.dtype, three_zeros.shape

(1, torch.int64, torch.Size([3]))

### Tensor Datatypes


https://pytorch.org/docs/stable/tensors.html

Precision in computer science:


https://en.wikipedia.org/wiki/Precision_(computer_science)

Note:

Tensor datatypes is one of the big errors you'll run into with PyTorch and deep learning:



1.   Tensors not right shape
2.   Tensors not right datatype
3.   Tensors not on the right device



In [None]:
float_32_tensor = torch.tensor([1, 3, 4, 5],
                            dtype=torch.float32,
                            device=None,
                            requires_grad=False)
float_32_tensor

tensor([1., 3., 4., 5.])

In [None]:
float_32_tensor.dtype, float_32_tensor.device

(torch.float32, device(type='cpu'))

In [None]:
# convert float32 tensor to float16 using torch.type method
float_16_tensor = float_32_tensor.type(torch.float16)

In [None]:
float_16_tensor.ndim, float_16_tensor.shape, float_16_tensor.dtype

(1, torch.Size([4]), torch.float16)

In [None]:
float_16_tensor * float_32_tensor

tensor([ 1.,  9., 16., 25.])

### Manipulating Tensors

Tensor Operations:

1. Addition
2. Subtraction
3. Multiplication (element wise)
4. Division
5. Matrix Multiplication

In [None]:
tensor = torch.tensor([[4, 6, 7],
                      [6, 5, 3]])


In [None]:
print(f"tensor + 10 ; {tensor+10}")
print(f"tensor - 10:{tensor-10}")
print(f"tensor * 10: {tensor*10}")
print(f"tensor / 10: {tensor/10}")

# try with torch inbuilt function

print(f"tensor + 10 ; {torch.add(tensor, 10)}")
print(f"tensor - 10:{torch.sub(tensor, 10)}")
print(f"tensor * 10: {torch.mul(tensor, 10)}")
print(f"tensor / 10: {torch.div(tensor, 10)}")

tensor + 10 ; tensor([[14, 16, 17],
        [16, 15, 13]])
tensor - 10:tensor([[-6, -4, -3],
        [-4, -5, -7]])
tensor * 10: tensor([[40, 60, 70],
        [60, 50, 30]])
tensor / 10: tensor([[0.4000, 0.6000, 0.7000],
        [0.6000, 0.5000, 0.3000]])
tensor + 10 ; tensor([[14, 16, 17],
        [16, 15, 13]])
tensor - 10:tensor([[-6, -4, -3],
        [-4, -5, -7]])
tensor * 10: tensor([[40, 60, 70],
        [60, 50, 30]])
tensor / 10: tensor([[0.4000, 0.6000, 0.7000],
        [0.6000, 0.5000, 0.3000]])



### Matrix Multiplication

Behaviour is depends on dimensionality of tensors

1. if both tensors are 1-D the dot product is returned.
2. if both arguments are 2-D the matrix-matrix product is returned.

for more info :
  https://pytorch.org/docs/stable/generated/torch.matmul.html


**2 ways of matrix multiplication**

1. elementwise multiplication
2. multiplying matrix by another matrix i.e **dot product** of rows and columns

mode info about matrix multiplication:

https://www.mathsisfun.com/algebra/matrix-multiplying.html

There are two main rules that performing matrix multiplication needs to satisfy:

1. The **Inner dimension** must match
2. The resulting matrix has the shape of **outer dimensions**

In [None]:
# product of tensors with both have dimensions 1
tensor1 = torch.rand(3)
tensor2 = torch.rand(3)
output=torch.matmul(tensor1, tensor2)

In [None]:
tensor1, tensor2, output

(tensor([0.7668, 0.6960, 0.6499]),
 tensor([0.4535, 0.7089, 0.0487]),
 tensor(0.8728))

In [None]:
output.shape, tensor1.ndim, tensor2.ndim

(torch.Size([]), 1, 1)

In [None]:
# matrix * vector
tensor1 = torch.rand(3, 4)
tensor2 = torch.rand(4)
output = torch.matmul(tensor1, tensor2)

In [None]:
output, tensor1, tensor2

(tensor([1.4366, 1.3501, 1.0724]),
 tensor([[3.1591e-01, 5.7112e-01, 8.8295e-01, 3.0575e-02],
         [8.2529e-01, 3.7634e-01, 2.6853e-01, 9.4193e-01],
         [8.7076e-04, 4.7717e-01, 7.1494e-01, 3.3466e-01]]),
 tensor([0.7083, 0.7993, 0.8478, 0.2517]))

In [None]:
output.size(), output.ndim, output.shape

(torch.Size([3]), 1, torch.Size([3]))

In [None]:
# matrix * matrix
tensor1 = torch.rand(3,4)
tensor2 = torch.rand(4,3)
output = torch.matmul(tensor1, tensor2)


In [None]:
output.size(), output.ndim

(torch.Size([3, 3]), 2)

In [None]:
# batched matrix * broadcasted vector
tensor1 = torch.rand(10, 3, 4)
tensor2 = torch.rand(4)
output = torch.matmul(tensor1, tensor2)

In [None]:
output, tensor1, tensor2

(tensor([[0.8796, 0.3629, 0.3455],
         [0.2460, 1.2472, 1.0858],
         [0.7728, 0.8481, 1.1930],
         [0.6768, 1.0506, 0.9284],
         [0.7682, 0.8107, 0.2163],
         [0.1731, 0.9723, 0.8366],
         [0.6547, 0.7234, 1.0279],
         [0.8206, 1.0522, 1.0745],
         [0.3377, 0.7175, 1.0502],
         [1.1729, 1.1037, 0.3091]]),
 tensor([[[0.9947, 0.1732, 0.5261, 0.8841],
          [0.2091, 0.8389, 0.1130, 0.5396],
          [0.1198, 0.4726, 0.0669, 0.7487]],
 
         [[0.1173, 0.7848, 0.1411, 0.1501],
          [0.5478, 0.9496, 0.9471, 0.9742],
          [0.9416, 0.9864, 0.8647, 0.4621]],
 
         [[0.7509, 0.1979, 0.6246, 0.3973],
          [0.6599, 0.4719, 0.6464, 0.5554],
          [0.7435, 0.9601, 0.8964, 0.8313]],
 
         [[0.3357, 0.6505, 0.4349, 0.6825],
          [0.1287, 0.5856, 0.8712, 0.8528],
          [0.5371, 0.6197, 0.9609, 0.0039]],
 
         [[0.1853, 0.3139, 0.7879, 0.2074],
          [0.1856, 0.2388, 0.6334, 0.7734],
          [0.0884, 0

In [None]:
output.size(), output.ndim

(torch.Size([10, 3]), 2)

In [None]:
# batched matrix * batched matrix
tensor1 = torch.rand(10, 3, 4)
tensor2 = torch.rand(10, 4, 9)
output = torch.matmul(tensor1, tensor2)


In [None]:
tensor1, tensor2, output

(tensor([[[9.6781e-01, 8.8600e-01, 6.4530e-01, 3.6524e-01],
          [7.1362e-01, 2.6894e-01, 2.9505e-01, 1.6717e-02],
          [2.3501e-01, 9.7408e-01, 4.6710e-01, 5.2316e-01]],
 
         [[2.9543e-01, 3.4276e-01, 1.6614e-01, 2.8369e-01],
          [9.2087e-01, 5.6718e-01, 8.1795e-01, 5.0940e-01],
          [8.1884e-01, 9.3374e-01, 6.5887e-01, 5.4845e-01]],
 
         [[9.7374e-01, 4.7369e-01, 4.1694e-01, 1.0023e-01],
          [3.4378e-01, 1.8310e-01, 6.1334e-01, 6.6771e-01],
          [1.9421e-01, 3.6067e-01, 4.8334e-01, 3.4284e-01]],
 
         [[2.6886e-01, 1.4792e-01, 4.2807e-01, 2.7412e-04],
          [9.1087e-01, 2.4938e-01, 2.4760e-01, 7.7178e-03],
          [7.7926e-01, 9.1451e-01, 9.2214e-01, 6.7909e-01]],
 
         [[8.7517e-01, 6.3678e-01, 6.7019e-01, 3.2862e-02],
          [9.1105e-01, 5.2197e-01, 9.3558e-01, 7.3185e-01],
          [3.1961e-01, 5.3644e-01, 7.2312e-01, 3.9074e-01]],
 
         [[3.4820e-01, 3.1431e-01, 3.8379e-01, 5.1350e-01],
          [8.7292e-02, 3.

In [None]:
output.size(), output.ndim

(torch.Size([10, 3, 9]), 3)

In [None]:
# batched matrix * broadcasted matrix
tensor1 = torch.rand(10, 3, 4)
tensor2 = torch.rand(4, 5)
output = torch.matmul(tensor1, tensor2)

In [None]:
output.size(), output.ndim

(torch.Size([10, 3, 5]), 3)

In [None]:
tensor = torch.rand(3)

In [None]:
%%time
value=0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
print(value)

tensor(0.8475)
CPU times: user 2.14 ms, sys: 0 ns, total: 2.14 ms
Wall time: 2.19 ms


In [None]:
%%time
torch.matmul(tensor, tensor)

CPU times: user 101 µs, sys: 0 ns, total: 101 µs
Wall time: 108 µs


tensor(0.8475)

### Finding the min, max, mean, sum etc.(tensor aggregation)

**mean()**: could not infer output dtype. Input dtype must be either a floating point or complex dtype. Got: Long

**max():**

Returns the maximum value of all elements in the input tensor

torch.max(input, dim, keepdim=False, *, out=None)
Returns tuple (values, indices)

values: Maximum of each row of input tensor
indices: index location of each max value found (argmax)

keepdim == True, means output tensor is same size of input.

more info: https://pytorch.org/docs/stable/generated/torch.max.html


**min()**

Returns the minimum value of all elements in the input tensor.

torch.min(input, dim, keepdim=False, *, out=None)

for more info:
https://pytorch.org/docs/stable/generated/torch.min.html


**mean()**:

Returns the mean value of the elements of input tensor. input must be floating point or complex.

torch.mean(input, *, dtype=None)

dtype: desired data type of returned tensor.this is useful for preventing datatype overflows.

torch.mean(input, dim, keepdim=Fales, *, dtype=None, out=None)

Returns mean value of each row of input tensor
dim is list of dimensions, reduce over all of them.

for more info:
https://pytorch.org/docs/stable/generated/torch.mean.html


**sum()**:

torch.sum(input, *, dtype=None)

returns the sum of all elements in the input tensor.

torch.sum(input, dim, keepdim=False, *, dtype=None)

returns sum of each row of input tensor in the given dimension dim , if dim is list of dimensions, reduce over all of them.

for more info:
https://pytorch.org/docs/stable/generated/torch.sum.html


**maximum()**:

torch.maximum(input, other, *, out=None)

Computes the element-wise maximum of input and other.

for more info;
https://pytorch.org/docs/stable/generated/torch.maximum.html

In [None]:
# create a tensor
x = torch.arange(0, 100, 10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [None]:
# find min
torch.min(x), x.min()

(tensor(0), tensor(0))

In [None]:
x = torch.rand(3, 4)
x

tensor([[0.5001, 0.2382, 0.8887, 0.2070],
        [0.0876, 0.0529, 0.5525, 0.3324],
        [0.2345, 0.8970, 0.8083, 0.4582]])

In [None]:
# find max

values, indices = torch.max(x, dim=1, keepdim=False)

In [None]:
values, indices

(tensor([0.8887, 0.5525, 0.8970]), tensor([2, 2, 1]))

In [None]:
indices.shape, indices.ndim

(torch.Size([3]), 1)

In [None]:
values , indices = torch.max(x, dim=1, keepdim=True)

In [None]:
values, indices

(tensor([[0.8887],
         [0.5525],
         [0.8970]]),
 tensor([[2],
         [2],
         [1]]))

In [None]:
indices.shape, indices.ndim

(torch.Size([3, 1]), 2)

In [None]:
x.dtype

torch.float32

In [None]:
# find mean
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(0.4381), tensor(0.4381))

In [None]:
# find sum
torch.sum(x), sum(x)

(tensor(5.2573), tensor([0.8221, 1.1880, 2.2495, 0.9976]))

In [None]:


# find element wise maxium of two tensors
a = torch.tensor([[1, 7, 3],
                 [5, 4, 8]])
b = torch.tensor([[0, 9, 0],
                 [0, 5, 9]])
torch.maximum(a, b)

tensor([[1, 9, 3],
        [5, 5, 9]])

### Fiding the Positional min and max

**argmin():**

torch.armin(input, dim=None, keepdim=False) -> LongTensor

Returns the indices of minimum value(s) of the flattened tensor along a dimension.

this is the second value returned by torch.min()

if there are multiple minimal then index of first minimal value get returned.

for more info:
https://pytorch.org/docs/stable/generated/torch.argmin.html


**argmax():**

torch.argmax(input, dim=None, keepdim=False)

Returns the indices of maximum value(s) of the flattened tensor along a dimension.

for more info:
https://pytorch.org/docs/stable/generated/torch.argmax.html

In [None]:
test_tensor = torch.rand(3, 5)
test_tensor

tensor([[0.0729, 0.5799, 0.0668, 0.0902, 0.9651],
        [0.6544, 0.8429, 0.0388, 0.7771, 0.1881],
        [0.6486, 0.1491, 0.5511, 0.1620, 0.0111]])

In [None]:
# positional minimal
torch.argmin(test_tensor)

tensor(14)

In [None]:
torch.argmin(test_tensor, dim=1)

tensor([2, 2, 4])

In [None]:
torch.argmin(test_tensor, dim=1, keepdim=True)

tensor([[2],
        [2],
        [4]])

In [None]:
# positional maximal
torch.argmax(test_tensor)

tensor(4)

In [None]:
torch.argmax(test_tensor, dim=1)

tensor([4, 1, 0])

In [None]:
torch.argmax(test_tensor, dim=1, keepdim=True)

tensor([[4],
        [1],
        [0]])

 ### Reshaping, Stacking, Squeezing and Unsqueezing tensors

**1.Reshaping:**

  Returns a tensor with same data and number of elements as input, with specified shape.

  When possible the returned tensor will be view of input.

  torch.reshape(input, shape), here shape is tuple of int values.

  for more info:
  https://pytorch.org/docs/stable/generated/torch.reshape.html


**Tensor.view():**

  Returns a new tensor with same data as the self tensor but of specified or different shape.

  each new view dimension must be either be a subspace of an original dimension, or only span across original dimensions.

  for more info:
  https://pytorch.org/docs/stable/generated/torch.Tensor.view.html#torch.Tensor.view


**stacking:**

  torch.stack(tensor, dim=0, *, out=None)

  concatnates a sequence of tensors along a new dimension.

  All tensors need to be of the same dimension.

  for more info:
  https://pytorch.org/docs/stable/generated/torch.stack.html

  torch.vstack(tensors, *, out=None)

  stack tensors in sequence vertically (row wise).

  This is equivalent to concatenation along the first axis after all 1-D tensors have been reshaped by torch.atleast_2d().

  for more info:

  https://pytorch.org/docs/stable/generated/torch.vstack.html


  torch.hstack(tensors, *, out=None)

  Stack tensors in sequence horizontally (column wise).

  This is equivalent to concatenation along the first axis for 1-D tensors and along the second axis for all other tensors.
  
  for more info:

  https://pytorch.org/docs/stable/generated/torch.hstack.html


**Squeezing:**

  torch.squeeze(input, dim=None) -> Tensor

  Returns a tensor with all specified dimensions of input of size 1 removed.

  When dim is given only specified dimension is squeezed or removed.

  The returned tensor and input tensor shares the storage so, change in returned tensor also get reflected in input tensor.

  for more info:
  https://pytorch.org/docs/stable/generated/torch.squeeze.html

**Unsqueezing:**

  torch.unsqueeze(input, dim) -> Tensor

  Returns a new tensor with a dimension of size one inserted at the specified position.

  The returned tensor shares the memory with same underlying tensor.


  A dim value ranges from [-input.ndim-1, input.ndim+1]
  negative dim correspond to dim = dim + input.ndim + 1

  for more info:

  https://pytorch.org/docs/stable/generated/torch.unsqueeze.html

**Permute:**

  Returns the view of the original tensor input with its dimensions permuted.

  torch.permute(input, dims) -> Tensor

  for more info:

  https://pytorch.org/docs/stable/generated/torch.permute.html

In [None]:
# create a tensor and reshape it
x = torch.rand(1, 4, 5)
x

tensor([[[0.1727, 0.5525, 0.9398, 0.4633, 0.4456],
         [0.9252, 0.6192, 0.3356, 0.1219, 0.1718],
         [0.3946, 0.2747, 0.7708, 0.5302, 0.3122],
         [0.7000, 0.8813, 0.2388, 0.2446, 0.2226]]])

In [None]:
x.shape, x.ndim

(torch.Size([1, 4, 5]), 3)

In [None]:
x_reshaped = torch.reshape(x, (5, 2, 2))

In [None]:
x_reshaped

tensor([[[0.1727, 0.5525],
         [0.9398, 0.4633]],

        [[0.4456, 0.9252],
         [0.6192, 0.3356]],

        [[0.1219, 0.1718],
         [0.3946, 0.2747]],

        [[0.7708, 0.5302],
         [0.3122, 0.7000]],

        [[0.8813, 0.2388],
         [0.2446, 0.2226]]])

In [None]:
x_reshaped.shape, x_reshaped.ndim

(torch.Size([5, 2, 2]), 3)

In [None]:
# change the view
z = x.view(5, 4)
z, z.shape, z.ndim

(tensor([[0.1727, 0.5525, 0.9398, 0.4633],
         [0.4456, 0.9252, 0.6192, 0.3356],
         [0.1219, 0.1718, 0.3946, 0.2747],
         [0.7708, 0.5302, 0.3122, 0.7000],
         [0.8813, 0.2388, 0.2446, 0.2226]]),
 torch.Size([5, 4]),
 2)

In [None]:
# changing z changes x as view of a tensor shares the same memory as the original input
z[0,3] = 5
z, x

(tensor([[0.1727, 0.5525, 0.9398, 5.0000],
         [0.4456, 0.9252, 0.6192, 0.3356],
         [0.1219, 0.1718, 0.3946, 0.2747],
         [0.7708, 0.5302, 0.3122, 0.7000],
         [0.8813, 0.2388, 0.2446, 0.2226]]),
 tensor([[[0.1727, 0.5525, 0.9398, 5.0000, 0.4456],
          [0.9252, 0.6192, 0.3356, 0.1219, 0.1718],
          [0.3946, 0.2747, 0.7708, 0.5302, 0.3122],
          [0.7000, 0.8813, 0.2388, 0.2446, 0.2226]]]))

In [None]:
x = torch.rand(2,3)
x

tensor([[0.7309, 0.3233, 0.0093],
        [0.5604, 0.6835, 0.4252]])

In [None]:
x.shape, x.ndim

(torch.Size([2, 3]), 2)

In [None]:
# stack tensors on top of each other
x_stacked_0 = torch.stack([x, x], dim=0)
x_stacked_0

tensor([[[0.7309, 0.3233, 0.0093],
         [0.5604, 0.6835, 0.4252]],

        [[0.7309, 0.3233, 0.0093],
         [0.5604, 0.6835, 0.4252]]])

In [None]:
x_stacked_0.shape, x_stacked_0.ndim

(torch.Size([2, 2, 3]), 3)

In [None]:
x_stacked_1 = torch.stack([x, x], dim=1)
x_stacked_1

tensor([[[0.7309, 0.3233, 0.0093],
         [0.7309, 0.3233, 0.0093]],

        [[0.5604, 0.6835, 0.4252],
         [0.5604, 0.6835, 0.4252]]])

In [None]:
x_stacked_1.shape, x_stacked_1.ndim

(torch.Size([2, 2, 3]), 3)

In [None]:
x_stacked_2 = torch.stack([x, x], dim =1)
x_stacked_2, x

(tensor([[[0.7309, 0.3233, 0.0093],
          [0.7309, 0.3233, 0.0093]],
 
         [[0.5604, 0.6835, 0.4252],
          [0.5604, 0.6835, 0.4252]]]),
 tensor([[0.7309, 0.3233, 0.0093],
         [0.5604, 0.6835, 0.4252]]))

In [None]:
x_vstacked = torch.vstack([x, x])
x_vstacked

tensor([[0.7309, 0.3233, 0.0093],
        [0.5604, 0.6835, 0.4252],
        [0.7309, 0.3233, 0.0093],
        [0.5604, 0.6835, 0.4252]])

In [None]:

x_vstacked.shape, x_vstacked.ndim

(torch.Size([4, 3]), 2)

In [None]:
x_hstacked = torch.hstack([x, x])
x_hstacked

tensor([[0.7309, 0.3233, 0.0093, 0.7309, 0.3233, 0.0093],
        [0.5604, 0.6835, 0.4252, 0.5604, 0.6835, 0.4252]])

In [None]:
# torch.squeeze() removes all single dimensions from target tensor
x = torch.zeros(1, 1, 10, 3, 3)
x.shape

torch.Size([1, 1, 10, 3, 3])

In [None]:
x

tensor([[[[[0., 0., 0.],
           [0., 0., 0.],
           [0., 0., 0.]],

          [[0., 0., 0.],
           [0., 0., 0.],
           [0., 0., 0.]],

          [[0., 0., 0.],
           [0., 0., 0.],
           [0., 0., 0.]],

          [[0., 0., 0.],
           [0., 0., 0.],
           [0., 0., 0.]],

          [[0., 0., 0.],
           [0., 0., 0.],
           [0., 0., 0.]],

          [[0., 0., 0.],
           [0., 0., 0.],
           [0., 0., 0.]],

          [[0., 0., 0.],
           [0., 0., 0.],
           [0., 0., 0.]],

          [[0., 0., 0.],
           [0., 0., 0.],
           [0., 0., 0.]],

          [[0., 0., 0.],
           [0., 0., 0.],
           [0., 0., 0.]],

          [[0., 0., 0.],
           [0., 0., 0.],
           [0., 0., 0.]]]]])

In [None]:
y = torch.squeeze(x)

In [None]:
y.shape,  y.ndim

(torch.Size([10, 3, 3]), 3)

In [None]:
y

tensor([[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]])

In [None]:
z = torch.unsqueeze(y, dim=-2)
z, z.shape

(tensor([[[[0., 0., 0.]],
 
          [[0., 0., 0.]],
 
          [[0., 0., 0.]]],
 
 
         [[[0., 0., 0.]],
 
          [[0., 0., 0.]],
 
          [[0., 0., 0.]]],
 
 
         [[[0., 0., 0.]],
 
          [[0., 0., 0.]],
 
          [[0., 0., 0.]]],
 
 
         [[[0., 0., 0.]],
 
          [[0., 0., 0.]],
 
          [[0., 0., 0.]]],
 
 
         [[[0., 0., 0.]],
 
          [[0., 0., 0.]],
 
          [[0., 0., 0.]]],
 
 
         [[[0., 0., 0.]],
 
          [[0., 0., 0.]],
 
          [[0., 0., 0.]]],
 
 
         [[[0., 0., 0.]],
 
          [[0., 0., 0.]],
 
          [[0., 0., 0.]]],
 
 
         [[[0., 0., 0.]],
 
          [[0., 0., 0.]],
 
          [[0., 0., 0.]]],
 
 
         [[[0., 0., 0.]],
 
          [[0., 0., 0.]],
 
          [[0., 0., 0.]]],
 
 
         [[[0., 0., 0.]],
 
          [[0., 0., 0.]],
 
          [[0., 0., 0.]]]]),
 torch.Size([10, 3, 1, 3]))

In [None]:
# permute

x = torch.rand(3, 3, 5)
x

tensor([[[0.0556, 0.7549, 0.3412, 0.8188, 0.3506],
         [0.7011, 0.2496, 0.3483, 0.9460, 0.3936],
         [0.5409, 0.7627, 0.4547, 0.9900, 0.0525]],

        [[0.0370, 0.5104, 0.2480, 0.0374, 0.2683],
         [0.7353, 0.7326, 0.0114, 0.0411, 0.9471],
         [0.7980, 0.3853, 0.8810, 0.1806, 0.7791]],

        [[0.6558, 0.8534, 0.4340, 0.3923, 0.7651],
         [0.3787, 0.4250, 0.3235, 0.8145, 0.5282],
         [0.4986, 0.9169, 0.8671, 0.7472, 0.7102]]])

In [None]:
x_permuted = torch.permute(x, (2, 0, 1))
x_permuted

tensor([[[0.0556, 0.7011, 0.5409],
         [0.0370, 0.7353, 0.7980],
         [0.6558, 0.3787, 0.4986]],

        [[0.7549, 0.2496, 0.7627],
         [0.5104, 0.7326, 0.3853],
         [0.8534, 0.4250, 0.9169]],

        [[0.3412, 0.3483, 0.4547],
         [0.2480, 0.0114, 0.8810],
         [0.4340, 0.3235, 0.8671]],

        [[0.8188, 0.9460, 0.9900],
         [0.0374, 0.0411, 0.1806],
         [0.3923, 0.8145, 0.7472]],

        [[0.3506, 0.3936, 0.0525],
         [0.2683, 0.9471, 0.7791],
         [0.7651, 0.5282, 0.7102]]])

### Indexing(selecting data from tensors)

Indexing in torch is similar to numpy indexing.

In [None]:
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape, x.ndim

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]),
 3)

In [None]:
x[0], x[0][0][2]

(tensor([[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]),
 tensor(3))

In [None]:


# get all values of 0th and 1st dimension but only index 1 of 2nd dimension
x[:, :, 1]

tensor([[2, 5, 8]])

### Pytorch Tensors and NumPy Arrays

NumPy is popular scientific Python numerical computing library
pytorch has functionality to interact with it

* Data in NumPy, want in PyTorch tensor -> torch.from_numpy(ndarray)
* PyTorch tensor -> NumPy -> torch.Tensor.numpy

In [None]:
array = np.arange(1.0, 8.0)
# warning : when converting from numpy to pytorch tensor, pytorch reflects numpy's default datatypes of float64 unless specified.
tensor = torch.from_numpy(array)
tensor, array

(tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64),
 array([1., 2., 3., 4., 5., 6., 7.]))

In [None]:
tensor = tensor+1
tensor, array

(tensor([2., 3., 4., 5., 6., 7., 8.], dtype=torch.float64),
 array([1., 2., 3., 4., 5., 6., 7.]))

In [None]:
tensor = tensor.type(torch.float32)
tensor

tensor([2., 3., 4., 5., 6., 7., 8.])

In [None]:
# changing value in numpy array not changes values in tensor
tensor[0] =  91
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([91.,  3.,  4.,  5.,  6.,  7.,  8.]))

In [None]:
# # tensor to numpy array

tensor = torch.rand([3,4])
array = tensor.numpy()
tensor, array

(tensor([[0.1291, 0.6586, 0.3787, 0.9410],
         [0.2091, 0.2975, 0.7614, 0.7680],
         [0.5391, 0.0396, 0.7088, 0.1158]]),
 array([[0.12914288, 0.6586039 , 0.37874007, 0.94100773],
        [0.20909625, 0.29749846, 0.761369  , 0.76796705],
        [0.53909034, 0.03963125, 0.708786  , 0.11575621]], dtype=float32))

### Reproducibility
IN short how neural nets learns:

start with random numbers -> tensor operations -> update random numbers to try and make them better representations of the data -> again -> again -> again......


A random seed is a number or vector used to initialise a pseudorandom number generator.

for more info:
https://en.wikipedia.org/wiki/Random_seed

torch.manual_seed(seed)

sets the seed for generating random numbers on all devices. Returns a torch.Generator object.

for more info:
https://pytorch.org/docs/stable/generated/torch.manual_seed.html

In [None]:


# random but reproducible tensors

random_seed = 24
torch.manual_seed(random_seed)

tensor_1= torch.rand(3,4)

tensor_2 = torch.rand(3,4)

tensor_1, tensor_2

(tensor([[0.7644, 0.3751, 0.0751, 0.5308],
         [0.9660, 0.2770, 0.3372, 0.8910],
         [0.4304, 0.3090, 0.3993, 0.5183]]),
 tensor([[0.0927, 0.3571, 0.9848, 0.3928],
         [0.9554, 0.3048, 0.2989, 0.3510],
         [0.0529, 0.1988, 0.8022, 0.1249]]))

In [None]:
random_seed = 32

torch.manual_seed(random_seed)

tensor_3 = torch.rand(3,4)

torch.manual_seed(random_seed)

tensor_4 = torch.rand(3,3)

tensor_3, tensor_4

(tensor([[0.8757, 0.2721, 0.4141, 0.7857],
         [0.1130, 0.5793, 0.6481, 0.0229],
         [0.5874, 0.3254, 0.9485, 0.5219]]),
 tensor([[0.8757, 0.2721, 0.4141],
         [0.7857, 0.1130, 0.5793],
         [0.6481, 0.0229, 0.5874]]))

### Running tensors and pytorch objects on the GPUs(and making faster computations)

## 1.Getting a GPU

1.Easiest - use Google Colab for a free GPU (options to upgrade as well)
2.Use your own GPU - takes a little bit of setup and requires the investment of purchasing a GPU. There's lots options ...,see post for
  what options to get : https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/
3.Use cloud computing - GCP, AWS, Azure, these services allow you to rent computers on the cloud and access them.

for 2,3 PyTorch + GPU drivers (CUDA) takes little bit of setting up, refer documentation for this:https://pytorch.org/get-started/locally/

In [None]:
!nvidia-smi

/bin/bash: line 1: nvidia-smi: command not found


### 2.Check for GPU access with PyTorch

CUDA Semantics Documentation:
https://pytorch.org/docs/stable/notes/cuda.html

In [None]:
# check GPU access with PyTorch

torch.cuda.is_available()

False

For PyTorch since it's capable of running compute on the GPU or CPU, its best practice to setup device agnostic code: https://pytorch.org/docs/stable/notes/cuda.html#best-practices

In [None]:
# setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cpu'

In [None]:
# count no of devices
torch.cuda.device_count()

0

### 3.Putting Tensors (and models) on the GPU
The reason we want our tensors/models on the GPU is because using a GPU results in faster computations.

In [None]:
# create tensor (default on CPU)
tensor = torch.tensor([1,2,3])
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [None]:
# Move tensor to GPU If available
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3])

### 4. Moving Tensor back to CPU

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

TO fix GPU tensor with NumPy issue, we can set it to cpu first then convert it to numpy array.

In [None]:
# If tensor is on the GPU can't transform to numpy
tensor_on_gpu_numpy = tensor_on_gpu.numpy()

In [None]:
tensor_on_gpu_numpy = tensor_on_gpu.cpu().numpy()
tensor_on_gpu_numpy

array([1, 2, 3])