<a href="https://colab.research.google.com/github/vinodkumarreddy/Pytorch-learning/blob/main/Pytorch_Intro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Pytorch Tensors

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F

In [8]:
device = "gpu" if torch.cuda.is_available() else "cpu"

In [9]:
# Creating tensors and default dtypes
zeros_t = torch.zeros(size = (3,4), device = device)
ones_t = torch.ones((2,3))
empty_t = torch.empty(4,3, device = device)
rand_t = torch.rand((100,100), device = device)

In [10]:
zeros_t.shape

torch.Size([3, 4])

In [11]:
zeros_t

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [12]:
zeros_t.dtype

torch.float32

In [13]:
ones_t

tensor([[1., 1., 1.],
        [1., 1., 1.]])

In [14]:
ones_t.dtype

torch.float32

In [20]:
torch.min(rand_t), torch.max(rand_t)

(tensor(1.2100e-05), tensor(0.9994))

When a random seed is set, then the random number generator generator starts to generate the same sequence of random numbers as long as the we are looking from the point from where the seed was set.

In [22]:
torch.manual_seed(1729)
rand_tensor_1 = torch.rand(3,4)
rand_tensor_2 = torch.rand(4,3)

torch.manual_seed(1729)
rand_tensor_3 = torch.rand(4,3)
rand_tensor_4 = torch.rand(3,4)

In [25]:
rand_tensor_1, rand_tensor_2

(tensor([[0.3126, 0.3791, 0.3087, 0.0736],
         [0.4216, 0.0691, 0.2332, 0.4047],
         [0.2162, 0.9927, 0.4128, 0.5938]]),
 tensor([[0.6128, 0.1519, 0.0453],
         [0.5035, 0.9978, 0.3884],
         [0.6929, 0.1703, 0.1384],
         [0.4759, 0.7481, 0.0361]]))

In [26]:
rand_tensor_3, rand_tensor_4

(tensor([[0.3126, 0.3791, 0.3087],
         [0.0736, 0.4216, 0.0691],
         [0.2332, 0.4047, 0.2162],
         [0.9927, 0.4128, 0.5938]]),
 tensor([[0.6128, 0.1519, 0.0453, 0.5035],
         [0.9978, 0.3884, 0.6929, 0.1703],
         [0.1384, 0.4759, 0.7481, 0.0361]]))

In [29]:
rand_tensor_1.flatten(), rand_tensor_3.flatten()

(tensor([0.3126, 0.3791, 0.3087, 0.0736, 0.4216, 0.0691, 0.2332, 0.4047, 0.2162,
         0.9927, 0.4128, 0.5938]),
 tensor([0.3126, 0.3791, 0.3087, 0.0736, 0.4216, 0.0691, 0.2332, 0.4047, 0.2162,
         0.9927, 0.4128, 0.5938]))

Most Tensor Operations are intuitive. Broadcasting helps pytorch choose the most semantically appropriate operation even when the shapes of the tensors are not matching.

In [31]:
ones_t = torch.ones(3,4)
twos_t = ones_t * 2
fours_t = twos_t**2
halfs_t = fours_t/8
ones_t, twos_t, fours_t, halfs_t

(tensor([[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]),
 tensor([[2., 2., 2., 2.],
         [2., 2., 2., 2.],
         [2., 2., 2., 2.]]),
 tensor([[4., 4., 4., 4.],
         [4., 4., 4., 4.],
         [4., 4., 4., 4.]]),
 tensor([[0.5000, 0.5000, 0.5000, 0.5000],
         [0.5000, 0.5000, 0.5000, 0.5000],
         [0.5000, 0.5000, 0.5000, 0.5000]]))

In [32]:
ones = torch.ones(3,4)
twos = ones*2
threes = ones + twos

In [33]:
ones, twos, threes

(tensor([[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]),
 tensor([[2., 2., 2., 2.],
         [2., 2., 2., 2.],
         [2., 2., 2., 2.]]),
 tensor([[3., 3., 3., 3.],
         [3., 3., 3., 3.],
         [3., 3., 3., 3.]]))

In [35]:
twos_1d_tensor = ones[0]

In [36]:
ones.shape, twos_1d_tensor.shape

(torch.Size([3, 4]), torch.Size([4]))

In [37]:
ones + twos_1d_tensor

tensor([[2., 2., 2., 2.],
        [2., 2., 2., 2.],
        [2., 2., 2., 2.]])

In [42]:
var1 = ones[0].reshape(1, -1)
var2 = ones[0].reshape(4, -1)

In [44]:
ones.shape, var1.shape, ones + var1

(torch.Size([3, 4]),
 torch.Size([1, 4]),
 tensor([[2., 2., 2., 2.],
         [2., 2., 2., 2.],
         [2., 2., 2., 2.]]))

In [46]:
ones.shape, var2.shape

(torch.Size([3, 4]), torch.Size([4, 1]))

In [45]:
ones + var2

RuntimeError: ignored

So, a tensor of shape 4 can be broadcasted and added to the tensor of shape (3,4). Similarly a tensor of shape (1,4) can be added to a tensor of shape (3,4) by broadcasting. Now, a tensor of shape (4,1) although consisting of the same number of elements will not be broadcasted. It also points towards the rule of broadcasting where we start from the end. The shapes need to be consistent according to the operation we are doing(addition requires same dimension size, while multiplication requires mxp, pxn rule) or the dimension is one in which case it will be copied and broadcasted to proceed with the operation, or the dimension simply doesn't exist in which case its the same as having one.