<a href="https://colab.research.google.com/github/vinodkumarreddy/Pytorch-learning/blob/main/Pytorch_Intro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Pytorch Tensors

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F

In [8]:
device = "gpu" if torch.cuda.is_available() else "cpu"

In [9]:
# Creating tensors and default dtypes
zeros_t = torch.zeros(size = (3,4), device = device)
ones_t = torch.ones((2,3))
empty_t = torch.empty(4,3, device = device)
rand_t = torch.rand((100,100), device = device)

In [10]:
zeros_t.shape

torch.Size([3, 4])

In [11]:
zeros_t

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [12]:
zeros_t.dtype

torch.float32

In [13]:
ones_t

tensor([[1., 1., 1.],
        [1., 1., 1.]])

In [14]:
ones_t.dtype

torch.float32

In [20]:
torch.min(rand_t), torch.max(rand_t)

(tensor(1.2100e-05), tensor(0.9994))

When a random seed is set, then the random number generator generator starts to generate the same sequence of random numbers as long as the we are looking from the point from where the seed was set.

In [22]:
torch.manual_seed(1729)
rand_tensor_1 = torch.rand(3,4)
rand_tensor_2 = torch.rand(4,3)

torch.manual_seed(1729)
rand_tensor_3 = torch.rand(4,3)
rand_tensor_4 = torch.rand(3,4)

In [25]:
rand_tensor_1, rand_tensor_2

(tensor([[0.3126, 0.3791, 0.3087, 0.0736],
         [0.4216, 0.0691, 0.2332, 0.4047],
         [0.2162, 0.9927, 0.4128, 0.5938]]),
 tensor([[0.6128, 0.1519, 0.0453],
         [0.5035, 0.9978, 0.3884],
         [0.6929, 0.1703, 0.1384],
         [0.4759, 0.7481, 0.0361]]))

In [26]:
rand_tensor_3, rand_tensor_4

(tensor([[0.3126, 0.3791, 0.3087],
         [0.0736, 0.4216, 0.0691],
         [0.2332, 0.4047, 0.2162],
         [0.9927, 0.4128, 0.5938]]),
 tensor([[0.6128, 0.1519, 0.0453, 0.5035],
         [0.9978, 0.3884, 0.6929, 0.1703],
         [0.1384, 0.4759, 0.7481, 0.0361]]))

In [29]:
rand_tensor_1.flatten(), rand_tensor_3.flatten()

(tensor([0.3126, 0.3791, 0.3087, 0.0736, 0.4216, 0.0691, 0.2332, 0.4047, 0.2162,
         0.9927, 0.4128, 0.5938]),
 tensor([0.3126, 0.3791, 0.3087, 0.0736, 0.4216, 0.0691, 0.2332, 0.4047, 0.2162,
         0.9927, 0.4128, 0.5938]))

Most Tensor Operations are intuitive. Broadcasting helps pytorch choose the most semantically appropriate operation even when the shapes of the tensors are not matching.

In [31]:
ones_t = torch.ones(3,4)
twos_t = ones_t * 2
fours_t = twos_t**2
halfs_t = fours_t/8
ones_t, twos_t, fours_t, halfs_t

(tensor([[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]),
 tensor([[2., 2., 2., 2.],
         [2., 2., 2., 2.],
         [2., 2., 2., 2.]]),
 tensor([[4., 4., 4., 4.],
         [4., 4., 4., 4.],
         [4., 4., 4., 4.]]),
 tensor([[0.5000, 0.5000, 0.5000, 0.5000],
         [0.5000, 0.5000, 0.5000, 0.5000],
         [0.5000, 0.5000, 0.5000, 0.5000]]))

In [32]:
ones = torch.ones(3,4)
twos = ones*2
threes = ones + twos

In [33]:
ones, twos, threes

(tensor([[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]),
 tensor([[2., 2., 2., 2.],
         [2., 2., 2., 2.],
         [2., 2., 2., 2.]]),
 tensor([[3., 3., 3., 3.],
         [3., 3., 3., 3.],
         [3., 3., 3., 3.]]))

In [35]:
twos_1d_tensor = ones[0]

In [36]:
ones.shape, twos_1d_tensor.shape

(torch.Size([3, 4]), torch.Size([4]))

In [37]:
ones + twos_1d_tensor

tensor([[2., 2., 2., 2.],
        [2., 2., 2., 2.],
        [2., 2., 2., 2.]])

In [42]:
var1 = ones[0].reshape(1, -1)
var2 = ones[0].reshape(4, -1)

In [44]:
ones.shape, var1.shape, ones + var1

(torch.Size([3, 4]),
 torch.Size([1, 4]),
 tensor([[2., 2., 2., 2.],
         [2., 2., 2., 2.],
         [2., 2., 2., 2.]]))

In [46]:
ones.shape, var2.shape

(torch.Size([3, 4]), torch.Size([4, 1]))

In [45]:
ones + var2

RuntimeError: ignored

So, a tensor of shape 4 can be broadcasted and added to the tensor of shape (3,4). Similarly a tensor of shape (1,4) can be added to a tensor of shape (3,4) by broadcasting. Now, a tensor of shape (4,1) although consisting of the same number of elements will not be broadcasted. It also points towards the rule of broadcasting where we start from the end. The shapes need to be consistent according to the operation we are doing(addition requires same dimension size, while multiplication requires mxp, pxn rule) or the dimension is one in which case it will be copied and broadcasted to proceed with the operation, or the dimension simply doesn't exist in which case its the same as having one.

In [48]:
batch_size = 100
channels = 5
rand1 = torch.rand(batch_size, channels, 4,5)
rand2 = torch.rand(5,2)
rand3 = torch.matmul(rand1, rand2)
rand4 = rand1 @ rand2

In [49]:
rand1.shape, rand2.shape, rand3.shape, rand4.shape

(torch.Size([100, 5, 4, 5]),
 torch.Size([5, 2]),
 torch.Size([100, 5, 4, 2]),
 torch.Size([100, 5, 4, 2]))

There are many default pytorch mathematical functions which we can explore. These functions also have the added benefit that auto diff is supported where possible.

Building Models

Here is an example of the LENET Model.

In [63]:
nn.Conv2d??

In [78]:
class LeNet(nn.Module):
  def __init__(self):
    super(LeNet, self).__init__()
    self.cnn1 = nn.Conv2d(in_channels = 1, out_channels = 6, kernel_size = 5)
    self.cnn2 = nn.Conv2d(in_channels = 6, out_channels = 16, kernel_size = 5)
    self.fc1 = nn.Linear(16*5*5, 120)
    self.fc2 = nn.Linear(120, 84)
    self.fc3 = nn.Linear(84, 10)

  def forward(self, x):
    h = F.relu(self.cnn1(x))
    h = F.max_pool2d(h, (2,2))

    h = F.relu(self.cnn2(h))
    h = F.max_pool2d(h, 2)

    h = h.reshape(h.shape[0], -1)
    h = self.fc1(h)
    h = F.relu(h)

    h = self.fc2(h)
    h = F.relu(h)

    h = self.fc3(h)
    return h




Generally any parametric based layers like convolution layer or the linear layer are available as a layer object. Purely functional operations like relu, max_pool2d are available as layers as well as functional arguments. The same network can also be defined in a different way using only layer objects.

In [79]:
lenet = LeNet()

In [80]:
test_input = torch.rand(size = (128, 1, 32, 32))
out = lenet(test_input)

In [81]:
out.shape

torch.Size([128, 10])

In [82]:
out

tensor([[ 0.0593, -0.0600, -0.0168,  ...,  0.0660, -0.0722, -0.0179],
        [ 0.0577, -0.0571, -0.0183,  ...,  0.0688, -0.0727, -0.0216],
        [ 0.0564, -0.0623, -0.0171,  ...,  0.0693, -0.0772, -0.0187],
        ...,
        [ 0.0581, -0.0612, -0.0165,  ...,  0.0679, -0.0728, -0.0176],
        [ 0.0565, -0.0597, -0.0179,  ...,  0.0661, -0.0751, -0.0208],
        [ 0.0558, -0.0601, -0.0192,  ...,  0.0637, -0.0739, -0.0178]],
       grad_fn=<AddmmBackward0>)

In [85]:
class LeNetS(nn.Module):
  def __init__(self):
    super(LeNetS, self).__init__()
    self.cnn1 = nn.Conv2d(in_channels = 1, out_channels = 6, kernel_size = 5)
    self.cnn2 = nn.Conv2d(in_channels = 6, out_channels = 16, kernel_size = 5)
    self.fc1 = nn.Linear(16*5*5, 120)
    self.fc2 = nn.Linear(120, 84)
    self.fc3 = nn.Linear(84, 10)
    self.pool_layer = nn.MaxPool2d(kernel_size = 2)
    self.relu = nn.ReLU()
    self.flatten = nn.Flatten()
    self.lenet_layer = nn.Sequential(
        self.cnn1,
        self.relu,
        self.pool_layer,
        self.cnn2,
        self.relu,
        self.pool_layer,
        self.flatten,
        self.fc1,
        self.relu,
        self.fc2,
        self.relu,
        self.fc3
    )

  def forward(self, input_image):
    return self.lenet_layer(input_image)


In [86]:
lenet = LeNetS()
input_images = torch.rand(128, 1, 32, 32)
out = lenet(input_images)

In [87]:
out.shape

torch.Size([128, 10])

In [88]:
out

tensor([[-0.0412, -0.0822, -0.0123,  ...,  0.0042, -0.0949,  0.0609],
        [-0.0401, -0.0873, -0.0146,  ...,  0.0009, -0.0951,  0.0584],
        [-0.0393, -0.0884, -0.0120,  ...,  0.0016, -0.0946,  0.0620],
        ...,
        [-0.0364, -0.0899, -0.0121,  ...,  0.0043, -0.0943,  0.0604],
        [-0.0415, -0.0849, -0.0117,  ...,  0.0055, -0.0924,  0.0539],
        [-0.0366, -0.0862, -0.0126,  ...,  0.0036, -0.0937,  0.0598]],
       grad_fn=<AddmmBackward0>)