<a href="https://colab.research.google.com/github/vinodkumarreddy/Pytorch-learning/blob/main/Pytorch_Tensors.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

First Section is on how to create a tensor. We already know all the corresponding functions. Then he discusses on random seeds and how initializing them would mean that the sequence of number generated given a random seed is deterministic. Then he moves on to shapes and like functions

In [None]:
import torch
import torchvision

In [None]:
a_t = torch.rand(size = (3,2))

In [None]:
b_t = torch.ones_like(a_t)

In [None]:
a_t.shape, b_t.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

Moves on to discuss how to convert standard python datatypes like list of lists etc into tensors using the torch.tensor function

Next is on datatypes and type conversions. By default most tensors are of the type float32. We can change the tensor format by specifying the requisite format while creating the tensor or we can also type cast the tensor using the .to function.

In [None]:
a = torch.ones((3,2))

In [None]:
a.dtype

torch.float32

In [None]:
a.to(torch.int32).dtype

torch.int32

In [None]:
a = [[1,1], [2,1]]

In [None]:
a

[[1, 1], [2, 1]]

In [None]:
a_t = torch.tensor(a)

In [None]:
a_t.dtype

torch.int64

One of the cases where the default data type if inferred from the data is when we are converting some existing python datatype to tensor. It seems like the data type is being inferref from the data.

In [None]:
a = [[1,1.1], [2.1,1.2]]

In [None]:
torch.tensor(a).dtype

torch.float32

Generic math operations on tensors

In [None]:
ones = torch.ones((3,4))
twos = ones*2
threes = twos + ones
fours = twos**2
sqrts = twos**0.5

In [None]:
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
twos

tensor([[2., 2., 2., 2.],
        [2., 2., 2., 2.],
        [2., 2., 2., 2.]])

In [None]:
threes

tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]])

In [None]:
fours

tensor([[4., 4., 4., 4.],
        [4., 4., 4., 4.],
        [4., 4., 4., 4.]])

In [None]:
sqrts

tensor([[1.4142, 1.4142, 1.4142, 1.4142],
        [1.4142, 1.4142, 1.4142, 1.4142],
        [1.4142, 1.4142, 1.4142, 1.4142]])

Generally any operation with a scalar is broadcasted to the whole tensor. Similarly element wise operations occur when we are doing these operations between to tensors. There is also broadcasting in these cases as well

In [None]:
# Double broadcasting test
a = torch.ones(size = (3,1))
b = torch.rand((1,2))
a, b

(tensor([[1.],
         [1.],
         [1.]]),
 tensor([[0.0298, 0.9041]]))

In [None]:
a*b

tensor([[0.0298, 0.9041],
        [0.0298, 0.9041],
        [0.0298, 0.9041]])

There are also many math operations supported by pytorch. Need to check if every operation allows for gradient calculations

In [1]:
import torch

Testing out various functions available in pytorch

In [2]:
a_t = torch.rand(size = (2,2)) * 10 - 5

In [5]:
a_t

tensor([[-1.3697,  4.4144],
        [-4.0055, -2.3486]])

In [3]:
torch.abs(a_t)

tensor([[1.3697, 4.4144],
        [4.0055, 2.3486]])

In [6]:
torch.ceil(a_t)

tensor([[-1.,  5.],
        [-4., -2.]])

In [7]:
torch.floor(a_t)

tensor([[-2.,  4.],
        [-5., -3.]])

In [8]:
torch.clamp(a_t, -2, 2)

tensor([[-1.3697,  2.0000],
        [-2.0000, -2.0000]])

In [9]:
import math

In [10]:
math.pi

3.141592653589793

In [11]:
angle_t = torch.tensor([math.pi/4, math.pi/3, math.pi/2, math.pi])

In [12]:
torch.sin(angle_t)

tensor([ 7.0711e-01,  8.6603e-01,  1.0000e+00, -8.7423e-08])

In [14]:
torch.cos(angle_t)

tensor([ 7.0711e-01,  5.0000e-01, -4.3711e-08, -1.0000e+00])

In [15]:
torch.asin(torch.sin(angle_t))

tensor([ 7.8540e-01,  1.0472e+00,  1.5708e+00, -8.7423e-08])

In [16]:
angle_t

tensor([0.7854, 1.0472, 1.5708, 3.1416])

In [17]:
d = torch.tensor([[1., 2.], [3., 4.]])
e = torch.ones(1, 2)  # many comparison ops support broadcasting!
print(torch.eq(d, e))

tensor([[ True, False],
        [False, False]])


In [18]:
d == e

tensor([[ True, False],
        [False, False]])

In [19]:
d >= e

tensor([[True, True],
        [True, True]])

In [20]:
torch.max??

In [21]:
a_t = torch.rand(size = (2,3,2))

In [22]:
a_t

tensor([[[0.0660, 0.9148],
         [0.6369, 0.0686],
         [0.9092, 0.6599]],

        [[0.0113, 0.7694],
         [0.4297, 0.9270],
         [0.5566, 0.8728]]])

In [24]:
torch.max(a_t)

tensor(0.9270)

In [25]:
torch.max(input = a_t, dim = 0)

torch.return_types.max(
values=tensor([[0.0660, 0.9148],
        [0.6369, 0.9270],
        [0.9092, 0.8728]]),
indices=tensor([[0, 0],
        [0, 1],
        [0, 1]]))

In [26]:
torch.max(input = a_t, dim = 1)

torch.return_types.max(
values=tensor([[0.9092, 0.9148],
        [0.5566, 0.9270]]),
indices=tensor([[2, 0],
        [2, 1]]))

In case of converting the predicted values to corresponding class

In [27]:
outputs = torch.rand(size = (100, 20))

In [29]:
output_class_probs, output_classes = torch.max(outputs, dim = 1)

In [31]:
output_classes.shape

torch.Size([100])

In [32]:
torch.mean(a_t, dim = 0).shape

torch.Size([3, 2])

In [33]:
torch.mean??

In [34]:
torch.std(a_t)

tensor(0.3494)

In [35]:
torch.prod(a_t)

tensor(2.6755e-06)

Assigning a tensor to another would not copy the tensor. Instead it just copies the pointer. We would need to explicitly use the clone method to copy a tensor. Clone also looks at the original autograd settings of the tensor. If the original has grad turned on then clone will also have it and any gradient will be propogated to the original while doing backward(). There is also the detach method. While just gets the tensor values without affecting the computational gradient. Does it create a clone?

In [38]:
a = torch.rand(size = (3,4), requires_grad = True)

In [39]:
a

tensor([[0.3308, 0.2392, 0.7425, 0.6823],
        [0.3806, 0.0097, 0.3490, 0.1767],
        [0.9397, 0.1434, 0.4187, 0.0665]], requires_grad=True)

In [40]:
b = a.detach()

In [42]:
assert b is not a

In [44]:
id(a)

139525899083328

In [45]:
id(b)

139525901405856

In [47]:
b[0][1] = 1

In [48]:
a

tensor([[0.3308, 1.0000, 0.7425, 0.6823],
        [0.3806, 0.0097, 0.3490, 0.1767],
        [0.9397, 0.1434, 0.4187, 0.0665]], requires_grad=True)

It seems like detach doesn't copy the contents for the new tensor. It should be a better idea to first detach and then do a clone so that there is no linkage between the newly created tensor and the original tensor

In [49]:
"gpu" if torch.cuda.is_available() else "cpu"

'cpu'

Shape editing for tensors

In [51]:
a = torch.rand(size = (3, 32, 32))

In [52]:
a.shape

torch.Size([3, 32, 32])

In [54]:
a.unsqueeze(0).shape

torch.Size([1, 3, 32, 32])

In [55]:
a.unsqueeze(1).shape

torch.Size([3, 1, 32, 32])

In [56]:
a.unsqueeze(2).shape

torch.Size([3, 32, 1, 32])

In [57]:
a = torch.rand(size = (1, 20))

In [58]:
a.shape

torch.Size([1, 20])

In [59]:
a.squeeze().shape

torch.Size([20])

In [60]:
# Creating a linear regression model using whatever we have learnt till now.

### Creating the synthetic data

In [108]:
import torch

In [109]:
W = [1, 2, 3]
b = [4]

In [110]:
W_t = torch.tensor(W, dtype = torch.float32)
b_t = torch.tensor(b, dtype = torch.float32)

In [111]:
W_t.shape, b_t.shape

(torch.Size([3]), torch.Size([1]))

In [112]:
W_t = W_t.reshape(3, 1)

In [113]:
X_t = torch.rand(size = (100, 3)) * 10 - 5

In [114]:
X_t.dtype

torch.float32

In [115]:
coeff = torch.randint(low = 0, high = 100, size = (100,3))

In [116]:
X_t = X_t * coeff

In [117]:
mean = 0
std = 0.5

In [118]:
ones = torch.ones(size = (100, 1))

In [119]:
errors = torch.normal(ones*mean, ones*std)

In [120]:
y_t = X_t @ W_t + b_t + errors

In [121]:
y_t.shape

torch.Size([100, 1])

In [122]:
X_t.shape, y_t.shape

(torch.Size([100, 3]), torch.Size([100, 1]))

In [123]:
mean = X_t.mean(dim = 0)
std = torch.std(X_t, dim = 0)

In [124]:
X_t.shape, mean.shape, std.shape

(torch.Size([100, 3]), torch.Size([3]), torch.Size([3]))

In [125]:
X_t_norm = (X_t - mean)/std

In [140]:
pW = torch.rand(size = (3,1), requires_grad = True)
pb = torch.rand(size = (1,), requires_grad = True)

In [127]:
y_pred = X_t @ pW + pb
loss = torch.sum((y_t - y_pred)**2)/y_t.shape[0]
loss.backward()

In [128]:
pW.grad

tensor([[ -50654.1094],
        [ -82741.4141],
        [-128655.4375]])

In [130]:
pW.grad.zero_()

tensor([[0.],
        [0.],
        [0.]])

In [132]:
pW.grad

tensor([[0.],
        [0.],
        [0.]])

In [144]:
epochs = 1000
lr = 0.01
for epoch in range(epochs):
  y_pred = X_t_norm @ pW + pb
  loss = torch.sum((y_t - y_pred)**2)/y_t.shape[0]
  loss.backward()
  with torch.no_grad():
    pW -= lr*pW.grad
    pb -= lr*pb.grad
  print(f"Epoch - {epoch + 1}, loss - {loss.item()}")
  # print(f"Gradient of pW - {pW.detach()}")
  # print(f"Gradient of pb - {pb.detach()}")
  # print(f"pW - {pW}")
  # print(f"pb - {pb}")
  # print("*"*100)
  # print("*"*100)
  pW.grad.zero_()
  pb.grad.zero_()

Epoch - 1, loss - 116796.8984375
Epoch - 2, loss - 111769.3125
Epoch - 3, loss - 106958.5625
Epoch - 4, loss - 102355.296875
Epoch - 5, loss - 97950.546875
Epoch - 6, loss - 93735.75
Epoch - 7, loss - 89702.65625
Epoch - 8, loss - 85843.4765625
Epoch - 9, loss - 82150.65625
Epoch - 10, loss - 78617.03125
Epoch - 11, loss - 75235.6953125
Epoch - 12, loss - 72000.09375
Epoch - 13, loss - 68903.9375
Epoch - 14, loss - 65941.203125
Epoch - 15, loss - 63106.11328125
Epoch - 16, loss - 60393.171875
Epoch - 17, loss - 57797.1015625
Epoch - 18, loss - 55312.859375
Epoch - 19, loss - 52935.60546875
Epoch - 20, loss - 50660.73828125
Epoch - 21, loss - 48483.8515625
Epoch - 22, loss - 46400.69921875
Epoch - 23, loss - 44407.23046875
Epoch - 24, loss - 42499.578125
Epoch - 25, loss - 40674.0625
Epoch - 26, loss - 38927.11328125
Epoch - 27, loss - 37255.36328125
Epoch - 28, loss - 35655.55078125
Epoch - 29, loss - 34124.578125
Epoch - 30, loss - 32659.494140625
Epoch - 31, loss - 31257.4296875
Epoc

In [145]:
pW

tensor([[176.0548],
        [352.5266],
        [497.5968]], requires_grad=True)

In [147]:
pb

tensor([31.6278], requires_grad=True)

In [149]:
W_t

tensor([[1.],
        [2.],
        [3.]])

In [151]:
W_t * std.reshape(-1, 1)

tensor([[176.0001],
        [352.5093],
        [497.6636]])