## Tensor
-----

In PyTorch, a tensor is a multi-dimensional array that can be used for various mathematical operations. It is similar to NumPy's ndarray, but with additional capabilities that make it suitable for deep learning tasks and GPU acceleration.

A tensor can have different dimensions: 0-dimensional tensor (scalar), 1-dimensional tensor (vector), 2-dimensional tensor (matrix), and higher-dimensional tensors. Each dimension is referred to as an axis.

**PyTorch tensors have several important properties:**

- **Data type:** Tensors can store data of different types, such as float, double, int, long, and others. The data type is specified during tensor creation and affects the precision and memory usage of the tensor.

- **Device:** Tensors can be stored and processed on different devices, such as CPU or GPU. By default, tensors are created on the CPU, but they can be easily moved to the GPU for accelerated computations.

- **Autograd support:** PyTorch tensors have built-in support for automatic differentiation, which is crucial for training neural networks. The autograd package keeps track of operations performed on tensors and allows for automatic computation of gradients.

In [1]:
import torch

In [2]:
x = torch.empty(3) #1d Tensor
print(x)

tensor([9.2755e-39, 1.0561e-38, 1.0653e-38])


In [3]:
x = torch.empty(2,3) #2d Tensor
print(x)

tensor([[-1.3635e-05,  6.8944e-43, -1.3635e-05],
        [ 6.8944e-43, -2.1595e-05,  6.8944e-43]])


In [4]:
x = torch.empty(2,2, 2) #3d Tensor
print(x)

tensor([[[-2.1772e-05,  6.8944e-43],
         [-2.1772e-05,  6.8944e-43]],

        [[-2.1764e-05,  6.8944e-43],
         [-2.1764e-05,  6.8944e-43]]])


In [5]:
x = torch.rand(2) #random numbers
print(x)

tensor([0.8155, 0.2555])


In [6]:
x = torch.zeros(2,2)
#all the values are zero when the size of the matrix is 2x2
print(x)

tensor([[0., 0.],
        [0., 0.]])


In [7]:
x = torch.ones(2,2)
#all the values are 1 when the size of the matrix is 2x2
print(x)

tensor([[1., 1.],
        [1., 1.]])


In [8]:
x = torch.ones(2,2)
print(x.dtype)

torch.float32


In [9]:
x = torch.ones(2,2, dtype=torch.int) #int/ double / float16
print(x.dtype)

torch.int32


In [10]:
print(x.size())

torch.Size([2, 2])


In [11]:
x=torch.tensor([2.5, 0.1])
x #another way to create tensor

tensor([2.5000, 0.1000])

In [12]:
x = torch.rand(2,2)
y = torch.rand(2,2)

In [13]:
print(x)

tensor([[0.4818, 0.9015],
        [0.0117, 0.3570]])


In [14]:
print(y)

tensor([[0.0231, 0.3410],
        [0.6202, 0.0633]])


In [15]:
z= x+y
z

tensor([[0.5049, 1.2425],
        [0.6319, 0.4202]])

In [16]:
z = torch.add(x,y)
z

tensor([[0.5049, 1.2425],
        [0.6319, 0.4202]])

In [17]:
y.add_(x) #here add_ 
y

tensor([[0.5049, 1.2425],
        [0.6319, 0.4202]])

In [18]:
z= x-y
z

tensor([[-0.0231, -0.3410],
        [-0.6202, -0.0633]])

In [19]:
z = torch.sub(x,y)
z

tensor([[-0.0231, -0.3410],
        [-0.6202, -0.0633]])

In [20]:
z = torch.mul(x,y)
z

tensor([[0.2432, 1.1201],
        [0.0074, 0.1500]])

In [21]:
z= x/y
z

tensor([[0.9543, 0.7256],
        [0.0186, 0.8495]])

In [22]:
#Slicing
x= torch.rand(5,3)
x

tensor([[0.0567, 0.2834, 0.6829],
        [0.4570, 0.8197, 0.6981],
        [0.5545, 0.1712, 0.5145],
        [0.9089, 0.6889, 0.5640],
        [0.4665, 0.1205, 0.0314]])

In [23]:
print(x[:,0])

tensor([0.0567, 0.4570, 0.5545, 0.9089, 0.4665])


In [24]:
print(x[1,0])

tensor(0.4570)


In [25]:
print(x[1,1].item())

0.8197047114372253


In [26]:
#Reshaping Tensor
x=torch.rand(4,4)
x

tensor([[0.0478, 0.8703, 0.4500, 0.4190],
        [0.2832, 0.3381, 0.1201, 0.5882],
        [0.0512, 0.9625, 0.3585, 0.0110],
        [0.5243, 0.5050, 0.7169, 0.1568]])

In [27]:
y = x.view(16)
y

tensor([0.0478, 0.8703, 0.4500, 0.4190, 0.2832, 0.3381, 0.1201, 0.5882, 0.0512,
        0.9625, 0.3585, 0.0110, 0.5243, 0.5050, 0.7169, 0.1568])

In [28]:
y = x.view(-1,8)
y

tensor([[0.0478, 0.8703, 0.4500, 0.4190, 0.2832, 0.3381, 0.1201, 0.5882],
        [0.0512, 0.9625, 0.3585, 0.0110, 0.5243, 0.5050, 0.7169, 0.1568]])

In [29]:
y.size()

torch.Size([2, 8])

In [30]:
#Numpy to Torch
import numpy as np

In [31]:
a = torch.ones(5) # a is a Tensor
a

tensor([1., 1., 1., 1., 1.])

In [32]:
b = a.numpy() #Here b is a numpy array
b

array([1., 1., 1., 1., 1.], dtype=float32)

***If we run this code in CPU then the tensor would store in the same memory while running on GPU can allocate memory separetely. So we should keep in mind that when we update the Tensor.***

In [33]:
a.add_(1)
print(a)
print(b)

tensor([2., 2., 2., 2., 2.])
[2. 2. 2. 2. 2.]


***As we see from the output, we just added 1 to the tensor a, but both a and b have been updated as b was copied from a and they were stored in the same memory location, because this Tensor was run on CPU.***

In [34]:
a

tensor([2., 2., 2., 2., 2.])

In [35]:
b

array([2., 2., 2., 2., 2.], dtype=float32)

In [36]:
#It says that the tensor needs to calculate gradient
x = torch.ones(5, requires_grad= True) 
x

tensor([1., 1., 1., 1., 1.], requires_grad=True)

## Torch AutoGrad
-----
In PyTorch, the torch.autograd module provides automatic differentiation capabilities, which is a fundamental component of deep learning frameworks. Automatic differentiation allows you to compute gradients of tensors with respect to other tensors, enabling efficient and convenient implementation of gradient-based optimization algorithms, such as backpropagation.

The torch.autograd module in PyTorch defines a class called Variable (prior to PyTorch version 0.4.0) or Tensor (starting from PyTorch version 0.4.0), which represents a tensor and tracks its operations to compute gradients automatically. When you perform operations on these variables/tensors, PyTorch builds a computational graph, which is a record of the operations applied to the variables.

The computational graph is used to perform backpropagation and compute gradients. Gradients are essentially derivatives, and they represent the rate of change of a tensor with respect to other tensors in the graph. The gradients are computed using the chain rule of calculus, which allows efficient calculation of gradients for complex computational graphs.

By utilizing torch.autograd, you can calculate gradients automatically without explicitly deriving and implementing them yourself. This feature greatly simplifies the process of training neural networks, as you can focus on designing the network architecture and defining the forward pass, while PyTorch takes care of computing the gradients and updating the network's parameters during the backward pass.

In [37]:
import torch
import torchvision
import torch.nn as nn
import numpy as np
import torchvision.transforms as transforms

import warnings as wr
wr.filterwarnings('ignore')

In [38]:
x = torch.tensor(1.,requires_grad = True)
w = torch.tensor(2.,requires_grad = True)
b = torch.tensor(3.,requires_grad = True)

**Build a computational Graph:**
y = w * x + b,  y = 2 * x +3

In [39]:
#Computer Gradient
y = w * x + b
y.backward()

In [40]:
print(x.grad) #x.grad = 2
print(w.grad) #x.grad = 1
print(b.grad) #x.grad = 1

tensor(2.)
tensor(1.)
tensor(1.)


In [41]:
# Build a 2 layer computational graph
x = torch.tensor(1., requires_grad = True)
w1 = torch.tensor(2., requires_grad = True)
b1 = torch.tensor(3., requires_grad = True)

w2 = torch.tensor(4., requires_grad = True)
b2 = torch.tensor(5., requires_grad = True)

In [42]:
layer_1 = w1 * x + b
layer_2 = w2 * layer_1 + b

layer_2.backward()
print(x.grad, w1.grad, w2.grad, b1.grad, b2.grad)

tensor(8.) tensor(4.) tensor(5.) None None


#### Train a Neural Network

In [43]:
#Create tensors pf shape (10,3) and (10,2)
feat_dim = 5 
x = torch.randn(10, feat_dim)
y = torch.randn(10,2)

print('x.shape: {}'.format(x.shape))
print('y: {}, y.shape: {}'.format(y, y.shape))

x.shape: torch.Size([10, 5])
y: tensor([[ 0.4620,  1.3333],
        [ 0.2411,  0.4739],
        [ 0.3680,  1.1487],
        [ 0.1500,  0.4983],
        [-0.2761,  0.5017],
        [-0.2714, -0.3496],
        [-0.3796, -0.0810],
        [-1.3155, -0.1073],
        [-1.3341,  0.6571],
        [-1.3604,  1.4133]]), y.shape: torch.Size([10, 2])


In [44]:
# Build a fully connected layer

linear = nn.Linear(feat_dim, 2)
print('w: {} w.shape: {}'.format(linear.weight, linear.weight.shape))
print('b: {} b.shape: {}'.format(linear.bias, linear.bias.shape))

w: Parameter containing:
tensor([[ 0.3820,  0.1718, -0.1179,  0.1841, -0.2726],
        [-0.1399,  0.1207, -0.3922, -0.0857, -0.3800]], requires_grad=True) w.shape: torch.Size([2, 5])
b: Parameter containing:
tensor([ 0.2675, -0.1812], requires_grad=True) b.shape: torch.Size([2])


In [45]:
# Build loss function and optimizer
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(linear.parameters(), lr = 0.01)

In [46]:
# Forward Pass
pred = linear(x)
print('pred.shape: {}'.format(pred.shape))

pred.shape: torch.Size([10, 2])


In [47]:
# Compute Loss:
loss= criterion (pred, y)
print('loss:', loss.item())

loss: 1.40336012840271


In [48]:
# Backward pass
loss.backward()

In [49]:
# Print out the gradients
print('dL/dw: ', linear.weight.grad)
print('dL/dw: ', linear.bias.grad)

dL/dw:  tensor([[ 0.3819,  0.3409, -0.1182,  0.3733, -0.3276],
        [-0.4214,  0.0787, -0.6373, -0.4869, -0.6445]])
dL/dw:  tensor([ 0.9001, -0.7659])


In [50]:
# 1-step gradient descent
optimizer.step()

In [51]:
# Performing gradient descent at the low level
# w = w - alpha * (dl/dw) #Update the weights
# b = b- alpha * (dl/db) #Update the bias
linear.weight.data = linear.weight.data - (0.01*linear.weight.grad.data)
linear.bias.data = linear.bias.data - (0.01*linear.bias.grad.data)

In [52]:
# Printing loss after 1 - step gradient descent
Pred = linear(x)
loss = criterion(pred, y )
print('Loss after 1-step optimization: ', loss.item())

Loss after 1-step optimization:  1.40336012840271
