## Tensors in Pytorch

Tensors are a specialized data structure thar are very similar to arrays and matrices. In PyTorch, we use tensors to encode the inputs and outputs of a model, as well as the model's paramerters.

In [1]:
import torch
import numpy as np

### Tensor Initialization

Directly from data

Tensors can be created directly from data. 

In [2]:
data = [[1, 2], [3, 4]]
x_data = torch.tensor(data)
print(x_data)

tensor([[1, 2],
        [3, 4]])


From a Numpy Array

In [6]:
np_array = np.array(data)
x_np = torch.from_numpy(np_array)
x_np

tensor([[1, 2],
        [3, 4]])

From another tensor

In [7]:
x_ones = torch.ones_like(x_data) #retains the propery of x_data
print(f"Ones tensor:\n {x_ones} \n")

x_rand = torch.rand_like(x_data, dtype=torch.float) #overrides the properties of data of x_data
print(f"Random Tensor: \n {x_rand} \n")

Ones tensor:
 tensor([[1, 1],
        [1, 1]]) 

Random Tensor: 
 tensor([[0.3176, 0.1556],
        [0.7995, 0.6487]]) 



With random or constant values


In [8]:
shape = (2, 3,)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")

Random Tensor: 
 tensor([[0.0334, 0.2433, 0.8228],
        [0.6627, 0.2783, 0.7356]]) 

Ones Tensor: 
 tensor([[1., 1., 1.],
        [1., 1., 1.]]) 

Zeros Tensor: 
 tensor([[0., 0., 0.],
        [0., 0., 0.]])


Tensor Attributes

In [9]:
tensor = torch.rand(3,4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")


Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


Tensor Operations

In [10]:
# We can move our tensor to the GPU if available

if torch.cuda.is_available():
    tensor = tensor.to('cuda')
    print(f"Device tensor is stored on: {tensor.device}")

Standard numpy indexing and slicing

In [11]:
tensor = torch.ones(4,4)
tensor[:,1] = 0
print(tensor)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


### Joining tensors
You can use `torch.cat` to concatenate  a sequence of tensor along a given dimension.


In [12]:
t1 = torch.cat([tensor, tensor, tensor], dim=1)
print(t1)

tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])


Multiplying tensors

In [13]:
print(f"tensor.mul(tensor) \n {tensor.mul(tensor)} \n")


tensor.mul(tensor) 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 



Matrix multiplication.
The following code computes the matrix multiplication between two tensors.

In [14]:
print(f"tensor.matmul(tensor.T) \n {tensor.matmul(tensor.T)} \n")

tensor.matmul(tensor.T) 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]]) 



In-place operations Operations that have a _ suffix are in-place. For example: x.copy_(y), x.t_(), will change x.

In [15]:
print(tensor, "\n")
tensor.add_(5)
print(tensor)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor([[6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.]])


### Bridge with NumPy array

Tensors on the CPU and NumPy arrays can share their underlying memory locations and changing one will change the other.

### Tensor to NumPy array

In [16]:
t = torch.ones(5)
print(f"t: {t}")
n = t.numpy()
print(f"n: {n}")

t: tensor([1., 1., 1., 1., 1.])
n: [1. 1. 1. 1. 1.]


A change in tensor reflects in the NumPy array.

In [17]:
t.add_(1)
print(f"t: {t}")
print(f"n: {n}")

t: tensor([2., 2., 2., 2., 2.])
n: [2. 2. 2. 2. 2.]


### NumPy array to Tensor

In [18]:
n = np.ones(5)
t = torch.from_numpy(n)

Changes in NumPy array refects in the tensor

In [19]:
np.add(n, 1, out=n)
print(f"t: {t}")
print(f"n: {n}")

t: tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
n: [2. 2. 2. 2. 2.]


## A gentle Introductonto TORCH.AUTOGRAD

`torch.autograd` is PyTorch's automatic differentiation engine that powers neural network training. 

### Background

Neural networks (NNs) are a collection of nested functions that are executed on some input data. These functions are defined by parameters (consisting of weight and biases), which in PyTorch are stored in tensors.

Training a NN happens in two steps:

**Forward Propagation**: In forward prop, the NN makes its best guess about the correct output. It runs the input through each of its function to make the guess.

**Backward Propagation**: In backprop, the NN adjusts its parameters proportionate to the error in its guess. It does this by traversing back from the output, collecting the derivatives of the error with respect to the parameters of the fucntion (gradients) and optimizing the prameters using the gradient descent.

### Usuage of PyTorch

Let's take a look at a single training step. For this example, we load a pre-trained resnet18 model from torchvison

In [20]:
import torch, torchvision
import tornado
model = torchvision.models.resnet18(pretrained=True)
data = torch.rand(1, 3, 64, 64)
labels = torch.rand(1, 1000)

Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /home/prabal/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
100.0%


In [21]:
data

tensor([[[[0.2269, 0.9585, 0.6781,  ..., 0.5887, 0.1503, 0.5902],
          [0.6165, 0.5536, 0.0039,  ..., 0.0595, 0.5201, 0.6334],
          [0.5404, 0.5632, 0.6701,  ..., 0.7347, 0.2987, 0.1757],
          ...,
          [0.8135, 0.2199, 0.6072,  ..., 0.8578, 0.8923, 0.4277],
          [0.1527, 0.3398, 0.5439,  ..., 0.3220, 0.2942, 0.5270],
          [0.1938, 0.2255, 0.6412,  ..., 0.1814, 0.7501, 0.9767]],

         [[0.5612, 0.1894, 0.0931,  ..., 0.3328, 0.9829, 0.0264],
          [0.8432, 0.4211, 0.9694,  ..., 0.4303, 0.6843, 0.9156],
          [0.0419, 0.1815, 0.1262,  ..., 0.4536, 0.8830, 0.6208],
          ...,
          [0.4107, 0.2438, 0.5095,  ..., 0.9527, 0.2785, 0.7902],
          [0.0691, 0.0721, 0.9970,  ..., 0.2509, 0.0963, 0.3182],
          [0.0069, 0.2391, 0.2387,  ..., 0.2372, 0.5071, 0.9100]],

         [[0.5803, 0.9349, 0.7477,  ..., 0.9230, 0.9951, 0.4979],
          [0.6587, 0.9618, 0.3332,  ..., 0.1051, 0.5893, 0.7079],
          [0.5834, 0.4824, 0.3527,  ..., 0

In [22]:
prediction = model(data) #forward pass

In [24]:
loss = (prediction - labels).sum()
loss.backward() # backward passs

In [25]:
optim = torch.optim.SGD(model.parameters(), lr=1e-2, momentum=0.9)

In [26]:
optim.step()

Differentiation in Autograd

In [28]:
import torch

a = torch.tensor([2., 3.], requires_grad=True)
b = torch.tensor([6., 4.], requires_grad=True)


We create another tensor `Q` from `a` and `b`

$Q = 3a^3 - b^2$

In [29]:
Q = 3*a**3 - b**2

Let us assume `a` ad `b` to be parameters of an NN, and Q to be the error. In NN we want gradients of the error w.r.t parameters, i.e.

$\frac{\delta Q}{\delta a} = 9a^2 $
<br/>
$\frac{\delta Q}{\delta b} = -2b $
<br/>
When we call `.backward()` and `Q`, autograd calcuates the gradients and stores them in the respective tensors `.grad` attribute.

We need to explicitly pass a `gradient` argument in `Q.backward()` because it is vector. `gradient` is a tensor of the same shape as `Q`, and it represents the gradient of Q w.r.t to itself, i.e

$\frac{\delta Q}{\delta Q} = 1$

In [31]:
print(a.grad)

None


In [32]:
external_grad = torch.tensor([1., 1.])
Q.backward(gradient=external_grad)

In [35]:
a.grad

tensor([36., 81.])