PyTorch Basics: Tensors & Gradients

In [1]:
import torch

Tensors

At its core, PyTorch is a library for processing tensors. A tensor is a number, vector, matrix or any n-dimensional array. Let's create a tensor with a single number:

In [2]:
#number 
t1 = torch.tensor(4.)
t1

tensor(4.)

is a shorthand for 4.0. It is used to indicate to Python (and PyTorch) that you want to create a floating point number. We can verify this by checking the dtype attribute of our tensor:

In [3]:
t1.dtype

torch.float32

Let's try creating slightly more complex tensors:

In [5]:
#vector
t2 = torch.tensor([1.,2,3,4])
t2

tensor([1., 2., 3., 4.])

In [6]:
#matrix
t3 = torch.tensor([[5.,6], [7,8], [9,10]])
t3


tensor([[ 5.,  6.],
        [ 7.,  8.],
        [ 9., 10.]])

In [11]:
#3-d array
t4 = torch.tensor([[[11,12,13],
                    [13,14,15]],
                    [[15,16,17],
                    [17,18,19]]])
t4

tensor([[[11, 12, 13],
         [13, 14, 15]],

        [[15, 16, 17],
         [17, 18, 19]]])

Tensors can have any number of dimensions, and different lengths along each dimension. We can inspect the length along each dimension using the .shape property of a tensor.

In [16]:
print(t1)
t1.shape

tensor(4.)


torch.Size([])

In [17]:
print(t2)
t2.shape

tensor([1., 2., 3., 4.])


torch.Size([4])

In [18]:
print(t3)
t3.shape

tensor([[ 5.,  6.],
        [ 7.,  8.],
        [ 9., 10.]])


torch.Size([3, 2])

In [19]:
print(t4)
t4.shape

tensor([[[11, 12, 13],
         [13, 14, 15]],

        [[15, 16, 17],
         [17, 18, 19]]])


torch.Size([2, 2, 3])

Tensor operations and gradients

We can combine tensors with the usual arithmetic operations. Let's look an example:

In [21]:
#create tensors
x = torch.tensor(3.)
w = torch.tensor(4., requires_grad=True)
b = torch.tensor(5., requires_grad=True)

We've created 3 tensors x, w and b, all numbers. w and b have an additional parameter requires_grad set to True. We'll see what it does in just a moment.

Let's create a new tensor y by combining these tensors:

In [23]:
#arthimetic operations
y = w*x+b
y

tensor(17., grad_fn=<AddBackward0>)

As expected, y is a tensor with the value 3 * 4 + 5 = 17. What makes PyTorch special is that we can automatically compute the derivative of y w.r.t. the tensors that have requires_grad set to True i.e. w and b. To compute the derivatives, we can call the .backward method on our result y.

In [24]:
#compute derivates
y.backward()

The derivates of y w.r.t the input tensors are stored in the .grad property of the respective tensors.

In [26]:
# Display gradients
print('dy/dx:', x.grad)
print('dy/dw:', w.grad)
print('dy/db:', b.grad)

dy/dx: None
dy/dw: tensor(3.)
dy/db: tensor(1.)


As expected, dy/dw has the same value as x i.e. 3, and dy/db has the value 1. Note that x.grad is None, because x doesn't have requires_grad set to True.

The "grad" in w.grad stands for gradient, which is another term for derivative, used mainly when dealing with matrices.

Interoperability with Numpy

Numpy is a popular open source library used for mathematical and scientific computing in Python. It enables efficient operations on large multi-dimensional arrays, and has a large ecosystem of supporting libraries:

    Matplotlib for plotting and visualization
    OpenCV for image and video processing
    Pandas for file I/O and data analysis

Instead of reinventing the wheel, PyTorch interoperates really well with Numpy to leverage its existing ecosystem of tools and libraries.

#gradient is used when we used for metrices , derivative is ised when we dealing with numbers

In [None]:
#Here's how we create an array in Numpy:

