# PyTorch Basics: Tensors and Gradients

---

In [1]:
import torch

# Tensors

At its core, PyTorch is a library for processing tensors. A tensor is a number, vector, matrix, or any n-dimensional array. Let's create a tensor with a single number.

In [3]:
# Number
t1= torch.tensor(5.0)
t1

tensor(5.)

5. is a shorthand for 5.0. It is used to indicate to Python (and PyTorch) that you want to create a floating-point number. We can verify this by checking the dtype attribute of our tensor.

In [4]:
t1.dtype

torch.float32

Let's try to create more complex tensors.

In [5]:
# Vector
t2 = torch.tensor([1., 2, 3, 4])
t2

tensor([1., 2., 3., 4.])

In [6]:
# Matrix
t3 = torch.tensor([[5., 6], 
                   [7, 8], 
                   [9, 10]])
t3

tensor([[ 5.,  6.],
        [ 7.,  8.],
        [ 9., 10.]])

In [7]:
# 3-dimensional array     
t4 = torch.tensor([
    [[11, 12, 13], 
     [13, 14, 15]], 
    [[15, 16, 17], 
     [17, 18, 19.]]])
t4                      # Two 2*3 matrices put into an array

tensor([[[11., 12., 13.],
         [13., 14., 15.]],

        [[15., 16., 17.],
         [17., 18., 19.]]])

Tensors can have any number of dimensions and different lengths along each dimension. We can inspect the length along each dimension using the .shape property of a tensor.

In [21]:
print(t1)
t1.shape

tensor(5.)


torch.Size([])

In [31]:
print(t2)
print('\n')
print('t2 is a vector containing 4 elements')
t2.shape

tensor([1., 2., 3., 4.])


t2 is a vector containing 4 elements


torch.Size([4])

In [30]:
print(t3)
print('\n')
print("t3 is a matrix of 3 rows and 2 columns")
t3.shape

tensor([[ 5.,  6.],
        [ 7.,  8.],
        [ 9., 10.]])


t3 is a matrix of 3 rows and 2 columns


torch.Size([3, 2])

In [32]:
print(t4)  
print('\n')
print('t4 is a tensor with 3 dimensions and each dimension might have a different length')
t4.shape

tensor([[[11., 12., 13.],
         [13., 14., 15.]],

        [[15., 16., 17.],
         [17., 18., 19.]]])


t4 is a tensor with 3 dimensions and each dimension might have a different length


torch.Size([2, 2, 3])

#### To check the length along each dimension, go within the outer most bracket and count the number of elements to get the length along the first dimension. Then go one more bracket in to check the length along the second dimension and so on.

Note that it's not possible to create tensors with an improper shape.

In [33]:
# Matrix
t5 = torch.tensor([[5., 6, 11], 
                   [7, 8], 
                   [9, 10]])
t5

ValueError: expected sequence of length 3 at dim 1 (got 2)

A ValueError is thrown because the lengths of the rows [5., 6, 11] and [7, 8] don't match.

---

# Tensor Operations and Gradients

We can combine tensors with the usual arithmetic operations. Let's look at an example:

In [34]:
# Create tensors.
x = torch.tensor(3.)
w = torch.tensor(4., requires_grad=True)
b = torch.tensor(5., requires_grad=True)
x, w, b

(tensor(3.), tensor(4., requires_grad=True), tensor(5., requires_grad=True))

We've created three tensors: x, w, and b, all numbers. w and b have an additional parameter requires_grad set to True. We'll see what it does in just a moment.

Let's create a new tensor y by combining these tensors.

In [35]:
# Arithmetic operations
y = w * x + b
y

tensor(17., grad_fn=<AddBackward0>)

As expected, y is a tensor with the value 3 * 4 + 5 = 17. What makes PyTorch unique is that we can automatically compute the derivative of y w.r.t. the tensors that have requires_grad set to True i.e. w and b. This feature of PyTorch is called autograd (automatic gradients).

To compute the derivatives, we can invoke the .backward method on our result y.

In [36]:
# Compute derivatives
y.backward()

The derivatives of y with respect to the input tensors are stored in the .grad property of the respective tensors.

In [37]:
# Display gradients
print('dy/dx:', x.grad)
print('dy/dw:', w.grad)
print('dy/db:', b.grad)

dy/dx: None
dy/dw: tensor(3.)
dy/db: tensor(1.)


As expected, dy/dw has the same value as x, i.e., 3, and dy/db has the value 1. Note that x.grad is None because x doesn't have requires_grad set to True.

The "grad" in w.grad is short for gradient, which is another term for derivative. The term gradient is primarily used while dealing with vectors and matrices.

---

# Tensor Functions

Apart from arithmetic operations, the torch module also contains many functions for creating and manipulating tensors. Let's look at some examples.

In [38]:
# Create a tensor with a fixed value for every element
t6 = torch.full((3, 2), 42)
t6



tensor([[42., 42.],
        [42., 42.],
        [42., 42.]])

In [39]:
# Concatenate two tensors with compatible shapes
t7 = torch.cat((t3, t6))
t7

tensor([[ 5.,  6.],
        [ 7.,  8.],
        [ 9., 10.],
        [42., 42.],
        [42., 42.],
        [42., 42.]])

In [40]:
# Compute the sin of each element
t8 = torch.sin(t7)
t8

tensor([[-0.9589, -0.2794],
        [ 0.6570,  0.9894],
        [ 0.4121, -0.5440],
        [-0.9165, -0.9165],
        [-0.9165, -0.9165],
        [-0.9165, -0.9165]])

In [41]:
# Change the shape of a tensor
t9 = t8.reshape(3, 2, 2)
t9

tensor([[[-0.9589, -0.2794],
         [ 0.6570,  0.9894]],

        [[ 0.4121, -0.5440],
         [-0.9165, -0.9165]],

        [[-0.9165, -0.9165],
         [-0.9165, -0.9165]]])

In [42]:
# Get the total number of elements in an input tensor using NUMEL
torch.numel(t9)

12

In [43]:
# Get the index of the max of all elements in a tensor
torch.argmax(t9)

tensor(3)

In [44]:
# Get the index of the min of all elements in a tensor
torch.argmin(t9)

tensor(0)

In [45]:
# Get the max element in a tensor
torch.max(t9)

tensor(0.9894)

In [46]:
# Get the min element in a tensor
torch.min(t9)

tensor(-0.9589)

In [49]:
# Returns the mean of all the elements in an input tensor
torch.mean(t9)

tensor(-0.4353)

In [50]:
# Returns the vector norm or matrix norm of a given tensor
torch.norm(t9)

tensor(2.8132)

In [54]:
# Sorts the elements of the input tensor along a given dimension in ascending order by value.
torch.sort(t9)

torch.return_types.sort(
values=tensor([[[-0.9589, -0.2794],
         [ 0.6570,  0.9894]],

        [[-0.5440,  0.4121],
         [-0.9165, -0.9165]],

        [[-0.9165, -0.9165],
         [-0.9165, -0.9165]]]),
indices=tensor([[[0, 1],
         [0, 1]],

        [[1, 0],
         [0, 1]],

        [[0, 1],
         [0, 1]]]))

In [56]:
# Computes input >= other_input element-wise.
torch.ge(t7,t8)

tensor([[False, False],
        [False, False],
        [False, False],
        [False, False],
        [False, False],
        [False, False]])

In [58]:
# Computes the element-wise maximum of input and other.
torch.max(t7,t8)

tensor([[ 5.,  6.],
        [ 7.,  8.],
        [ 9., 10.],
        [42., 42.],
        [42., 42.],
        [42., 42.]])

---

# Interoperability with Numpy

Numpy is a popular open-source library used for mathematical and scientific computing in Python. It enables efficient operations on large multi-dimensional arrays and has a vast ecosystem of supporting libraries, including:

Pandas for file I/O and data analysis
Matplotlib for plotting and visualization
OpenCV for image and video processing
If you're interested in learning more about Numpy and other data science libraries in Python, check out this tutorial series: https://jovian.ai/aakashns/python-numerical-computing-with-numpy .

Instead of reinventing the wheel, PyTorch interoperates well with Numpy to leverage its existing ecosystem of tools and libraries.

Here's how we create an array in Numpy:

In [59]:
import numpy as np

x = np.array([[1, 2], [3, 4.]])
x

array([[1., 2.],
       [3., 4.]])

We can convert a Numpy array to a PyTorch tensor using torch.from_numpy.

In [60]:
# Convert the numpy array to a torch tensor.
y = torch.from_numpy(x)
y

tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64)

Let's verify that the numpy array and torch tensor have similar data types.



In [61]:
x.dtype, y.dtype

(dtype('float64'), torch.float64)

We can convert a PyTorch tensor to a Numpy array using the .numpy method of a tensor.

In [63]:
# Convert a torch tensor to a numpy array
z = y.numpy()
z

array([[1., 2.],
       [3., 4.]])

The interoperability between PyTorch and Numpy is essential because most datasets you'll work with will likely be read and preprocessed as Numpy arrays.

You might wonder why we need a library like PyTorch at all since Numpy already provides data structures and utilities for working with multi-dimensional numeric data. There are two main reasons:

Autograd: The ability to automatically compute gradients for tensor operations is essential for training deep learning models.

GPU support: While working with massive datasets and large models, PyTorch tensor operations can be performed efficiently using a Graphics Processing Unit (GPU). Computations that might typically take hours can be completed within minutes using GPUs.