In [1]:
import torch

## Tensors

At its core, PyTorch is a library for processing tensors. A tensor is a number, vector, matrix, or any n-dimensional array. Let's create a tensor with a single number.

In [2]:
# Number

t1 = torch.tensor(4.)
t1

tensor(4.)

In [3]:
t1.dtype

torch.float32

In [4]:
# Vector
t2 = torch.tensor([1., 2., 3., 4.])
t2

tensor([1., 2., 3., 4.])

In [5]:
# Matrix
t3 = torch.tensor([[5.,6.], [7.,8.],[9.,10.]])

In [6]:
t3.shape

torch.Size([3, 2])

In [7]:
# 3-d array
t4 = torch.tensor([[[11.,12,13],[14.,15,16]],[[17.,18,19],[20.,21,22]]])
t4

tensor([[[11., 12., 13.],
         [14., 15., 16.]],

        [[17., 18., 19.],
         [20., 21., 22.]]])

In [8]:
t4.shape

torch.Size([2, 2, 3])

# Each element should be of same datatype or the PyTorch would implicitly convert everything to a same datatype and the n-d array should have a regular shape

## Tensor operations and gradients

We can combine tensors with the usual arithmetic operations. Let's look at an example:

In [9]:
# Create tensors.
x = torch.tensor(3.)
w = torch.tensor(4., requires_grad=True)
b = torch.tensor(5., requires_grad=True)
x, w, b

(tensor(3.), tensor(4., requires_grad=True), tensor(5., requires_grad=True))

In [10]:
# Arithmetic operations
y = w * x + b
y

tensor(17., grad_fn=<AddBackward0>)

As expected, `y` is a tensor with the value `3 * 4 + 5 = 17`. What makes PyTorch unique is that we can automatically compute the derivative of `y` w.r.t. the tensors that have `requires_grad` set to `True` i.e. w and b. This feature of PyTorch is called _autograd_ (automatic gradients).

To compute the derivatives, we can invoke the `.backward` method on our result `y`.

In [11]:
# Compute derivatives
y.backward()

The derivatives of `y` with respect to the input tensors are stored in the `.grad` property of the respective tensors.

In [12]:
# Display gradients
print('dy/dx:', x.grad)
print('dy/dw:', w.grad)
print('dy/db:', b.grad)

dy/dx: None
dy/dw: tensor(3.)
dy/db: tensor(1.)


As expected, `dy/dw` has the same value as `x`, i.e., `3`, and `dy/db` has the value `1`. Note that `x.grad` is `None` because `x` doesn't have `requires_grad` set to `True`. 

## Tensor functions

Apart from arithmetic operations, the `torch` module also contains many functions for creating and manipulating tensors. Let's look at some examples.

In [13]:
# Create a tensor with a fixed value for every element
t6 = torch.full((3, 2), 42)
t6

tensor([[42, 42],
        [42, 42],
        [42, 42]])

In [14]:
# Concatenate two tensors with compatible shapes
t7 = torch.cat((t3, t6))
t7

tensor([[ 5.,  6.],
        [ 7.,  8.],
        [ 9., 10.],
        [42., 42.],
        [42., 42.],
        [42., 42.]])

In [15]:
# Compute the sin of each element
t8 = torch.sin(t7)
t8

tensor([[-0.9589, -0.2794],
        [ 0.6570,  0.9894],
        [ 0.4121, -0.5440],
        [-0.9165, -0.9165],
        [-0.9165, -0.9165],
        [-0.9165, -0.9165]])

In [16]:
t8.shape

torch.Size([6, 2])

In [17]:
# Change the shape of a tensor
t9 = t8.reshape(3, 2, 2)
t9

tensor([[[-0.9589, -0.2794],
         [ 0.6570,  0.9894]],

        [[ 0.4121, -0.5440],
         [-0.9165, -0.9165]],

        [[-0.9165, -0.9165],
         [-0.9165, -0.9165]]])

## Interoperability with Numpy

In [18]:
import numpy as np

In [19]:
x = np.array([[1, 2], [3, 4.]])
x

array([[1., 2.],
       [3., 4.]])

We can convert a Numpy array to a PyTorch tensor using `torch.from_numpy`.

In [20]:
# Uses the same space in the memory and stores in place
y = torch.from_numpy(x)
y

tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64)

In [21]:
# Creates a copy of the np array and stores it as a new tensor
y_2 = torch.tensor(x)
y_2

tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64)

In [22]:
x.dtype, y.dtype

(dtype('float64'), torch.float64)

In [23]:
# Convert a torch tensor to a numpy array
z = y.numpy()
z

array([[1., 2.],
       [3., 4.]])

The interoperability between PyTorch and Numpy is essential because most datasets we work with will likely be read and preprocessed as Numpy arrays.

Then why we need a library like PyTorch at all since Numpy already provides data structures and utilities for working with multi-dimensional numeric data. There are two main reasons:

1. **Autograd**: The ability to automatically compute gradients for tensor operations is essential for training deep learning models.
2. **GPU support**: While working with massive datasets and large models, PyTorch tensor operations can be performed efficiently using a Graphics Processing Unit (GPU). Computations that might typically take hours can be completed within minutes using GPUs.

We'll leverage both these features of PyTorch extensively.

# Numpy is not implemented for matrix operations in GPU

In [24]:
x_1 = torch.tensor([[3.],[4.]])
w_1 = torch.tensor([[4.,3],[3.,4]], requires_grad=True)
b_1 = torch.tensor([[5.], [6.]], requires_grad=True)

In [25]:
y_1 = w_1 * x_1 + b_1

In [26]:
# y_1.backward()

In [27]:
# Jacobian of a tensor in accordance to a function
def exp_reducer(x):
    return x.exp().sum(dim=1)
inputs = torch.rand(2, 2)
torch.autograd.functional.jacobian(exp_reducer, inputs)

tensor([[[1.2546, 1.5370],
         [0.0000, 0.0000]],

        [[0.0000, 0.0000],
         [1.0802, 2.3908]]])

In [28]:
z_1 = torch.rand(2, 2)
z_1.exp().sum(dim=1)

tensor([3.2079, 4.2718])

In [29]:
tensor = torch.tensor([[[11.,12,13],[14.,15,16]],[[17.,18,19],[20.,21,22]]])
three_d_array = np.array([[[11.,12,13],[14.,15,16]],[[17.,18,19],[20.,21,22]]])

In [30]:
a = tensor[0][0]
b = three_d_array[0][0]

In [31]:
print(a, b)

tensor([11., 12., 13.]) [11. 12. 13.]


In [32]:
a.dtype, b.dtype

(torch.float32, dtype('float64'))

In [33]:
a.shape

torch.Size([3])

In [34]:
b.shape      # is not a vector anymore. So every matrix and vector is a tensor but not every tensor in a matrix or vector

(3,)

In [35]:
type(a)

torch.Tensor

In [36]:
type(b)

numpy.ndarray

In [37]:
t_20 = torch.tensor([[1., -1.], [1., 1.]], requires_grad=True)
out = t_20.pow(2).sum()
out.backward()
t_20.grad

tensor([[ 2., -2.],
        [ 2.,  2.]])