# PyTorch

PyTorch is an open source optimized tensor library for deep learning using GPUs and CPUs. It is a deep learning framework that provides a whole stack to preprocess data, model data and deploy the models in cloud. PyTorch leverages <b>CUDA</b> (Compute Unified Device Architecture) which is an API (Application Programming Interface) developed by Nvidia for general computing on GPUs (Graphical Processing Units). This enables us to accelerate the modelling process due to faster computations because CUDA enables us to run the computations on GPU (if available).

In [53]:
import torch
import numpy as np
torch.__version__

'2.0.0+cpu'

## What is a Tensor?

Tensors are simply arrays that are used to represent data.

A scalar is a single number, and in tensor terms it is a zero dimension tensor.

In [54]:
scalar = torch.tensor(1)
scalar

tensor(1)

Dimensions of a tensor

In [55]:
scalar.ndim

0

Converting a tensor to a python number

In [56]:
scalar.item()

1

A vector is a single dimension tensor

In [57]:
# Vector
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [58]:
vector.ndim

1

A matrix is a 2 dimensional tensor.

In [59]:
# Matrix
M = torch.tensor([[7, 8],
                       [9, 10]])
M

tensor([[ 7,  8],
        [ 9, 10]])

Creation of a tensor

In [60]:
TENSOR = torch.tensor([[[1,2,3],
                        [4,5,6],
                        [7,8,9]]])

In [61]:
TENSOR.shape

torch.Size([1, 3, 3])

Random Tensors

In [62]:
random_tensor = torch.rand(3,4)
random_tensor

tensor([[0.9846, 0.2143, 0.0532, 0.5845],
        [0.8787, 0.9650, 0.8950, 0.4201],
        [0.5838, 0.5235, 0.3100, 0.6545]])

Random tensor with similar shape to an image tensor. $1^{st}$ dimension is the color channels, $2^{nd}$ dimension is the height and $3^{rd}$ dimension is the width.

In [63]:
rand_image_tensor = torch.rand(size = (3,224,224))
rand_image_tensor

tensor([[[0.3655, 0.5782, 0.0167,  ..., 0.1865, 0.9490, 0.8858],
         [0.7656, 0.8645, 0.9024,  ..., 0.6482, 0.5608, 0.7447],
         [0.9062, 0.2410, 0.2361,  ..., 0.5147, 0.3051, 0.6544],
         ...,
         [0.6222, 0.8142, 0.9643,  ..., 0.9518, 0.1033, 0.7665],
         [0.8218, 0.3474, 0.2778,  ..., 0.7294, 0.9846, 0.3166],
         [0.9428, 0.2639, 0.8355,  ..., 0.3922, 0.1428, 0.1607]],

        [[0.1120, 0.4808, 0.2404,  ..., 0.8720, 0.2212, 0.0922],
         [0.2276, 0.9472, 0.9695,  ..., 0.5663, 0.3563, 0.5173],
         [0.0052, 0.2633, 0.9020,  ..., 0.1512, 0.7038, 0.8533],
         ...,
         [0.5751, 0.7068, 0.1659,  ..., 0.8836, 0.5984, 0.8448],
         [0.8209, 0.6645, 0.1052,  ..., 0.7706, 0.3336, 0.5566],
         [0.4155, 0.5257, 0.0267,  ..., 0.4119, 0.5015, 0.8692]],

        [[0.3669, 0.1200, 0.3952,  ..., 0.8025, 0.1914, 0.1781],
         [0.3584, 0.2715, 0.5946,  ..., 0.2086, 0.5107, 0.2603],
         [0.0972, 0.7708, 0.0934,  ..., 0.2782, 0.0499, 0.

Zeros and ones

In [64]:
zero_tensor = torch.zeros(size = (3,4))
zero_tensor

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [65]:
one_tensor = torch.ones(size = (3,4))
one_tensor

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

Default datatype in pytorch is float32

In [66]:
one_tensor.dtype

torch.float32

Creating a range of tensors and tensors-like

In [67]:
range_tensor = torch.arange(0,10)
range_tensor

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [68]:
# creating tensors-like
torch.zeros_like(range_tensor)

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

Three most important parmeters when creating tensors are <b>dtype, device</b> and <b>requires_grad</b>.

1. Some datatypes are specific for GPUs and some are specific for CPUs. Generally if you see torch.cuda anywhere, the tensor is being used for GPU (since Nvidia GPUs use a computing toolkit called CUDA).
The different types of bits for datatype has to do with th eprecision of the value.The higher the precision value (8, 16, 32), the more detail and hence data used to express a number.
This matters in deep learning and numerical computing because you're making so many operations, the more detail you have to calculate on, the more compute you have to use.
So lower precision datatypes are generally faster to compute on but sacrifice some performance on evaluation metrics like accuracy (faster to compute but less accurate).

2. The argument device refers to what device the tensor is saved on. If one of your tensors is on the CPU and the other is on the GPU, you get an error when you perform a computation on them.

3. requires_grad refers to whether or not to track gradients with the tensor operations

In [69]:
x_tensor = torch.tensor([1.0,2.0,3.0],
             dtype = torch.float32,
             device = 'cpu',
             requires_grad=False,)
print('Datatype of the tensor is ',x_tensor.dtype)
print('Device the tensor is save on is ',x_tensor.device)

Datatype of the tensor is  torch.float32
Device the tensor is save on is  cpu


In [70]:
# converting dtype of a tensor
x16_tensor = x_tensor.type(torch.float16)
x16_tensor.dtype

torch.float16

## Basic operations

In [71]:
# element wise multiplication
torch.mul(x16_tensor,10)
x16_tensor * 10

tensor([10., 20., 30.], dtype=torch.float16)

In [72]:
torch.subtract(x16_tensor,2)
x16_tensor - 2

tensor([-1.,  0.,  1.], dtype=torch.float16)

In [73]:
torch.add(x16_tensor,3)
x16_tensor + 3

tensor([4., 5., 6.], dtype=torch.float16)

In [74]:
torch.divide(x16_tensor,2)
x16_tensor/2

tensor([0.5000, 1.0000, 1.5000], dtype=torch.float16)

In [75]:
rand_tensor = torch.rand(size = (3,3))
# you can use @ to perform matrix multiplication
rand_tensor @ rand_tensor
# you can use the predefined method
torch.matmul(rand_tensor,rand_tensor)
# mm is short form of matmul
torch.mm(rand_tensor,rand_tensor)

tensor([[0.3061, 0.3351, 0.6323],
        [0.4728, 0.4213, 0.3691],
        [0.5618, 1.2554, 1.1009]])

There are several other functions and methods which are more or less similar to the ones on numpy.

## PyTorch and NumPy

Transform numpy array to pytorch tensor

In [76]:
arr = np.arange(1.0,10.0)
arr_tensor = torch.from_numpy(arr)
arr, arr_tensor

(array([1., 2., 3., 4., 5., 6., 7., 8., 9.]),
 tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=torch.float64))

In [77]:
arr.dtype, arr_tensor.dtype

(dtype('float64'), torch.float64)

Tensor to numpy array

In [78]:
tensor_numpy = arr_tensor.numpy()
tensor_numpy

array([1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [79]:
tensor_numpy.dtype

dtype('float64')

Setting a random seed. When using a jupyter notebook, you shall set random seed everytime you use random.

In [80]:
torch.manual_seed(42)

<torch._C.Generator at 0x1f0916c5090>

## Accessing a GPU

In [81]:
# configuration of the GPU
!nvidia-smi

'nvidia-smi' is not recognized as an internal or external command,
operable program or batch file.


Check for GPU access

In [82]:
# if False it means GPU is not available
torch.cuda.is_available()

False

Its not likely that you always have access to a GPU. So we can set a device agnostic code as below

In [83]:
# setup device agnostic code
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cpu'

In [84]:
# count number of devices
torch.cuda.device_count()

0

### Putting Tensors and Models on the GPUs

In [85]:
cpu_tensor = torch.tensor([1,2,3], device='cpu')
cpu_tensor.device

device(type='cpu')

In [86]:
# use 'to' to change the device. incase of GPU availability it shows cuda with the index of the GPU used
cpu_tensor.to(device)
cpu_tensor

tensor([1, 2, 3])

To convert a tensor on GPU to numpy is not possible. The device has to be changed to CPU and then converted to numpy array.

In [87]:
cpu_tensor.cpu().numpy()

array([1, 2, 3], dtype=int64)

## Preparing and converting dataset

The <b>nn</b> module of torch provides all the building blocks for neural networks.

In [88]:
from torch import nn

In [89]:
# subclass the nn.module class that contains all the building blocks required
class LinearRegression(nn.Module):
    def __init__(self, *args, **kwargs) -> None:
        super().__init__(*args, **kwargs)
        self.weights = nn.Parameter(torch.randn(1,requires_grad=True,dtype=torch.float))
        self.bias = nn.Parameter(torch.randn(1,requires_grad=True,dtype=torch.float))

    def forward(self,x: torch.Tensor):
        return x*self.weights + self.bias

In [90]:
torch.manual_seed(42)
la = LinearRegression()
list(la.parameters())

[Parameter containing:
 tensor([0.3367], requires_grad=True),
 Parameter containing:
 tensor([0.1288], requires_grad=True)]

In [91]:
# list of named parameters
la.state_dict()

OrderedDict([('weights', tensor([0.3367])), ('bias', tensor([0.1288]))])