# Chapter 12 - Parallelizing Neural Network Training With Pytorch

## PyTorch And Training Performance

### What is PyTorch?

PyTorch is a scalable and multiplatform programming interface for implementing and running machine learning algorithms, including convenience wrappers for deep learning.

Pytorch is built around a computation graph composed of a set of nodes. Each node represents an operation that may have zero or more inputs or outputs. It provides an imperative programing environment that evaluates operations, executes computations and returns concrete values immediately.

Mathematically, tensors can be understood as a generalization of scalars, vectors, matrices and so on. More concretely, a scalar can be defined a rank-0 tensora vector can be defined as a rank-1 tensor, a matrix can be defined as a rank-2 tensor, and matrices stacked in a third dimension can be defined as rank 3 tensors. Tensors in PyTorch are similar ro Numpy's arrays, except that tensors are optimized for automatic differentiation and can run on GPU's.


![Alt text](../images/29.png)

## First Steps With Pytorch

### Creating Tensors In PyTorch

Firstly, we can simply create a tensor from a list or a NumPy array using the
torch.tensor or the torch.from_numpy function as follows:

In [3]:
import torch
import numpy as np
np.set_printoptions(precision=3)

a = [1, 2, 3]
b = np.array([4, 5, 6], dtype=np.int32)
t_a = torch.tensor(a)
t_b = torch.from_numpy(b)
print(t_a)
print(t_b)

tensor([1, 2, 3])
tensor([4, 5, 6], dtype=torch.int32)


Create tensors filled with 1s:

In [4]:
t_ones = torch.ones((2, 3))
t_ones.shape
print(t_ones)

tensor([[1., 1., 1.],
        [1., 1., 1.]])


Create tensor of random values:

In [5]:
rand_tensor = torch.rand(2, 3)
print(rand_tensor)

tensor([[0.6960, 0.0407, 0.1444],
        [0.5592, 0.8668, 0.3437]])


### Manipulating The Data Type and Shape Of a Tensor

In [6]:
t_a_new = t_a.to(torch.int64)
print(t_a_new.dtype)

torch.int64


Certain operations require that the input tensors have a certain number of dimensions (that is, rank) associated with a certain number of elements (shape). Thus, we might need to change the shape of a tensor, add a new dimension, or squeeze an unnecessary dimension. PyTorch provides useful functions (or operations) to achieve this, such as torch.transpose(), torch.reshape(), and torch.squeeze(). Let’s take a look at some examples:

Transposing a tensor:

In [7]:
t = torch.rand(3, 5)
t_tr = torch.transpose(t, 0, 1)
print(t.shape, ' --> ', t_tr.shape)

torch.Size([3, 5])  -->  torch.Size([5, 3])


Reshaping a tensor (for example, from a 1D vector to a 2D array):

In [8]:
t = torch.zeros(30)
t_reshape = t.reshape(5, 6)
print(t_reshape.shape)

torch.Size([5, 6])


Removing the unnecessary dimensions (dimensions that have size 1, which are not needed):

In [12]:
t = torch.zeros(1, 2, 1, 4, 1)
t_sqz = torch.squeeze(t, 2)
print(t.shape, ' --> ', t_sqz.shape)

torch.Size([1, 2, 1, 4, 1])  -->  torch.Size([1, 2, 4, 1])


### Applying Mathematical Operations To Tensors

let’s instantiate two random tensors, one with uniform distribution in the range [–1, 1) and the other with a standard normal distribution:

In [14]:
torch.manual_seed(1)
t1 = 2 * torch.rand(5, 2) - 1
t2 = torch.normal(mean=0, std=1, size=(5, 2))

Now, to compute the element-wise product of t1 and t2, we can use the following:


In [17]:
t3 = torch.multiply(t1, t2)
print(t3)

tensor([[ 0.4426, -0.3114],
        [ 0.0660, -0.5970],
        [ 1.1249,  0.0150],
        [ 0.1569,  0.7107],
        [-0.0451, -0.0352]])


To compute the mean, sum, and standard deviation along a certain axis (or axes), we can use torch.mean(), torch.sum(), and torch.std(). For example, the mean of each column in t1 can be computed as follows:

In [18]:
t4 = torch.mean(t1, axis=0)
print(t4)

tensor([-0.1373,  0.2028])


### Split, Stack and Concatenate Tensors

Assume that we have a single tensor, and we want to split it into two or more tensors. For this, Pytorch provides a convenient torch.chunk() function, which divides an input tensor into a list of equally size tensors. We can determine the desired dimension specified by the *dim* argument. In this case, the total size of the input tensor along the specified dimension must be divisible by the desired number of splits.

Alternatively, we can provide the desired sizes in a list using hte torch.split() function.

Examples:


Providing the number of splits:

In [24]:
torch.manual_seed(1)
t = torch.rand(6)
print(t)

t_splits = torch.chunk(t, 3)
[item.numpy() for item in t_splits]

tensor([0.7576, 0.2793, 0.4031, 0.7347, 0.0293, 0.7999])


[array([0.758, 0.279], dtype=float32),
 array([0.403, 0.735], dtype=float32),
 array([0.029, 0.8  ], dtype=float32)]

Providing the sizes of different splits:

In [27]:
torch.manual_seed(1)
t = torch.rand(5)
print(t)

t_splits = torch.split(t, split_size_or_sections=[3,2])
[item.numpy() for item in t_splits]

tensor([0.7576, 0.2793, 0.4031, 0.7347, 0.0293])


[array([0.758, 0.279, 0.403], dtype=float32),
 array([0.735, 0.029], dtype=float32)]

Sometimes, we are working with multiple tensors and need to concatenate or stack them to create a single tensor. In this case, PyTorch functions such as torch.stack() and torch.cat() come in handy.

In [28]:
A = torch.ones(3)
B = torch.zeros(2)
C = torch.cat([A, B], axis=0)
print(C)

tensor([1., 1., 1., 0., 0.])


In [29]:
A = torch.ones(3)
B = torch.zeros(3)
S = torch.stack([A, B], axis=1)
print(S)

tensor([[1., 0.],
        [1., 0.],
        [1., 0.]])
