The figure below describes a typical workflow along with the important modules associated with each step.
<img src='./figs/pytorch.png'>

### 1. Data loading and handling

**torch.utils.data** includes two important classes in this module, which are Dataset and DataLoader.

* Dataset is built on top of Tensor data type and is used primarily for custom datasets.
* DataLoader is used when you have a large dataset and you want to load data from a Dataset in background so that it’s ready and waiting for the training loop.

We can also use torch.nn.DataParallel and torch.distributed if we have access to multiple machines or GPUs.

### 2. Building neural network

The **torch.nn** module is used for creating Neural Networks. It provides all the common neural network layers like fully connected layers, convolutional layers, activation and loss functions etc.

**torch.optim** module provides techniques to update the weights and biases. Similarly, for automatic differentiation which is required during backward pass, we use the **torch.autograd** module.

### 3. Model Inference & Compatibility

After the model has been trained, it can be used to predict output for test cases or even new datasets. This process is referred to as **model inference**.

PyTorch also provides **TorchScript** which can be used to run models independently from a Python runtime. This can be thought of as a Virtual Machine with instructions mainly specific to Tensors.

We can also convert model trained using PyTorch into formats like **ONNX**, which allow you to use these models in other DL frameworks such as MXNet, CNTK, Caffe2. We can also convert onnx models to Tensorflow.

#### Introduction to Tensors

The core data structure used in PyTorch is Tensors, which are simply matrices. A scalar value is represented by a 0-dimensional Tensor. Similarly a column/row matrix using a 1-D Tensor and so on. 

<img src='./figs/tensor.png'>

#### Creating tensors

In [3]:
import torch

In [10]:
a = torch.ones(3, 3)
print(a)

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])


In [11]:
b = torch.zeros(3, 3)
print(b)

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])


In [12]:
c = torch.tensor([[1, 1, 1], [2, 2, 2]])
print(c)

tensor([[1, 1, 1],
        [2, 2, 2]])


In [13]:
a.shape, b.shape, c.shape

(torch.Size([3, 3]), torch.Size([3, 3]), torch.Size([2, 3]))

#### Accessing an element in tensors

In [16]:
c[1, 1]

tensor(2)

In [17]:
c[:, 2]

tensor([1, 2])

#### Specifying data type of elements

Whenever we create a tensor, PyTorch decides the data type of the elements of the tensor such that the data type can cover <U>all the elements of the tensor</U>.

In [19]:
int_tensor = torch.tensor([[1,2,3],[4,5,6]])
print(int_tensor.dtype)

# torch.int64

# What if we changed any one element to floating point number?
int_tensor = torch.tensor([[1,2,3],[4.,5,6]])
print(int_tensor.dtype)

# torch.float32

print(int_tensor)

# tensor([[1., 2., 3.],
#        [4., 5., 6.]])


# This can be overridden as follows
int_tensor = torch.tensor([[1,2,3],[4.,5,6]], dtype=torch.int32)
print(int_tensor.dtype)

# torch.int32

print(int_tensor)

# tensor([[1, 2, 3],
#        [4, 5, 6]], dtype=torch.int32)


torch.int64
torch.float32
tensor([[1., 2., 3.],
        [4., 5., 6.]])
torch.int32
tensor([[1, 2, 3],
        [4, 5, 6]], dtype=torch.int32)


#### Tensor to/from NumPy Array

In [20]:
import numpy as np

In [22]:
f_np = c.numpy()
print(f_np)

[[1 1 1]
 [2 2 2]]


In [23]:
d = np.array([[1, 1, 1, 1], [2, 2, 2, 2]])
d_tensor = torch.from_numpy(d)
print(d_tensor)

tensor([[1, 1, 1, 1],
        [2, 2, 2, 2]])


#### Arithmetic Operations on Tensors

In [27]:
tensor1 = torch.tensor([[1, 2, 3], [4, 5, 6]])
tensor2 = torch.tensor([[-1, -1, -1], [-2, -2, -2]])

In [28]:
print(tensor1 + tensor2)
print(torch.add(tensor1, tensor2))

tensor([[0, 1, 2],
        [2, 3, 4]])
tensor([[0, 1, 2],
        [2, 3, 4]])


In [29]:
print(tensor1 - tensor2)
print(torch.sub(tensor1, tensor2))

tensor([[2, 3, 4],
        [6, 7, 8]])
tensor([[2, 3, 4],
        [6, 7, 8]])


In [33]:
print(tensor1*2)
print(tensor1*tensor2) # elementwise 

tensor([[ 2,  4,  6],
        [ 8, 10, 12]])
tensor([[ -1,  -2,  -3],
        [ -8, -10, -12]])


In [31]:
tensor3 = torch.tensor([[1, 1], [0, 0], [1, 2]])
print(torch.mm(tensor1, tensor3))

tensor([[ 4,  7],
        [10, 16]])


In [34]:
print(tensor1/2)
print(tensor1/tensor2) # elementwise 

tensor([[0, 1, 1],
        [2, 2, 3]])
tensor([[-1, -2, -3],
        [-2, -2, -3]])


#### CPU v/s GPU Tensor

PyTorch has different implementation of Tensor for CPU and GPU. Every tensor can be converted to GPU in order to perform massively parallel, fast computations. All operations that will be performed on the tensor will be carried out using GPU-specific routines that come with PyTorch.


In [35]:
tensor_cpu = torch.tensor([[1., 2.], [3., 4.], [5., 6.]], device='cpu')
tensor_gpu = torch.tensor([[1., 2.], [3., 4.], [5., 6.]], device='cuda')

AssertionError: 
The NVIDIA driver on your system is too old (found version 9020).
Please update your GPU driver by downloading and installing a new
version from the URL: http://www.nvidia.com/Download/index.aspx
Alternatively, go to: https://pytorch.org to install
a PyTorch version that has been compiled with your version
of the CUDA driver.

In [38]:
# This uses CPU RAM
tensor_cpu = tensor_cpu * 5

# This uses GPU RAM
# Focus on GPU RAM Consumption
tensor_gpu = tensor_gpu * 5


NameError: name 'tensor_gpu' is not defined

In [39]:
# Move GPU tensor to CPU
tensor_gpu_cpu = tensor_gpu.to(device='cpu')

# Move CPU tensor to GPU
tensor_cpu_gpu = tensor_cpu.to(device='cuda')


NameError: name 'tensor_gpu' is not defined