# PyTorch
## Part 1 - Introduction to PyTorch framework
**Author:** Shashi Kiran Chilukuri


![title](img/pytorch.jpeg)

We have many frameworks to create deep learning models, to name some: Tensorflow, Pytorch, Caffe, MXNet, Keras etc. Each of them have their own architecture style and of course each one with some pros and cons. In this 3 part article series, we will go through one of the popular and highly adopting framework "PyTorch". The goal of this article series is to introduce to
   * Part 1: PyTorch framework
   * Part 2: Tensor Operations
   * Part 3: General steps to create neural network with PyTorch

# Part1: Introduce to PyTorch framework

## So,what is PyTorch?
 ![title](img/what is Pytorch.png)
 
 Now lets compare with other most popular framework 'TensorFlow' to check how it standout
 
## Comparing with other most popular framework: Tensorflow**
 ![title](img/comparing with TensorFlow.png)
 
 We can clearly see PyTorch out performs TensorFlow in some areas

## Core Components of Pytorch
Before we jump into core components, lets quickly have a high level overview of what is Artificial Neural Network and general steps involved in creating a network

<img src="img/ANN.png" width="500">

Artificial Neural Networks(ANN) are vaguely inspired from human brain's neural structure. Just how a human brain process the information, a ANN will also receive, process and output the information. ANN will have at least 3 layers (Input, Hidden, Output layers) to process the information and also each of these layer can have any number of nodes. In the below picture, we can notice 2 input nodes in Input layer, 3 hidden nodes in Hidden layer and 1 output node in Output layer (also 1 bias unit for Input and hidden layer). 
<img src="img/Simple multi-layer neural network.PNG" width="500">

ANN is an iterative process. What it means is, we train the network until difference between predicted and actual outcome is minimal. Meaning, we first receive the inputs, processes them in hidden layers and predict the  outcome(Y-Pred as shown in above picture). This process is called Forward propagation or Feedforward 

 <img src="img/nn training.PNG" width="700">

Next, we compare predict outcome(Y-Pred) with actual(Y) to check how well the ANN model performed.We then calculate the error function and gradient. Next, we perform the back propagation which is the process to spread the error to each of the weights using the chain rule. Finally, we update the weights and rerun the whole process until the model gets better output. Take a look at above neural networks training cycle

Going back to our core components of PyTorch frameworks..

### Tensors
![title](img/data dimensions.png)

* Tensor as noted above is a multi-dimensional matrix which contain elements of a single data type. These are part of the pytorch’s torch package. 

> `torch` package provides a flexible N-dimensional array or Tensor, which supports basic routines for indexing, slicing, transposing, type-casting, resizing, sharing storage and cloning. The Tensor also supports mathematical operations like max, min, sum, statistical distributions like uniform, normal and multinomial, and BLAS operations like dot product, matrix-vector multiplication, matrix-matrix multiplication, matrix-vector product and matrix product.

* So why do we need to use PyTorch tensors when we have Numpy ndarrays? One good reason is, PyTorch can utilize power of GPU's to accelerate numerical calculations. Otherwise both are conceptually identical. Tensors in PyTorch and arrays in numpy will share their underlying memory locations, and changing one will change the other. We can convert a Tensor to array and vice versa very early

### Computational Graphs

Computational graphs are another core component of PyTorch deep learning framework. Computational graphs states the sequence of operations that occur between the variables. These chain computations in the neural network will result in output prediction. Neural networks are heavily dependent on computational graph during the training process. It helps to calculate the gradient (which is all the partial derivatives of error function with respect to weights) by efficiently applying the chain rule. 

PyTorch uses dynamic computational graph (as opposed to conventional static computational graph) to calculate the gradient during backpropagation. Here are  differences between static and dynamic computational graphs
 <img src="img/static vs dynamic.png">

Here is the example of computational graphs (by PyTorch.org) to show how backpropagation works using PyTorch's autograd package. This brings us to next topic - "**Autograd**"
 <img src="img/dynamic_graph.gif">
 
### AUTOGRAD

PyTorch's Autograd package provides reverse automatic differentiation (backpropagation) for all operations on Tensors to calculate the gradient. since the PyTorch follow "define by run" dynamic framework, backpropagation will run the desired computation as opposed to specifying a static graph strucutre. 

If you are interested, take a look at [PyTorch.org article on Autograd](https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#sphx-glr-beginner-blitz-autograd-tutorial-py) on how we can set an attribute to a Tensor and calculate the gradients automatically.

### PyTorch Packages

Here are some of the PyTorch packages that are used in creating ANN
<img src="img/pytorch_packages.PNG">

Lets incorporate above PyTorch concepts into to our neural network training cycle. Here are the steps
1. Convert the input data into torch tensors
2. Create a model with sequence of input, output and hidden layers
3. Perform forward propagation to generate predicted output
4. Compare predicted output with actual
5. Calculate error (difference between predicted vs actual) and then perform gradient decent using back propagation (with PyTorch's Autograd) to spread the error until the model gets better output
6. Check the performance of the model with different metrics. 
7. Once the model is performing well, we can validate the model with test data

Till now we have gone through theoretical overview of PyTorch framework, now lets install it and take this further

## Installation
 * For installation go to https://pytorch.org/
 * It has option of installing locally or on cloud
 ![title](img/installation.PNG)

**Installing PyTorch on Windows**
 * As a pre-req, need to have access to computer with Python, NumPy, Matplotlib, Jupyter Notebook installed
 * Once you select the preferences, you will get the command to run
 * Note: CUDA is NVIDIA’s parallel computing platform. If your computer doesn’t have NVIDIA’s graphics chip. You can select ‘None’ in preferences
 
 <img src="img/installation step1.PNG" width="500"> 
 * Once pytorch installation completed, it will check to proceed on torchvision. Say yes to continue…
 <img src="img/installation step2.PNG" width="500"> 

## Summary
We have looked into what PyTorch is, its core components, and how to install it. In the next post, we will look into torch tensors, its operations and how to create feedforward neural network with PyTorch.

**Sources and Interesting articles:**

* [About Torch pacakage](https://en.wikipedia.org/wiki/Torch_(machine_learning)
* [PyTorch Documentation](https://pytorch.org/docs/stable/index.html)
* [PyTorch vs Tensorflow](https://towardsdatascience.com/pytorch-vs-tensorflow-spotting-the-difference-25c75777377b)
* [Pytorch dynamic compuatational graphs](https://medium.com/intuitionmachine/pytorch-dynamic-computational-graphs-and-modular-deep-learning-7e7f89f18d1)
* [Automatic differentiation in PyTorch](https://openreview.net/pdf?id=BJJsrmfCZ)
* [PyTorch articles](https://jhui.github.io/)
* [Deep Learning Frameworks](https://www.kdnuggets.com/2017/02/anatomy-deep-learning-frameworks.html)
* [Exploring PyTorch](https://blog.algorithmia.com/exploring-the-deep-learning-framework-pytorch/)