# Introduction
[PyTorch] is one of the two most popular Deep Learning frameworks in Python, besides TensorFlow. Here is some key points when comparing the two:
- In terms of low or high level, PyTorch falls somewhere in between TensorFlow and Keras. No fit-and-predict interface, must be done by hand.
- PyTorch is prefered by research community with more customizations, as we normally see newly published architectures written in PyTorch.
- TensorFlow/Keras is better for production due to high-level interface and large deployment ecosystem.

[PyTorch]: https://github.com/pytorch/pytorch

# 1. Data manipulation

## 1.1. Computation

In [1]:
import numpy as np
import pandas as pd
import janitor
import torch
import torch.nn as nn
import torch.nn.functional as F

### Tensor
In PyTorch, we work most of the time with
<code style='font-size:13px'><a href=https://pytorch.org/docs/stable/tensors.html>Tensor</a></code>
whose operations are very much like NumPy's array. Being very natural to PyTorch, tensor operations are provided directly in the [mother package]. One thing to notice is that PyTorch requires tensors to be of the same data type so mathematical computation can be performed on them. When error occurs, simply call the <code style='font-size:13px'>double()</code> method to convert the tensor to float type.

[mother package]: https://pytorch.org/docs/stable/torch.html

In [40]:
a = torch.tensor([
    [1., 2., 3.],
    [4., 5., 6.]
])

In [41]:
torch.rand_like(a).double()

tensor([[0.1039, 0.6032, 0.2776],
        [0.3127, 0.8976, 0.3040]], dtype=torch.float64)

### Autograd
PyTorch provides automatic differentiation via the module
<code style='font-size:13px'><a href=https://pytorch.org/docs/stable/autograd.html>autograd</a></code>,
with functions implemented as
<code style='font-size:13px'>Tensor</code>
methods. Being a mathematical module, it distinguishes two types of tensor, *constant* and *variable*, indicated via the flag <code style='font-size:13px'>requires_grad</code>. All tensors are constants by default, and become variables when this flag is enabled. With variables, we compute the *foward* pass using operations just like normal tensors. The difference is during the *backward* pass (by calling the <code style='font-size:13px'>backward()</code> method on the output variable), PyTorch will compute and accumulate partial derivatives for leaf nodes. This information can be accessed via the <code style='font-size:13px'>grad</code> attribute of input variables.

We demonstrate Autograd with a simple Linear Regression problem on the Boston dataset. Note that $\mathbf{X}$ and $\mathbf{y}$ are variables of the model, but $\mathbf{w}$ and $b$ are variables from the perspective of the loss function $\mathcal{L}$. In this section, $\mathbf{w}$ and $b$ are randomly initialized and will be optimized later.

In [13]:
# constants
df = pd.read_csv('data/boston.csv')
x = torch.tensor(df.drop(columns='price').values, dtype=torch.float32)
y = torch.tensor(df.price, dtype=torch.float32).reshape(-1, 1)
x.shape

torch.Size([506, 13])

In [14]:
# variables
w = torch.rand(13, 1, requires_grad=True)
b = torch.rand(1, requires_grad=True)

In [15]:
# foward pass
yPred = torch.matmul(x, w) + b
yTrue = y
loss = F.mse_loss(yPred, yTrue)

In [17]:
# backward pass
loss.backward()
w.grad

tensor([[5.3905e+03],
        [1.1662e+04],
        [1.4044e+04],
        [7.7154e+01],
        [6.5687e+02],
        [7.1741e+03],
        [8.2400e+04],
        [4.1105e+03],
        [1.2983e+04],
        [5.1313e+05],
        [2.1505e+04],
        [4.0734e+05],
        [1.5525e+04]])

### Optimization

In [37]:
from torch.utils.data import Dataset, DataLoader

In [43]:
len(y)

506

In [50]:
x[2]

tensor([2.7290e-02, 0.0000e+00, 7.0700e+00, 0.0000e+00, 4.6900e-01, 7.1850e+00,
        6.1100e+01, 4.9671e+00, 2.0000e+00, 2.4200e+02, 1.7800e+01, 3.9283e+02,
        4.0300e+00], dtype=torch.float64)

In [51]:
y[2]

tensor([34.7000], dtype=torch.float64)

In [52]:
class TabularData(Dataset):
    def __init__(self, df, labelName):
        self.features = df.drop(columns=labelName)
        self.label = df[labelName]
    
    def __len__(self):
        return len(self.label)
    
    def __getitem__(self, idx):
        return self.features[idx], self.label[idx]

In [55]:
TabularData(df, 'price')

<__main__.TabularData at 0x1a72763b550>

In [29]:
params = w, b
optimizer = torch.optim.SGD(params, lr=0.1)

In [None]:
nIter = 100
for i in range(nIter):
    

In [7]:
import torch

x = torch.ones(5)  # input tensor
y = torch.zeros(3)  # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w)+b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)

In [8]:
loss.backward()

In [10]:
w.grad

tensor([[0.3269, 0.2952, 0.0765],
        [0.3269, 0.2952, 0.0765],
        [0.3269, 0.2952, 0.0765],
        [0.3269, 0.2952, 0.0765],
        [0.3269, 0.2952, 0.0765]])

# 2. Neural networks

PyTorch has two APIs for creating layers,
<code style='font-size:13px'><a href=https://pytorch.org/docs/stable/nn.html>nn</a></code>
(abbreviated for *neural network*) and
<code style='font-size:13px'><a href=https://pytorch.org/docs/stable/nn.functional.html>nn.functional</a></code>.
The first module provides object interface (that supports auto differentiation) and the second module provides function interface (easier to use). So, the best practice is using object interface for layers with trainable parameters such as recurrent or convolutional, and using function interface for loss functions or activation functions.

PyTorch has two APIs for creating models, where the recommended one is
<code style='font-size:13px'><a href=https://pytorch.org/docs/stable/generated/torch.nn.Module.html>nn.Module</a></code>,
being equivalent to functional API in Keras. To create a model, we inherit this class, define building blocks inside the <code style='font-size:13px'>\_\_init__()</code> method and design the neural network architecture with the <code style='font-size:13px'>foward()</code> method. We don't need to to specify the backward pass, as the submodule
<code style='font-size:13px'><a href=https://pytorch.org/docs/stable/autograd.html>autograd</a></code>
will handle it for us. The second API,
<code style='font-size:13px'><a href=https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html>nn.Sequential</a></code>,
is good for simple architectures as well as small blocks of large networks, inception block of GooLeNet for example.

In [1]:
import torch
from torch import nn
from torch.autograd import Variable

In [2]:
torch.cuda.is_available()

False

In [56]:
x = torch.autograd.Variable(torch.Tensor([[1,2,3,4,5]]), requires_grad=True)
y = torch.sum(x**2)
y.backward() 
x.grad

tensor([[ 2.,  4.,  6.,  8., 10.]])

In [57]:
class TwoLayerNet(nn.Module):
    def __init__(self, D_in, H, D_out):
        """
        In the constructor we instantiate two nn.Linear modules and assign them as
        member variables.

        D_in: input dimension
        H: dimension of hidden layer
        D_out: output dimension
        """
        super(TwoLayerNet, self).__init__()
        self.linear1 = nn.Linear(D_in, H) 
        self.linear2 = nn.Linear(H, D_out)

def forward(self, x):
        """
        In the forward function we accept a Variable of input data and we must 
        return a Variable of output data. We can use Modules defined in the 
        constructor as well as arbitrary operators on Variables.
        """
        h_relu = nn.functional.relu(self.linear1(x))
        y_pred = self.linear2(h_relu)
        return y_pred

In [72]:
class Rectangle:
    def __init__(self, length, width):
        self.length = length
        self.width = width

    def area(self):
        return self.length * self.width

    def perimeter(self):
        return 2 * self.length + 2 * self.width

class Square(Rectangle):
    def __init__(self, length):
        super().__init__(length, length)
        
class Cube(Square):
    def surface_area(self):
        face_area = super().area()
        return face_area * 6

    def volume(self):
        face_area = super().area()
        return face_area * self.length

# References
- *pytorch.org - [Autograd mechanics](https://pytorch.org/docs/stable/notes/autograd.html)*
- *pytorch.org - [Automatic differentiation with Torch.Autograd](https://pytorch.org/tutorials/beginner/basics/autogradqs_tutorial.html)*
- *pytorch.org - [Deep Learning with PyTorch: A 60-minute blitz](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html)*
- *towardsdatascience.com - [Understanding PyTorch with an example: a step-by-step tutorial](https://towardsdatascience.com/understanding-pytorch-with-an-example-a-step-by-step-tutorial-81fc5f8c4e8e)*
- *towardsdatascience.com - [PyTorch vs TensorFlow - spotting the difference](https://towardsdatascience.com/pytorch-vs-tensorflow-spotting-the-difference-25c75777377b)*
- *blog.paperspace.com - [PyTorch 101 advanced](https://blog.paperspace.com/pytorch-101-advanced/)*
- *poloclub.github.io - [CNN explainer](https://poloclub.github.io/cnn-explainer/)*
- https://cs230.stanford.edu/blog/pytorch/

In [None]:
conda install pytorch torchvision torchaudio cpuonly -c pytorch

In [None]:
!pip3 install torch torchvision torchaudio --user

---
*&#9829; By Quang Hung x Thuy Linh &#9829;*