A Simple Pytorch Tutorial
======================

PyTorch is a neural network library in Python. Before we dive into PyTorch, let us first think of what functionalities a neural network library should support:

* Implement a neural network $\mathbf{y} = f_\mathbf{\theta}(\mathbf{x})$ parameterized by $\mathbf{\theta}$
* Support Inference: given an input example $\mathbf{x}$, compute the output of the network $\mathbf{y}$
* Support training of the network givan a bunch of training data $\langle \mathbf{x}, \mathbf{y} \rangle$
    - The library should support auto-differentiation: computing the gradient $\frac{\partial f_\mathbf{\theta}(\mathbf{x})}{\partial \mathbf{\theta}}$
    - The library should also provide implementations of common optimizers (e.g., SGD, Adam) to update the network's parameter $\mathbf{\theta}$ via gradient descent
    
Pytorch provides receipies to do all of these (and more)!

Let us try to understand some basic concepts of PyTorch using a toy example. Assume the network network we'd like to implement is a simple linear model:
$$ y = \mathbf{w}^\intercal \mathbf{x} + b  $$
where our model's parameter $\mathbf{\theta} = \langle \mathbf{w}, b \rangle$.

In [16]:
import torch
import torch.nn as nn  # where most neural network modules are

We could implement our simple neural network by implementing the (abstract) class ``nn.Module``. Basically, what we are going to do is

* Create two model parameters, $\mathbf{w}$ and $b$.
* Define the computation routine of the network: how $y$ is computed given $\mathbf{x}$

In [3]:
class MyNeuralNet(nn.Module):
    def __init__(self):
        super(MyNeuralNet, self).__init__()
        
        # $w$ is 2-dimentional a vector with default value [1.0, 2.0]
        self.w = nn.Parameter(
            torch.tensor([1.0, 2.0])
        )
        # b is a scalar with default value 0.5
        self.b = nn.Parameter(
            torch.tensor(0.5)
        )
        
    def forward(self, x):
        """The forward pass"""
        y = self.w.dot(x) + self.b
        
        return y

Having defined our simple ''neural net'', now let's create one instance and run it!

In [19]:
f = MyNeuralNet()
x = torch.tensor([0.5, 0.3])
y = f(x)

print(f'The output `y` is {y}')

The output `y` is 1.600000023841858


We then call ``.backward()`` on the output of the nerual network ``y`` to perform back propagation, which compuates the gradients w.r.t model parameters $\mathbf{\theta} = \langle \mathbf{w}, b \rangle$

In [9]:
y.backward()

To get the gradients of model parameters $\mathbf{\theta} = \langle \mathbf{w}, b \rangle$, simply check the ``.grad`` property of model parameters:

In [15]:
print(f'gradient of $w$: {f.w.grad}')
print(f'gradient of $b$: {f.b.grad}')

gradient of $w$: tensor([0.5000, 0.3000])
gradient of $b$: 1.0


We could do a sanity check to make sure the results are correct:
$$\frac{\partial y}{\partial \mathbf{w}} = \mathbf{x} \quad \frac{\partial y}{\partial b} = 1.0$$