# PyTorch

In this part of the notes we use <a href="https://pytorch.org/">PyTorch</a> in order to develop deep learning models. 
PyTorch is, at the time of writting, one of the most used frameworks for deep learing computations. Alongside its C++ implementation,
it exposes  a Python API to simlify its use.  There are lots of resources to learn PyTorch. Hence, herein we just go over the basic elements of the library and how to use it. In particular, we will be focusing on the following core elements that deep learning algorithms use in order to train  a model

- ```Module``` which contains ```Layers```
- ```Optimizer```
- ```Loss```
- ```Trainer```

Let's see each of these. We start with the ```Module```

### ```Module```

In PyTorch, we build our networks around the ```nn.Module``` base class. For a simple case, we only have to define the structure of our network and implement the ```forward()``` function. This is shown below where we use the example in  <a href="https://towardsdatascience.com/understanding-pytorch-with-an-example-a-step-by-step-tutorial-81fc5f8c4e8e">Understanding PyTorch with an example: a step-by-step tutorial</a> to set up a linear regression model in PyTorch.  Recall, that in linear regression we assume a model of the form

$$y = ax + b + \epsilon$$

Typically the model will establish its operation in the ```__init__``` method and apply them in the ```forward``` function that will compute the predicion(s) of the model. This is shown below


In [26]:
import torch
from torch import nn, Tensor
torch.manual_seed(42)

<torch._C.Generator at 0x7f3f311cff30>

In [27]:
class LinearRegression(nn.Module):
    def __init__(self):
        super().__init__()
        # To make "a" and "b" real parameters of the model, we need to wrap them with nn.Parameter
        self.a = nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float))
        self.b = nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float))
        
    def forward(self, x):
        # Computes the outputs / predictions
        return self.a + self.b * x

In [28]:
model = LinearRegression()

### ```Optimizer``` and ```Loss```

As you know by now, paramteric models, like deep neural networks, have to establish their parameters somehow.
Optimization therefore is an essential component of PyTorch. This is built around the concept of
 the ```Optimizer``` and ```Loss```. When developing machine learning models we need a metric to measure the error or loss and an optimization method that works on that metric. We have seen this strategy in almost all the  previous lessons. The PyTorch ```optim``` module has a number of widedly used optimizers.

In [29]:
import torch.optim as optim

An optimizer typically will have access to the model parameters:

In [30]:
sgd_optimizer = optim.SGD(model.parameters(), lr=0.001)

Similarly, we can define error metrics. PyTorch supports a number of error metrics.

```
mean_squared_error_loss = nn.MSELoss()
softmax_cross_entropy_loss = nn.CrossEntropyLoss()
```

Since we do linear regression, let's define an ```nn.MSELoss()```

In [31]:
mse_loss = nn.MSELoss(reduction='mean')

We have everything we need right not to start training a model.  Let's generate some data.

In [32]:
import numpy as np

In [33]:
np.random.seed(42)
x = np.random.rand(100, 1)
y = 1 + 2 * x + .1 * np.random.randn(100, 1)

In [34]:
# Shuffles the indices
idx = np.arange(100)
np.random.shuffle(idx)

# Uses first 80 random indices for train
train_idx = idx[:80]

# Uses the remaining indices for validation
val_idx = idx[80:]

# Generates train and validation sets
x_train, y_train = x[train_idx], y[train_idx]
x_val, y_val = x[val_idx], y[val_idx]

In [35]:
# convert to PyTorch tensors
x_train = torch.tensor(x_train)
y_train = torch.tensor(y_train)
x_val = torch.tensor(x_val)
y_val = torch.tensor(y_val)

In [37]:
for epoch in range(10):
    
    model.train()
    yhat = model(x_train)
    
    
    
    # No more manual loss!
    # error = y_tensor - yhat
    # loss = (error ** 2).mean()
    loss = mse_loss(y_train, yhat)
    
    print("Epoch> {0} loss {1}".format(epoch, loss))

    loss.backward()    
    sgd_optimizer.step()
    sgd_optimizer.zero_grad()

print(model.state_dict())

Epoch> 0 loss 1.7223373385163594
Epoch> 1 loss 1.714435262693462
Epoch> 2 loss 1.7065720638565374
Epoch> 3 loss 1.6987477402393498
Epoch> 4 loss 1.6909620047670972
Epoch> 5 loss 1.6832147139229239
Epoch> 6 loss 1.6755056818239793
Epoch> 7 loss 1.6678347659479926
Epoch> 8 loss 1.6602016409484506
Epoch> 9 loss 1.652606208820017
OrderedDict([('a', tensor([0.6355])), ('b', tensor([0.3060]))])


For a quick review of PyTorch have a look at the following links 

- <a href="https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html">Deep Learning with PyTorch: A 60 Minute Blitz</a>
- <a href="https://towardsdatascience.com/understanding-pytorch-with-an-example-a-step-by-step-tutorial-81fc5f8c4e8e">Understanding PyTorch with an example: a step-by-step tutorial</a>


Furthermore, <a href="https://www.fast.ai/">fast.ai</a> has on online free of cost course geared around PyTorch.