# Making an MLP - Multilayer Perceptron - With Pytorch

In [1]:
import torch.nn as nn
import torch

In [2]:
class MLP(nn.Module):
    def __init__(self, input_size:int):
        super(MLP, self).__init__()
        self.hidden_layer = nn.Linear(in_features=input_size, out_features=64)
        self.output_layer = nn.Linear(64, 2)
        self.activation = nn.ReLU()
        # Single dimension input to the softmax layer, so dim=0
        self.softmax = nn.Softmax(dim=0)
        # https://discuss.pytorch.org/t/implicit-dimension-choice-for-softmax-warning/12314/17

    # def forward(self, x):
    #     hidden_pass = self.hidden_layer(x)
    #     activation_pass = self.activation(hidden_pass)
    #     output_pass = self.output_layer(activation_pass)
    #     return output_pass

    def forward(self, x):
        x = self.hidden_layer(x)
        x = self.activation(x)
        x = self.output_layer(x)
        x = self.softmax(x)
        
        return x

Let's go through this bit by bit.

```python
class MLP(nn.Module):
```

Here we're inheriting from `nn.Module`. Combined with `super(MLP, self).__init__()` this creates a class that tracks the architecture and provides a lot of useful methods and attributes. It is mandatory to inherit from `nn.Module` when you're creating a class for your network. The name of the class itself can be anything.

```python
self.hidden_layer = nn.Linear(in_features=input_size, out_features=64)
```

This line creates a module for a linear transformation, $x\mathbf{W} + b$, with a user determined number of inputs and 64 outputs and assigns it to `self.hidden_layer`. The module automatically creates the weight and bias tensors which we'll use in the `forward` method. You can access the weight and bias tensors once the network (`mlp_net`) is created with `mlp_net.hidden_layer.weight` and `mlp_net.hidden_layer.bias`.

```python
self.output_layer = nn.Linear(64, 2)
```

Similarly, this creates another linear transformation with 64 inputs and 2 outputs.

```python
self.activation = nn.ReLU()
self.softmax = nn.Softmax(dim=1)
```

Here I defined operations for the sigmoid activation and softmax output. Setting `dim=1` in `nn.Softmax(dim=1)` calculates softmax across the columns.

```python
def forward(self, x):
```

PyTorch networks created with `nn.Module` must have a `forward` method defined. It takes in a tensor `x` and passes it through the operations you defined in the `__init__` method.

```python
x = self.hidden_layer(x)
x = self.activation(x)
x = self.output_layer(x)
x = self.softmax(x)
```

Here the input tensor `x` is passed through each operation and reassigned to `x`. We can see that the input tensor goes through the hidden layer, then an activation function, then the output layer, and finally the softmax function. It doesn't matter what you name the variables here, as long as the inputs and outputs of the operations match the network architecture you want to build. The order in which you define things in the `__init__` method doesn't matter, but you'll need to sequence the operations correctly in the `forward` method.

Now we can create a `MLP` object.

In [3]:
mlp_net = MLP(input_size=10)
print(mlp_net)

MLP(
  (hidden_layer): Linear(in_features=10, out_features=64, bias=True)
  (output_layer): Linear(in_features=64, out_features=2, bias=True)
  (activation): ReLU()
  (softmax): Softmax(dim=0)
)


In [4]:
torch.manual_seed(0)
mlp_net.forward(torch.rand(10))

tensor([0.5523, 0.4477], grad_fn=<SoftmaxBackward0>)