## Hands-on Challenge Solution

Now that you have seen the different non-linear functions, it's time to implement them ourselves. We will implement each function as a layer for our MLP: 

1.  linearlayer
2.  softmax
3.  relu
4.  sigmoid
5.  tanh

All the layers will have the same basic structure: 

-   initialize the required data structure(s)
-   forward function that is called to process input data
-   a reset function to clear out the outputs.

**Note** There is a reason to have an *out* tensor and not just pass through the results. Later we will need the output results to compute the gradients. 



In [4]:
import torch
import torch.nn.functional as F  
import numpy as np

class layer():
    
    #This is the layer superclass with only node_dim to specify
    #This is the representation of just one layer
    #we will use this as a base class (parent class) for linearlayer and non-linear functions.
    
    def __init__(self, node_dim):
        """
        This init should be called via super() with the number
        of nodes as an argument.
        """
        self.input = np.zeros(node_dim)
        self.input_grad = np.zeros(node_dim)
        
    def forward(self, x):
        self.input =  x
        
    def zero_grad(self):
        self.input_grad.fill(0.)

However, the linearlayer adds two more arguments: the input dimension and a switch to include bias. Normally we want bias, but we will make it an option just in case.



In [5]:
class linearlayer(layer):
    def __init__(self, in_dim, node_dim, bias=True):
        super(linearlayer, self).__init__(node_dim)
        self.out = np.zeros(node_dim)
        self.weights = np.random.rand(node_dim,in_dim)
        
        if bias:
            self.bias = np.zeros(in_dim)
            
    def forward(self, x):
        
        self.input = x
        self.out = np.dot(self.weights, x) #matrix multiplication of weights and inputs.
        if self.bias.any():
            self.out += self.bias
        return self.out
    
    def reset(self):
        self.out.fill(0.)

The non-linear functions needs to be specified.

In [6]:
class relu(layer):
    def __init__(self, node_dim):
        super(relu, self).__init__(node_dim)

    def forward(self, x):
        self.input = x
        return np.clip(x, 0, None)
     
class softmax(layer):
    def __init__(self, node_dim):
        super(softmax, self).__init__(node_dim)

    def forward(self, x):
        self.input = x
        # return np.exp(x - np.max(x)) / np.sum(np.exp(x - np.max(x))) -> as alternative
        return (np.exp(x)) / (np.sum((np.exp(x))))
    
class sigmoid(layer):
    def __init__(self, node_dim):
        super(sigmoid, self).__init__(node_dim)

    def forward(self, x):
        self.input = x
        return 1 / (1 + np.exp(-1 * x))
    
class tanh(layer):
    def __init__(self, node_dim):
        super(tanh, self).__init__(node_dim)

    def forward(self, x):
        self.input = x
        return (np.exp(2*x) - 1) / (np.exp(2*x) + 1)  

Go ahead and write the code for all the non-linearity layers. Then make a simple MLP, try and run it. 


Here's an example


In [7]:
import numpy as np

class MLP():
    def __init__(self):
        
        # The MLP will be a list with each layer as an item.
        self.net = []

        self.net.append(linearlayer(10, 20))
        self.net.append(relu(20))
        self.net.append(linearlayer(20, 4))
        self.net.append(softmax(4))
     
        
    def forward(self, x):

        # Input x for each layer and return the result back into x,
        # ready as input for the next layer.
        for layer in self.net:
            x = layer.forward(x)
        
        return x

    def reset(self):
        # traverse the MLP and call each layers 'reset' method
        for layer in self.net:
            layer.reset()

Let's see if it works



In [8]:
model = MLP()

x = np.random.random(10) # inputs

print(x)
print(model.forward(x))

[0.20412878 0.20541899 0.24846197 0.24222301 0.50406675 0.82262963
 0.90324775 0.19573363 0.3249147  0.29344446]
[0.61230688 0.00155266 0.15343245 0.23270802]
