TP: Artfical neural networks
# ===========================================================

The objective of this practical session is to implement from scratch an artificial neuron network architecture using the numpy library, then to train this neuron network in order to learn the function (non-linear):

$$
f^*(x_1, x_2) = x_1^2 + x_2^2
$$


from training data:

$$
\{(x_n, y_n) \in \mathbb{R}^2 \times \mathbb{R}, n = 1, \dots, N \},
$$

for which:

$$
\forall n \in \{1, \dots, N \}, y_n = f^*(x_n) + \epsilon_n,
$$


where $\epsilon_n$ is Gaussian white noise with zero mean and variance $\sigma^2$.





In [1]:
from math import *
import numpy as np
import matplotlib.pyplot as plt

## 1. Generation of training and test data

The function below is used to generate data from the model.

In [2]:
def generate_dataset(nsamples, sigma=0.):
    
    """
    Generate a dataset with the specified number of samples
    
    :param nsamples: number of sample points
    :type nsamples: int
    :param sigma: standard deviation of the noise
    :type sigma: float
    
    :return: Generated dataset {(x_n, y_n), n = 1, ..., nsamples}
    :rtype: tuple of numpy arrays
    """

    x = np.zeros((nsamples, 2))
    x[:, 0] = np.random.uniform(-1, 1, nsamples)
    x[:, 1] = np.random.uniform(-1, 1, nsamples)

    eps = np.random.normal(loc=0, scale=sigma, size=nsamples)
    
    y = x[:, 0]**2 + x[:, 1]**2 + eps
    return x, y

In [3]:
nsamples = 50
sigma = 0.
xtrain, ytrain = generate_dataset(nsamples, sigma)
xtest, ytest = generate_dataset(nsamples, sigma)

## 2. Implementing a network layer

We will rely on a class, Layer, whose attributes are:

- the "input_dim" size of the signal at the input of the layer
- the number "output_dim" of neurons in the layer
- the activation function "activation" used by the layer ('RELU' or none)
- the "weights" matrix of the weights of the neurons of the layer
- the "biases" vector of the biases
- the input signal "input"
- the output signal "output"
- the gradients "input_grad", "output_grad", "weights_grad", and "biases_grad"

This class allows us to implement what takes place inside a layer of the neural network. 

Complete the implementation of the class:

**Question 2.1.** In the constructor, initialize the values of the weight and bias matrix.

**Question 2.2.** Implement the computation of the output signal y of the layer in the presence of an input x. We remind here that:

$$
y = \sigma (Wx) + b,
$$

where $\sigma $ is the activation function of the layer and $b$ the bias.

**Question 2.3.** Implement the backpropagation algorithm of the gradient of the output signal "output_grad" in the layer. For a layer $i$, we have:

\begin{eqnarray}
t_i &=& b_i + W_i x \\
y_i &=& g_i(t_i) \\
\end{eqnarray}

It can really help you if you calculate the derivative $\frac{\partial t}{\partial W}$. Think about how to write this mathematically prior of implementing it. 



**Question 2.4.** Implement a method to update the weight / bias values of the layer knowing the learning rate

In [4]:
class Layer:

    """
    Neural network layer implementation
    """

    def __init__(self, input_dim, output_dim, activation='RELU'):

        """
        :param input_dim: dimension of the input vector
        :type input_dim: integer

        :param output_dim: dimension of the output vector 
         (i.e number of neurons in the layer)
        :type output_dim: integer

        :param activation: activation function
        :type activation: string
        """

        
        self.input_dim = input_dim
        self.output_dim = output_dim
        self.activation = activation

        #TODO, help: define all the weight and bias matrix using the adaptated dimensions


    def forward(self, x):

        """
        Computes a forward pass in the layer

        :param x: input signal
        :type x: numpy array of size input_dim

        :return: output of the layer
        :rtype: numpy array of size output_dim
        """

        
        #TODO, help: implement the output signal with the activation function


    def backward(self, output_grad):

        """
        Computes a backward pass in the layer

        :param output_grad: gradient of the output of the layer w.r.t
         the training loss
        :type output_grad: numpy array of size output_dim

        :return: gradient of the output of the layer w.r.t
         the training loss
        :rtype: numpy array of size input_dim
        """

        #TODO, help: write down mathematically the gradients, no matter the activation function used then implement
        


    def update(self, learning_rate):

        """
        Update the weights during the gradient descent

        :param learning_rate: learning rate
        :type learning_rate: float
        """
    
        #TODO

## 3. Linear model 

The Linear class below is used to implement the linear model

$$
f^*(x_1, x_2) = w_1 x_1 + w_2 x_2
$$


associated with the cost function corresponding to the standard $L^2$.


The implementation is based on the Layer class: the linear model is in fact nothing more than a network layer without an activation function.

The attributes of the class are:
- the "input_dim" and "output_dim" dimensions of the input and output signals
- the instance of the Layer class used to describe the model
- the output signal "output"
- the "target" target used during training
- the "loss" value of the cost function:
$$
loss = (output - target)^2
$$
- the "loss_grad" gradient of the cost function compared to the output of the model

In [5]:
class Linear:

    """
    Linear model implementation
    """

    def __init__(self, input_dim, output_dim=1):


        self.input_dim = input_dim
        self.output_dim = output_dim
        self.layer = Layer(self.input_dim, self.output_dim, activation='None')


    def forward(self, x):

        """
        Computes a forward pass in the neural network

        :param x: input signal
        :type x: numpy array of size input_dim

        :return: output of the neural network
        :rtype: float
        """
        self.output = self.layer.forward(x)


    def compute_loss(self, x, target):

        """
        Computes the loss 

        :param x: input signal
        :type x: numpy array of size input_dim

        :param target: target value
        :type target: float

        :return: loss
        :rtype: float
        """

        self.target = target
        self.forward(x)
        self.loss = (self.output - target)**2


    def backward(self):

        """
        Backpropagation in the neural network 
        """

        self.loss_grad = 2*(self.output - self.target)
       
        #use the layer.backward method to compute the backprop
        self.layer.backward(self.loss_grad)


    def update(self, learning_rate):

        """
        Update the weights of the network during the gradient descent

        :param learning_rate: learning rate
        :type learning_rate: float
        """
        ##use the layer.update method to compute the gradient descent
        self.layer.update(learning_rate)

## 4. Neural Network

**Question 4.1** Complete the implementation of the TwoLayersNetwork class below, inspiring of the linear class

In [6]:
class TwoLayersNetwork:

    """
    Linear model implementation
    """

    def __init__(self, input_dim, hidden_dim, output_dim=1):
        
        """
        Model initialization
        
        :param input_dim: Dimension of the input signal
        :type input_dim: int
        
        :param hidden_dim: Dimension of the hidden layer
        :type input_dim: int
        
        :param output_dim: Dimension of the output signal
        :type output_dim: int
        """

        #TODO, help: Use the layer class to build your layers


    def forward(self, x):

        """
        Computes a forward pass in the neural network

        :param x: input signal
        :type x: numpy array of size input_dim
        """

        #TODO


    def compute_loss(self, x, target):

        """
        Computes the loss 

        :param x: input signal
        :type x: numpy array of size input_dim

        :param target: target value
        :type target: float
        """

        #TODO


    def backward(self):

        """
        Backpropagation in the neural network 
        """

        #TODO


    def update(self, learning_rate):

        """
        Update the weights of the network during the gradient descent

        :param learning_rate: learning rate
        :type learning_rate: float
        """

        #TODO

## 5. Linear model training

The code below is used to train the linear model.

In [7]:
    # 1: Linear model

    input_dim = 2
    output_dim = 1
    nepochs = 100

    # Initializes the model
    linear = Linear(input_dim, output_dim)
    
    # Fix the learning rate
    learning_rate = 1e-1

    training_loss, test_loss = [], []
    for epoch in range(nepochs):
        
        train_err = []
        for n in range(nsamples):

            linear.compute_loss(xtrain[n], ytrain[n])
            linear.backward()
            linear.update(learning_rate)
            train_err.append(linear.loss)

        test_err = []
        for n in range(nsamples):

            linear.compute_loss(xtest[n], ytest[n])
            test_err.append(linear.loss)

        training_loss.append(np.array(train_err).mean())
        test_loss.append(np.array(test_err).mean())

    plt.plot(np.array(training_loss), label='training loss')
    plt.plot(np.array(test_loss), label='test loss')
    plt.legend()
    plt.show()

    print("Minimal training error: " + str(min(training_loss)))
    print("Minimal test error: " + str(min(test_loss)))

TypeError: unsupported operand type(s) for -: 'NoneType' and 'float'

## 6. Neural network training

**Question 6.1.** Using the linear model as an inspiration, implement the training of the neural network on the data

**Question6.2.** What is the influence of the number of neurons in the hidden layer?

In [None]:
#TODO Question 6.1