# LAB-4.1: Neural Networks in Numpy

### Objective

In this lab session, we investigate a regression task by a basic deep learning architecture. You need to implement the neural network from scratch by using only the numpy package. 

### General Announcements

* The exercises on this sheet are graded by a maximum of **20 points**. You will be asked to implement several functions.
* Team work is not allowed! Everybody implements his/her own code. Discussing issues with others is fine, sharing code with others is not. 
* If you use any code fragments found on the Internet, make sure you reference them properly.
* You can send your questions via email to the TAs until the deadline.

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

# 1) Generating a Toy Dataset
- Generate a dataset of random values with the help of numpy. (<span style="color:green">2 points</span>)

In [2]:
def generate_data(num_samples: int, num_features: int) -> tuple:
    """
    Inputs: 
        - Number of Samples (dtype: Integer)
        - Number of dimension in Features/Data (dtype: Integer)
    Outputs:
        - data (numpy.ndarray | dtype: numpy.float | Shape=(num_sample, num_feature))
        - labels (numpy.ndarray | dtype: numpy.float | Shape=(num_sample))
    """
    np.random.seed(42)  # Keep the seed the same
    data = np.random.randn(num_samples, num_features)
    labels = np.random.randn(num_samples)
    return data, labels

X, Y = generate_data(1, 10)
print(X.shape, Y.shape)

(1, 10) (1,)


# 2) Designing Deep Learning Modules from Scratch
- All modules in a deep learning models have a common structure. You can find it below.

In [3]:
# DO NOT MODIFY
class GenericModule:
    def __init__(self) -> None:
        pass

    def forward(self, x: np.ndarray) -> np.ndarray:
        """
        Pass a tensor through the module. This is called a forward pass.
        
        :param x: Input tensor.
        :returns: Output of the layer after applying it to the input tensor.
        """
        raise NotImplementedError
    
    def backward(self, grad: np.ndarray) -> np.ndarray:
        """
        Calculate the gradients of all trainable weights in the layer based on the gradient information coming in from the next layer.
        Then calculate the gradient being passed down to the previous layer.
        This is called a backward pass.
        
        The gradient basically tells the layer in which direction it should adapt its outputs to reduce the loss.
        The layer then adapts its weights to increase the outputs towards the direction dictated by the gradient.
        
        :param grad: Gradient information derived from the next layer.
        :returns: Gradient information for the previous layer.
        """
        raise NotImplementedError
        

- Now, you need to implement the module by using numpy functions only.
    - Implement ReLU, which is the function: $f(x) = max(0, x)$ (<span style="color:green">1 point</span> for forward & <span style="color:green">1 point</span> for backward)

In [5]:
class ReLU(GenericModule):
    def __init__(self, name, shape):
        self.name = name
        self.non_zero_index = np.zeros(shape)
        
    def forward(self, x):
        # for the backpropagation you need to remember which indexes were not masked by the ReLU
        out = np.maximum(0, x)  
        self.non_zero_index[:] = 1
        self.non_zero_index[x < 0] = 0
        return out
    # 
    # def backward(self, grad):
    #     return ?

# Checking Module - DO NOT MODIFY
obj = ReLU('Temporary', 10)
output = obj.forward(X[0] - X[0].mean())
df_temp = pd.DataFrame({'input': X[0] - X[0].mean(), 'output': output, 'gradient': obj.non_zero_index})
print(df_temp)

      input    output  gradient
0  0.048653  0.048653       1.0
1 -0.586325  0.000000       0.0
2  0.199627  0.199627       1.0
3  1.074969  1.074969       1.0
4 -0.682214  0.000000       0.0
5 -0.682198  0.000000       0.0
6  1.131152  1.131152       1.0
7  0.319374  0.319374       1.0
8 -0.917535  0.000000       0.0
9  0.094499  0.094499       1.0


- Implement dense layer: $f(x) = (w \cdot x) + b$ (<span style="color:green">1 point</span> for initialization & <span style="color:green">1 point</span> for forward & <span style="color:green">2 points</span> for backward)
- Initialize the weights randomly in the range [-1,1]

In [None]:
class Dense(GenericModule):
    def __init__(self, name, in_size, out_size):
        self.name=name
        self.w = np.random.uniform(low=-1, high=1, shape=(in_size, out_size))
        self.b = np.random.uniform(low=-1, high=1, shape=(in_size, out_size))
        
    def forward(self, x):
        out = ?
        self.x = x.reshape(-1,1)  # need to save x for backward
        return out
    
    def backward(self, grad):
        grad_x = ?
        self.grad_w = ?
        self.grad_b = ?
        self.grad_b = self.grad_b.reshape(-1)  # needed so you don't get shape missmatches
        return grad_x

# Checking Module - DO NOT MODIFY
obj = Dense('Temporary', 10, 32)
output = obj.forward(X[0] - X[0].mean())
print(output.shape, obj.w.shape, obj.b.shape)

- Implement the squared error loss function: $L(x, y) = |x - y|_2^2$ (<span style="color:green">1 point</span> for forward & <span style="color:green">1 point</span> for backward)

In [None]:
class SE_loss(GenericModule):
    def forward(self, x, y):
        """
        :params x: Prediction for a single sample.
        :params y: Target of the sample.
        """
        self.y = y  # need to save for backward
        self.x = x  # need to save for backward
        out = ?
        return out
    
    def backward(self):
        grad = ?
        return grad.reshape(1,1)
    
# Checking Module
obj = SE_loss()
output = obj.forward(np.array([1]), np.array([10]))
print(output, obj.backward())

# 3) Training a Neural Network
- Define an architecture as a list by using dense and relu layers (<span style="color:green">2 points</span>)
    - Use the given hyper-parameters during designing: 
        1) Dense (??, 32)
        2) ReLU (??)
        3) Dense (??, 1)
    - Re-generate the dataset
        - Number of Samples = 15
        - Number of Features = 10
        

In [None]:
net = ?
X, Y = ?

- Forward each sample through the network and calculate the loss (<span style="color:green">2 points</span>)
- Run the backwards pass and implement the gradient decent algorithm for the dense layers (<span style="color:green">2 points</span>)
- Note that you need to rerun the previous cell to reset the network before training again
- Note that we are currently not using batching, and instead only pass one sample through the network each time

In [None]:
# Hyper-Paraters for Training
learning_rate = 0.01
num_epoch = 10
loss = SE_loss()

print("-------Network-------")
for layer in net:
    print(layer.name)
    if isinstance(layer, Dense):
        print("   Weight shape:", layer.w.shape)
        print("   Bias shape:", layer.b.shape)
        
print("\n-------Training-------")

# Training Phase
for epoch_no in range(num_epoch):
    
    cumulative_loss = 0
    
    for sample_idx in range(len(X)):
        
        x = ?
        y = ?

        # Forward Pass
        for layer in net:
            x = ?

        loss_val = ?
        cumulative_loss += loss_val

        # Backward Pass
        grad = ?
        for layer in net[::-1]:
            grad = ?
            if isinstance(layer, Dense):  # Optimization: Gradient Decent
                layer.w = ?
                layer.b = ?
                
    print("Epoch:", epoch_no, "- Loss:", cumulative_loss / len(X))

- Right now the training is being tracked with ugly long numbers. Write a function that plots the loss curve nicely instead. (<span style="color:green">1 point</span>)

In [None]:
def plot_loss_curve(loss_vals):
    # Hint: Use matplotlib.pyplot
    raise NotImplementedError()
    
plot_loss_curve(?)

- Your model should currently reach a performance between 0.05 and 0.08 after the first 10 epochs
- Try out multiple learning rates and find one that provides a loss < 0.04 after the first 10 epochs (<span style="color:green">1 point</span>)

In [None]:
# Learning rate:

 - Name at least two important parts of a deep learning pipeline that we didn´t do in this notebook. (<span style="color:green">2 points</span>)

In [None]:
# Answer: 