# In Today's task you will

- Implement Linear (also called Dense, Fully-Connected) layer as a Perceptron.
- Allow your solution to stack multiple layers to form MLP network.
- Perform forward propagation through your network.

This (and later) template implementation is similar to Pytorch framework.

## Task 1a:

Declare a siMple perceptron (Linear layer) that inherits defined class Module - it is here, to help you store all network layers.

The simple perceptron should be constructed of:
1. Input features
2. Followed by 1 Linear Layer with "single neuron"
3. Activation function


4. Perform forward pass for the example feature vectors `xInput1` and `xInput2` of `size = 10` features.
Use prepared plot to view the results. (Repeat the process using all 4 activation functions.)

In [80]:
# Import
import numpy as np
from plotly.subplots import make_subplots
import plotly.graph_objects as go
from collections import OrderedDict

### Module

All deep learning frameworks have usually one elementary building block.
In our project, we follow the structure of the pytorch, so the elementary building block is called **`Module`**.
Now, it is pretty simple, but it will get more complex and more useful...
You can see function `.backward` that will later contain the partial derivations of chain rule for backward pass and parameter optimization.

In [81]:
class Module:
    def __init__(self):
        self.modules = OrderedDict()

    def add_module(self, module, name:str):
        if hasattr(self, name) and name not in self.modules:
            raise KeyError("attribute '{}' already exists".format(name))
        elif '.' in name:
            raise KeyError("module name can't contain \".\"")
        elif name == '':
            raise KeyError("module name can't be empty string \"\"")
        self.modules[name] = module

    def forward(self, *args, **kwargs) -> np.ndarray:
        pass

    def backward(self, *args, **kwargs):
        pass

    def __call__(self, *args, **kwargs):
        return self.forward(*args, **kwargs)


## Linear Layer

In the lecture, we talked about a Perceptron and Single Layer Perceptron as an object with weight for every input value.
In the frameworks, the "Fully connected layer" is implemented in Matrix Algebra.

Also, the activation function and layer logic are separated for easier backward propagation (chain rule) and optimization (The topic of 2nd+3rd lecture).

(If you want to know more, you can go to the lecture, or you can take a look on the implementation of forward and backward propagation on your own.)

In [82]:
#------------------------------------------------------------------------------
#   Linear class
#------------------------------------------------------------------------------
class Linear(Module):
    def __init__(self, in_features, out_features):
        super(Linear, self).__init__()
        self.in_features = in_features
        self.out_features = out_features
        self.W = np.random.randn(out_features, in_features)
        self.b = np.zeros((out_features, 1))

    def forward(self, input: np.ndarray) -> np.ndarray:
        Z1 = self.W.dot(input) + self.b
        return Z1
        # <<<<<<<<<

    def backward(self, dz):
        pass


## Activations

The definitions for Sigmoid, Tanh, ReLU, and LeakyReLU activation functions with forward and backward pass.
Implement the forward pass. (for now, you leave the backward pass on `pass`)

In [83]:
#------------------------------------------------------------------------------
#   SigmoidActivationFunction class
#------------------------------------------------------------------------------
class Sigmoid(Module):
    def __init__(self):
        super(Sigmoid, self).__init__()

    def forward(self, input: np.ndarray) -> np.ndarray:

        return (1/(1 + np.exp(- input)))
        # <<<<<<<<<

    def backward(self, da):
        pass

#------------------------------------------------------------------------------
#   HyperbolicTangentActivationFunction class
#------------------------------------------------------------------------------
class Tanh(Module):
    def __init__(self):
        super(Tanh, self).__init__()

    def forward(self, input: np.ndarray) -> np.ndarray:

        return (np.exp(input) - np.exp(- input))/(np.exp(input) + np.exp(- input))

        # <<<<<<<<<

    def backward(self, da):
        pass

#------------------------------------------------------------------------------
#   RELUActivationFunction class
#------------------------------------------------------------------------------
class ReLU(Module):
    def __init__(self):
        super(ReLU, self).__init__()

    def forward(self, input: np.ndarray) -> np.ndarray:

        return np.maximum(0, input)
        # <<<<<<<<<

    def backward(self, da):
        pass

#------------------------------------------------------------------------------
#   LeakyRELUActivationFunction class
#------------------------------------------------------------------------------
class LeakyReLU(Module):
    # >>>>>>>>> add something here
    def __init__(self, alpha = 0.01):
        self.alpha = alpha
        super(LeakyReLU, self).__init__()

    def forward(self, input: np.ndarray) -> np.ndarray:

        return np.maximum(self.alpha*input, input)
    # <<<<<<<<<<<
    def backward(self, da):
        pass

### Plotting the functions
Verify your implementations of Activation functions - do your graphs look like they should?

In [84]:
activationsInput = np.linspace(-4,4,100)

sigmoid = Sigmoid()
y = sigmoid.forward(activationsInput)

fig = make_subplots(rows=2, cols=2)

fig.add_trace(
    go.Scatter(x=activationsInput, y=y, name='Sigmoid'),
    row=1, col=1
)

tanh = Tanh()
y = tanh.forward(activationsInput)
fig.add_trace(
    go.Scatter(x=activationsInput, y=y, name='Tanh'),
    row=1, col=2
)

relu = ReLU()
y = relu(activationsInput)
fig.add_trace(
    go.Scatter(x=activationsInput, y=y, name='ReLU'),
    row=2, col=1
)

leakyrelu = LeakyReLU()
y = leakyrelu(activationsInput)
fig.add_trace(
    go.Scatter(x=activationsInput, y=y, name='LeakyReLU'),
    row=2, col=2
)

fig.update_layout(height=600, width=800, title_text="Activation functions")
fig.show()


### Perceptron feed forward

Model your Perceptron.
Define and initialize perceptron with "1 neuron"!
Feed `xInput1` and `xInput2` to the perceptron and print the results.

In [85]:
# xInput1 is just a single sample - it contains 1 sample with 10 features
xInput1 = np.expand_dims(np.arange(10), axis=1)     # shape <10; 1>

# xInput2 is just a mini-batch! - it contains 4 samples with 10 features
xInput2 = np.random.randn(10, 4)                    # shape <10; 4>

# >>>>>>>>> Initialize Your Perceptron Here
in_features = 10
out_features = 5

perceptron = Linear(in_features,out_features)


Your Perceptron with an Activation function.
Use previously defined perceptron and use its output as input for the activation function sigmoid and LeakyReLU.
Feed `xInput1` and `xInput2` to the perceptron, print and observe the results.

In [86]:
# >>>>>>>>> Initialize activations and feed them after perceptron

output1 = perceptron.forward(xInput1)
output2 = perceptron.forward(xInput2)


sigmoid_output1 = sigmoid(output1)
sigmoid_output2 = sigmoid(output2)

leaky_relu_output1 = leakyrelu(output1)
leaky_relu_output2 = leakyrelu(output2)
print("Perceptron output for single sample:\n", output1)
print("Sigmoid output for single sample:\n", sigmoid_output1)
print("LeakyReLU output for single sample:\n", leaky_relu_output1)

print("\nPerceptron output for mini-batch:\n", output2)
print("Sigmoid output for mini-batch:\n", sigmoid_output2)
print("LeakyReLU output for mini-batch:\n", leaky_relu_output2)

Perceptron output for single sample:
 [[-0.64148785]
 [23.27585508]
 [17.06314461]
 [ 4.6571887 ]
 [ 5.75720582]]
Sigmoid output for single sample:
 [[0.34491029]
 [1.        ]
 [0.99999996]
 [0.99059616]
 [0.99685003]]
LeakyReLU output for single sample:
 [[-6.41487845e-03]
 [ 2.32758551e+01]
 [ 1.70631446e+01]
 [ 4.65718870e+00]
 [ 5.75720582e+00]]

Perceptron output for mini-batch:
 [[ 2.07059489 -0.15466929  1.35540997  1.31892875]
 [ 0.78890301 -1.27644715  2.65854787 -0.353657  ]
 [-3.65849429 -0.20248161 -0.24683223 -1.69723694]
 [ 3.64677367 -0.23259072  0.16178315 -1.67532563]
 [ 1.07740065  0.06345024  1.08517418  0.68462678]]
Sigmoid output for mini-batch:
 [[0.88801214 0.46140958 0.79501268 0.78900342]
 [0.68759574 0.2181556  0.93453588 0.41249589]
 [0.02512381 0.44955184 0.43860335 0.15482648]
 [0.97458751 0.44211305 0.5403578  0.15771543]
 [0.74600177 0.51585724 0.7474719  0.66477056]]
LeakyReLU output for mini-batch:
 [[ 2.07059489e+00 -1.54669289e-03  1.35540997e+00  1.

## Task 1b:

Finish the implementation of class `Model` - finish the call of forward feed.
Declare a simple model consisting of:
 1. Input Layer
 2. 3 Linear Layers with arbitrary number of neurons
 3. Output Linear Layer with 1 neuron.

...and activation functions to add non-linearity

Declare your own input vector with 16 features.
Perform forward pass through the network and print the results.

### Model class

Implementation of the **`Model`** class.
Define its forward function - the implementation of forward and backward pass is sensitive to the order of called operations.
Each Layer(module) of type **`Module`** can be saved to the attribute **`Module.modules`** using the **`add_module`** method.


In [87]:
#------------------------------------------------------------------------------
#   Model class
#------------------------------------------------------------------------------
class Model(Module):
    def __init__(self):
        super(Model, self).__init__()
        self.add_module(Linear(16, 32) , "linearInput")  # Input -> 32 neurons
        self.add_module(Linear(32, 48) , "linear1" )  # Input -> 32 neurons
        self.add_module(Linear(48, 64) , "linear2")  # 32 -> 64 neurons
        self.add_module(Linear(64, 32) , "linear3")  # 64 -> 32 neurons
        self.add_module(Linear(32, 1)  , "linear_out")  # 32 -> 1 neuron (output)

    def forward(self, input):
        # Funkcion and function order matter, tried manny alternatives dn which one is better or it it matters in this low-lvl case
        x = self.modules["linearInput"].forward(input)
        x = sigmoid(x)
        #x = leakyrelu(x)
        x = self.modules["linear1"].forward(x)
        #x = relu(x)
        x = sigmoid(x)
        x = self.modules["linear2"].forward(x)
        #x = relu(x)
        x = sigmoid(x)
        x = self.modules["linear3"].forward(x)
        #x = relu(x)
        x = sigmoid(x)
        x = self.modules["linear_out"].forward(x)
        x = sigmoid(x)

        return x

    def backward(self, dA: np.ndarray):
        pass

In [88]:
model = Model()
# We were told to have 3 layers.... not sure if input and output are counted so i make it  -> 1 + 3 + 1 <-
# >>>>>>>>> Build the model architecture with 2 hidden layers and one final output layer that can process - feed forward the xInput1 and xInput2
# we are told to feed it with xInput1 and xInput2 but those have 10 "features " tho we need 16 so i make a new one for this ,


x_input = np.random.randn(16, 16)
xInput1 = np.expand_dims(np.arange(16), axis=1)
xInput2 = np.random.randn(16, 4)

print("x_input:\n", x_input.shape)
print("xInput1:\n", xInput1.shape)
print("xInput2:\n", xInput2.shape)



output = model.forward(x_input)
output1 = model.forward(xInput1)
output2 = model.forward(xInput2)


x_input:
 (16, 16)
xInput1:
 (16, 1)
xInput2:
 (16, 4)


In [89]:
# What are the output shapes after feeding xInput1 and xInput2 to the model ?
# How many samples do they contain ? they contain 16,1 and 4 samples. Based on how much samples they already had.. only features vere reduced.


print("Model output:\n", output.shape)
print("Model output:\n", output1.shape)
print("Model output:\n", output2.shape)



Model output:
 (1, 16)
Model output:
 (1, 1)
Model output:
 (1, 4)


In [90]:
# Monkey dolls !