# MLP

**🎉 Implementation Credit & Introduction 🚀**

This Multilayer Perceptron (MLP) implementation is inspired by and based on Omar Aflak’s excellent work on building a neural network from scratch in Python. His clear, step-by-step Medium article and accompanying GitHub repository laid the foundation for this notebook:

- 📖 [Medium Article: “Math & Neural Network From Scratch in Python”](https://medium.com/data-science/math-neural-network-from-scratch-in-python-d6da9f29ce65)  
- 💻 [GitHub Repo: OmarAflak/Medium-Python-Neural-Network](https://github.com/OmarAflak/Medium-Python-Neural-Network)

---

## ✨ What Is a Multilayer Perceptron?

A Multilayer Perceptron is one of the simplest—and yet most powerful—types of feedforward neural networks. It consists of:

1. **Input Layer**: Receives the raw features (e.g., pixel values, sensor readings).  
2. **Hidden Layer(s)**: Perform non-linear transformations using activation functions like sigmoid, ReLU, or tanh.  
3. **Output Layer**: Produces the final predictions (e.g., class probabilities, regression outputs).

Each layer is “fully connected,” meaning every neuron in one layer connects to every neuron in the next. The magic of MLPs lies in how they learn: through **backpropagation**, weights and biases are adjusted to minimize a loss function (e.g., mean squared error).

---

## 🎨 Visual Overview

Below are some placeholders where you can insert illustrative diagrams:

![🖼️ Network Architecture Placeholder](path/to/architecture_diagram.png)  
*Figure 1: Schematic of an MLP with one hidden layer.*

![🖼️ Forward & Backward Pass Placeholder](path/to/backprop_diagram.png)  
*Figure 2: Illustration of forward propagation (left) and backpropagation (right).*

---

## 📝 Notebook Structure

1. **Import Libraries**  
2. **Activation & Loss Functions**  
3. **Layer Classes**  
   - Fully Connected Layer  
   - Activation Layer  
4. **Network Class & Training Loop**  
5. **Example: Training on Dummy Data**  
6. **Results & Visualization**

Let’s dive in! ✨  


## Imports

In [None]:
import numpy as np

## Activation Functions

In [None]:
def sigmoid(x):
    return 1.0 / (1.0 + np.exp(-x))

def sigmoid_prime(x):
    return sigmoid(x)*(1.0 - sigmoid(x))

## Loss functions

In [None]:
def mse(y_true, y_pred):
    return (0.5*(y_true - y_pred)**2).mean()

def mse_prime(y_true, y_pred):
    return y_pred - y_true

## Activation Layers

In [None]:
class ActivationLayer:
    def forward(self, input_data):
        self.input = input_data
        return sigmoid(input_data)

    def backward(self, output_error):
        return sigmoid_prime(self.input) * output_error
    
    def step(self, eta):
        return

## Fully Connected Layer

In [1]:
class FullyConnectedLayer:
    def __init__(self, input_size, output_size):
        self.delta_w = np.zeros((input_size, output_size))
        self.delta_b = np.zeros((1,output_size))
        self.passes = 0

        self.weights = np.random.rand(input_size, output_size) - 0.5
        self.bias = np.random.rand(1, output_size) - 0.5

    def forward(self, input_data):
        self.input = input_data
        return np.dot(self.input, self.weights) + self.bias

    def backward(self, output_error):
        input_error = np.dot(output_error, self.weights.T)
        weights_error = np.dot(self.input.T, output_error)

        self.delta_w += weights_error
        self.delta_b += output_error
        self.passes += 1
        return input_error

    def step(self, eta):
        self.weights -= eta * self.delta_w / self.passes
        self.bias -= eta * self.delta_b / self.passes

        self.delta_w = np.zeros(self.weights.shape)
        self.delta_b = np.zeros(self.bias.shape)
        self.passes = 0


## MLP Network

In [None]:
class Network:
    def __init__(self, verbose=True):
        self.verbose = verbose
        self.layers = []

    def add(self, layer):
        self.layers.append(layer)

    def predict(self, input_data):
        result = []
        for i in range(input_data.shape[0]):
            output = input_data[i]
            for layer in self.layers:
                output = layer.forward(output)
            result.append(output)
        return result

    def fit(self, x_train, y_train, minibatches, learning_rate, batch_size=64):
        for i in range(minibatches):
            err = 0

            idx = np.argsort(np.random.random(x_train.shape[0]))[:batch_size]
            x_batch = x_train[idx]
            y_batch = y_train[idx]

            for j in range(batch_size):
                output = x_batch[j]
                for layer in self.layers:
                    output = layer.forward(output)

                err += mse(y_batch[j], output)

                error = mse_prime(y_batch[j], output)
                for layer in reversed(self.layers):
                    error = layer.backward(error)
            
            for layer in self.layers:
                layer.step(learning_rate)

            if (self.verbose) and ((i%10) == 0):
                err /= batch_size
                print('minipartia: %5d/%d   błąd=%0.9f' % (i, minibatches, err))