# Super Simple Perceptron 🔥

Let's start with the perceptron and see what happens to the input as it goes through the model:

1. **Input**: The perceptron gets input values, as a vector $ \mathbf{x} $.

   Here's the example vector with two inputs: $ \mathbf{x} = [0.5, 1.5] $.

2. **Weights and Bias**:
   - **Weights ($ \mathbf{w} $)**: The perceptron learns these parameters during training which sets the importance, or how much `weight`, we give each input.
   - **Bias ($ b $)**: This is the other important parameter that helps the model to fit the data better by shifting the activation function.
   > Essentially the bias adjusts the output regardless of the inputs by shifting the activation function, especially when the input values are zero or do not fully explain the output on their own. 🤔


   For our example, let's use weights $ \mathbf{w} = [0.8, 0.2] $ and bias $ b = 0.5 $.

3. **Weighted Sum**: The perceptron calculates a weighted sum of the inputs and adds the bias. The formula is:

   $$
   z = \mathbf{w} \cdot \mathbf{x} + b
   $$

   Substituting the values:

   $$
   z = (0.8 \times 0.5) + (0.2 \times 1.5) + 0.5 = 0.4 + 0.3 + 0.5 = 1.2
   $$

4. **Activation Function**: The weighted sum is what we pass through the activation function. Common choices, whicch we'll cover later, include the step function, sigmoid, or ReLU. For simplicity, we'll use a step function:

   $$
   \text{output} = \begin{cases}
      1 & \text{if } z \geq 0 \\
      0 & \text{if } z < 0
   \end{cases}
   $$

   Since $ z = 1.2 $, which is greater than 0, the output is 1.
   Because,anything above 0 is 1 💪


In [None]:

import numpy as np

# Define input, weights, and bias
x = np.array([0.5, 1.5])
w = np.array([0.8, 0.2])
b = 0.5

# Calculate weighted sum
z = np.dot(w, x) + b
print(f'Weighted sum: {z}')

# Step activation function
output = 1 if z >= 0 else 0
print(f'Output: {output}')


Weighted sum: 1.2000000000000002
Output: 1


In [None]:
x[0], w[0], b, z

(0.5, 0.8, 0.5, 1.2000000000000002)


# **Extending to Multilayer Perceptron (MLP)**

Here, we'll investigate out how `inputs` are transformed from a perceptron to a Multilayer Perceptron (MLP) and whats happening inside!

1. **Input Layer**: Similar to a single perceptron, the MLP starts with an input layer—the same input $ \mathbf{x} = [0.5, 1.5] $.

2. **Hidden Layer(s)**: MLPs have one or more hidden layers. Each neuron in a hidden layer performs the same operations as a single perceptron.

   Let's take a look at an MLP with one hidden layer containing two neurons.

   **Neuron 1 in Hidden Layer:**
   - Weights: $ \mathbf{w_1} = [0.8, 0.2] $
   - Bias: $ b_1 = 0.5 $

   **Neuron 2 in Hidden Layer:**
   - Weights: $ \mathbf{w_2} = [0.4, 0.9] $
   - Bias: $ b_2 = -0.3 $

   **Calculations for Hidden Layer:**
   - Neuron 1: $ z_1 = (0.8 \times 0.5) + (0.2 \times 1.5) + 0.5 = 1.2 $
   - Neuron 2: $ z_2 = (0.4 \times 0.5) + (0.9 \times 1.5) - 0.3 = 1.4 $

   > Applying a ReLU activation function turns negative numbers to 0. ReLU applied to `[-1  0  1  2]` => `[0 0 1 2]`.
   
   > ReLU is often used in hidden layers of neural networks because it helps to mitigate the vanishing gradient problem and introduces non-linearity into the model.

Here we apply ReLU:
   $$
   \text{output}_1 = \max(0, 1.2) = 1.2
   $$
   $$
   \text{output}_2 = \max(0, 1.4) = 1.4
   $$

3. **Output Layer**: The outputs from the hidden layer are then fed into the output layer.

   **Neuron in Output Layer:**
   - Weights: $ \mathbf{w_3} = [0.3, 0.7] $
   - Bias: $ b_3 = 0.1 $

   **Calculations for Output Layer:**
   - Input to this layer: $ [1.2, 1.4] $

   $$
   z_3 = (0.3 \times 1.2) + (0.7 \times 1.4) + 0.1 = 1.34
   $$

   Applying a sigmoid activation function for binary classification:

   $$
   \text{output} = \frac{1}{1 + e^{-1.34}} \approx 0.79
   $$


In [1]:
import numpy as np

# Define input
x = np.array([0.5, 1.5])

# Weights and biases for hidden layer
w1 = np.array([0.8, 0.2])
b1 = 0.5
w2 = np.array([0.4, 0.9])
b2 = -0.3

# Calculate outputs for hidden layer
z1 = np.dot(w1, x) + b1
z2 = np.dot(w2, x) + b2

# Apply ReLU activation
output1 = max(0, z1)
output2 = max(0, z2)

# Weights and bias for output layer
w3 = np.array([0.3, 0.7])
b3 = 0.1

# Calculate output for output layer
z3 = np.dot(w3, np.array([output1, output2])) + b3

# Apply sigmoid activation
output = 1 / (1 + np.exp(-z3))

print(f'Output: {output}')


Output: 0.7916664907298545



### Explanation Recap

- **Perceptron:** Single layer of computation, one neuron, that processes inputs through weighted sums and activation functions.
- **MLP:** Multiple layers, many neurons, where each layer’s output becomes the next layer’s input, allowing the network to learn complex patterns.

I hope this helps you understand how inputs are transformed as they travel through the network. 🔥
