# Perceptron – The Simplest Neural Network
🎯 Objective: <br>
Understand how a single-layer neural network (a perceptron) makes binary decisions.
$$
\hat{y} = f\left( \sum_{i=1}^{n} w_i x_i + b \right)
$$


**Where:**<br>
- x = [x_1, x_2, ..., x_n] : input vector  
- w = [w_1, w_2, ..., w_n] : weights  
- b : bias term  
- f : activation function (e.g., step function)  
- y : predicted output (0 or 1)

## What Makes This a Single-Layer Perceptron (SLP)?
A single-layer perceptron (SLP) is defined by the number of layers of trainable weights between the input and output.

### Key Characteristics of an SLP:
**Architecture:** Just one layer of weights connecting input directly to output<br>
**Activation Function:** Simple (e.g., step function or sigmoid), applied directly after weighted sum<br>
**Training Rule:** Uses Perceptron Learning Rule (or gradient descent if activation is differentiable), There's only one transformation before activation<br>

Inputs → [ Weights + Bias ] → Activation → Output


In [11]:
import numpy as np

In [13]:
class Perceptron:
    def __init__(self, input_dim, lr=0.01, epochs=100):
        self.weights = np.zeros(input_dim)
        self.bias = 0
        self.lr = lr
        self.epochs = epochs

    def activation(self, x):
        return 1 if x >= 0 else 0  # Step function

    def predict(self, x):
        linear_output = np.dot(self.weights, x) + self.bias
        return self.activation(linear_output)

    def train(self, X, y):
        for _ in range(self.epochs):
            for xi, target in zip(X, y):
                pred = self.predict(xi)
                update = self.lr * (target - pred)
                self.weights += update * xi
                self.bias += update

In [15]:
# Example usage
X = np.array([
    [0, 0],
    [0, 1],
    [1, 0],
    [1, 1]
])
y = np.array([0, 0, 0, 1])  # AND function

model = Perceptron(input_dim=2)
model.train(X, y)

In [17]:
for x in X:
    print(f"Input: {x}, Predicted: {model.predict(x)}")

Input: [0 0], Predicted: 0
Input: [0 1], Predicted: 0
Input: [1 0], Predicted: 0
Input: [1 1], Predicted: 1


## Perceptron Concept Quiz
1. What is the purpose of the activation function in a perceptron?<br>
    a) To initialize weights <br>
    b) To determine whether the perceptron fires or not <br>
    c) To normalize the input features <br>
    d) To reduce overfitting <br>

2. Which of the following problems cannot be solved by a single-layer perceptron?<br>
    a) AND logic gate<br>
    b) OR logic gate<br>
    c) XOR logic gate<br>
    d) NOT logic gate<br>

3. What happens during training when the perceptron makes a wrong prediction?<br>
    a) The weights remain unchanged<br>
    b) The weights are updated to minimize the error<br>
    c) A new model is created<br>
    d) The perceptron stops training<br>

4. Why is a bias term 𝑏 included in the perceptron?<br>
    a) To reduce the dimensionality<br>
    b) To scale the output<br>
    c) To shift the activation function<br>
    d) To prevent overfitting<br>

## Answer
**Q1: b — Correct!** <br>
The activation function determines whether the perceptron "fires" (outputs a 1) or not (outputs a 0). It's a core part of decision-making in the perceptron.<br>
**Q2: c — Correct!**<br>
A single-layer perceptron cannot solve XOR because XOR is not linearly separable. You need a multi-layer perceptron (MLP) for that.<br>
**Q3: b — Correct!**<br>
When the perceptron makes a wrong prediction, it updates the weights using the Perceptron Learning Rule:<br>
**Q5: c — Correct!**<br>
The bias term allows the activation boundary to shift left or right, improving flexibility in fitting data that isn’t centered at the origin.<br>

### Analogy:
Think of it like this:<br>
**Single-layer perceptron** is like a simple light switch(Digital signal), either on or off, depending on one decision rule.<br>
**Multi-layer perceptron (MLP)** is like a smart thermostat(Analog signal), it makes more complex decisions by combining multiple intermediate rules (neurons in hidden layers).<br>

## Learning Phase = Transformation Phase
In machine learning, especially neural networks, the learning phase refers to the period during which the model:
- Sees data
- Makes predictions
- Calculates errors
- Updates parameters (weights and bias)
- So in effect, this is when the transformation function:
$$
\hat{y} = f\left( \sum_{i=1}^{n} w_i x_i + b \right)
$$<br>
is being tuned so that the output y gets closer to the target y.

## Weights and biases are Learned During This Phase
- Initially, weights 𝑤 and bias 𝑏 are random (or zero).
- During training:
    - The model sees inputs and computes outputs.
    - It compares outputs to ground truth.
    - Based on the error, it updates 𝑤 and 𝑏 to reduce future errors.

This happens over multiple epochs.<br>
**The learning phase is the transformation phase, and this is also where the appropriate values for weight and bias are determined.**

## lr — Learning Rate
Definition: A small number (e.g., 0.01 or 0.001) that controls how much the model adjusts its weights during each update.<br>
**Why It Matters:**<br>
It affects how fast or slow your model learns.

## epoch — Epoch
Definition: One full pass through the entire training dataset.<br>
**Why It Matters:**
The model needs multiple exposures to the data to learn patterns. A single pass is rarely enough.
🔁 Example:
You have 1,000 training samples.

If you train for 10 epochs, the model will see all 1,000 samples 10 times (in total: 10,000 update opportunities).

## Activation Function
Activation functions introduce non-linearity, allowing networks to model complex functions and decision boundaries.
Without one, your model would just be a linear transformation no matter how many layers you stack.<br>
The activation function used in this case is called the **Heaviside Step Function** (or simply the **Step function**).<br>

$$
f(x) =
\begin{cases}
1 & \text{if } x \geq 0 \\
0 & \text{otherwise}
\end{cases}
$$<br>

### Alternatives to the Step Function (Used in Practice)
- Sigmoid,     Output Range → (0, 1), Good for → (Probability) 
- Tanh,        Output Range → (-1, 1), Good for → (Zero-centered output) **Better than sigmoid for hidden layers**
- ReLu,        Output Range → (0, ∞), Good for → (Fast convergence) **Sparse activations, popular in deep nets**
- Leaky ReLu,  Output Range → (−∞, ∞), Good for → (Avoids dead neurons) **Fixes ReLU’s zero-gradient issue**
- Softmax,     Output Range → (0, 1), sums to 1, Good for → (Output layer for multiclass classification) **Converts logits to probabilities**

![Alt text](images/perceptron.png)

In [44]:
!git add .

The file will have its original line endings in your working directory
The file will have its original line endings in your working directory


In [46]:
!git commit -m "feat: Added note on neural network"


[main d884488] feat: Added note on neural network
 16 files changed, 303 insertions(+)
 create mode 100644 AI_projects/.ipynb_checkpoints/perceptron-checkpoint.ipynb
 create mode 100644 AI_projects/.ipynb_checkpoints/perceptron-checkpoint.png
 create mode 100644 AI_projects/ML_perceptron_1.png
 create mode 100644 AI_projects/perceptron.ipynb
 create mode 100644 AI_projects/perceptron.png
 create mode 100644 AI_projects/perceptron_10.png
 create mode 100644 AI_projects/perceptron_11.png
 create mode 100644 AI_projects/perceptron_12.png
 create mode 100644 AI_projects/perceptron_2.png
 create mode 100644 AI_projects/perceptron_3.png
 create mode 100644 AI_projects/perceptron_4.png
 create mode 100644 AI_projects/perceptron_5.png
 create mode 100644 AI_projects/perceptron_6.png
 create mode 100644 AI_projects/perceptron_7.png
 create mode 100644 AI_projects/perceptron_8.png
 create mode 100644 AI_projects/perceptron_9.png


In [48]:
!git push origin main

To https://github.com/endiesworld/ML_projects.git
   e07a59c..d884488  main -> main
