# Day 3: What is a Perceptron? Forward and Backward Propagation

Day 3 demonstrates the implementation of a single perceptron for solving a binary classification problem (AND gate) using forward propagation and backward propagation in TensorFlow. The code is implemented without Keras to understand the foundational concepts of neural networks.

## What I Learned

### 1. **What is a Perceptron?**
- A perceptron is a type of artificial neuron used in machine learning. It is the simplest type of neural network and is used for binary classification. The perceptron performs the following operations:
    - #### Weighted Sum:
        - z = W . X + b
    - Where:
        - \( W \): Weights
        - \( X \): Input features
        - \( b \): Bias term

### 2. **Activation Function?**
- An **activation function** applies a transformation to the weighted sum of inputs, determining whether a neuron should be activated or not. It introduces **non-linearity** to the model, enabling the neural network to learn complex patterns.

    #### **Common Activation Functions:**
    1. **Step Function**:  
       A binary function that outputs either 0 or 1 based on whether the weighted sum exceeds a threshold.
    
       \[
       f(z) =
       \begin{cases}
       1 & \text{if } z \geq 0 \\
       0 & \text{if } z < 0
       \end{cases}
       \]
    
    2. **Sigmoid Function**:  
       Maps the input to a range between 0 and 1. It is commonly used in binary classification.
    
       \[
       f(z) = \frac{1}{1 + e^{-z}}
       \]
    
    3. **ReLU (Rectified Linear Unit)**:  
       Outputs the input directly if it is positive, and 0 if it is negative. It is commonly used in hidden layers of deep networks.
    
       \[
       f(z) = \max(0, z)
       \]
    
    4. **Tanh (Hyperbolic Tangent)**:  
       Maps the input to a range between -1 and 1, similar to the sigmoid but centered around 0.
    
       \[
       f(z) = \frac{e^{z} - e^{-z}}{e^{z} + e^{-z}}
       \]

    #### **Purpose:**
    - Introduces non-linearity, allowing the neural network to learn complex patterns.
    - Helps the model make decisions based on the weighted input sum.

### 3. **Forward Propagation**

- **Forward propagation** is the process by which input data is passed through the layers of a neural network to generate an output. It involves calculating the weighted sum of inputs, adding the bias, and applying the activation function to produce the final prediction.

    ### Steps in Forward Propagation:
     1. **Input Layer**:  
        - The input data \( X \) is passed into the neural network. This could be an image, text, or any other type of data.

     2. **Weighted Sum**:  
        - The inputs are multiplied by the corresponding weights \( W \) and added to the bias term \( b \).
     
        \[
        z = W \cdot X + b
        \]
  
     3. **Activation Function**:  
        - The weighted sum \( z \) is passed through an activation function \( f(z) \) (such as sigmoid, ReLU, or tanh) to introduce non-linearity. This determines whether the neuron is activated.

             \[
             a = f(z)
             \]

     4. **Output**:  
        - The output \( a \) is passed to the next layer or, in the case of the final layer, returned as the model's prediction.

    ### Forward Propagation in a Single Layer:
    - In a simple neural network with one layer:
        - Each input feature \( X \) is multiplied by a weight \( W \).
        - The bias term \( b \) is added to the weighted sum.
        - The result is passed through an activation function to obtain the output.

This process is repeated layer by layer, from the input layer to the output layer, until a final prediction is made.


## 4. **Backward Propagation**
- **Backward propagation**, or **backpropagation**, is the process used to train a neural network by adjusting its weights and biases based on the error in its predictions. It calculates the gradients of the loss function with respect to each parameter using the chain rule and updates the parameters to minimize the error.

    ### Steps in Backward Propagation:
    
     1. **Compute the Loss**:  
        - The loss function \( L \) quantifies the difference between the predicted output and the actual target. Common loss functions include:
          - Mean Squared Error (MSE) for regression tasks.
          - Cross-Entropy Loss for classification tasks.
  
     2. **Calculate Gradients**:  
        - The gradients of the loss function with respect to the weights (\( W \)) and biases (\( b \)) are computed using the **chain rule** of calculus.

    
     3. **Weight and Bias Updates**:  
        - The weights and biases are updated using the gradients and a learning rate \( \eta \) (step size) to minimize the loss.
     
     4. **Iterate Through Layers**:  
        - Backpropagation starts at the output layer and propagates the error backward through each hidden layer, updating weights and biases at every step.
     
    ### Purpose of Backward Propagation:
     - **Minimize the Loss**: Helps the network learn by reducing the difference between predicted and actual outputs.
     - **Efficient Training**: Ensures that weights and biases are updated systematically to improve the model's performance.

Backward propagation is an essential step in training neural networks and enables them to learn from data effectively.


In [2]:

import tensorflow as tf
import numpy as np


In [3]:

# Data: OR Gate (Linearly separable)
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)
y = np.array([[0], [1], [1], [1]], dtype=np.float32)


In [4]:

# Parameters: Weights and Bias
weights = tf.Variable(tf.random.normal([2, 1]))
bias = tf.Variable(tf.random.normal([1]))


In [5]:

# Learning rate
learning_rate = 0.1


In [6]:

# Sigmoid activation function
def sigmoid(x):
    return 1 / (1 + tf.exp(-x))


In [7]:

# Loss function: Mean Squared Error
def compute_loss(y_true, y_pred):
    return tf.reduce_mean(tf.square(y_true - y_pred))


In [8]:

# Training the perceptron
epochs = 1000


In [9]:

for epoch in range(epochs):
    with tf.GradientTape() as tape:
        # Forward propagation
        linear_output = tf.matmul(X, weights) + bias 
        predictions = sigmoid(linear_output) 
        
        # Compute loss
        loss = compute_loss(y, predictions)
    
    # Backward propagation
    gradients = tape.gradient(loss, [weights, bias])
    weights.assign_sub(learning_rate * gradients[0]) 
    bias.assign_sub(learning_rate * gradients[1])

    # Logging progress
    if epoch % 100 == 0:
        print(f"Epoch {epoch}, Loss: {loss.numpy()}")


Epoch 0, Loss: 0.637333869934082
Epoch 100, Loss: 0.16182009875774384
Epoch 200, Loss: 0.0965847447514534
Epoch 300, Loss: 0.08095353096723557
Epoch 400, Loss: 0.06975092738866806
Epoch 500, Loss: 0.06086064502596855
Epoch 600, Loss: 0.05366594344377518
Epoch 700, Loss: 0.04776450991630554
Epoch 800, Loss: 0.04286418482661247
Epoch 900, Loss: 0.038749516010284424


In [10]:

# Final weights and bias
print(f"Weights: {weights.numpy()}, Bias: {bias.numpy()}")


Weights: [[2.5266373]
 [2.465813 ]], Bias: [-0.8859231]


In [11]:

# Testing
test_data = np.array([[0, 1], [1, 1]], dtype=np.float32)
test_predictions = sigmoid(tf.matmul(test_data, weights) + bias)
print(f"Predictions: {test_predictions.numpy()}")


Predictions: [[0.8291889 ]
 [0.98380184]]
