Backpropagation (short for "Backward Propagation of Errors") is an optimization technique used in deep learning to train neural networks efficiently. It updates the network's weights based on the errors from its predictions, helping the model learn better over time.


#### What it does?
- Purpose: It minimizes the cost (error) by adjusting weights and biases in the network.
#### How It Works:- It calculates the error (loss) between predicted and actual values.
- The error is propagated back through the network, layer by layer, using partial derivatives.
- Adjustments are made to the weights to reduce the error in future predictions.



Steps in Backpropagation
- 1. Forward Pass:- Input flows through the network to produce predictions (output).
- Example: Predicting y = 10 when actual y = 12.

- 2. Calculate Error:- Compute the difference (error) between predictions and actual values using a cost function.
- Example: Mean Squared Error (MSE): $$\text{Error} = (12 - 10)^2 = 4$$

- 3. Backward Pass:- The error is sent backward through the network (from output to input) using the chain rule of derivatives.
- Gradients are calculated to determine how much each weight contributes to the error.

- 4. Update Weights:- Gradients are used to adjust weights and biases using an optimization algorithm (e.g., Gradient Descent).


Why Backpropagation is Needed
- Efficiency: It updates weights systematically across all layers of the network.
- Scalability: Handles complex, multi-layer architectures.
- Accuracy: Helps reduce error and improve predictions over multiple iterations.

In [1]:
# Step 1: Import Required Libraries
import numpy as np

In [2]:
# Step 2: Define Activation Function and Derivativ
def sigmoid(x):
    # Sigmoid activation function: 1 / (1 + e^(-x))
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    # Derivative of sigmoid for backpropagation
    return x * (1 - x)

In [None]:
# Step 3: Initialize Training Data (Dummy)
# Input features (2 features per example) means 2 columns 4 rows
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])

# Target outputs (expected results for XOR problem)
y = np.array([[0], [1], [1], [0]])

In [4]:
# Step 4: Initialize Weights and Bias
# Random weights initialization
np.random.seed(42)  # For consistent results
weights_input_hidden = np.random.rand(2, 2)  # Between input and hidden layer
weights_hidden_output = np.random.rand(2, 1)  # Between hidden and output layer
bias_hidden = np.random.rand(1, 2)  # Bias for hidden layer
bias_output = np.random.rand(1, 1)  # Bias for output layer

In [5]:
# Step 5: Implement Forward and Backward Pass

# Training process
for epoch in range(10000):  # Number of iterations
    # Forward Pass
    # Step 1: Input to Hidden Layer
    hidden_layer_input = np.dot(X, weights_input_hidden) + bias_hidden
    hidden_layer_output = sigmoid(hidden_layer_input)

    # Step 2: Hidden to Output Layer
    output_layer_input = np.dot(hidden_layer_output, weights_hidden_output) + bias_output
    predicted_output = sigmoid(output_layer_input)

    # Backward Pass (Error Calculation and Weight Update)
    # Step 3: Calculate Error
    error = y - predicted_output

    # Step 4: Compute Gradients
    output_gradient = error * sigmoid_derivative(predicted_output)
    hidden_error = output_gradient.dot(weights_hidden_output.T)
    hidden_gradient = hidden_error * sigmoid_derivative(hidden_layer_output)

    # Step 5: Update Weights and Biases
    weights_hidden_output += hidden_layer_output.T.dot(output_gradient) * 0.1  # Learning rate = 0.1
    weights_input_hidden += X.T.dot(hidden_gradient) * 0.1
    bias_hidden += np.sum(hidden_gradient, axis=0, keepdims=True) * 0.1
    bias_output += np.sum(output_gradient, axis=0, keepdims=True) * 0.1

In [6]:
# Step 6: Test the Model
# Print final predictions
print("Final Predictions:")
print(predicted_output)

Final Predictions:
[[0.06029012]
 [0.94447222]
 [0.944367  ]
 [0.05997169]]
