# Forward and Backward Propagation


# 1. Explain the concept of forward propagation in a neural network.

Solution:-
What is Forward Propagation?
Forward propagation is the process in which an input passes through a neural network’s layers to produce an output. It is the first phase of neural network computation before backpropagation (which updates weights).
Goal: Compute the predicted output (ŷ) based on input features (X) and the network’s weights & biases.

# 2. What is the purpose of the activation function in forward propagation.

An activation function introduces non-linearity in a neural network by transforming the weighted sum of inputs into an output signal. It helps the network learn complex patterns beyond simple linear relationships.

Why Are Activation Functions Important in Forward Propagation?
1️ Introduces Non-Linearity (Essential for Learning)
Without activation functions, a neural network acts like a linear function, no matter how many layers it has.
Non-linear activation functions allow the network to model complex relationships (e.g., recognizing faces, understanding speech).

# 3. Describe the steps involved in the backward propagation (backpropagation) algorithm.

Solution:-
Backpropagation (Backward Propagation of Errors) is an optimization algorithm used in neural networks to update the weights and biases by minimizing the error (loss). It works by calculating the gradient of the loss function with respect to each parameter using the chain rule of calculus.
Steps of Backpropagation Algorithm
Backpropagation occurs after forward propagation and consists of the following steps:
1. Forward Propagation (Compute Output)
2. Compute the Loss Gradient (Error Signal)
3.  Backpropagate Error to Output Layer
4.  Backpropagate Error to Hidden Layers
5.  pdate Weights & Biases Using Gradient Descent

# 4.  What is the purpose of the chain rule in backpropagation.

Solution:-
The chain rule is a fundamental concept from calculus that is used in backpropagation to compute the gradients of the loss function with respect to each weight in a neural network. It is critical for efficiently calculating how the error (loss) changes as the weights and biases are adjusted during training.

The chain rule allows us to compute the derivative of a composite function. If you have a function that is composed of other functions, the chain rule helps to find the derivative of the entire function with respect to the variable of interest.

In backpropagation, we need to compute how the loss function (error) changes with respect to each weight, starting from the output layer all the way to the input layer. This requires calculating gradients layer by layer, and the chain rule is the mathematical tool for that.

Why is the Chain Rule Important in Backpropagation?
Enables Efficient Gradient Computation
The chain rule allows us to break down complex derivatives into smaller, manageable parts, making the process of computing gradients efficient.

Backpropagates Error Through Layers
By applying the chain rule layer by layer, backpropagation propagates the error backward from the output to the input layer, allowing each weight in the network to be updated correctly.

Facilitates Optimization
The gradients calculated using the chain rule guide the weight updates during training, helping the model converge toward a better solution (i.e., minimizing the loss function).



# 5. Implement the forward propagation process for a simple neural network with one hidden layer using NumPy.

Solution:-
Here's a simple implementation of forward propagation for a neural network with one hidden layer using NumPy. We will define:

An input layer with random values.
A hidden layer with a ReLU activation function.
An output layer with a sigmoid activation function (for binary classification).
Steps:
Input Layer: Random values (features).
Hidden Layer: Compute weighted sum, apply ReLU activation.
Output Layer: Compute weighted sum, apply Sigmoid activation (for probability output).

In [1]:
import numpy as np

# Initialize parameters (weights and biases) for a neural network with 1 hidden layer
input_size = 3  # Number of features (input layer size)
hidden_size = 4  # Number of neurons in the hidden layer
output_size = 1  # Output layer size (binary classification)

# Random initialization of weights and biases
np.random.seed(42)  # For reproducibility
W1 = np.random.randn(input_size, hidden_size)  # Weights for input to hidden layer
b1 = np.zeros((1, hidden_size))  # Biases for hidden layer
W2 = np.random.randn(hidden_size, output_size)  # Weights for hidden to output layer
b2 = np.zeros((1, output_size))  # Biases for output layer

# Sample input data (let's assume batch size = 1 for simplicity)
X = np.array([[0.1, 0.2, 0.3]])  # Shape: (1, 3)

# Forward Propagation
# 1. Hidden Layer: Compute Z1, then apply ReLU activation
Z1 = np.dot(X, W1) + b1  # Weighted sum for hidden layer
A1 = np.maximum(0, Z1)  # ReLU activation

# 2. Output Layer: Compute Z2, then apply Sigmoid activation
Z2 = np.dot(A1, W2) + b2  # Weighted sum for output layer
A2 = 1 / (1 + np.exp(-Z2))  # Sigmoid activation (output layer)

# Output of forward propagation (A2 is the final predicted probability)
print("Output (Predicted Probability):")
print(A2)


Output (Predicted Probability):
[[0.33060082]]
