#**Forward and Backward Propagation Assignment**

## Q1. Explain the concept of forward propagation in a neural network.


### **Forward Propagation in Neural Networks**  
Forward propagation is the process of passing input data through the neural network to compute the output or prediction based on current weights and biases.  

### **Steps in Forward Propagation:**  
1. **Input Layer**: Input data is fed into the network.  
2. **Weighted Sum**: For each neuron in the hidden layer, a weighted sum of the inputs is computed.
Each input feature is multiplied by a corresponding weight (which determines the importance of
the feature), and then the bias term is added.

  Mathematically, this is expressed as:
   
   z = w1x1+ w2x2+ w3x3 +.....+ b
   
   where w1, w2, w3 are the weights, x1, x2, x3...xn  are the input features, and, b is the bias.
3. **Activation Function**: Apply an activation function (e.g., ReLU, sigmoid) to introduce non-linearity.  
4. **Hidden Layers**: Repeat the weighted sum and activation for all hidden layers.  
5. **Output Layer**: Generate the final output, such as probabilities (using softmax) or regression values.  

**Summary**: Forward propagation calculates the output by propagating inputs through the network, layer by layer, using weighted sums and activation functions. This forms the basis for making predictions.  


## Q2. What is the purpose of the activation function in forward propagation?


### **Purpose of the Activation Function in Forward Propagation**  
The activation function introduces non-linearity to the neural network, enabling it to model complex, non-linear relationships in data.  

####**Key Points:**  
1. **Avoids Linearity**: Without activation functions, the network behaves as a linear model, regardless of layers.  
2. **Enables Complexity**: Non-linear activation functions (e.g., ReLU, sigmoid, tanh) allow the network to learn complex patterns and approximate non-linear functions.  
3. **Improves Learning**: They enable feature extraction, hierarchical learning, and adaptation to varying levels of abstraction, enhancing generalization and prediction accuracy.  

In summary, activation functions are essential for solving complex tasks like image recognition or natural language processing, as they transform simple linear computations into powerful non-linear models.

## Q3. Describe the steps involved in the backward propagation (backprogation) algorithm.


### **Steps in the backward (backpropagation) algorithm:**  
Backpropagation is the process of updating the weights and biases of a neural network to minimize the error by propagating the loss backward.

**Key Steps:**  
1. **Forward Pass**: Compute the output of the network and calculate the loss using a loss function (e.g., MSE, Cross-Entropy).  
2. **Backward Pass**:  
   - **Output Layer**: Compute the gradient of the loss with respect to the output and backpropagate through the activation function.  
   - **Hidden Layers**: Calculate gradients for each layer using the chain rule, propagating the error backward layer by layer.  
3. **Weight Update**: Use the computed gradients to update weights and biases using an optimization algorithm (e.g., gradient descent).  
In text form, the equation can be written as:  

**"The new weight (\(w_{\text{new}}\)) is calculated by subtracting the product of the learning rate (\(\eta\)) and the gradient of the loss (\( \frac{\partial L}{\partial w} \)) from the old weight (\(w_{\text{old}}\))."**  

This represents the weight update rule in gradient descent.   
   w_{new} = w_{old} - \eta \cdot \frac{\partial L}{\partial w}
    
   where ( \eta ) is the learning rate.  
4. **Repeat**: Perform forward and backward passes iteratively for multiple epochs until the loss converges.

**Summary**: Backpropagation calculates gradients to adjust weights and biases, minimizing error and improving the model's performance over iterations.

## Q4. What is the purpose of the chain rule in backprogation?



### **Purpose of the Chain Rule in Backpropagation:**  
The chain rule is crucial in backpropagation because it enables the calculation of gradients for the loss function with respect to weights and biases in a neural network.  

**Key Points:**  
1. **Gradient Decomposition**: The chain rule breaks the gradient calculation into smaller, manageable parts by chaining the derivatives of activations and weights.  
2. **Error Propagation**: It helps propagate the error backward through layers by computing how the loss at the output depends on the parameters of each layer.  
3. **Efficient Updates**: This process ensures efficient gradient computation for deep networks, allowing proper updates to weights and biases to minimize loss.  

**Summary**: The chain rule allows backpropagation to compute gradients layer by layer, making it feasible to train deep neural networks efficiently.

## Q5. Implement the forward progation process for a simple neural network with one hidden layer using NumPy.

In [1]:
import numpy as np

# Define the activation function (Sigmoid in this case)
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Define the input, weights, and biases for a simple network
# Assume input layer has 3 neurons, hidden layer has 4 neurons, and output layer has 1 neuron

# Input (batch size = 2, features = 3)
X = np.array([[0.1, 0.2, 0.3],
              [0.4, 0.5, 0.6]])

# Weights for the hidden layer (3 input neurons, 4 hidden neurons)
W_hidden = np.random.randn(3, 4)

# Biases for the hidden layer (4 neurons in the hidden layer)
b_hidden = np.random.randn(4)

# Weights for the output layer (4 hidden neurons, 1 output neuron)
W_output = np.random.randn(4, 1)

# Bias for the output layer (1 output neuron)
b_output = np.random.randn(1)

# Forward Propagation
# Step 1: Calculate the weighted sum for the hidden layer
Z_hidden = np.dot(X, W_hidden) + b_hidden

# Step 2: Apply activation function to the hidden layer (using Sigmoid)
A_hidden = sigmoid(Z_hidden)

# Step 3: Calculate the weighted sum for the output layer
Z_output = np.dot(A_hidden, W_output) + b_output

# Step 4: Apply activation function to the output (Sigmoid for a binary classification task)
A_output = sigmoid(Z_output)

# Print the output
print("Output of the network:")
print(A_output)

Output of the network:
[[0.11254923]
 [0.10850343]]
