In [None]:
'''ASSIGNMENT 3
TITLE: BACK PROPAGATION FEED-FORWARD NEURAL
NETWORK
PROBLEM STATEMENT: -
. Write a python program in python program for creating a Back Propagation Feed-forward
neural network '''

import numpy as np

# Sigmoid and its derivative
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_deriv(x):
    return x * (1 - x)

# Input dataset (XOR problem)
X = np.array([[0, 0],
              [0, 1],
              [1, 0],
              [1, 1]])

# Output labels
y = np.array([[0],
              [1],
              [1],
              [0]])

# Set seed for reproducibility
np.random.seed(1)

# Define network architecture
input_neurons = 2
hidden_neurons = 4
output_neurons = 1

# Initialize weights and biases
W1 = np.random.uniform(size=(input_neurons, hidden_neurons))
b1 = np.zeros((1, hidden_neurons))

W2 = np.random.uniform(size=(hidden_neurons, output_neurons))
b2 = np.zeros((1, output_neurons))

# Learning rate
lr = 0.5

# Training loop
for epoch in range(10000):
    # ---- Forward propagation ----
    Z1 = np.dot(X, W1) + b1
    A1 = sigmoid(Z1)

    Z2 = np.dot(A1, W2) + b2
    A2 = sigmoid(Z2)

    # ---- Backpropagation ----
    error = y - A2
    dZ2 = error * sigmoid_deriv(A2)

    dW2 = np.dot(A1.T, dZ2)
    db2 = np.sum(dZ2, axis=0, keepdims=True)

    dA1 = np.dot(dZ2, W2.T)
    dZ1 = dA1 * sigmoid_deriv(A1)

    dW1 = np.dot(X.T, dZ1)
    db1 = np.sum(dZ1, axis=0, keepdims=True)

    # ---- Update weights and biases ----
    W2 += lr * dW2
    b2 += lr * db2
    W1 += lr * dW1
    b1 += lr * db1

    # Optional: Print loss
    if epoch % 1000 == 0:
        loss = np.mean(np.square(error))
        print(f"Epoch {epoch} | Loss: {loss:.4f}")

# ---- Final predictions ----
print("\nPredictions after training:")
output = sigmoid(np.dot(sigmoid(np.dot(X, W1) + b1), W2) + b2)
print(np.round(output))

'''Here are the answers to your assignment questions related to **Back Propagation Feed-forward Neural Networks (BP-FFNN)**:

---

### **1) What is the purpose of implementing a Back Propagation Feed-forward Neural Network (BP-FFNN)?**

The purpose of implementing a **BP-FFNN** is to build a machine learning model that can learn complex patterns from data using **supervised learning**. It is widely used for tasks such as classification, regression, pattern recognition, and function approximation.
Backpropagation is the key algorithm that helps in **minimizing the error** by adjusting the weights through gradient descent.

---

### **2) Explain the architecture of a BP-FFNN and the role of each layer (input, hidden, output).**

A **BP-FFNN** consists of three main layers:

* **Input Layer:**

  * Receives raw data features.
  * Each node represents a feature of the input.
  * No computation, just passes data to the next layer.

* **Hidden Layer(s):**

  * Performs computations using weighted inputs and activation functions.
  * Extracts patterns and representations from the data.
  * More layers and neurons improve learning capacity but increase complexity.

* **Output Layer:**

  * Produces the final result or prediction (e.g., class label, numeric value).
  * Activation function depends on the task (e.g., softmax for classification, linear for regression).

---

### **3) Describe the activation functions commonly used in the hidden and output layers of a BP-FFNN, and explain their significance.**

**Common activation functions:**

* **Hidden Layers:**

  * **ReLU (Rectified Linear Unit):** Efficient and reduces vanishing gradient issues.

    $$
    f(x) = \max(0, x)
    $$
  * **Sigmoid:** Squashes values between 0 and 1; useful for probabilistic interpretation.

    $$
    f(x) = \frac{1}{1 + e^{-x}}
    $$
  * **Tanh:** Outputs between -1 and 1; centered around zero.

* **Output Layer:**

  * **Sigmoid:** For binary classification.
  * **Softmax:** For multi-class classification.
  * **Linear:** For regression problems.

**Significance:** These functions introduce **non-linearity** into the model, allowing the network to learn complex relationships.

---

### **4) How does the backpropagation algorithm work in the context of training a BP-FFNN? Explain each step of the process.**

**Backpropagation Training Steps:**

1. **Forward Propagation:**

   * Compute output using current weights and activation functions.

2. **Loss Calculation:**

   * Use a loss function (e.g., MSE or cross-entropy) to measure the error between predicted and actual outputs.

3. **Backward Propagation:**

   * Calculate the **gradient of the loss function** with respect to each weight using the **chain rule** of calculus.

4. **Weight Updates:**

   * Adjust weights using gradient descent:

     $$
     w := w - \eta \cdot \frac{\partial L}{\partial w}
     $$

     where $\eta$ is the learning rate.

5. **Repeat:**

   * Iterate over multiple epochs until convergence.

---

### **5) Discuss the importance of hyperparameters such as learning rate, number of hidden layers, and number of neurons in each layer when training a BP-FFNN.**

* **Learning Rate (η):**

  * Controls the step size during weight updates.
  * Too small → slow convergence.
  * Too large → may overshoot minima or diverge.

* **Number of Hidden Layers:**

  * More layers allow deeper learning but increase training time and risk of overfitting.
  * Shallow networks may underfit complex data.

* **Number of Neurons per Layer:**

  * Affects the model's capacity to capture features.
  * Too few → underfitting; too many → overfitting and more computation.

**Balancing these hyperparameters** is crucial for building an effective and generalizable model.

---

Would you like an example Python implementation of a BP-FFNN using a simple dataset?
'''

Epoch 0 | Loss: 0.3181
Epoch 1000 | Loss: 0.2125
Epoch 2000 | Loss: 0.0059
Epoch 3000 | Loss: 0.0020
Epoch 4000 | Loss: 0.0012
Epoch 5000 | Loss: 0.0008
Epoch 6000 | Loss: 0.0006
Epoch 7000 | Loss: 0.0005
Epoch 8000 | Loss: 0.0004
Epoch 9000 | Loss: 0.0004

Predictions after training:
[[0.]
 [1.]
 [1.]
 [0.]]
