# Analytical Quiz: Backpropagation & Gradient Descent

### Questions

1. In your own words, explain the goal of gradient descent. Why is it used in training neural networks?

2. A model’s loss decreases very slowly during training. What could be the causes and how would you troubleshoot this?

3. What is the role of the learning rate in gradient descent? Describe what might happen if it’s too small or too large.

4. How does backpropagation use the chain rule from calculus? Why is this important?

5. Suppose you are training a neural network and you notice the gradients are either vanishing or exploding. What are some ways to address this?


# Assignment: Simulating Gradient Descent and Backpropagation

### Task 1: Visualize Gradient Descent

- Plot the function `f(x) = x² + 4` and simulate one-variable gradient descent.
- Use a starting point like `x = 10`.
- At each step, update `x = x - learning_rate * gradient`.
- Visualize how the function value decreases over iterations.


In [None]:
# Your code here
import numpy as np
import matplotlib.pyplot as plt

x = 10
lr = 0.1
history = []

for _ in range(30):
    grad = 2 * x  # derivative of f(x) = x² + 4
    x = x - lr * grad
    history.append((x, x**2 + 4))

xs, ys = zip(*history)
plt.plot(xs, ys, marker='o')
plt.title("Gradient Descent on f(x) = x² + 4")
plt.xlabel("x")
plt.ylabel("f(x)")
plt.grid(True)
plt.show()

### Task 2: Manual Backpropagation (2-layer NN)

- Implement a simple neural network with 1 hidden layer (no libraries).
- Input: x = 1.0, weight1 = 0.5, weight2 = -1.0, learning rate = 0.1
- Target output: 0.5
- Use sigmoid as activation and MSE as the loss function
- Manually compute forward pass, gradients, and one update step


In [None]:
# Your code here
# Define sigmoid and derivative
sigmoid = lambda x: 1 / (1 + np.exp(-x))
sigmoid_prime = lambda x: sigmoid(x) * (1 - sigmoid(x))

# Forward pass
x = 1.0
w1 = 0.5
w2 = -1.0
y_true = 0.5

h = sigmoid(x * w1)
y_pred = sigmoid(h * w2)

# Loss
loss = 0.5 * (y_true - y_pred)**2
print("Loss before update:", loss)

# Backward pass
dL_dy = y_pred - y_true
dy_dz2 = sigmoid_prime(h * w2)
dz2_dw2 = h
dz2_dh = w2

dh_dz1 = sigmoid_prime(x * w1)
dz1_dw1 = x

# Gradients
dL_dw2 = dL_dy * dy_dz2 * dz2_dw2
dL_dw1 = dL_dy * dy_dz2 * dz2_dh * dh_dz1 * dz1_dw1

# Update
w1 -= 0.1 * dL_dw1
w2 -= 0.1 * dL_dw2

print("Updated weights:", w1, w2)