# 📌 Exercise 1.2: Implement Gradient Descent for a Single Neuron

💡 Goal: Adjust w and b to minimize error using Mean Squared Error (MSE) loss.

## 🔹 Steps:

- Define a loss function:

    Loss = ( 1 / 𝑛 ) ∑( 𝑦pred − 𝑦true )^2

- Compute gradients of `w` and `b`
- Update `w` and `b` using **Gradient Descent**
- Train the neuron on some **sample data**

In [1]:
import numpy as np

# Step 1: Generate some sample training data
X_train = np.array([-2, -1, 0, 1, 2])  # Inputs (features)
y_train = np.array([0, 0, 1, 1, 1])    # Corresponding labels (desired outputs)

# Step 2: Initialize parameters (weights and bias)
w = np.random.randn()  # Randomly initialize the weight
b = np.random.randn()  # Randomly initialize the bias
learning_rate = 0.1    # Learning rate for gradient descent
epochs = 1000          # Number of training iterations

# Step 3: Define the Sigmoid activation function
def sigmoid(x):
    """
    Computes the sigmoid activation function.

    Parameters:
    x (float): The input value.

    Returns:
    float: The sigmoid output (a value between 0 and 1).
    """
    return 1 / (1 + np.exp(-x))

# Step 4: Compute the derivative of the Sigmoid function
def sigmoid_derivative(x):
    """
    Computes the derivative of the sigmoid function.

    Parameters:
    x (float): The input value.

    Returns:
    float: The derivative of the sigmoid function.
    """
    return sigmoid(x) * (1 - sigmoid(x))

# Step 5: Training loop using gradient descent
for epoch in range(epochs):
    total_loss = 0  # Track the total loss for this epoch

    for i in range(len(X_train)):  # Iterate over each training sample
        x = X_train[i]      # Get the input
        y_true = y_train[i] # Get the corresponding true label

        # Step 6: Forward pass (calculate neuron output)
        z = w * x + b       # Linear combination of inputs
        y_pred = sigmoid(z) # Apply sigmoid activation

        # Step 7: Compute Mean Squared Error (MSE) loss
        loss = (y_pred - y_true) ** 2  # Squared error loss
        total_loss += loss  # Accumulate loss for this epoch

        # Step 8: Backpropagation - Compute gradients
        dL_dy = 2 * (y_pred - y_true)  # Derivative of loss w.r.t y_pred
        dy_dz = sigmoid_derivative(z)  # Derivative of y_pred w.r.t z
        dz_dw = x  # Derivative of z w.r.t w
        dz_db = 1  # Derivative of z w.r.t b

        # Compute gradients for weight and bias
        dL_dw = dL_dy * dy_dz * dz_dw  # Gradient of loss w.r.t w
        dL_db = dL_dy * dy_dz * dz_db  # Gradient of loss w.r.t b

        # Step 9: Update parameters using gradient descent
        w -= learning_rate * dL_dw  # Update weight
        b -= learning_rate * dL_db  # Update bias

    # Step 10: Print loss every 100 epochs
    if epoch % 100 == 0:
        print(f"Epoch {epoch}, Loss: {total_loss:.4f}")

# Step 11: Print final trained parameters
print("Final weight:", w)
print("Final bias:", b)

# Step 12: Test the trained neuron with the same input values
output = [sigmoid(w * x + b) for x in X_train]
print("Final Neuron Outputs:", output)


Epoch 0, Loss: 1.4978
Epoch 100, Loss: 0.1393
Epoch 200, Loss: 0.0817
Epoch 300, Loss: 0.0566
Epoch 400, Loss: 0.0428
Epoch 500, Loss: 0.0341
Epoch 600, Loss: 0.0282
Epoch 700, Loss: 0.0240
Epoch 800, Loss: 0.0208
Epoch 900, Loss: 0.0183
Final weight: 4.643631924726756
Final bias: 2.203567813349657
Final Neuron Outputs: [0.0008379648928901648, 0.0801681843926923, 0.9005694444835305, 0.9989387009482799, 0.9999897767077403]


## 📌 Analysis of the Output

✅ Loss Decreasing: The loss started at **1.0911** and reduced to **0.0176**, showing that the neuron optimized itself over epochs.

✅ **Final Weight** (`w`) and **Bias** (`b`):

- `w = 4.68`: This means the neuron gives a **strong weight** to the input.
- `b = 2.22`: A positive bias shifts the output towards **1**. ✅ Final Outputs:

- **Low values (-2, -1) → Close to 0** ✅
- **High values (0, 1, 2) → Close to 1** ✅
- The neuron correctly classifies the input data **as expected!**