# Deep Learning Fundamentals - Solutions

This notebook covers the fundamental building blocks of Deep Learning as discussed in the lecture. 
You will implement Perceptrons, Activation Functions, and specific components of Neural Networks.

## Topics Covered:
1. **Perceptrons & Logic Gates** (AND, OR, NOT)
2. **Activation Functions** (Sigmoid, ReLU, Tanh)
3. **Multi-Layer Perceptrons (MLP)** (Solving XOR)
4. **Forward & Backward Propagation** (Step-by-step implementation)

In [None]:
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

## Part 1: The Perceptron as a Logic Gate

A single Perceptron can act as a basic Logic Gate. 

Recall the formula for a perceptron:
$$ Output = \begin{cases} 1 & \text{if } w \cdot x + b > 0 \\ 0 & \text{otherwise} \end{cases} $$
*(Note: In some contexts, threshold is moved to basic bias)*

### Exercise 1.1: Implement Logic Gates
Using the weights and biases (or thresholds) derived in the lecture, implement the `AND`, `OR`, and `NOT` gates.

**Reference values from lecture:**
- **AND**: $w_1=1, w_2=1, t=1.5$
- **OR**: $w_1=1, w_2=1, t=0.5$
- **NOT**: $w_1=-1, t=-0.5$

In [None]:
def perceptron(x, w, t):
    """
    A simple binary threshold perceptron.
    Args:
        x: list or array of inputs
        w: list or array of weights
        t: threshold value
    Returns:
        0 or 1
    """
    # TODO: Implement the weighted sum and threshold check
    pass

# --- AND GATE ---
def AND_gate(x1, x2):
    # TODO: Set appropriate weights and threshold
    pass

# --- OR GATE ---
def OR_gate(x1, x2):
    pass

# --- NOT GATE ---
def NOT_gate(x1):
    pass

# Testing (Uncomment when implemented)
# print("AND(0,0) =", AND_gate(0,0))
# print("AND(0,1) =", AND_gate(0,1))
# print("AND(1,1) =", AND_gate(1,1))
# print("OR(0,1)  =", OR_gate(0,1))
# print("NOT(0)   =", NOT_gate(0))
# print("NOT(1)   =", NOT_gate(1))

## Part 2: Activation Functions

Neuron fire rates are rarely just 0 or 1. We use activation functions to introduce non-linearity and continuous output.

### Exercise 2.1: Implement Sigmoid, Tanh, and ReLU

Formulas:
- **Sigmoid**: $\sigma(x) = \frac{1}{1 + e^{-x}}$
- **Tanh**: $\tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}$
- **ReLU**: $f(x) = \max(0, x)$

In [None]:
def sigmoid(x):
    # TODO: Implement Sigmoid
    pass

def tanh(x):
    # TODO: Implement Tanh
    pass

def relu(x):
    # TODO: Implement ReLU
    pass

# Visualization
x_range = np.linspace(-10, 10, 100)

# TODO: Plot the functions once implemented
# plt.figure(figsize=(12, 4))
# plt.subplot(1, 3, 1); plt.plot(x_range, sigmoid(x_range)); plt.title("Sigmoid")
# plt.subplot(1, 3, 2); plt.plot(x_range, tanh(x_range)); plt.title("Tanh")
# plt.subplot(1, 3, 3); plt.plot(x_range, relu(x_range)); plt.title("ReLU")
# plt.show()

## Part 3: The MLP & The XOR Problem

A single perceptron cannot solve XOR because it is not linearly separable. We need a Multi-Layer Perceptron (MLP).
We can build an XOR gate by combining AND, OR, and NOT gates.

$$ XOR(x_1, x_2) = (x_1 \text{ OR } x_2) \text{ AND } (\text{NOT } (x_1 \text{ AND } x_2)) $$

### Exercise 3.1: Implement XOR using your gates
Combine the functions you wrote in Part 1 to solve XOR.

In [None]:
def XOR_gate(x1, x2):
    # TODO: Combine AND, OR, and NOT gates to implement XOR
    # Hint: XOR(x1, x2) = (x1 OR x2) AND (NOT (x1 AND x2))
    pass

# Testing XOR
# print("XOR(0,0) =", XOR_gate(0,0))
# print("XOR(0,1) =", XOR_gate(0,1))
# print("XOR(1,0) =", XOR_gate(1,0))
# print("XOR(1,1) =", XOR_gate(1,1))

## Part 4: Training with Backpropagation (Single Step)

In a real network, we don't hardcode weights. We train them using **Gradient Descent** and **Backpropagation**.

Let's implement a single update step for a simple neuron with a Sigmoid activation:
$$ \hat{y} = \sigma(w \cdot x + b) $$
$$ Loss = \frac{1}{2}(y - \hat{y})^2 $$

### Exercise 4.1: Compute Gradients and Update Weights

1. **Forward Pass**: Compute output.
2. **Compute Error**: Difference between true $y$ and predicted $\hat{y}$.
3. **Backward Pass**: Compute gradient of Loss w.r.t $w$.
   $$ \frac{\partial L}{\partial w} = -(y - \hat{y}) \cdot \sigma'(\text{logit}) \cdot x $$
   where $\sigma'(z) = \sigma(z)(1 - \sigma(z))$.
4. **Update**: $w_{new} = w_{old} - \eta \cdot \text{gradient}$

In [None]:
def sigmoid_derivative(output):
    # TODO: Implement derivative of sigmoid: f'(x) = f(x)(1-f(x))
    pass

# 1. Initialize
x = np.array([0.5, -0.2])
y_true = 1.0
w = np.array([0.1, 0.5])
b = 0.0
learning_rate = 0.1

print(f"Initial Weights: {w}")

# 2. Forward Pass
# TODO: Compute logit and prediction
y_pred = 0
print(f"Prediction: {y_pred}")

# 3. Calculate Error
error = 0 # TODO
print(f"Error: {error}")

# 4. Backward Pass
# TODO: Calculate gradients
grad_w = 0
grad_b = 0

# 5. Update Weights
# TODO: Update w and b
w_new = w
b_new = b

print(f"Updated Weights: {w_new}")
print(f"Updated Bias: {b_new}")