# Calculus & Gradients — FAANG-Level Lab

**Goal:** Implement and verify gradients like an ML engineer.

Key idea: *If you can’t gradient-check it, you don’t really trust it.*


In [None]:
import numpy as np

def check(name: str, cond: bool):
    if not cond:
        raise AssertionError(f'Failed: {name}')
    print(f'OK: {name}')

rng = np.random.default_rng(0)

## Section 1 — Finite Differences (Gradient Checking)

### Task 1.1: Implement numerical gradient for scalar f(w)

# HINT:
- Use central difference: (f(w+eps e_i) - f(w-eps e_i)) / (2 eps)

**Explain:** Why is central difference more accurate than forward difference?

In [None]:
def numerical_grad(f, w, eps=1e-5):
    # TODO
    ...

# sanity test: f(w)=sum(w^2) => grad=2w
w = rng.standard_normal(5)
f = lambda v: float(np.sum(v*v))
g_num = numerical_grad(f, w)
g_true = 2*w
check('grad_close', np.allclose(g_num, g_true, atol=1e-6))

## Section 2 — Chain Rule in Code

### Task 2.1: Gradient of MSE for linear model

Model: y_hat = Xw
Loss: L(w) = (1/n) * sum_i (y_hat_i - y_i)^2

# HINT:
- Let r = Xw - y
- grad = (2/n) * X^T r

**FAANG gotcha:** shape mismatches; keep w as (d,) and X as (n,d).

In [None]:
def mse_loss_and_grad(X, y, w):
    # TODO: return (loss, grad)
    ...

n, d = 20, 4
X = rng.standard_normal((n, d))
w = rng.standard_normal(d)
y = rng.standard_normal(n)
loss, grad = mse_loss_and_grad(X, y, w)

# gradient check
f = lambda v: mse_loss_and_grad(X, y, v)[0]
g_num = numerical_grad(f, w)
check('mse_grad', np.allclose(grad, g_num, atol=1e-5))
print('loss', loss)

## Section 3 — Logistic Regression (Core Interview Gradient)

### Task 3.1: Binary cross-entropy gradient

Given labels y in {0,1}.
p = sigmoid(Xw)
Loss = -(1/n) * sum(y log p + (1-y) log(1-p))

# HINT:
- sigmoid(z)=1/(1+exp(-z))
- grad = (1/n) * X^T (p - y)
- add numerical stability for logs (clip p)


In [None]:
def sigmoid(z):
    # TODO
    ...

def logreg_loss_and_grad(X, y, w):
    # TODO
    ...

n, d = 50, 3
X = rng.standard_normal((n, d))
w = rng.standard_normal(d)
y = rng.integers(0, 2, size=n)
loss, grad = logreg_loss_and_grad(X, y, w)
f = lambda v: logreg_loss_and_grad(X, y, v)[0]
g_num = numerical_grad(f, w, eps=1e-5)
check('logreg_grad', np.allclose(grad, g_num, atol=1e-5))
print('loss', loss)

## Section 4 — Jacobian/Hessian Intuition

### Task 4.1: Compute Hessian of f(w)=sum(w^2) numerically

# HINT:
- Hessian of sum(w^2) is 2I
- Use numerical_grad on each component of grad

This is mainly about *shape thinking*: Hessian is (d,d).

In [None]:
def numerical_hessian(f, w, eps=1e-4):
    # TODO
    ...

w = rng.standard_normal(4)
f = lambda v: float(np.sum(v*v))
H = numerical_hessian(f, w)
check('H_shape', H.shape == (4,4))
check('H_close', np.allclose(H, 2*np.eye(4), atol=1e-3))
print(H)

---
## Submission Checklist
- All TODOs completed
- Gradient checks pass
- Explain prompts answered
