# Homework 1

<!--
## Part 1: Hand Calculations

### Problem 1.1: Closed-Form Solution
For \(f(x) = x^2 - 4x + 7\),

\(f'(x) = 2x - 4\). Setting \(f'(x)=0\) gives \(2x-4=0 \Rightarrow x=2\).
Since \(f''(x)=2>0\), this is a minimum. The minimum value is \(f(2)=4-8+7=3\).

### Problem 1.2: Gradient Descent by Hand
Update rule: \(x_{n+1} = x_n - \alpha f'(x_n)\), with \(x_0=0\), \(\alpha=0.1\), \(f'(x)=2x-4\).

| iteration (n) | \(x_n\) | \(f'(x_n)\) | \(f(x_n)\) |
|---:|---:|---:|---:|
| 0 | 0.0000 | -4.0000 | 7.0000 |
| 1 | 0.4000 | -3.2000 | 5.5600 |
| 2 | 0.7200 | -2.5600 | 4.6384 |
| 3 | 0.9760 | -2.0480 | 4.0486 |
| 4 | 1.1808 | -1.6384 | 3.6711 |

After 5 updates, \(x_5 = 1.34464\). The true minimum is at \(x=2\), so the distance to the minimizer is \(|1.34464-2|=0.65536\).

The sign of \(f'(x_n)\) tells the direction to move (negative means move right, positive means move left). The magnitude \(|f'(x_n)|\) scales the step size so that when the slope is steep (far from the minimum), we take larger steps, and when the slope is small (near the minimum), we take smaller steps for stability.

### Problem 1.3: Fence Optimization
Area constraint: \(LW = 43{,}560\) so \(W=rac{43{,}560}{L}\).

**Perimeter (in terms of \(L\) only):**
\(P(L) = 2L + 2W = 2L + rac{87{,}120}{L}\).

**Cost function:**
\(C(L) = 8P(L) = 16L + rac{16\cdot 43{,}560}{L}\).

**Domain:** \(L>0\) (positive length).

**Closed-form minimum:**
\(C'(L)=16 - rac{16\cdot 43{,}560}{L^2}\). Setting \(C'(L)=0\) gives \(L^2=43{,}560\), so
\(L=\sqrt{43{,}560}\approx 208.71\) ft and \(W=rac{43{,}560}{L}=\sqrt{43{,}560}\approx 208.71\) ft.

Minimum total cost:
\(C(L) = 16L + rac{16\cdot 43{,}560}{L}\Rightarrow C(\sqrt{43{,}560})\approx \$6{,}678.73\).

**Observation:** The optimal fence is a square (length equals width).
-->


In [None]:
import numpy as np
import numpy.polynomial as poly
import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd


In [None]:

def gradient_descent(coef, x_start, learning_rate, num_iterations):
    """
    f: function to minimize
    df: derivative of f
    x_start: starting point
    learning_rate: step size
    num_iterations: number of iterations to run
    
    Returns: (x_history, f_history) - lists of x values and f(x) values
    """
    # Your implementation here
    # delete this and the following line in your implementation. (do not delete the return)
    f = poly.Polynomial(coef)
    df = f.deriv()
    x_history = [x_start]
    f_history = [f(x_start)]

    x = x_start
    for i in range(num_iterations):
        x -= learning_rate * float(df(x))
        x_history.append(float(x))
        f_history.append(float(f(x)))

    return x_history, f_history


# function2(x) = x^2 - 12x + 4
coef_f2 = [4, -12, 1]
learning_rates = [0.01, 0.1, 0.5, 0.9]

plt.figure(figsize=(10, 6))

results = {}
for lr in learning_rates:
    x_hist, f_hist = gradient_descent(coef_f2, x_start=0.0, learning_rate=lr, num_iterations=100)
    results[lr] = (x_hist, f_hist)
    plt.plot(range(len(f_hist)), f_hist, linewidth=2, label=f"α = {lr}")

plt.xlabel("Iteration")
plt.ylabel("f(x)")
plt.title("Gradient Descent Convergence for Different Learning Rates")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

# print final values again for convenience
for lr in learning_rates:
    x_hist, f_hist = results[lr]
    print(f"α = {lr:0.2f} | final x = {x_hist[-1]:.10f} | final f(x) = {f_hist[-1]:.10f}")


