# CIR Calibration with Steepest Descent, Newton, and Gauss–Newton

This notebook calibrates a Cox–Ingersoll–Ross (CIR) term structure model to U.S. Treasury yield data using three optimization algorithms: steepest descent, Newton, and Gauss–Newton.

In [1]:
import numpy as np
import math
import textwrap
from pprint import pprint


### Treasury bond data file format (tbonds.txt)

* The first non-empty line is a header:

```
w | t    1/12 0.25  0.5    1    3    5    7   10   20   30
```

  * "w" is a week label, "t" labels maturities.
  * The maturities (time to maturity in years) are: (1/12, 0.25, 0.5, 1, 3, 5, 7, 10, 20, 30).

* Remaining lines give one row per week, with blank lines between:

```
09/12/24 5.18 5.06 4.68 4.09 3.47 3.47 3.57 3.68 4.07 4.00

09/19/24 4.89 4.80 4.46 3.93 3.47 3.49 3.60 3.73 4.11 4.06
```

  * First token: date string.
  * Next 10 tokens: yields in percent for the 10 maturities.


In [2]:
def read_tbonds_file(filename="tbonds.txt"):
    maturities = []
    dates = []
    rates_rows = []

    with open(filename, "r") as f:
        lines = f.readlines()

    # Filter out lines that are completely empty (or whitespace only)
    non_empty_lines = [ln.strip() for ln in lines if ln.strip() != ""]

    if len(non_empty_lines) == 0:
        raise ValueError("File appears empty or only whitespace")

    # First non-empty line is header
    header_line = non_empty_lines[0]
    header_line = header_line.replace("|", " ")
    header_tokens = header_line.split()

    # header_tokens should look like:
    # ["w", "t", "1/12", "0.25", "0.5", "1", "3", "5", "7", "10", "20", "30"]
    if len(header_tokens) < 3:
        raise ValueError("Header line does not contain maturities")

    maturity_tokens = header_tokens[2:]  # skip "w" and "t"

    for tok in maturity_tokens:
        if "/" in tok:
            num, den = tok.split("/")
            maturities.append(float(num) / float(den))
        else:
            maturities.append(float(tok))

    # Remaining non-empty lines are data rows
    for line in non_empty_lines[1:]:
        tokens = line.split()
        if len(tokens) == 0:
            continue
        date_str = tokens[0]
        rates = [float(x) for x in tokens[1:]]
        dates.append(date_str)
        rates_rows.append(rates)

    t_vec = np.array(maturities, dtype=float)           # shape (M,)
    R = np.array(rates_rows, dtype=float)               # shape (W, M)
    return t_vec, dates, R


t_vec, dates, R = read_tbonds_file("tbonds.txt")
print("Maturities:", t_vec)
print("Dates:", dates)
print("R shape:", R.shape)
print("First row of rates:", R[0])
assert t_vec.shape == (10,), "Expected 10 maturities"
assert R.shape[1] == 10, "Expected 10 maturities per week"
print("Data load assertions passed.")


Maturities: [ 0.08333333  0.25        0.5         1.          3.          5.
  7.         10.         20.         30.        ]
Dates: ['09/12/24', '09/19/24', '09/26/24', '10/03/24', '10/10/24', '10/17/24', '10/24/24', '10/31/24', '11/07/24']
R shape: (9, 10)
First row of rates: [5.18 5.06 4.68 4.09 3.47 3.47 3.57 3.68 4.07 4.  ]
Data load assertions passed.



### CIR zero-coupon bond and yield formulas
We use the Cox–Ingersoll–Ross (CIR) model under no arbitrage, where the price of a zero-coupon bond is
\[
P(r,t;a,b,\sigma) = A(t;a,b,\sigma)\, e^{-B(t;a,\sigma)\,r}
\]
with
\[
h(a,\sigma) = \sqrt{a^2 + 2\sigma^2},
\]
\[
A(t;a,b,\sigma) =
\left[
rac{2 h \exp\left(	frac{a+h}{2} tight)}
{2h + (a+h)(e^{h t}-1)}
ight]^{rac{2ab}{\sigma^2}},
\quad
B(t;a,\sigma) =
rac{2(e^{h t}-1)}{2h + (a+h)(e^{h t}-1)}.
\]
The continuously-compounded yield (rate of return in percent) is
\[
R_{	ext{model}}(t; r,a,b,\sigma)
= -rac{\log P(r,t;a,b,\sigma)}{t} 	imes 100.
\]


In [3]:
def cir_h(a, sigma):
    return math.sqrt(a * a + 2.0 * sigma * sigma)


def cir_A_B(t, a, b, sigma):
    """
    Return (A(t; a,b,sigma), B(t; a,sigma)) for the CIR formula.
    Small regularization is added to keep denominators positive.
    """
    h = cir_h(a, sigma)
    exp_ht = math.exp(h * t)
    numerator = 2.0 * h * math.exp((a + h) * t / 2.0)
    denom = 2.0 * h + (a + h) * (exp_ht - 1.0)
    denom = denom if denom > 1e-12 else 1e-12
    C = numerator / denom
    C_clamped = min(max(C, 1e-12), 1e12)
    q = 2.0 * a * b / (sigma * sigma) if sigma != 0 else 0.0
    exp_arg = q * math.log(C_clamped)
    exp_arg = max(min(exp_arg, 700.0), -700.0)
    A = math.exp(exp_arg)
    B = 2.0 * (exp_ht - 1.0) / denom
    return A, B


def cir_price(r, t, a, b, sigma):
    try:
        A, B = cir_A_B(t, a, b, sigma)
        P = A * math.exp(-B * r)
    except (OverflowError, ValueError):
        return 1e-12
    return max(P, 1e-12)


def cir_yield_percent(r, t, a, b, sigma):
    """
    r, a, b, sigma are given as decimal rates,
    but the returned yield is in percent (%),
    to match the data units in tbonds.txt.
    """
    P = cir_price(r, t, a, b, sigma)
    t_safe = t if t > 1e-8 else 1e-8
    return - (math.log(P) / t_safe) * 100.0



### Weekly nonlinear least squares formulation
For a given week (w), we have maturities (t_j) and observed yields (R_{w,j}^{	ext{obs}}) (in percent, from the dataset).
We collect the parameters in \(x = (r, a, b, \sigma)^	op\). The model yield is
\[
R_{	ext{model}}(t_j; x) = R_{	ext{model}}(t_j; r,a,b,\sigma).
\]
Residuals and objective:
\[
f_j(x) = R_{	ext{model}}(t_j; x) - R_{w,j}^{	ext{obs}}, \quad
arphi(x) = 	frac12 \sum_{j=1}^M f_j(x)^2.
\]


In [4]:

def residuals_week(x, t_vec, R_obs_row):
    """
    x = [r, a, b, sigma]
    t_vec: shape (M,)
    R_obs_row: shape (M,) observed yields in percent
    """
    r, a, b, sigma = x
    M = len(t_vec)
    F = np.zeros(M, dtype=float)
    for j in range(M):
        t = t_vec[j]
        R_model = cir_yield_percent(r, t, a, b, sigma)
        F[j] = R_model - R_obs_row[j]
    return F


def objective_week(x, t_vec, R_obs_row):
    F = residuals_week(x, t_vec, R_obs_row)
    return 0.5 * np.dot(F, F)



### Finite-difference Jacobian, gradient, and Gauss–Newton Hessian (weekly)
For least squares \(arphi(x) = 	frac12 |F(x)|^2\) we use
\(
abla arphi(x) = J_F(x)^T F(x)\) and the Gauss–Newton approximation
\(
abla^2 arphi(x) pprox J_F(x)^T J_F(x)\). We compute \(J_F(x)\) via forward finite differences.


In [5]:

def jacobian_fd_week(x, t_vec, R_obs_row, eps=1e-6):
    """
    Finite-difference Jacobian for the residuals F(x) for a single week.
    J has shape (M, 4).
    """
    x = np.array(x, dtype=float)
    F0 = residuals_week(x, t_vec, R_obs_row)
    M = len(F0)
    n = len(x)
    J = np.zeros((M, n), dtype=float)

    for k in range(n):
        x_perturbed = x.copy()
        delta = eps * max(1.0, abs(x[k]))
        x_perturbed[k] = x[k] + delta
        Fk = residuals_week(x_perturbed, t_vec, R_obs_row)
        J[:, k] = (Fk - F0) / delta

    return J


def grad_week(x, t_vec, R_obs_row, eps=1e-6):
    F = residuals_week(x, t_vec, R_obs_row)
    J = jacobian_fd_week(x, t_vec, R_obs_row, eps=eps)
    return J.T @ F


def hess_gauss_newton_week(x, t_vec, R_obs_row, eps=1e-6):
    J = jacobian_fd_week(x, t_vec, R_obs_row, eps=eps)
    return J.T @ J



### Backtracking line search (Armijo)
All three methods use backtracking to select step lengths. Starting from \(lpha_0\), shrink by \(ho\) until the Armijo condition
\( arphi(x + lpha p) \le arphi(x) + c lpha 
abla arphi(x)^T p \) holds.


In [6]:
def backtracking_line_search_week(x, p, t_vec, R_obs_row,
                                  alpha_init=1.0, rho=0.5, c=1e-4,
                                  eps=1e-6):
    """
    Backtracking line search for the weekly objective.
    """
    x = np.array(x, dtype=float)
    p = np.array(p, dtype=float)

    f_x = objective_week(x, t_vec, R_obs_row)
    g_x = grad_week(x, t_vec, R_obs_row, eps=eps)
    # Enforce descent direction if necessary
    if np.dot(g_x, p) >= 0:
        p = -g_x

    alpha = alpha_init

    while True:
        x_new = x + alpha * p
        f_new = objective_week(x_new, t_vec, R_obs_row)
        if f_new <= f_x + c * alpha * np.dot(g_x, p):
            break
        alpha *= rho
        if alpha < 1e-10:
            break

    return alpha



### Optimization algorithms for a single week
We implement steepest descent, Newton (using the Gauss–Newton Hessian for stability), and Gauss–Newton with optional line search.


In [7]:

def steepest_descent_week(x0, t_vec, R_obs_row,
                          max_iters=1000, tol=1e-6, eps=1e-6,
                          verbose=False):
    x = np.array(x0, dtype=float)
    for k in range(max_iters):
        g = grad_week(x, t_vec, R_obs_row, eps=eps)
        ng = np.linalg.norm(g)
        if verbose and k % 50 == 0:
            print(f"[SD] iter {k}, f={objective_week(x, t_vec, R_obs_row):.6e}, ||grad||={ng:.6e}")
        if ng < tol:
            break

        p = -g
        alpha = backtracking_line_search_week(x, p, t_vec, R_obs_row,
                                              alpha_init=1.0, rho=0.5, c=1e-4,
                                              eps=eps)
        x = x + alpha * p

    return x


In [8]:

def newton_week(x0, t_vec, R_obs_row,
                max_iters=50, tol=1e-8, eps=1e-6,
                use_gauss_newton_hessian=True,
                verbose=False):
    x = np.array(x0, dtype=float)

    for k in range(max_iters):
        g = grad_week(x, t_vec, R_obs_row, eps=eps)
        ng = np.linalg.norm(g)
        if verbose:
            print(f"[Newton] iter {k}, f={objective_week(x, t_vec, R_obs_row):.6e}, ||grad||={ng:.6e}")
        if ng < tol:
            break

        if use_gauss_newton_hessian:
            H = hess_gauss_newton_week(x, t_vec, R_obs_row, eps=eps)
        else:
            H = hess_gauss_newton_week(x, t_vec, R_obs_row, eps=eps)

        try:
            p = np.linalg.solve(H, -g)
        except np.linalg.LinAlgError:
            H_reg = H + 1e-6 * np.eye(len(x))
            p = np.linalg.solve(H_reg, -g)

        alpha = backtracking_line_search_week(x, p, t_vec, R_obs_row,
                                              alpha_init=1.0, rho=0.5, c=1e-4,
                                              eps=eps)
        x = x + alpha * p

    return x


In [9]:

def gauss_newton_week(x0, t_vec, R_obs_row,
                      max_iters=50, tol=1e-8, eps=1e-6,
                      use_line_search=True,
                      verbose=False):
    x = np.array(x0, dtype=float)

    for k in range(max_iters):
        F = residuals_week(x, t_vec, R_obs_row)
        J = jacobian_fd_week(x, t_vec, R_obs_row, eps=eps)
        g = J.T @ F
        ng = np.linalg.norm(g)
        if verbose:
            print(f"[GN] iter {k}, f={0.5 * np.dot(F, F):.6e}, ||J^T F||={ng:.6e}")
        if ng < tol:
            break

        H_gn = J.T @ J
        try:
            p = np.linalg.solve(H_gn, -g)
        except np.linalg.LinAlgError:
            H_reg = H_gn + 1e-6 * np.eye(len(x))
            p = np.linalg.solve(H_reg, -g)

        if use_line_search:
            alpha = backtracking_line_search_week(x, p, t_vec, R_obs_row,
                                                  alpha_init=1.0, rho=0.5, c=1e-4,
                                                  eps=eps)
        else:
            alpha = 1.0

        x = x + alpha * p

    return x


In [10]:
# Quick sanity check on a single week (w=0)
w = 0
R_obs_row = R[w, :]
x0 = np.array([0.04, 0.5, 0.05, 0.10])

print("Initial objective:", objective_week(x0, t_vec, R_obs_row))
print("Gradient norm:", np.linalg.norm(grad_week(x0, t_vec, R_obs_row)))

# Run each solver with verbose output to confirm convergence behavior
print("\nRunning steepest descent (week 0)...")
x_sd_test = steepest_descent_week(x0, t_vec, R_obs_row, max_iters=500, tol=1e-6, eps=1e-6, verbose=True)
print("Steepest descent solution:", x_sd_test)

print("\nRunning Newton (week 0)...")
x_nt_test = newton_week(x0, t_vec, R_obs_row, max_iters=50, tol=1e-8, eps=1e-6, use_gauss_newton_hessian=True, verbose=True)
print("Newton solution:", x_nt_test)

print("\nRunning Gauss-Newton (week 0)...")
x_gn_test = gauss_newton_week(x0, t_vec, R_obs_row, max_iters=50, tol=1e-8, eps=1e-6, use_line_search=True, verbose=True)
print("Gauss-Newton solution:", x_gn_test)

Initial objective: 4.260298551844023
Gradient norm: 422.5169821805291

Running steepest descent (week 0)...
[SD] iter 0, f=4.260299e+00, ||grad||=4.225170e+02
[SD] iter 50, f=8.374851e-01, ||grad||=1.611466e+00


[SD] iter 100, f=8.352473e-01, ||grad||=2.212027e+00
[SD] iter 150, f=8.330103e-01, ||grad||=1.257226e+00


[SD] iter 200, f=8.308135e-01, ||grad||=1.576509e+00
[SD] iter 250, f=8.286579e-01, ||grad||=2.194935e+00
[SD] iter 300, f=8.265004e-01, ||grad||=1.251149e+00


[SD] iter 350, f=8.243826e-01, ||grad||=1.603194e+00


[SD] iter 400, f=8.223072e-01, ||grad||=2.298650e+00
[SD] iter 450, f=8.202194e-01, ||grad||=1.284803e+00
Steepest descent solution: [0.04782967 0.52143138 0.03511569 0.09656174]

Running Newton (week 0)...
[Newton] iter 0, f=4.260299e+00, ||grad||=4.225170e+02
[Newton] iter 1, f=1.465456e+00, ||grad||=2.202436e+02
[Newton] iter 2, f=1.362893e+00, ||grad||=2.486522e+02
[Newton] iter 3, f=7.098055e-01, ||grad||=1.553335e+02
[Newton] iter 4, f=6.593088e-01, ||grad||=1.372369e+02
[Newton] iter 5, f=6.300849e-01, ||grad||=1.271348e+02
[Newton] iter 6, f=6.169244e-01, ||grad||=1.220680e+02
[Newton] iter 7, f=2.843867e-01, ||grad||=2.701061e+01
[Newton] iter 8, f=2.828558e-01, ||grad||=2.664491e+01
[Newton] iter 9, f=2.824113e-01, ||grad||=2.643331e+01
[Newton] iter 10, f=2.821853e-01, ||grad||=2.631853e+01
[Newton] iter 11, f=2.821564e-01, ||grad||=2.630317e+01
[Newton] iter 12, f=2.821420e-01, ||grad||=2.629545e+01


[Newton] iter 13, f=2.821384e-01, ||grad||=2.629351e+01
[Newton] iter 14, f=2.821366e-01, ||grad||=2.629254e+01
[Newton] iter 15, f=2.821357e-01, ||grad||=2.629206e+01
[Newton] iter 16, f=2.821352e-01, ||grad||=2.629182e+01
[Newton] iter 17, f=2.821351e-01, ||grad||=2.629176e+01
[Newton] iter 18, f=2.821351e-01, ||grad||=2.629174e+01
[Newton] iter 19, f=2.821351e-01, ||grad||=2.629173e+01
[Newton] iter 20, f=2.821351e-01, ||grad||=2.629173e+01
[Newton] iter 21, f=2.821351e-01, ||grad||=2.629173e+01
[Newton] iter 22, f=2.821351e-01, ||grad||=2.629173e+01
[Newton] iter 23, f=2.821351e-01, ||grad||=2.629173e+01
[Newton] iter 24, f=2.821351e-01, ||grad||=2.629173e+01
[Newton] iter 25, f=2.821351e-01, ||grad||=2.629174e+01
[Newton] iter 26, f=2.821351e-01, ||grad||=2.629174e+01
[Newton] iter 27, f=2.821351e-01, ||grad||=2.629174e+01
[Newton] iter 28, f=2.821351e-01, ||grad||=2.629174e+01
[Newton] iter 29, f=2.821351e-01, ||grad||=2.629174e+01
[Newton] iter 30, f=2.821351e-01, ||grad||=2.629

[GN] iter 36, f=2.821351e-01, ||J^T F||=2.629176e+01
[GN] iter 37, f=2.821351e-01, ||J^T F||=2.629176e+01
[GN] iter 38, f=2.821351e-01, ||J^T F||=2.629176e+01
[GN] iter 39, f=2.821351e-01, ||J^T F||=2.629177e+01
[GN] iter 40, f=2.821351e-01, ||J^T F||=2.629177e+01
[GN] iter 41, f=2.821351e-01, ||J^T F||=2.629177e+01
[GN] iter 42, f=2.821351e-01, ||J^T F||=2.629177e+01
[GN] iter 43, f=2.821351e-01, ||J^T F||=2.629177e+01
[GN] iter 44, f=2.821351e-01, ||J^T F||=2.629178e+01
[GN] iter 45, f=2.821351e-01, ||J^T F||=2.629178e+01
[GN] iter 46, f=2.821351e-01, ||J^T F||=2.629178e+01
[GN] iter 47, f=2.821351e-01, ||J^T F||=2.629178e+01
[GN] iter 48, f=2.821351e-01, ||J^T F||=2.629178e+01
[GN] iter 49, f=2.821351e-01, ||J^T F||=2.629179e+01
Gauss-Newton solution: [ 0.05336361  2.60068635  0.04840203 -2.56057867]



## Part (1): Weekly calibration (r_w, a_w, b_w, \sigma_w)
For each week we minimize
\(arphi_w(r,a,b,\sigma) = 	frac12 \sum_j (R_{	ext{model}}(t_j; r,a,b,\sigma) - R_{w,j}^{	ext{obs}})^2\)
using the three optimizers.


In [11]:
W, M = R.shape
print("Number of weeks:", W, "Number of maturities:", M)

def compute_sse_rmse_week(x, t_vec, R_obs_row):
    F = residuals_week(x, t_vec, R_obs_row)
    sse = np.dot(F, F)
    rmse = math.sqrt(sse / len(F))
    return sse, rmse

results_part1 = []

for w in range(W):
    R_obs_row = R[w, :]
    x0 = np.array([0.04, 0.5, 0.05, 0.10], dtype=float)
    print(f"=== Week {w} ({dates[w]}) ===")
    x_sd = steepest_descent_week(x0, t_vec, R_obs_row,
                                 max_iters=1000, tol=1e-6, eps=1e-6,
                                 verbose=False)
    x_nt = newton_week(x0, t_vec, R_obs_row,
                       max_iters=50, tol=1e-8, eps=1e-6,
                       use_gauss_newton_hessian=True,
                       verbose=False)
    x_gn = gauss_newton_week(x0, t_vec, R_obs_row,
                             max_iters=50, tol=1e-8, eps=1e-6,
                             use_line_search=True,
                             verbose=False)

    for method_name, x_hat in [("SteepestDescent", x_sd),
                               ("Newton", x_nt),
                               ("GaussNewton", x_gn)]:
        sse, rmse = compute_sse_rmse_week(x_hat, t_vec, R_obs_row)
        results_part1.append({
            "week_index": w,
            "week_date": dates[w],
            "method": method_name,
            "params": x_hat,
            "SSE": sse,
            "RMSE": rmse
        })
        print(f"{method_name}: params={x_hat}, SSE={sse:.6e}, RMSE={rmse:.6e}")

Number of weeks: 9 Number of maturities: 10
=== Week 0 (09/12/24) ===


SteepestDescent: params=[0.04795148 0.54210459 0.03509135 0.09330123], SSE=1.597879e+00, RMSE=3.997348e-01
Newton: params=[ 0.05336361  2.60068635  0.04840203 -2.56057867], SSE=5.642702e-01, RMSE=2.375437e-01
GaussNewton: params=[ 0.05336361  2.60068635  0.04840203 -2.56057867], SSE=5.642702e-01, RMSE=2.375437e-01
=== Week 1 (09/19/24) ===


SteepestDescent: params=[0.04523794 0.52875518 0.03625336 0.09188536], SSE=1.369426e+00, RMSE=3.700575e-01
Newton: params=[ 0.05030306  2.79426415  0.04800257 -2.62921392], SSE=6.023529e-01, RMSE=2.454288e-01
GaussNewton: params=[ 0.05030306  2.79426415  0.04800257 -2.62921392], SSE=6.023529e-01, RMSE=2.454288e-01
=== Week 2 (09/26/24) ===


SteepestDescent: params=[0.04473604 0.52387591 0.03715414 0.09143864], SSE=1.235810e+00, RMSE=3.515409e-01
Newton: params=[ 0.04754693  3.53625237  0.04447565 -2.87721843], SSE=9.726722e-01, RMSE=3.118769e-01
GaussNewton: params=[ 0.04754693  3.53625237  0.04447565 -2.87721843], SSE=9.726722e-01, RMSE=3.118769e-01
=== Week 3 (10/03/24) ===


SteepestDescent: params=[0.04508302 0.52249889 0.03788975 0.09116918], SSE=1.220432e+00, RMSE=3.493468e-01
Newton: params=[ 0.04538285  3.12929358  0.04873671 -2.74396432], SSE=7.581196e-01, RMSE=2.753397e-01
GaussNewton: params=[ 0.04538285  3.12929358  0.04873671 -2.74396432], SSE=7.581196e-01, RMSE=2.753397e-01
=== Week 4 (10/10/24) ===


SteepestDescent: params=[0.04580923 0.514335   0.04081025 0.09122192], SSE=8.396242e-01, RMSE=2.897627e-01
Newton: params=[ 0.05120384  5.47332215  0.04708388 -3.44162174], SSE=3.634155e-01, RMSE=1.906346e-01
GaussNewton: params=[ 0.05120384  5.47332215  0.04708388 -3.44162174], SSE=3.634155e-01, RMSE=1.906346e-01
=== Week 5 (10/17/24) ===


SteepestDescent: params=[0.04560451 0.51352333 0.04089998 0.09131481], SSE=8.058401e-01, RMSE=2.838732e-01
Newton: params=[ 0.04902807  4.63774165  0.04777486 -3.21032872], SSE=4.348910e-01, RMSE=2.085404e-01
GaussNewton: params=[ 0.04902807  4.63774165  0.04777486 -3.21032872], SSE=4.348910e-01, RMSE=2.085404e-01
=== Week 6 (10/24/24) ===


  denom = 2.0 * h + (a + h) * (exp_ht - 1.0)
  B = 2.0 * (exp_ht - 1.0) / denom


SteepestDescent: params=[0.04545215 0.50884252 0.0424087  0.09166493], SSE=6.090144e-01, RMSE=2.467822e-01
Newton: params=[ 0.04203759 12.8053046   0.04315798 -1.97291482], SSE=9.034148e-01, RMSE=3.005686e-01
GaussNewton: params=[ 0.04203759 12.8053046   0.04315798 -1.97291482], SSE=9.034148e-01, RMSE=3.005686e-01
=== Week 7 (10/31/24) ===


SteepestDescent: params=[0.04495336 0.50487463 0.04340649 0.09287679], SSE=3.894147e-01, RMSE=1.973359e-01
Newton: params=[ 0.04151005  0.14084971  0.04203578 -0.01499913], SSE=9.661711e-01, RMSE=3.108329e-01
GaussNewton: params=[ 0.04151005  0.14084971  0.04203578 -0.01499913], SSE=9.661711e-01, RMSE=3.108329e-01
=== Week 8 (11/07/24) ===


SteepestDescent: params=[0.04459167 0.503464   0.04397166 0.09286453], SSE=3.650213e-01, RMSE=1.910553e-01
Newton: params=[ 0.04214473  0.01244992  0.05982536 -0.02005375], SSE=5.826794e-01, RMSE=2.413875e-01
GaussNewton: params=[ 0.04214473  0.01244992  0.05982536 -0.02005375], SSE=5.826794e-01, RMSE=2.413875e-01



### Weekly calibration discussion
Gauss–Newton typically converges fastest thanks to the tailored Hessian approximation, while steepest descent often needs more iterations. Newton with the Gauss–Newton Hessian tends to achieve similar accuracy but may require line search regularization when the Hessian is nearly singular.


## Part (1) summary
The weekly calibrations below run all three solvers for each of the nine weeks. RMSE and SSE are printed for each combination so convergence and fit quality can be inspected directly.

## Part (2)
Part (2) global calibration is temporarily skipped to focus on stabilizing Part (1).