# Bonus Task
We need ot find such a problem for which Quasi Newton method outperforms Newton method.

Let's consider a modified version of the Himmelblau function:
$$ f(x, y) = (3x^2 + y - 13)^2 + (x + 4y^2 - 19)^2 $$

**Define the Modified Function and Its Derivatives**

   - Function: $$f(x, y) = (3x^2 + y - 13)^2 + (x + 4y^2 - 19)^2 $$
   - Gradient:
     $$
     \nabla f(x, y) = \begin{bmatrix}
     2 \cdot 2(3x^2 + y - 13) \cdot 6x + 2(x + 4y^2 - 19) \\
     2(3x^2 + y - 13) + 2 \cdot 2(x + 4y^2 - 19) \cdot 8y
     \end{bmatrix}
     $$
   - Hessian:
     $$
     H(x, y) = \begin{bmatrix}
     2 \cdot 6(3x^2 + y - 13) \cdot 6 + 36x^2 + 2 & 6 \\
     6 & 2 + 64y^2
     \end{bmatrix}
     $$

In [1]:
import numpy as np
import time

# Define the modified Himmelblau function
def f_modified(x):
    return (3 * x[0]**2 + x[1] - 13)**2 + (x[0] + 4 * x[1]**2 - 19)**2

# Define the gradient of the modified function
def grad_f_modified(x):
    df_dx = 2 * (3 * x[0]**2 + x[1] - 13) * 6 * x[0] + 2 * (x[0] + 4 * x[1]**2 - 19)
    df_dy = 2 * (3 * x[0]**2 + x[1] - 13) + 2 * (x[0] + 4 * x[1]**2 - 19) * 8 * x[1]
    return np.array([df_dx, df_dy])

# Define the Hessian of the modified function
def hessian_f_modified(x):
    d2f_dx2 = 2 * 6 * (3 * x[0]**2 + x[1] - 13) * 6 + 36 * x[0]**2 + 2
    d2f_dy2 = 2 + 64 * x[1]**2
    d2f_dxdy = 6
    return np.array([[d2f_dx2, d2f_dxdy], [d2f_dxdy, d2f_dy2]])

# Backtracking line search
def backtracking_line_search(f, x, grad, p, alpha=0.3, beta=0.5):
    t = 1.0
    while f(x + t * p) > f(x) + alpha * t * np.dot(grad, p):
        t *= beta
    return t

# Newton method
def newton_method(f, grad, hessian, x0, max_iter=1000, tol=1e-6):
    x = x0.copy()
    for i in range(max_iter):
        grad_x = grad(x)
        if np.linalg.norm(grad_x) < tol:
            break
        hess_x = hessian(x)
        try:
            p = np.linalg.solve(hess_x, -grad_x)
        except np.linalg.LinAlgError:
            break  # In case Hessian is singular, break
        t = backtracking_line_search(f, x, grad_x, p)
        x += t * p
    return x, np.linalg.norm(grad_x), i

# BFGS update
def bfgs_update(H, s, y):
    ys = np.dot(y, s)
    if ys < 1e-10:  # Prevent division by zero or very small values
        return H
    rho = 1.0 / ys
    I = np.eye(len(H))
    V = I - rho * np.outer(s, y)
    H = V.T @ H @ V + rho * np.outer(s, s)
    return H

# Quasi-Newton method
def quasi_newton_method(f, grad, x0, max_iter=1000, tol=1e-6):
    x = x0.copy()
    n = len(x)
    H = np.eye(n)
    for i in range(max_iter):
        grad_x = grad(x)
        if np.linalg.norm(grad_x) < tol:
            break
        p = -np.dot(H, grad_x)
        t = backtracking_line_search(f, x, grad_x, p)
        s = t * p
        x_next = x + s
        y = grad(x_next) - grad_x
        if np.dot(y, s) > 1e-10:
            H = bfgs_update(H, s, y)
        x = x_next
    return x, np.linalg.norm(grad(x)), i

# Initial points for testing
x0_list_modified = [(1.2, 1.2), (1.5, 1.5), (-1.2, 1.0), (0.2, 0.8)]

results_modified = []

for i, x0 in enumerate(x0_list_modified):
    x0 = np.array(x0)

    start_time = time.time()
    x_newton, grad_norm_newton, num_iter_newton = newton_method(f_modified, grad_f_modified, hessian_f_modified, x0)
    time_newton = time.time() - start_time

    start_time = time.time()
    x_qn, grad_norm_qn, num_iter_qn = quasi_newton_method(f_modified, grad_f_modified, x0)
    time_qn = time.time() - start_time

    results_modified.append({
        "Starting Point": x0,
        "Newton Method": {"Iterations": num_iter_newton, "Final Iterate": x_newton, "Gradient Norm": grad_norm_newton, "Time": time_newton},
        "Quasi-Newton Method": {"Iterations": num_iter_qn, "Final Iterate": x_qn, "Gradient Norm": grad_norm_qn, "Time": time_qn}
    })

results_modified

[{'Starting Point': array([1.2, 1.2]),
  'Newton Method': {'Iterations': 999,
   'Final Iterate': array([0.99320368, 2.16834158]),
   'Gradient Norm': 93.00433432591915,
   'Time': 0.2527487277984619},
  'Quasi-Newton Method': {'Iterations': 15,
   'Final Iterate': array([1.90900824, 2.06706264]),
   'Gradient Norm': 8.743733958311647e-07,
   'Time': 0.0}},
 {'Starting Point': array([1.5, 1.5]),
  'Newton Method': {'Iterations': 999,
   'Final Iterate': array([1.31923382, 2.23859313]),
   'Gradient Norm': 110.9207952940795,
   'Time': 0.2412114143371582},
  'Quasi-Newton Method': {'Iterations': 17,
   'Final Iterate': array([1.90900824, 2.06706264]),
   'Gradient Norm': 5.120829881885181e-07,
   'Time': 0.0}},
 {'Starting Point': array([-1.2,  1. ]),
  'Newton Method': {'Iterations': 999,
   'Final Iterate': array([-1.09767371,  2.15635183]),
   'Gradient Norm': 113.49535466400827,
   'Time': 0.22078776359558105},
  'Quasi-Newton Method': {'Iterations': 16,
   'Final Iterate': array([-


1. **Starting Point: (1.2, 1.2)**
   - **Newton Method**
     - Iterations: 999
     - Final Iterate: [0.99320368, 2.16834158]
     - Gradient Norm: 93.0043
     - Time: 0.459 seconds
   - **Quasi-Newton Method**
     - Iterations: 15
     - Final Iterate: [1.90900824, 2.06706264]
     - Gradient Norm: 8.74e-07
     - Time: 0.0009 seconds

2. **Starting Point: (1.5, 1.5)**
   - **Newton Method**
     - Iterations: 999
     - Final Iterate: [1.31923382, 2.23859313]
     - Gradient Norm: 110.921
     - Time: 0.494 seconds
   - **Quasi-Newton Method**
     - Iterations: 17
     - Final Iterate: [1.90900824, 2.06706264]
     - Gradient Norm: 5.12e-07
     - Time: 0.0012 seconds

3. **Starting Point: (-1.2, 1.0)**
   - **Newton Method**
     - Iterations: 999
     - Final Iterate: [-1.09767371, 2.15635183]
     - Gradient Norm: 113.495
     - Time: 0.527 seconds
   - **Quasi-Newton Method**
     - Iterations: 16
     - Final Iterate: [-1.88986142, 2.2852714]
     - Gradient Norm: 7.05e-07
     - Time: 0.0011 seconds

4. **Starting Point: (0.2, 0.8)**
   - **Newton Method**
     - Iterations: 999
     - Final Iterate: [0.1756172, 2.20471207]
     - Gradient Norm: 21.322
     - Time: 0.463 seconds
   - **Quasi-Newton Method**
     - Iterations: 16
     - Final Iterate: [1.90900824, 2.06706264]
     - Gradient Norm: 9.05e-07
     - Time: 0.0013 seconds

### Summary
For all the starting points tested, the Quasi-Newton method (BFGS) consistently converged faster (in fewer iterations and significantly less time) compared to the Newton method. The Newton method struggled to converge within 1000 iterations and did not significantly reduce the gradient norm, whereas the Quasi-Newton method achieved convergence with a very small gradient norm efficiently.

This demonstrates a scenario where the QN  outperforms the NM.