$\textbf{1.}$ What is the minimizer and minimum function value of f(x) and g(x) ? Are both the function convex ? What
is a suitable initial choice of B (denoted by B0, i.e. Replacement of first ?? in the Algorithm 3)? Justify
with proper reasons.




For $f(x)$:
$
\frac{\partial f}{\partial x_i} = 8(x_i - 1) + 8(x_{2i}^2 - x_{i+1}) = 0
$

Substitute $x_i = 1$:
$
\frac{\partial f}{\partial x_i} \Big|_{x_i=1} = 8(1 - 1) + 8(x_{2i}^2 - x_{i+1}) = 0
$

This simplifies to $-8(x_{2i}^2 - x_{i+1}) = 0$. Since $x_{2i}^2$ is always non-negative, the only solution is $x_{i+1} = 1$.

For $g(x)$:
$
\frac{\partial g}{\partial x_i} = 2(x_1 - x_{2i}) - 2(x_{2i-1} - x_i) + 2(x_i - 1) = 0
$

Substitute $x_i = 1$:
$
\frac{\partial g}{\partial x_i} \Big|_{x_i=1} = 2(x_1 - x_{2i}) - 2(x_{2i-1} - 1) + 2(1 - 1) = 0
$

This simplifies to $2(x_1 - x_{2i}) - 2(x_{2i-1} - 1) = 0$. Similarly, since $x_{2i-1} \geq 1$ and $x_{2i} \geq 1$, the only solution is $x_1 = 1$.

Therefore, $x_i = 1$ for all $i$ is indeed a solution to both systems of equations, confirming that $x_i = 1$ is a minimizer for both $f(x)$ and $g(x)$.

For $f(x)$:
$f(x) = \sum_{i=1}^{n-1} [4(x_{2i} - x_{i+1})^2 + (x_i - 1)^2]$

Substitute $x_i = 1$:
$f(x) = \sum_{i=1}^{n-1} [4(1 - x_{i+1})^2 + (1 - 1)^2]$

Simplifying further:
$f(x) = 4\sum_{i=1}^{n-1} (1 - x_{i+1})^2$

Since $x_{i+1} = 1$ for all $i$, each term in the sum becomes zero, and the minimum value of $f(x)$ is $0$.

For $g(x)$:
$g(x) = \sum_{i=1}^{n} [(x_1 - x_{2i})^2 + (x_i - 1)^2]$

Substitute $x_i = 1$:
$g(x) = \sum_{i=1}^{n} [(1 - x_{2i})^2 + (1 - 1)^2]$

Simplifying further:
$g(x) = \sum_{i=1}^{n} (1 - x_{2i})^2$

Since $x_{2i} = 1$ for all $i$, each term in the sum becomes zero, and the minimum value of $g(x)$ is $0$.

Therefore, when $x_i = 1$ for all $i$, both $f(x)$ and $g(x)$ achieve their minimum values of $0$.

**Initial Choice of $B_0$:**
 a common and often effective choice for the initial Hessian approximation ($B_0$) is the identity matrix ($I$).

**Justification:**
1. **Positivity Definite:** The identity matrix is always positive definite.
2. **Simplicity:** Choosing $B_0 = I$ is computationally simple.
3. **General Applicability:** It is a general-purpose choice that often performs well across different problems.







In [None]:
import numpy as np
from scipy.optimize import minimize

# Define the functions f(x) and g(x)
def f(x):
    product_term = np.prod([4 * (x[i]**2 - x[i+1])**2 + (x[i] - 1)**2 for i in range(len(x)-1)])
    return product_term

def g(x):
    sum_term = np.sum([(x[0] - x[i]**2)**2 + (x[i] - 1)**2 for i in range(len(x))])
    return sum_term

# Algorithm 3: BFGS Algorithm
def bfgs_algorithm(x0, tolerance):
    B0 = np.eye(len(x0))  # Initial choice of B_0 as the identity matrix
    k = 0

    while np.linalg.norm(grad_f(x0)) > tolerance:
        pk = -np.dot(B0, grad_f(x0))
        alpha_k = minimize(lambda alpha: f(x0 + alpha * pk), 0).x[0]
        x0_new = x0 + alpha_k * pk
        sk = x0_new - x0
        yk = grad_f(x0_new) - grad_f(x0)

        # BFGS update formula for Hessian approximation
        B0 = B0 + np.outer(yk, yk) / np.dot(yk, sk) - np.dot(B0, np.outer(sk, sk)).dot(B0) / np.dot(sk, B0.dot(sk))

        x0 = x0_new
        k += 1

    return x0, f(x0), k

# Gradient of f(x)
def grad_f(x):
    gradient = np.zeros_like(x)
    for i in range(len(x)-1):
        gradient[i] = 4 * (x[i]**2 - x[i+1]) * (8 * x[i]**2 - 4 * x[i+1] + 2) + 2 * (x[i] - 1)
    gradient[-1] = 2 * (x[-1] - 1)
    return gradient

# Initial guess
initial_guess = np.ones(5)

# Set tolerance for stopping criterion
tolerance = 1e-6

# Run BFGS algorithm for function f(x)
result_f = bfgs_algorithm(initial_guess, tolerance)

# Run BFGS algorithm for function g(x)
result_g = bfgs_algorithm(initial_guess, tolerance)

# Display results
print("Results for f(x):")
print("Minimizer:", result_f[0])
print("Minimum function value:", result_f[1])
print("Number of iterations:", result_f[2])

print("\nResults for g(x):")
print("Minimizer:", result_g[0])
print("Minimum function value:", result_g[1])
print("Number of iterations:", result_g[2])

Results for f(x):
Minimizer: [1. 1. 1. 1. 1.]
Minimum function value: 0.0
Number of iterations: 0

Results for g(x):
Minimizer: [1. 1. 1. 1. 1.]
Minimum function value: 0.0
Number of iterations: 0


$\textbf{Question 2.}$

In [None]:
import numpy as np
import time
from scipy.optimize import minimize

# Define the objective function f(x)
def f(x):
    return np.sum(4 * (x[:-1]**2 - x[1:])**2 + (x[:-1] - 1)**2)

# Gradient of f(x)
def gradient_f(x):
    n = len(x)
    grad = np.zeros_like(x)
    grad[:-1] += 8 * (x[:-1]**2 - x[1:]) * (2 * x[:-1])
    grad[1:] += -8 * (x[:-1]**2 - x[1:]) + 2 * (x[:-1] - 1)
    return grad

# BFGS optimization algorithm with more efficient line search
def bfgs_with_line_search(func, grad_func, x0):
    start_time = time.time()

    result = minimize(func, x0, jac=grad_func, method='L-BFGS-B', options={'maxiter': 100})

    end_time = time.time()
    elapsed_time = end_time - start_time

    return result.x, result.fun, elapsed_time

# Test for different values of n
dimensions = [1000, 2500, 5000, 7500, 10000]
results = []

# Display header for the results table
print("{:<10} {:<20} {:<20}".format("Dimension", "Minimum Value", "Time Taken (s)"))
print("="*50)

for n in dimensions:
    x0 = np.zeros(n)
    minimizer, min_value, elapsed_time = bfgs_with_line_search(f, gradient_f, x0)
    results.append((n, min_value, elapsed_time))
    print("{:<10} {:<20} {:<20}".format(n, min_value, elapsed_time))


Dimension  Minimum Value        Time Taken (s)      
1000       3.886048115018858    0.01813983917236328 
2500       3.8866667640074484   0.012986421585083008
5000       3.8890669551610544   0.024234771728515625
7500       3.8886473663915497   0.12154912948608398 
10000      3.888699500343477    0.03904128074645996 


$\textbf{Question 3.}$

In [None]:
import numpy as np
import time
from scipy.optimize import minimize

# Define the objective function g(x)
def g(x):
    return np.sum((x[0] - x[1:]**2)**2 + (x[1:] - 1)**2)

# Gradient of g(x)
def gradient_g(x):
    n = len(x)
    grad = np.zeros_like(x)
    grad[0] = 2 * (x[0] - x[1]**2)
    grad[1:] += 2 * (x[1:] - 1) * (-2 * x[1:])
    grad[0] += 2 * (x[0] - x[1:]**2) @ (-2 * x[1:])
    return grad

# BFGS optimization algorithm with more efficient line search
def bfgs_with_line_search(func, grad_func, x0):
    start_time = time.time()

    result = minimize(func, x0, jac=grad_func, method='L-BFGS-B', options={'maxiter': 100})

    end_time = time.time()
    elapsed_time = end_time - start_time

    return result.x, result.fun, elapsed_time

# Test for different values of n
dimensions = [1000, 2500, 5000, 7500, 10000]
results = []

# Display header for the results table
print("{:<10} {:<20} {:<20}".format("Dimension", "Minimum Value", "Time Taken (s)"))
print("="*50)

for n in dimensions:
    x0 = np.zeros(n)
    minimizer, min_value, elapsed_time = bfgs_with_line_search(g, gradient_g, x0)
    results.append((n, min_value, elapsed_time))
    print("{:<10} {:<20} {:<20}".format(n, min_value, elapsed_time))


Dimension  Minimum Value        Time Taken (s)      
1000       999.0                0.003514528274536133
2500       2499.0               0.004128932952880859
5000       4999.0               0.007193088531494141
7500       7499.0               0.010080575942993164
10000      9999.0               0.013151884078979492


$\textbf{Question 4.}$

In [None]:
import numpy as np
import time

def objective_function(x):
    return np.prod([4*(x[i]**2 - x[i+1])**2 + (x[i] - 1)**2 for i in range(len(x)-1)])

def gradient(x):
    n = len(x)
    grad = np.zeros(n)
    for i in range(n-1):
        grad[i] = -8 * (x[i]**2 - x[i+1]) * (2*x[i] - 2) + 2 * (x[i] - 1)
        grad[i+1] = 8 * (x[i]**2 - x[i+1])  # Special case for the last element
    return grad

def hessian(x):
    n = len(x)
    hess = np.zeros((n, n))
    for i in range(n-1):
        hess[i, i] = 8 * (x[i]**2 - x[i+1]) + 16 * x[i]**2 - 8
        hess[i, i+1] = -16 * x[i]
        hess[i+1, i] = -16 * x[i]  # Symmetric
        hess[i+1, i+1] = 8 * (x[i]**2 - x[i+1])  # Special case for the last element
    return hess

def preconditioned_gradient(x, preconditioner):
    return preconditioner @ gradient(x)

def preconditioned_hessian(x, preconditioner):
    return preconditioner @ hessian(x) @ preconditioner.T

def backtracking_line_search(x, gradient, direction, alpha0, rho, gamma):
    alpha = alpha0
    while objective_function(x + alpha * direction) > objective_function(x) + gamma * alpha * np.dot(gradient, direction):
        alpha = rho * alpha
    return alpha

def newtons_method_with_preconditioning(x0, tolerance, preconditioner):
    x = x0.copy()
    k = 0
    start_time = time.time()

    while np.linalg.norm(gradient(x)) > tolerance:
        precond_gradient = preconditioned_gradient(x, preconditioner)
        precond_hessian = preconditioned_hessian(x, preconditioner)

        direction = -np.linalg.solve(precond_hessian, precond_gradient)
        step_size = backtracking_line_search(x, gradient(x), direction, alpha0=0.9, rho=0.5, gamma=0.5)
        x = x + step_size * direction
        k += 1

    end_time = time.time()
    time_taken = end_time - start_time

    return x, k, time_taken

# Test for different values of n
n_values = [1000, 2500, 5000, 7500, 10000]
tolerance = 1e-6  # Stopping tolerance
preconditioner = preconditioner + regularization_constant * np.eye(len(preconditioner))
# Identity matrix as a simple preconditioner

# Display results in a table
print("{:<10} {:<25} {:<15}".format("n", "Minimizer", "Time Taken (s)"))
for n in n_values:
    x0 = np.zeros(n)

    start_time = time.time()
    result = newtons_method_with_preconditioning(x0, tolerance, preconditioner[:n, :n])
    end_time = time.time()

    minimizer, iterations, time_taken = result
    print("{:<10} {:<25} {:<15}".format(n, str(minimizer), time_taken))


NameError: name 'preconditioner' is not defined