Here the Given function is: \
$\hspace6ex q(\mathbf{x}) = \sqrt{x_1^2 + 1} +  \sqrt{x_2^2 + 1}$. \

$\Rightarrow \ Hessian \ = \nabla^2 q(\mathbf{x}) =
\begin{bmatrix}
  q_{x_1^2}(\mathbf{x}) & 
    q_{x_1x_2}(\mathbf{x})  \\
  q_{x_2x_1}(\mathbf{x}) & 
    q_{x_2^2}(\mathbf{x})
\end{bmatrix}
=
\begin{bmatrix}
  \frac{1}{(x_1^2+1)^\frac{3}{2}} & 0 \\ 
  \\
  0 & \frac{1}{(x_2^2+1)^\frac{3}{2}}
\end{bmatrix} $

In [32]:
import numpy as np 

#method to find Hessian matrix
def evalh(x): 
  assert type(x) is np.ndarray 
  assert len(x) == 2 
  return np.array([[(x[0]**2+1)**(-1.5), 0], [0, (x[1]**2+1)**(-1.5)]])

In [33]:
def evalf(x):  
  #Input: x is a numpy array of size 2 
  assert type(x) is np.ndarray and len(x) == 2 #do not allow arbitrary arguments 
  #after checking if the argument is valid, we can compute the objective function value
  #compute the function value and return it 
  return (x[0]**2+1)**0.5 + (x[1]**2+1)**0.5

In [34]:
def evalg(x):  
  #Input: x is a numpy array of size 2 
  assert type(x) is np.ndarray and len(x) == 2 #do not allow arbitrary arguments 
  #after checking if the argument is valid, we can compute the gradient value
  #compute the gradient value and return it 
  return np.array([x[0]/(x[0]**2+1)**0.5, x[1]/(x[1]**2+1)**0.5])

In [35]:
def compute_D_k(x):
  assert type(x) is np.ndarray
  assert len(x) == 2
  if np.linalg.det(evalh(x)) == 0:
    raise ValueError('Determinant does not exist. Please check!!')
  return np.linalg.inv(evalh(x))  #computing inverse of Hessian.

In [36]:
def compute_steplength_backtracking_Newton_method(x, gradf, alpha_start, rho, gamma): #add appropriate arguments to the function 
  assert type(x) is np.ndarray and len(gradf) == 2 
  assert type(gradf) is np.ndarray and len(gradf) == 2 
  #assert type(direction) is np.ndarray and len(direction) == 2 
  assert type(alpha_start) is float and alpha_start>=0. 
  assert type(rho) is float and rho>=0.
  assert type(gamma) is float and gamma>=0. 
  alpha = alpha_start
  D_k = compute_D_k(x)
  while evalf(x + alpha*np.matmul(D_k,-gradf)) > evalf(x) + gamma*alpha*(np.matmul(np.matrix.transpose(gradf), np.matmul(D_k,-gradf)) ):
    alpha = rho*alpha
  return alpha 

In [37]:
#line search type 
CONSTANT_STEP_LENGTH = 1
BACKTRACKING_LINE_SEARCH = 2
BACKTRACKING_LINE_SEARCH_SCALING = 3

In [47]:
#complete the code for gradient descent with scaling to find the minimizer

def find_minimizer_Newton_method(start_x, tol, line_search_type, *args):
  #Input: start_x is a numpy array of size 2, tol denotes the tolerance and is a positive float value
  assert type(start_x) is np.ndarray and len(start_x) == 2 #do not allow arbitrary arguments 
  assert type(tol) is float and tol>=0 
  x = start_x
  g_x = evalg(x)

  #initialization for backtracking line search
  if(line_search_type == BACKTRACKING_LINE_SEARCH):
    alpha_start = args[0]
    rho = args[1]
    gamma = args[2]

  k = 0
  while (np.linalg.norm(g_x) > tol): #continue as long as the norm of gradient is not close to zero upto a tolerance tol
    D_k = compute_D_k(x)
    import scipy
    from scipy.linalg import sqrtm
    d = scipy.linalg.sqrtm(D_k)
    if line_search_type == CONSTANT_STEP_LENGTH: #do a gradient descent with constant step length
      step_length = 1.0
    elif line_search_type == BACKTRACKING_LINE_SEARCH:
      step_length = compute_steplength_backtracking_Newton_method(x, g_x, alpha_start, rho, gamma) #call the new function you wrote to compute the steplength
      #raise ValueError('BACKTRACKING LINE SEARCH NOT YET IMPLEMENTED') 
    else:  
      raise ValueError('Line search type unknown. Please check!')    
    #implement the gradient descent steps here  
    x = np.subtract(x, np.multiply(step_length,np.matmul(D_k, g_x))) #update x = x - step_length*g_x
    k += 1 #increment iteration
    #print('iter:',k)
    g_x = evalg(x) #compute gradient at new point
  return x, k, evalf(x)  

In [48]:
my_start_x = np.array([1.0,1.0])
my_tol= 1e-9

**2.ANSWER:**

In [25]:
print("For Newton's Method with CONSTANT_STEP_LENGTH procedure :")
x_opt, k, f_value = find_minimizer_Newton_method(my_start_x, my_tol, CONSTANT_STEP_LENGTH)
print("Minimizer = ",x_opt,",  Iteration = ",k,",  Minimum function value = ", f_value) 

print("\nFor Newton's Method with BACKTRACKING_LINE_SEARCH :")
x_opt_bls, k, f_value = find_minimizer_Newton_method(my_start_x, my_tol, BACKTRACKING_LINE_SEARCH, 1.0, 0.5,0.5)
print("Minimizer = ",x_opt_bls,",  Iteration = ", k , ",  Minimum function value = ",f_value)

For Newton's Method with CONSTANT_STEP_LENGTH procedure :
Minimizer =  [3.93389978e-12 3.93389978e-12] ,  Iteration =  36 ,  Minimum function value =  2.0

For Newton's Method with BACKTRACKING_LINE_SEARCH :
Minimizer =  [2.22044605e-16 2.22044605e-16] ,  Iteration =  1 ,  Minimum function value =  2.0


**COMMENTS:** \
1. From the above output it is clear that , **Newton's Method with BACKTRACKING_LINE_SEARCH (number of iterations = 1)** converges with a much faster rate than the **Newton's Method with CONSTANT_STEP_LENGTH procedure (number of iterations = 36).** \
2. Minimum function value remains same in both the cases i.e. 2.0. \
3. For both cases Minimizer is different i.e. point of convergence is different.

**3. SOLUTION:**

In [49]:
#Complete the module to compute the steplength by using the backtracking line search without scaling.
def compute_steplength_backtracking(x, gradf, alpha_start, rho, gamma): #add appropriate arguments to the function 
  assert type(x) is np.ndarray and len(x) == 2 
  assert type(gradf) is np.ndarray and len(gradf) == 2 
  
  alpha = alpha_start
  gr_t = np.matrix.transpose(gradf)
  #implement the backtracking line search
  while evalf(np.add(x,-alpha*gradf)) > evalf(x)-gamma*alpha*np.matmul(gr_t, gradf):
    alpha = rho*alpha
  #print('final step length:',alpha)
  return alpha

In [50]:
#complete the code for gradient descent to find the minimizer
def find_minimizer_gd(start_x, tol, line_search_type, *args):
  #Input: start_x is a numpy array of size 2, tol denotes the tolerance and is a positive float value
  assert type(start_x) is np.ndarray and len(start_x) == 2 #do not allow arbitrary arguments 
  assert type(tol) is float and tol>=0 
  # construct a suitable A matrix for the quadratic function 
  x = start_x
  A = (1/2)*evalh(x)
  g_x = evalg(x)

  #initialization for backtracking line search
  if(line_search_type == BACKTRACKING_LINE_SEARCH):
    alpha_start = args[0]
    rho = args[1]
    gamma = args[2]
    #print('Params for Backtracking LS: alpha start:', alpha_start, 'rho:', rho,' gamma:', gamma)

  k = 0
  #print('iter:',k, ' x:', x, ' f(x):', evalf(x), ' grad at x:', g_x, ' gradient norm:', np.linalg.norm(g_x))

  while (np.linalg.norm(g_x) > tol): #continue as long as the norm of gradient is not close to zero upto a tolerance tol
  
    if line_search_type == BACKTRACKING_LINE_SEARCH:
      step_length = compute_steplength_backtracking(x,g_x, alpha_start,rho, gamma) #call the new function you wrote to compute the steplength
      #raise ValueError('BACKTRACKING LINE SEARCH NOT YET IMPLEMENTED')
    elif line_search_type == CONSTANT_STEP_LENGTH: #do a gradient descent with constant step length
      step_length = 0.1
    else:  
      raise ValueError('Line search type unknown. Please check!')
    
    #implement the gradient descent steps here   
    x = np.subtract(x, np.multiply(step_length,g_x)) #update x = x - step_length*g_x
    k += 1 #increment iteration
    g_x = evalg(x) #compute gradient at new point

    #print('iter:',k, ' x:', x, ' f(x):', evalf(x), ' grad at x:', g_x, ' gradient norm:', np.linalg.norm(g_x))
  return x , k , evalf(x)

**3. ANSWER:**

In [41]:
#check gradient descent (without scaling) with backtracking line search with starting point (1, 1).
print("\nFor BACKTRACKING_LINE_SEARCH WITHOUT_SCALING :")
x_opt_bls, k, f_value = find_minimizer_gd(my_start_x, my_tol, BACKTRACKING_LINE_SEARCH, 1.0, 0.5,0.5)
print("Minimizer = ",x_opt_bls,",  Iteration = ", k , ",   Minimum function value = ",f_value)


For BACKTRACKING_LINE_SEARCH WITHOUT_SCALING :
Minimizer =  [2.78991477e-19 2.78991477e-19] ,  Iteration =  4 ,   Minimum function value =  2.0


**COMMENTS:** \
1. Here the number of Iterations required to converge to the minimum value is 4 which is greater than the number of iterations in Newton's Method with BACKTRACKING_LINE_SEARCH (=1) and less than the number of iterations in Newton's Method with CONSTANT_STEP_LENGTH procedure (=36). \
2. Minimum function value is same in all the cases (= 2).

**4.SOLUTION:**

In [51]:
my_start_x = np.array([10.0, 10.0])
my_tol= 1e-9

In [45]:
print("\nFor Newton's Method with BACKTRACKING_LINE_SEARCH :")
x_opt_bls, k, f_value = find_minimizer_Newton_method(my_start_x, my_tol, BACKTRACKING_LINE_SEARCH, 1.0, 0.5,0.5)
print("Minimizer = ",x_opt_bls,",  Iteration = ", k , ",  Minimum function value = ",f_value)


For Newton's Method with BACKTRACKING_LINE_SEARCH :
Minimizer =  [-9.92761578e-15 -9.92761578e-15] ,  Iteration =  17 ,  Minimum function value =  2.0


In [52]:
print("For Newton's Method with CONSTANT_STEP_LENGTH procedure :")
x_opt, k, f_value = find_minimizer_Newton_method(my_start_x, my_tol, CONSTANT_STEP_LENGTH)
print("Minimizer = ",x_opt,",  Iteration = ",k,",  Minimum function value = ", f_value) 


For Newton's Method with CONSTANT_STEP_LENGTH procedure :
iter: 1
iter: 2
iter: 3
iter: 4


ValueError: ignored

**REASON FOR ERROR MESSAGE:** \
Here, at the 5th iteration the value of the $Hessian = \nabla^2 q(\mathbf{x})$ is zero, i.e. Hessian becomes a singular matrix $\Rightarrow$ its inverse is not possible. Therefore, we cannot proceed further With Newton's Method with CONSTANT_STEP_LENGTH procedure.

**COMMENTS:** \
1. For Newton's Method with CONSTANT_STEP_LENGTH procedure, at the 5th iteration the Hessian becomes Zero.
2. For Newton's Method with BACKTRACKING_LINE_SEARCH the number of iterations required to converge were 17.

**5.ANSWER :**

In [53]:
#check gradient descent without scaling and with backtracking line search 
print("\nFor BACKTRACKING_LINE_SEARCH WITHOUT_SCALING :")
x_opt_bls, k, f_value = find_minimizer_gd(my_start_x, my_tol, BACKTRACKING_LINE_SEARCH, 1.0, 0.5,0.5)
print("Minimizer = ",x_opt_bls,",  Iteration = ", k , ",   Minimum function value = ",f_value)


For BACKTRACKING_LINE_SEARCH WITHOUT_SCALING :
Minimizer =  [2.12455853e-14 2.12455853e-14] ,  Iteration =  13 ,   Minimum function value =  2.0


**COMMENTS:** \
1. In the previous question the number of iterations for Newton's Method with BACKTRACKING_LINE_SEARCH were 17 whereas in this question the number of iterations for gradient descent without scaling and with backtracking line search is little bit less i.e. 13.\
2. Minimum function value is same in both cases, equal to 2. But mimimizer is different.