**Given function is :**
\begin{align}
\ h(x) = \sum_{i=1}^{N} \frac{1}{10^{i}}(x_i - i)^2  
\end{align}

for $ \textbf{N = 3}$,  we can express  **h(x)** as: \\
\begin{align}
\ h(x) = \frac{1}{10^{1}}(x_1 - 1)^2  + \frac{1}{10^{2}}(x_2 - 2)^2 + \frac{1}{10^{3}}(x_3 - 3)^2 
\end{align}  \\

which can be equivalently expressed in the form: \\
\begin{align}
\ h(x) = \begin{bmatrix}x_{1} & x_{2} & x_{3}
\end{bmatrix}\ \begin{bmatrix} \frac{1}{10} & 0 & 0 \\ 0 & \frac{1}{100} & 0 \\ 0 & 0 & \frac{1}{1000}\end{bmatrix}\begin{bmatrix}x_{1}\\x_{2}\\x_{3} \end{bmatrix} + 2\begin{bmatrix}-\frac{1}{10} & -\frac{2}{100} & -\frac{3}{1000}\end{bmatrix}\begin{bmatrix}
x_{1} \\ x_{2} \\ x_{3}\end{bmatrix} + \frac{149}{1000}  
= \mathbf{x}^\top \mathbf{A} \mathbf{x} + 2 \mathbf{b}^\top \mathbf{x} + c \end{align} \\
$ where ,\ \mathbf{x} = \begin{bmatrix}x_{1} \\ x_{2} \\ x_{3} 
\end{bmatrix},
\hspace{1.5cm} \mathbf{A} = \begin{bmatrix} 1/10 & 0 & 0 \\ 0 & 1/100 & 0 \\ 0 & 0 & 1/1000\end{bmatrix} \ , \ \ \mathbf{b} = \begin{bmatrix}-\frac{1}{10} \\ -\frac{2}{100} \\ -\frac{3}{1000}\end{bmatrix}\  \ , \ c = \  \frac{149}{1000} $


In [31]:
import numpy as np 

In [32]:
def evalf(x):  
  #Input: x is a numpy array of size 3 
  assert type(x) is np.ndarray and len(x) == 3 #do not allow arbitrary arguments 
  #after checking if the argument is valid, we can compute the objective function value
  #compute the function value and return it 
  return (1/10)*((x[0]-1)**2) + (1/100)*((x[1]-2)**2) + (1/1000)*((x[2]-3)**2)

In [33]:
def evalg(x):  
  #Input: x is a numpy array of size 3 
  assert type(x) is np.ndarray and len(x) == 3 #do not allow arbitrary arguments 
  #after checking if the argument is valid, we can compute the gradient value
  #compute the gradient value and return it 
  return np.array([0.2*((x[0]-1)), 0.02*((x[1]-2)), 0.002*((x[2]-3))])

In [34]:
#Complete the module to compute the steplength by using the closed-form expression
def compute_steplength_exact(gradf, A): #add appropriate arguments to the function 
  assert type(gradf) is np.ndarray and len(gradf) == 3 
  assert type(A) is np.ndarray and A.shape[0] == 3 and  A.shape[1] == 3 #allow only a 3x3 array 
  #Complete the code to compute step length
  gr_t = np.matrix.transpose(gradf)
  step_length = np.matmul(gr_t, gradf)/(2*np.matmul(np.matmul(gr_t, A), gradf))
  return step_length

In [35]:
#Complete the module to compute the steplength by using the backtracking line search
def compute_steplength_backtracking(x, gradf, alpha_start, rho, gamma): #add appropriate arguments to the function 
  assert type(x) is np.ndarray and len(x) == 3 
  assert type(gradf) is np.ndarray and len(gradf) == 3 
  
  alpha = alpha_start
  gr_t = np.matrix.transpose(gradf)
  #implement the backtracking line search
  while evalf(np.add(x,-alpha*gradf)) > evalf(x)-gamma*alpha*np.matmul(gr_t, gradf):
    alpha = rho*alpha
  #print('final step length:',alpha)
  return alpha

In [36]:
#we define the types of line search methods that we have implemented
EXACT_LINE_SEARCH = 1
BACKTRACKING_LINE_SEARCH = 2
CONSTANT_STEP_LENGTH = 3

In [37]:
def find_minimizer(start_x, tol, line_search_type, *args):
  #Input: start_x is a numpy array of size 3, tol denotes the tolerance and is a positive float value
  assert type(start_x) is np.ndarray and len(start_x) == 3 #do not allow arbitrary arguments 
  assert type(tol) is float and tol>=0 
  # construct a suitable A matrix for the quadratic function 
  A = np.array([[1/10, 0, 0],[0, 1/100, 0],[0,0,1/1000]])
  x = start_x
  g_x = evalg(x)

  #initialization for backtracking line search
  if(line_search_type == BACKTRACKING_LINE_SEARCH):
    alpha_start = args[0]
    rho = args[1]
    gamma = args[2]
    #print('Params for Backtracking LS: alpha start:', alpha_start, 'rho:', rho,' gamma:', gamma)

  k = 0
  #print('iter:',k, ' x:', x, ' f(x):', evalf(x), ' grad at x:', g_x, ' gradient norm:', np.linalg.norm(g_x))

  while (np.linalg.norm(g_x) > tol): #continue as long as the norm of gradient is not close to zero upto a tolerance tol
  
    if line_search_type == EXACT_LINE_SEARCH:
      step_length = compute_steplength_exact(g_x, A) #call the new function you wrote to compute the steplength
      #raise ValueError('EXACT LINE SEARCH NOT YET IMPLEMENTED')
    elif line_search_type == BACKTRACKING_LINE_SEARCH:
      step_length = compute_steplength_backtracking(x,g_x, alpha_start,rho, gamma) #call the new function you wrote to compute the steplength
      #raise ValueError('BACKTRACKING LINE SEARCH NOT YET IMPLEMENTED')
    elif line_search_type == CONSTANT_STEP_LENGTH: #do a gradient descent with constant step length
      step_length = 0.1
    else:  
      raise ValueError('Line search type unknown. Please check!')
    
    #implement the gradient descent steps here   
    x = np.subtract(x, np.multiply(step_length,g_x)) #update x = x - step_length*g_x
    k += 1 #increment iteration
    g_x = evalg(x) #compute gradient at new point

    #print('iter:',k, ' x:', x, ' f(x):', evalf(x), ' grad at x:', g_x, ' gradient norm:', np.linalg.norm(g_x))
  return x , k , evalf(x)

**2nd & 3rd ANSWER :** 

In [38]:
# Let us assume that the starting-point and stopping-tolerance for 2nd question be as mentioned below: 
my_start_x = np.array([1/100, 1/10, 1])
my_tol= 1e-9

x_opt_els, k, f_val_els = find_minimizer(my_start_x, my_tol, EXACT_LINE_SEARCH)
print("For Exact line search method:")
print("Minimizer = ",x_opt_els, ", iterations = ",k ,", Minimum function value = ", f_val_els)

x_opt_bls, k, f_val_bls = find_minimizer(my_start_x, my_tol, BACKTRACKING_LINE_SEARCH, 1, 0.5,0.5)
print("\nFor BACKTRACKING_LINE_SEARCH method:")
print("Minimizer = ",x_opt_bls, ", iterations = ",k ,", Minimum function value = ", f_val_bls)

For Exact line search method:
Minimizer =  [1.         2.         2.99999952] , iterations =  245 , Minimum function value =  2.2763158970246047e-16

For BACKTRACKING_LINE_SEARCH method:
Minimizer =  [1.        2.        2.9999995] , iterations =  7594 , Minimum function value =  2.4929930196397745e-16


**Comment for 3rd question :** \\
Here, the number of **iterations** required to achieve the minimizer and the minimum value of the function, by using the Exact step length procedure is comparatively very small (i.e. 245) to that of the backtracking line search procedure (i.e. 7594). Also minimizer remains same in both cases. However, there is a little difference in the function value.

**4 SOLUTION :** \\
$For, \ \  N = 4$

In [39]:
def evalf(x):  
  #Input: x is a numpy array of size 4 
  assert type(x) is np.ndarray and len(x) == 4 #do not allow arbitrary arguments 
  #after checking if the argument is valid, we can compute the objective function value
  #compute the function value and return it 
  return (1/10)*((x[0]-1)**2) + (1/100)*((x[1]-2)**2) + (1/1000)*((x[2]-3)**2) + (1/10000)*((x[3]-4)**2)

In [40]:
def evalg(x):  
  #Input: x is a numpy array of size 4 
  assert type(x) is np.ndarray and len(x) == 4 #do not allow arbitrary arguments 
  #after checking if the argument is valid, we can compute the gradient value
  #compute the gradient value and return it 
  return np.array([0.2*(x[0]-1), 0.02*(x[1]-2), 0.002*(x[2]-3), 0.0002*(x[3]-4)])

In [41]:
#Complete the module to compute the steplength by using the closed-form expression
def compute_steplength_exact(gradf, A): #add appropriate arguments to the function 
  assert type(gradf) is np.ndarray and len(gradf) == 4 
  assert type(A) is np.ndarray and A.shape[0] == 4 and  A.shape[1] == 4 #allow only a 4x4 array 
  #Complete the code to compute step length
  #A = np.identity(4)
  #gradf = evalg(x)
  gr_t = np.matrix.transpose(gradf)
  step_length = np.matmul(gr_t, gradf)/(2*np.matmul(np.matmul(gr_t, A), gradf))
  return step_length

In [42]:
#Complete the module to compute the steplength by using the backtracking line search
def compute_steplength_backtracking(x, gradf, alpha_start, rho, gamma): #add appropriate arguments to the function 
  assert type(x) is np.ndarray and len(x) == 4 
  assert type(gradf) is np.ndarray and len(gradf) == 4 
  
  alpha = alpha_start
  gr_t = np.matrix.transpose(gradf)
  #implement the backtracking line search
  while evalf(np.add(x,-alpha*gradf)) > evalf(x)-gamma*alpha*np.matmul(gr_t, gradf):
    alpha = rho*alpha
  #print('final step length:',alpha)
  return alpha

In [43]:
#we define the types of line search methods that we have implemented
EXACT_LINE_SEARCH = 1
BACKTRACKING_LINE_SEARCH = 2
CONSTANT_STEP_LENGTH = 3

In [44]:
def find_minimizer(start_x, tol, line_search_type, *args):
  #Input: start_x is a numpy array of size 3, tol denotes the tolerance and is a positive float value
  assert type(start_x) is np.ndarray and len(start_x) == 4 #do not allow arbitrary arguments 
  assert type(tol) is float and tol>=0 
  # construct a suitable A matrix for the quadratic function 
  A = np.array([[1/10, 0, 0,0],[0, 1/100, 0,0],[0,0,1/1000,0],[0,0,0,1/10000]])
  x = start_x
  g_x = evalg(x)

  #initialization for backtracking line search
  if(line_search_type == BACKTRACKING_LINE_SEARCH):
    alpha_start = args[0]
    rho = args[1]
    gamma = args[2]
    #print('Params for Backtracking LS: alpha start:', alpha_start, 'rho:', rho,' gamma:', gamma)

  k = 0
  #print('iter:',k, ' x:', x, ' f(x):', evalf(x), ' grad at x:', g_x, ' gradient norm:', np.linalg.norm(g_x))

  while (np.linalg.norm(g_x) > tol): #continue as long as the norm of gradient is not close to zero upto a tolerance tol
  
    if line_search_type == EXACT_LINE_SEARCH:
      step_length = compute_steplength_exact(g_x, A) #call the new function you wrote to compute the steplength
      #raise ValueError('EXACT LINE SEARCH NOT YET IMPLEMENTED')
    elif line_search_type == BACKTRACKING_LINE_SEARCH:
      step_length = compute_steplength_backtracking(x,g_x, alpha_start,rho, gamma) #call the new function you wrote to compute the steplength
      #raise ValueError('BACKTRACKING LINE SEARCH NOT YET IMPLEMENTED')
    elif line_search_type == CONSTANT_STEP_LENGTH: #do a gradient descent with constant step length
      step_length = 0.1
    else:  
      raise ValueError('Line search type unknown. Please check!')
    
    #implement the gradient descent steps here   
    x = np.subtract(x, np.multiply(step_length,g_x)) #update x = x - step_length*g_x
    k += 1 #increment iteration
    g_x = evalg(x) #compute gradient at new point

    #print('iter:',k, ' x:', x, ' f(x):', evalf(x), ' grad at x:', g_x, ' gradient norm:', np.linalg.norm(g_x))
  return x , k , evalf(x)

In [45]:
my_start_x = np.array([1/1000,1/100, 1/10, 1])
my_tol= 1e-9

x_els, k, fval_els = find_minimizer(my_start_x, my_tol, EXACT_LINE_SEARCH)
print("For Exact line search method:")
print("Minimizer = ",x_els, ", iterations = ",k ,", Minimum function value = ", fval_els)

x_bls, k, fval_bls = find_minimizer(my_start_x, my_tol, BACKTRACKING_LINE_SEARCH, 1, 0.5,0.5)
print("\nFor BACKTRACKING_LINE_SEARCH method:")
print("Minimizer = ",x_bls, ", iterations = ",k ,", Minimum function value = ", fval_bls)

For Exact line search method:
Minimizer =  [1.         2.         3.         3.99999527] , iterations =  2249 , Minimum function value =  2.2403065892237764e-15

For BACKTRACKING_LINE_SEARCH method:
Minimizer =  [1.       2.       3.       3.999995] , iterations =  66517 , Minimum function value =  2.4997720868765857e-15


**Comment :** \\
here we have a similiar observation as we have found in the 3rd question which is : \\
**1.** the number of iterations required to achieve the minimizer and the minimum value of the function, by using the Exact step length procedure is comparatively very small (i.e. 2249) to that of the backtracking line search procedure (i.e. 66517). \\
**2.** Also minimizer remains same in both cases. However, there is a little difference in the function value.

**5. ANSWER :**

**Comment on the possible observations for N > 4 :** \\
1. Number of iterations in case of exact line search will be less than the backtracking line search procedure. Also as the $N$ value increases the number of iterations in both cases also increases. \\
2. Minimizer will remain same in both the cases. \\
3. The difference in the function value for both cases will increase for increasing $N$ value.