In [11]:
import numpy as np
from scipy.optimize import approx_fprime
#from antidifferentiation import gradient
#test

In [4]:
conda install antidifferentiation


Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.

Note: you may need to restart the kernel to use updated packages.



PackagesNotFoundError: The following packages are not available from current channels:

  - antidifferentiation

Current channels:

  - https://repo.anaconda.com/pkgs/main/win-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/win-64
  - https://repo.anaconda.com/pkgs/r/noarch
  - https://repo.anaconda.com/pkgs/msys2/win-64
  - https://repo.anaconda.com/pkgs/msys2/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.




#PS5 

![image.png](attachment:image.png)



let $f(x_i)$ be the Rosenbrock function defined above. The lagrangian is:

$$ \mathcal{L}(x_{i}, \lambda) = f(x_{i}) - \lambda ( \sum_{i=1}^{n} x_{i}^{2} - r)$$

![image.png](attachment:image.png)

In [16]:
#first we define the rosenbrock function with unknown parameters a and b 
def rosenbrock(x,a = 1,b = 1):
    return sum( a*(1-x[:-1])**2.0 + b*(x[1:]-x[:-1]**2.0)**2.0 )


#lagrangian
def lagrangian(x, lamda =0.5, r=4): #output should be a 1 dimensional scalar value 
    return rosenbrock(x) - lamda * (np.sum(x**2) - r)





![image.png](attachment:image.png)


 The first n elements of the array correspond to the gradient of the Lagrangian with respect to the x variables, and the last element corresponds to the gradient with respect to the lambda variable.

 
The gradient with respect to the x variables is calculated by taking the derivative of the Lagrangian with respect to each x_i and summing the results. The gradient with respect to lambda is simply the sum of the squares of the x variables minus the value of the constraint r.

In [20]:


# def lagrangian_gradient_func(x, λ, r=4):
#     ''' inputs a function then, outputs a new function that calculates the gradient of the lagrangian'''
#     return gradient(lambda x: lagrangian(x, λ, r))(x)

#manual method

def lagrangian_gradient_func(x, lamda = 0.5, r =4, a=1 , b=1): #this calculates the true gradient(not approximation)

    n = len(x)
    gradient_x = np.zeros(n)
    for i in range(n - 1):
        gradient_x[i] = 2 * (1 - x[i]) * a - 2 * b * (x[i+1] - x[i]**2) * 2 * x[i]
        gradient_x[i+1] = 2 * b * (x[i+1] - x[i]**2)
    gradient_x = gradient_x - 2 * lamda * x
    gradient_lambda = np.sum(x**2) - r
    return np.concatenate((gradient_x, [gradient_lambda]))


#testcase
x = np.array([1, 2, 3])
lamda = 0.5
r = 4

gradient = lagrangian_gradient_func(x, lamda, r)
print(gradient)


[-5.  4. -5. 10.]


![image.png](attachment:image.png)



In [36]:
n = 5 #dimension of the vector /initial x
r = 4 # from the constraint

#first we define a first differencing function 
def finite_difference_gradient(x, lamda = 0.5, h = 1e-6):
    n = len(x)
    gradient_x = np.zeros(n)
    for i in range(n):
        x_plus = x.copy()
        x_plus[i] += h
        x_minus = x.copy()
        x_minus[i] -= h
        gradient_x[i] = (lagrangian(x_plus, lamda, r) - lagrangian(x_minus, lamda, r)) / (2 * h)
    lamda_plus = lamda + h
    lamda_minus = lamda - h
    gradient_lambda = (lagrangian(x, lamda_plus, r) - lagrangian(x, lamda_minus, r)) / (2 * h)
    return np.concatenate((gradient_x, [gradient_lambda]))

#we could also use scipy.optimize import approx_fprime to find the gradient.
#the problem with using this one is that im not sure where the lamda goes so i'll stick to the one ABOVE
# finite_difference = np.array([approx_fprime(x, lagrangian, epsilon=1e-6)])

#loop to investigate rmse 
num_samples = 100
error = 0
for i in range(num_samples):
    x = np.random.rand(n) #  generates an array  n numbers from a uniform distribution over  [0, 1]
    lamda = np.random.rand() #the 6th uniform number
    true_gradient = lagrangian_gradient_func(x, lamda, r=4)
    finite_diff_gradient = finite_difference_gradient(x, lamda)
    error += np.sum((np.abs(true_gradient - finite_diff_gradient))**2)
error = np.sqrt(error / num_samples)
print("Root mean squared error:", error)



Root mean squared error: 274.1269930545227


![image.png](attachment:image.png)

In this code, the hessian_DL function calculates the Hessian of the Lagrangian as a function of the point x and the multiplier lamda. The function returns the Hessian as a 2D numpy array, with shape (n+1, n+1), where n is the number of variables in x. The Hessian is calculated using the analytical expression for the second derivatives of the Lagrangian with respect to x and lambda. The resulting Hessian matrix is stored in the variable hessian_DL.

In [46]:



def hessian_DL(x, lamda = 0.5, r=4):
    n = len(x)
    hessian_x = np.zeros((n, n))
    for i in range(n - 1):
        hessian_x[i, i] = 2 * a + 2 * b * 2 * x[i]**2 - 4 * b * (x[i+1] - x[i]**2)
        hessian_x[i+1, i] = -4 * b * x[i]
        hessian_x[i, i+1] = -4 * b * x[i]
        hessian_x[i+1, i+1] = 2 * b
    hessian_x = hessian_x - 2 * np.eye(n) * lamda
    hessian_lambda = np.zeros((1, 1))
    hessian_lambda[0, 0] = np.sum(2 * x)
    hessian_xlambda = np.zeros((n, 1))
    hessian_lambda_x = np.zeros((1, n))
    return np.concatenate((np.concatenate((hessian_x, hessian_xlambda), axis=1), np.concatenate((hessian_lambda_x, hessian_lambda), axis=1)))

x = np.array([1, 2, 3])
lamda = 4
hessian = hessian_DL(x, lamda)
print(hessian)

[[-6. -4.  0.  0.]
 [-4. 14. -8.  0.]
 [ 0. -8. -6.  0.]
 [ 0.  0.  0. 12.]]


![image.png](attachment:image.png)
The search direction s^k is then calculated as the solution to the linear system D^2L(y^k)s^k = DL(y^k), which is obtained by multiplying the inverse of the Hessian by the gradient. The resulting search direction is returned as output. Note that the output will depend on the implementation of the hessian_DL and gradient_DL functions and may vary each time the code is run.

In [51]:
def search_direction(y):
    r = 4 #change this if we want to change r for the whole system 
    lamda = 0.5
    n = len(y) - 1
    x = y[:n]
    lamda = y[n]
    hessian = hessian_DL(x, lamda, r)
    gradient = lagrangian_gradient_func(x, lamda, r)
    s = - np.dot(np.linalg.inv(hessian), gradient)
    return s

y = np.array([1, 2, 3, 4])
s = search_direction(y)
print(s)

[-1.20325203 -1.19512195 -2.7398374  -0.83333333]


![image.png](attachment:image.png)

In [52]:
def newton_nextstep(y, r=4, alpha=1.0):
    s = search_direction(y, r)
    y_new = y + alpha * s
    return y_new

![image.png](attachment:image.png)

In [None]:
def check_convergence(y, epsilon, r=4):
    gradient = gradient_DL(y[:-1], y[-1], r)
    gradient_norm = np.linalg.norm(gradient)
    if gradient_norm < epsilon:
        return True
    else:
        return False