# Coordinate Descent
Consider the function    

$$f(x)=e^{x_1 −x_2+1}+e^{x_2−x_3+2}+e^{x_3−x_1+3}$$

We try to find the minimum of $f$ with coordinate descent. 

## 2a (3pts)
Implement for each coordinate $x_i$ $(i\in\{1,2,3\})$ a function `argmin_xi(x)` that returns $\argmin_{x_i} f(x)$.

In [1]:
# f'(x) = e^(x_1-x_2+1)-e^(x_3-x1+3) 
# So we have to have e^(x_1-x_2+1) = e^(x_3-x1+3)
# thus x_1-x_2+1 = x_3-x1+3
# which will mean x_1 = (x_3 + x_2 + 2)/2
# (note that each step in here was a bi-implication, so there is only 1 point where f' is 0, and it can be considered that this is a global minimum not a maximum/inflection)
def argmin_x1 (x):
    return (x[1]+x[2])/2 + 1 

# f'(x) = e^(x_2-x_3+2) - e^(x_1-x_2+1)
# So we have to have e^(x_2-x_3+2) = e^(x_1-x_2+1)
# thus x_2-x_3+2 = x_1-x_2+1
# which will mean x_2 = (x_1 + x_3 - 1)/2
# (note that each step in here was a bi-implication, so there is only 1 point where f' is 0, and it can be considered that this is a global minimum not a maximum/inflection)
def argmin_x2 (x):
    return (x[0] + x[2] - 1)/2

# f'(x) = e^(x_3-x_1+3) - e^(x_2-x_3+2)
# So we have to have e^(x_3-x_1+3) = e^(x_2-x_3+2)
# thus x_3-x_1+3 = x_2-x_3+2
# which will mean x_3 = (x_2 + x_1 - 1)/2
# (note that each step in here was a bi-implication, so there is only 1 point where f' is 0, and it can be considered that this is a global minimum not a maximum/inflection)
def argmin_x3 (x):
    return (x[1] + x[0] - 1)/2

 Compute the $\argmin$ for each coordinate on $\underline{x}'=(2,3,4)$

In [2]:
x_prime = (2.0,3.0,4.0)
print(argmin_x1(x_prime))
print(argmin_x2(x_prime))
print(argmin_x3(x_prime))

4.5
2.5
2.0


## 2b (9pts)
Implement a function `coordinate_descent(f, argmin, x_0, max_iter=100)` that performs max_iter coordinate descent steps, where
- `f` is the function to be minimized (check the function values at each iteration),
- `argmin` is an array of the `argmin_xi` functions for each coordinate, and
- `x_0` is the starting point (initialization).

So, at iteration `t` we have to go through all the coordinates (indexed by `i`, going from the first to the last coordinate index) and update each coordinate with the update rule `x_t[i] = argmin[i](x_t)`

Starting at `x_0=(1,20,5)`, run your coordinate descent implementation and answer the following questions.


In [3]:
import numpy as np

#implementing the function f, even though it is not needed
def f(x):
    return np.exp(x[0]-x[1]+1)+np.exp(x[1]-x[2]+2)+np.exp(x[2]-x[0]+3)

#defining an array for the argmin functions for convenience
argmins = [argmin_x1, argmin_x2, argmin_x3]

#implementing coordinate descent
def coordinate_descent(f, argmin, x_0, max_iter=100):
    #initialize all values
    x = np.array(x_0)
    xs = [0]*(max_iter+1) #to keep track of the minimizers in each iterations
    fxs = [0]*(max_iter+1) #to keep track of the minima found in each iteration (for debugging)

    #values before iterations
    xs[0] = tuple(x_0) 
    fxs[0] = f(x_0)

    #iteration loop
    for t in range(max_iter):
        #loop through coordinates in each iteration
        for i in range(len(argmin)):
            #update the coordinate
            x[i] = argmin[i](x)
        #at the end of the iteration save the minimizer and correspondig minima
        xs[t+1] = tuple(x)
        fxs[t+1] = f(x)
    return tuple(x), xs, fxs


What are the first three coordinate update results (for the first iteration)?

In [4]:
x_star, xs, fxs = coordinate_descent(f, argmins, (1.0, 20.0, 5.0))
xs[1]

(13.5, 8.75, 10.625)

What is the minimizer coordinate descent converges to?

In [5]:
x_star

(11.0, 10.0, 10.0)

In [6]:
#to show us the steps taken in case we care, not part of the assignment
for i in range(len(xs)):
    print((xs[i], fxs[i]))

((1.0, 20.0, 5.0), 24156049.38673374)
((13.5, 8.75, 10.625), 316.4569571918279)
((10.6875, 10.15625, 9.921875), 23.30523740506907)
((11.0390625, 9.98046875, 10.009765625), 22.186384455902566)
((10.9951171875, 10.00244140625, 9.998779296875), 22.167465219784063)
((11.0006103515625, 9.99969482421875, 10.000152587890625), 22.167172942577434)
((10.999923706054688, 10.000038146972656, 9.999980926513672), 22.167168369369886)
((11.000009536743164, 9.999995231628418, 10.000002384185791), 22.167168297926004)
((10.999998807907104, 10.000000596046448, 9.999999701976776), 22.16716829680967)
((11.000000149011612, 9.999999925494194, 10.000000037252903), 22.167168296792227)
((10.999999981373549, 10.000000009313226, 9.999999995343387), 22.167168296791957)
((11.000000002328306, 9.999999998835847, 10.000000000582077), 22.16716829679195)
((10.999999999708962, 10.00000000014552, 9.99999999992724), 22.16716829679195)
((11.00000000003638, 9.99999999998181, 10.000000000009095), 22.16716829679195)
((10.999999