# Optimization in Julia
We will be using `Optim` for several applications, both uni- and multivariate optimization.


In [None]:
using JuMP, GLPK, Optim, Plots

## Optimizing a function without gradient information


### Optimizing a univariate function
For a univariation function, you need to provide an upper and lower bound
```Julia
optimize(f, lower, upper)
```
Try to optimize $x^3 - 6x + x^2 +2$ and compare the result between both methods (`Brent()` vs `GoldenSection()`).


### Use case
Try to determine the optimal value of (a) parameter(s) in a more complex system.

Let's revisit the employee planning problem from last week. Suppose we include a temporary worker, who only works on wednesdays and weekends. Furthermore, assume that the age of the temporary worker has an impact on the cost (insurance, taxes, expected pay$\dots$). You want to find the ideal age to recruit, keeping costs low. Suppose the salary behaves like: $ s(x) = 90 .- 0.75x .+ 0.02x.^2 $.

FYI: it has been done deliberately that the optimization and minimization problem are separable, just to illustrate what can be done.


In [None]:
# age return
s(x) = 90 .- 0.75x .+ 0.02x.^2

function quicksim(k)
    # settings
    A = [1 0 0 1 1 1 1 0;
         1 1 0 0 1 1 1 0;
         1 1 1 0 0 1 1 1;
         1 1 1 1 0 0 1 0;
         1 1 1 1 1 0 0 0;
         0 1 1 1 1 1 0 1;
         0 0 1 1 1 1 1 1];
    b = [22; 17; 13; 14; 15; 18; 24 ];
    c = [5*96; 5*96; 5*96; 5*96; 5*96; 5*96; 5*96; 3*s(k)]
    # JuMP model
    model = Model(with_optimizer(GLPK.Optimizer, method = GLPK.SIMPLEX))
    # decision variables
    @variable(model, x[1:8]>=0, Int)
    # constraints
    @constraint(model, A*x .>= b)
    # objective function
    @objective(model, Min, c'*x);
    # calculate
    optimize!(model)
    # return value
    objective_value(model)
end

### Optimizing a univariate function - data fitting

Suppose we have a random periodic signal with noise, i.e. $y_i = a \sin(x_i-b) + c + \epsilon_i$




In [None]:
# generate data
a = 3; b = pi/3; c=10; e=Normal(0,0.1)
n = 20;
x = sort(rand(Uniform(0,20), n))
y = a*sin.(x .- b) .+ c .+ rand(e,n)
scatter(x,y)
X = 0:0.5:20;
plot!(X,a*sin.(X .- b) .+ c )

## Optimizing a function with gradient information

Suppose we want to minimize a function :
$$\min f(\bar{x}) = x_1 ^2 + 2.5\sin(x_2) - x_1^2x_2^2x_3^2 $$

Compare the results (computation time) using
1. gradient and hessian
3. a zero order method (i.e. no gradients used)



In [None]:
# leave
using Optim, BenchmarkTools

f(x) = x[1]^2 + 2.5*sin(x[2]) - x[1]^2*x[2]^2*x[3]^2
initial_x = [-0.6;-1.2; 0.135];

# gradients
function g!(G, x)
    G[1] = 2*x[1] - 2*x[1]*x[2]^2*x[3]^2
    G[2] = 2.5*cos(x[2]) - 2*x[1]^2*x[2]*x[3]^2
    G[3] = -2*x[1]^2*x[2]^2*x[3]
end
    
function h!(H,x)
    H[1,1] = 2 - 2*x[2]^2*x[3]^2 
    H[1,2] = -4*x[1]*x[2]*x[3]^2 
    H[1,3] = -4*x[1]*x[2]^2*x[3]
    H[2,1] = -4*x[1]*x[2]*x[3]^2 
    H[2,2] = -2.5*sin(x[2]) - 2*x[1]^2*x[3]^2
    H[2,3] = -4*x[1]^2*x[2]*x[3]
    H[3,1] = -4*x[1]*x[2]^2*x[3]
    H[3,2] = -4*x[1]^2*x[2]*x[3]
    H[3,3] = -2*x[1]^2*x[2]^2
end

### Optimize the optimizer
You could study the influence of the optimization methods and try to optimize them as well (this is sometimes refered to as hyperparameter tuning). Try to create a method that minimizes the amount of iteration by modifying the parameter $\eta$ from the `BFGS` method.

FYI: 
* This is merely as a proof of concept and will not come up with a significant improvement for this case.
* Look at the documentation for possible values of $\eta$.

## Quadratic Programming - Active set methods

Consider the following problem:
$$ \min f(x) = -8x_1 - 16x_2 + x_1^2 + 4x_2^2$$
$$\text{ST:} \begin{cases}x_1 + x_2 \le 5 \\ x_1 \le 3 \\ x_1 \ge 0 \\ x_2 \ge 0 \end{cases}$$

Solve this problem as a quadratic programming problem.

Reminder general formulation + [documentation](https://github.com/oxfordcontrol/GeneralQP.jl):

$$\min_{\vec{x}}f\left( \vec{x} \right)\overset{\vartriangle}{=} \frac{1}{2}\vec{x}^\mathsf{T} Q \vec{x} - \vec{c}^ \mathsf{T} \vec{x} $$
$$\text{ST:} \begin{cases}x_1 + x_2 \le 5 \\ x_1 \le 3 \\ x_1 \ge 0 \\ x_2 \ge 0 \end{cases}$$ 

In [None]:
using Plots, GeneralQP
# Illustration
f(x,y) = - 8x - 16y + x.^2 + 4y.^2
x = range(0,3,length=30)
y = range(0,3,length=30)
Plots.plot(x,y,f,st=:surface, camera=(40,25))
xlabel!("x")
ylabel!("y")

## Sequential Quadratic Programming
Reminder & [documentation for NLopt](https://github.com/JuliaOpt/NLopt.jl): 
$$
\begin{align*}
\min\, & f\left(\vec{x}\right)\\
\textrm{subject to} & \begin{cases}
\vec{h}\left(\vec{x}\right)=\vec{0}\\
\vec{g}\left(\vec{x}\right)\leq\vec{0}
\end{cases}
\end{align*}
$$

$\text{where } f:\mathbb{R}^{n}\rightarrow\mathbb{R}, \vec{h}:\mathbb{R}^{n}\rightarrow\mathbb{R}^{m},
m\leq n, \text{and } \vec{g}:\mathbb{R}^{n}\rightarrow\mathbb{R}^{p}.$


This allows you to deal with non-linear constraints. Try to solve the following problem:
$$\min_{\bar{x}} f(\bar{x}) = -x_1 - x_2$$
$$\text{ST:} \begin{cases}-x_1^2 + x_2 \ge 0\\1- x_1^2 - x_2^2 \ge 0\end{cases}$$ 

1. Try it once with NLopt
2. Try it with JuMP (combined with Ipopt)