# A Reminder: What do we care about when we solve optimization problems?

**We care about four distinct attributes:**

1) **Correctness** (Are "feasible" solutions feasible? Are "optimal" solutions optimal?)

2) **Time taken** (fast is every user's favourite feature).

3) **Bound quality** (how good is the lower bound?).

4) **Solution quality** (how good is the best solution identified?)


How do we obtain these four attributes?

1) **Test your code regularly, with real data**
  * There might be bugs in any of the following:
    * Your data (even if you acquired it from a well-known repository).
    * Your code.
    * Julia (the solver interface (CPLEX.jl, Gurobi.jl), JuMP, Julia base.).
    * The solver: I have personally seen bugs in both CPLEX and Gurobi, open-source solvers are even less reliable.
  * You need to create test scripts before you start developing Julia code, and run these every time you make changes to your code, or update anything, to verify correctness.
      * This includes changing your Gurobi version, executing Pkg.update().
  * Synthetic data makes for bad test cases: if possible, use real data (e.g. from repositories, the literature).
   
2) **Adjust your expectations**
   * Adjust the solve time: if you obtain a feasible solution, then you can terminate early.
   * Adjust the optimality tolerance: since your data is probably not error free, the benefits of closing the optimality gap from 1% to 0% are probably limited.
   * Improve 3-4.
   
3) **Get Better Relaxations**
  * Cuts (improve the lower bounds).
    * Tune solver cuts.
    * Add clever problem-specific cuts.
    * Branching strategies (construct the tree in a smart way).
    * Subproblem solution strategies (how we solve each node of the branch-and-bound tree).
  * Apriori lower bounds (e.g. from QCQP, SOCP, SDP).
    
4) **Use Heuristics for Warm-Starts**
* Heuristics (improve the upper bounds via feasible solutions).

**In the rest of this session, we will discuss how to go about testing your optimization code.**

# Our running example: Robust Portfolio Optimization

We are going to use robust portfolio optimization as our running example for this session (based on code written by Iain Dunning [here](https://github.com/joehuchette/OR-software-tools-2015/blob/master/7-adv-optimization/Callbacks.ipynb)).

Portfolio optimization is the problem of constructing a portfolio of assets to maximize our risk-adjusted expected return. If we maximize the expected return while taking on arbitrary risk, we have a very high chance of going bankrupt. On the other hand, if we are unwilling to take on any risk then we will probably not be able to outperform US treasury bounds. Our goal is to write a solver which allows us to explore portfolios between these two extremes, and verify its correctness.

Robust optimization is an one approach to solving this problem. It says that we don't know the exact expected returns of each asset, because we only have access to noisy historical data. Therefore, a reasonable approach to take is to maximize the worst case expected return, where the returns are drawn from a bounded set of outcomes centered on the historical expected returns. We refer to this set of outcomes as an uncertainty set. 

It can be shown that robust optimization is equivalent to maximizing risk-adjusted expected returns, for some attitudes towards risk, so varying the size of the uncertainty set lets us explore portfolios with different attitudes towards risk.

# A formulation of the problem 

* Let $x_i$ be the proportion we invest in asset $i$. We are going to ban short-selling, and we need to invest all our money, so $\sum_i x_i=1$ and $x \geq 0$.
* We're also going to restrict ourselves to buying at most a quarter of the assets in the market (buying every asset in the market is called index tracking, and clients don't like it if you do index tracking while charging high fees). This means we need **binary** variables $y_i \in \{0, 1\}$ and the additional constraints $x \leq y$ and $\sum_i y_i \leq \frac{N}{4}$.
* Let $p_i$ be the expected return for asset $i$. We assume that $p \in U$, where...
* $U$ is our uncertainty set. We assume that we are given the following data: $\bar{p}_i$, the historical expected return, and $\sigma_i$, the historical standard deviation, for each asset $i$. Given this information, we can construct the uncertainty set $U^Γ$, which we define by:
\begin{align*}
U^Γ:=\{p: p_i=\bar{p}_i+\sigma_i d_i, || d|| \leq Γ\}.
\end{align*}


Given this information, we can formulate our problem as follows:

\begin{align*}
    \max_{z, x} \ z\\
    \text{s.t.} \ z \leq p^\top x, \ \forall p \in U^Γ,\\
    \ \ e^\top x=1,\\
    0 \leq x \leq y,\\
    e^\top y \leq N/4,\\
    y \in \{0, 1\}^n.\\
\end{align*}

While this problem has infinitely many constraints, we can solve it via a cutting-plane method.

We will write this in JuMP as follows:

In [1]:
using JuMP, Gurobi, LinearAlgebra

# Generate data
n = 20
p̄ = [1.15 + i*0.05/150 for i in 1:n]
σ = [0.05/450*√(2*i*n*(n+1)) for i in 1:n]

function solve_portfolio()
    port = Model(solver=GurobiSolver())

    @variable(port, z <= maximum(p̄)) # We can't earn a higher expected return than the highest expected return of all stocks
    @objective(port, Max, z)
    @variable(port, 0 <= x[1:n] <= 1)
    @constraint(port, sum(x) == 1)
    
    @variable(port, y[1:n], Bin)
    @constraint(port, x.<=y)
    @constraint(port, sum(y) <= n/4)
    

    # Link z to x
    function portobj(cb)
        # Get values of z and x
        zval = getvalue(z)
        xval = getvalue(x)[:]
    
        # Find most pessimistic value of p'x
        # over all p in the uncertainty set
        rob = Model(solver=GurobiSolver(OutputFlag=0))
        @variable(rob, p[i=1:n])
        @variable(rob, d[i=1:n])
        @objective(rob, Min, LinearAlgebra.dot(xval,p))
        Γ = sqrt(10)
        @constraint(rob, sum(d[i]^2 for i=1:n) ≤ Γ)
        for i in 1:n
            @constraint(rob, p[i] == p̄[i] + σ[i]*d[i])
        end
        solve(rob)
        worst_z = getobjectivevalue(rob)
        @show (zval, worst_z)
        worst_p = getvalue(p)[:]
        
        # Is this worst_p going to change the objective
        # because worst_z is worse than the current z?
        if worst_z < zval - 1e-2
            # Yep, we've made things worse!
            # Gurobi should try to find a better portfolio now
            @lazyconstraint(cb, z <= LinearAlgebra.dot(worst_p,x))
        end
    end
    addlazycallback(port, portobj)
    
    solve(port)
    
    return getvalue(x)[:]
end

LoadError: LoadError: UndefVarError: @lazyconstraint not defined
in expression starting at In[1]:48

We can time how long the problem takes to solve, and measure its memory use, via the @time macro. Using this macro is a better idea than writing down the time reported by CPLEX/Gurobi, because the later ommits the time required in pre and post processing steps. This isn't a big deal when the only extra step is writing the problem to Gurobi, but becomes a bigger deal if you are adding other nuts and bolts on (e.g. a warm-start heuristic).


In [2]:
@time proportion=solve_portfolio()

LoadError: UndefVarError: solve_portfolio not defined

Notice that the @time macro (correctly) tells us that the time taken is actually about 0.04 seconds longer than Gurobi reported.

# Testing for correctness

The first thing which we should do to test correctness is check whether the "optimal" solution satisfies all of our constraints. We can achieve this using the Test.jl package, which evaluates statements to decide if they are true and prints warnings if they are not. If you are writing a Julia package then it can also be linked to things like TravisCI, which gives a neat summary of what percentage of test-cases your code is passing.

In [45]:
using Test

# Test that e'x=1
@test abs(sum(proportion)-1.0)<=1e-12

# Test that we are investing in at most N/4 assets
@test sum(proportion.>=1e-12*ones(n))<=n/4

# Test that we are not short-selling
@test minimum(proportion) >=-1e-12

[32m[1mTest Passed[22m[39m

A better way to write these tests is to wrap them in a TestSet, as follows:

In [46]:
@testset "Feasibility Tests" begin
    # Test that e'x=1
    @test abs(sum(proportion)-1.0)<=1e-12
    
    # Test that we are investing in at most N/4 assets
    @test sum(proportion.>=1e-12*ones(n))<=n/4
    
    # Test that we are not short-selling
    @test minimum(proportion) >=-1e-12
end

[37m[1mTest Summary:     | [22m[39m[32m[1mPass  [22m[39m[36m[1mTotal[22m[39m
Feasibility Tests | [32m   3  [39m[36m    3[39m


Test.DefaultTestSet("Feasibility Tests", Any[], 3, false)

Notice that we get a neat summary of which test cases we are passing. This means that every time we add something to a piece of code, we can test if the code is still correct (with respect to our test-set) by running an appropriate testing file. Whenever you are coding something more serious than a homework assignment in JuMP, you should do this every time you commit to GitHub.

**Discussion: does passing the above test cases mean that we have found the optimal solution?**

* No! All that we have confirmed thus far is that we have a feasible solution.
* **Actually, the solution which we found is wrong! This is because I "forgot" to square $Γ$ in the constraint:**
@constraint(rob, sum(d[i]^2 for i=1:n) ≤ Γ)

How could we have picked this out? Some suggestions:

1) Gold standard: solve the problem using a different approach, and test if you get the same answer. In this case, you can also solve the problem as one giant MIQP by taking the dual with respect to p, and then compare your answer from the two approaches (see the link to Iain's notebook for how to do this). If there is only one approach, you could ask a friend to code up a second solver (without looking at your solver) and see if you get the same result. 

2) Bronze standard: Included more test cases, such as testing if we get the analytical solution to some easy problems, and perform unit testing on the inner problem.



# Final (correct) version

In [47]:
using JuMP, Gurobi

# Generate data
n = 20
p̄ = [1.15 + i*0.05/150 for i in 1:n]
σ = [0.05/450*√(2*i*n*(n+1)) for i in 1:n]

function solve_portfolio()
    port = Model(solver=GurobiSolver())

    @variable(port, z <= maximum(p̄)) # We can't earn a higher expected return than the highest expected return of all stocks
    @objective(port, Max, z)
    @variable(port, 0 <= x[1:n] <= 1)
    @constraint(port, sum(x) == 1)
    
    @variable(port, y[1:n], Bin)
    @constraint(port, x.<=y)
    @constraint(port, sum(y) <= n/4)
    

    # Link z to x
    function portobj(cb)
        # Get values of z and x
        zval = getvalue(z)
        xval = getvalue(x)[:]
    
        # Find most pessimistic value of p'x
        # over all p in the uncertainty set
        rob = Model(solver=GurobiSolver(OutputFlag=0))
        @variable(rob, p[i=1:n])
        @variable(rob, d[i=1:n])
        @objective(rob, Min, LinearAlgebra.dot(xval,p))
        Γ = sqrt(10)
        @constraint(rob, sum(d[i]^2 for i=1:n) <= Γ^2)
        for i in 1:n
            @constraint(rob, p[i] == p̄[i] + σ[i]*d[i])
        end
        solve(rob)
        worst_z = getobjectivevalue(rob)
        @show (zval, worst_z)
        worst_p = getvalue(p)[:]
        
        # Is this worst_p going to change the objective
        # because worst_z is worse than the current z?
        if worst_z < zval - 1e-2
            # Yep, we've made things worse!
            # Gurobi should try to find a better portfolio now
            @lazyconstraint(cb, z <= LinearAlgebra.dot(worst_p,x))
        end
    end
    addlazycallback(port, portobj)
    
    solve(port)
    
    return getvalue(x)[:]
end

solve_portfolio()

Academic license - for non-commercial use only
Optimize a model with 22 rows, 41 columns and 80 nonzeros
Variable types: 21 continuous, 20 integer (20 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [1e+00, 1e+00]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 5e+00]
Academic license - for non-commercial use only
(zval, worst_z) = (1.1566666666666665, 1.1119445303971691)
Presolve time: 0.00s
Presolved: 22 rows, 41 columns, 80 nonzeros
Variable types: 21 continuous, 20 integer (20 binary)
Academic license - for non-commercial use only
(zval, worst_z) = (1.1566666666666665, 1.1145993788007809)

Root relaxation: objective 1.156667e+00, 3 iterations, 0.00 seconds
Academic license - for non-commercial use only
(zval, worst_z) = (1.1566666666666665, 1.1111247140712786)
Academic license - for non-commercial use only
(zval, worst_z) = (1.1563333333333332, 1.1119445303971691)
Academic license - for non-commercial use only
(zval, worst_z) = (1.1

20-element Array{Float64,1}:
 0.41739159941369663
 0.2951403297190225 
 0.0                
 0.0                
 0.0                
 0.0                
 0.0                
 0.0                
 0.0                
 0.0                
 0.0                
 0.0                
 0.0                
 0.0                
 0.0                
 0.0                
 0.0                
 0.09838019104277032
 0.09575624277179975
 0.09333163705271079