In [1]:
using Laplacians

In [2]:
include("../src/flowUtils.jl")

cutCapacity

In [7]:
include("../src/min_cost_flow.jl")
#include("../src/min_cost_flow_ada.jl")
#include("/Users/spielman/Laplacians/src/primalDualIPM.jl")
#include("/Users/spielman/Laplacians/src/max_flow_IPM.jl")

mu_xsuz (generic function with 1 method)

In [8]:
function backSlash(A)
    function f(b)
        return A\b
    end
    return f
end

backSlash (generic function with 1 method)

We introduce a data type for minimum cost flow problems.
It has an edge list, a vector of capacities of edges, a vector of costs, and a vector of demands.
We test this with the graph on 4 vertices with source 1 and destination 4.
We try to flow 1 unit of flow.  
The capacities on all edges are 0.7.
The costs on route 1-2-4 are 1, and on route 1-3-4 are 2.
So, it should flow most on route 1-2-4, as it does.

In [9]:
edges = [1 2; 2 4; 1 3; 3 4]
caps = ones(4)*0.7
costs = [1.0; 1; 2 ; 2]
dems = [1.0;0;0;-1]

mcfp = MCFproblem(edges, caps, costs, dems)

MCFproblem{Float64,Int64}([1 2; 2 4; 1 3; 3 4], [0.7, 0.7, 0.7, 0.7], [1.0, 1.0, 2.0, 2.0], [1.0, 0.0, 0.0, -1.0])

In [11]:
#using JLD
@time sol = min_cost_flow(mcfp, lapSolver=approxCholSddm)
flow = sol[1]

number of nodes =4, number of edges=4
  0.000105 seconds (65 allocations: 3.625 KiB)
maximum x.*s =1.567222e+00, minimum x.*s=1.100556e+00, mu=1.190000e+00
maximum (u-x).*z =1.162778e+00, minimum (u-x).*z=9.294444e-01, mu=1.190000e+00

Iteration 1, ||r_p||/||b||=3.905243e-02, ||r_d||/||c||=1.111421e+00, rel. gap=3.131579e-01, alpha=1.000000e+00
maximum theta =2.855357e+01, minimum theta=2.212500e+01, cond theta=1.290557e+00
maximum theta inverse =4.519774e-02, minimum theta inverse =3.502189e-02, cond theta inverse=1.290557e+00
Time taken to build is : 
  0.000095 seconds (177 allocations: 14.859 KiB)
Time taken to solve is : 
  0.000112 seconds (59 allocations: 2.875 KiB)
  0.000394 seconds (383 allocations: 9.813 KiB)
normal eq. residual =8.573815e-11
pred. norm(res_p_saddle_1) =8.573817e-11, norm(res_d_saddle_1)=2.220446e-16
Time taken to solve is : 
  0.000103 seconds (59 allocations: 2.875 KiB)
  0.000291 seconds (383 allocations: 9.813 KiB)
normal eq. residual =2.414918e-11
corr.

4-element Array{Float64,1}:
 0.7
 0.7
 0.3
 0.3

In [7]:
reportMCFresults(mcfp,flow)

Cost: 2.6000001168182707
Min flow: 0.30000006210723285
Min slack: 6.5831009865569e-8
Error on demands: 7.447635674839859e-9


Now, let's try computing this using MOSEK, an IPM based solver package. You can install it in Julia by typing  Pkg.add("Mosek") in Julia. "A license file is required to use MOSEK (these are free for academic use)." You can request for one here:
https://www.mosek.com/products/academic-licenses/
"MOSEK will look first for the enironment variable MOSEKLM_LICENSE_FILE which, if defined, must point to the relevant license file. If this is not defined, MOSEK will look for a file called mosek.lic in the default install path, e.g.
$HOME/mosek/mosek.lic"

The instructions in quotes and more can be found here: https://github.com/JuliaOpt/Mosek.jl

In [5]:
using MathProgBase, Mosek


function MCFmosek(mcfp::MCFproblem)
    edge_list = mcfp.edge_list
    m = size(edge_list,1)
    n = maximum(edge_list)
    B = sparse(collect(1:m), edge_list[:,1], 1.0, m, n) -
      sparse(collect(1:m), edge_list[:,2], 1.0, m, n)
 
    sense = '='
    l = 0
    u = mcfp.capacities
    d = mcfp.demands
    @time sol = linprog(mcfp.costs,B',sense,d,l,u,MosekSolver())
    f = sol.sol
    return f
    
end

MCFmosek (generic function with 1 method)

In [9]:
f = MCFmosek(mcfp)
reportMCFresults(mcfp,f)

Problem
  Name                   :                 
  Objective sense        : min             
  Type                   : LO (linear optimization problem)
  Constraints            : 4               
  Cones                  : 0               
  Scalar variables       : 4               
  Matrix variables       : 0               
  Integer variables      : 0               

Optimizer started.
Interior-point optimizer started.
Presolve started.
Linear dependency checker started.
Linear dependency checker terminated.
Eliminator started.
Freed constraints in eliminator : 3
Eliminator terminated.
Eliminator - tries                  : 1                 time                   : 0.00            
Lin. dep.  - tries                  : 1                 time                   : 0.00            
Lin. dep.  - number                 : 0               
Presolve terminated. Time: 0.00    
Interior-point optimizer terminated. Time: 0.01. 

Optimizer terminated. Time: 0.01    

Interior-point solution 

Stacktrace:
 [1] [1mdepwarn[22m[22m[1m([22m[22m::String, ::Symbol[1m)[22m[22m at [1m./deprecated.jl:70[22m[22m
 [2] [1mabs[22m[22m[1m([22m[22m::Array{Float64,1}[1m)[22m[22m at [1m./deprecated.jl:57[22m[22m
 [3] [1mreportMCFresults[22m[22m[1m([22m[22m::MCFproblem{Float64,Int64}, ::Array{Float64,1}[1m)[22m[22m at [1m/Users/anuprao/Documents/repos/Laplacians.jl/src/flowUtils.jl:33[22m[22m
 [4] [1minclude_string[22m[22m[1m([22m[22m::String, ::String[1m)[22m[22m at [1m./loading.jl:515[22m[22m
 [5] [1minclude_string[22m[22m[1m([22m[22m::Module, ::String, ::String[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/Compat/src/Compat.jl:407[22m[22m
 [6] [1mexecute_request[22m[22m[1m([22m[22m::ZMQ.Socket, ::IJulia.Msg[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/execute_request.jl:154[22m[22m
 [7] [1meventloop[22m[22m[1m([22m[22m::ZMQ.Socket[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/eventloop.jl:

We would like to compare the performance of our code to standard codes on some benchmark examples.  These benchmark examples usually come in the Dimacs format.  
In this notebook, we use the example "goto_8_08a.min"
from <a href="http://lime.cs.elte.hu/~kpeter/data/mcf/goto/">http://lime.cs.elte.hu/~kpeter/data/mcf/goto/</a>.

In [12]:
mcfp = readDimacsMCF("/Users/anuprao/Documents/DIMACSFlow/goto_8_08a.min")

MCFproblem{Float64,Int64}([1 2; 1 3; … ; 180 216; 216 252], [8642.0, 741.0, 453.0, 62.0, 341.0, 1029.0, 2640.0, 1762.0, 155.0, 31.0  …  55614.0, 55614.0, 55614.0, 55614.0, 55614.0, 55614.0, 55614.0, 55614.0, 55614.0, 55614.0], [28.0, 714.0, 451.0, 822.0, 0.0, 7.0, 518.0, 29.0, 298.0, 973.0  …  142.0, 142.0, 142.0, 142.0, 142.0, 142.0, 142.0, 142.0, 142.0, 142.0], [55614.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, -55614.0, 0.0, 0.0, 0.0, 0.0])

In [None]:
#sol = min_cost_flow(mcfp, lapSolver = (h->augTreeLap(h, tol=1e-12)), tol=1e-2)
@time sol = min_cost_flow(mcfp, lapSolver = approxCholSddm, tol = 1e-10, reg_p = 1.0e-10, reg_d = 1.0e-10, tol_ref = 1.0e-8)

flow = sol[1]
reportMCFresults(mcfp,flow)

That took around 5.6 seconds, let us now try using the backSlash solver. We will see that it is much faster!

In [10]:
@time sol = min_cost_flow(mcfp, lapSolver = backSlash, tol = 1e-5)
flow = sol[1]
reportMCFresults(mcfp,flow)

number of nodes =256, number of edges=2048
maximum x.*s =4.181963e+09, minimum x.*s=1.840784e+04, mu=3.146102e+08
maximum (u-x).*z =3.520945e+08, minimum (u-x).*z=1.751229e+04, mu=3.146102e+08

Iteration 1, ||r_p||/||b||=1.049937e+00, ||r_d||/||c||=5.714446e+01, rel. gap=1.362448e-01, alpha=1.000000e+00
maximum theta =2.048482e+05, minimum theta=9.590553e+00, cond theta=2.135937e+04
maximum theta inverse =1.042693e-01, minimum theta inverse =4.881664e-06, cond theta inverse=2.135937e+04
Time taken to build is : 
  0.000002 seconds (1 allocation: 16 bytes)
Time taken to solve is : 
  0.005688 seconds (56 allocations: 397.688 KiB, 81.80% gc time)
normal eq. residual =8.413071e-10
pred. norm(res_p_saddle_1) =8.408049e-10, norm(res_d_saddle_1)=7.677155e-10
Time taken to solve is : 
  0.000764 seconds (54 allocations: 397.656 KiB)
normal eq. residual =3.167853e-09
corr. norm(res_p_saddle_1) =3.168306e-09, norm(res_d_saddle_1)=1.273475e-07
normal eq. residual =2.219082e-16
corr. norm(res_p_s

Stacktrace:
 [1] [1mdepwarn[22m[22m[1m([22m[22m::String, ::Symbol[1m)[22m[22m at [1m./deprecated.jl:70[22m[22m
 [2] [1mabs[22m[22m[1m([22m[22m::Array{Float64,1}[1m)[22m[22m at [1m./deprecated.jl:57[22m[22m
 [3] [1mreportMCFresults[22m[22m[1m([22m[22m::MCFproblem{Float64,Int64}, ::Array{Float64,1}[1m)[22m[22m at [1m/Users/anuprao/Documents/repos/Laplacians.jl/src/flowUtils.jl:33[22m[22m
 [4] [1minclude_string[22m[22m[1m([22m[22m::String, ::String[1m)[22m[22m at [1m./loading.jl:515[22m[22m
 [5] [1minclude_string[22m[22m[1m([22m[22m::Module, ::String, ::String[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/Compat/src/Compat.jl:407[22m[22m
 [6] [1mexecute_request[22m[22m[1m([22m[22m::ZMQ.Socket, ::IJulia.Msg[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/execute_request.jl:154[22m[22m
 [7] [1meventloop[22m[22m[1m([22m[22m::ZMQ.Socket[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/eventloop.jl:

As a comparison, we will check how MOSEK does (it does very well)

In [15]:
f = MCFmosek(mcfp)
mcfp.costs'*f

Problem
  Name                   :                 
  Objective sense        : min             
  Type                   : LO (linear optimization problem)
  Constraints            : 256             
  Cones                  : 0               
  Scalar variables       : 2048            
  Matrix variables       : 0               
  Integer variables      : 0               

Optimizer started.
Interior-point optimizer started.
Presolve started.
Linear dependency checker started.
Linear dependency checker terminated.
Eliminator - tries                  : 0                 time                   : 0.00            
Lin. dep.  - tries                  : 1                 time                   : 0.00            
Lin. dep.  - number                 : 0               
Presolve terminated. Time: 0.00    
GP based matrix reordering started.
GP based matrix reordering terminated.
Optimizer  - threads                : 4               
Optimizer  - solved problem         : the primal      
Optimiz

5.60870539e8

The comments about using Lemon are copied from Dan's notebook. I am just keeping it here for comparison.

For comparison, we should also try the code from "Lemon", available at

<a href="http://lemon.cs.elte.hu/trac/lemon">http://lemon.cs.elte.hu/trac/lemon</a>

It was easy to install on my Mac (I just needed cmake, which I got by `brew install cmake`)

Here is a transcript of running it:

```
Dans17:tools spielman$ ./dimacs-solver ~/tmp/goto_8_08a.min 
Problem type: min
Num of nodes: 256
Num of arcs:  2048

Sum of supply values: 0
GEQ supply contraints are used for NetworkSimplex

Read the file: u: 0s, s: 0s, cu: 0s, cs: 0s, real: 0.00467205s
Setup NetworkSimplex class: u: 0s, s: 0s, cu: 0s, cs: 0s, real: 7.70092e-05s
Run NetworkSimplex: u: 0s, s: 0s, cu: 0s, cs: 0s, real: 0.00143886s

Feasible flow: found
Min flow cost: 560870539
```

## More Tests

The min_cost_flow generates SDDM matrices now because of adding a regularization term to get better robustness. This means we always use SDDM solvers, and approxCholSddm seems to be pretty slow compared to its Laplacian counterpart.

In [16]:
mcfp = readDimacsMCF("/Users/anuprao/Documents/DIMACSFlow/goto_8_13a.min")

MCFproblem{Float64,Int64}([1 2; 1 3; … ; 7362 7771; 7771 8180], [8642.0, 741.0, 453.0, 62.0, 341.0, 1029.0, 2640.0, 1762.0, 155.0, 31.0  …  148449.0, 148449.0, 148449.0, 148449.0, 148449.0, 148449.0, 148449.0, 148449.0, 148449.0, 148449.0], [28.0, 714.0, 451.0, 822.0, 0.0, 7.0, 518.0, 29.0, 298.0, 973.0  …  50.0, 50.0, 50.0, 50.0, 50.0, 50.0, 50.0, 50.0, 50.0, 50.0], [148449.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])

In [17]:
f = MCFmosek(mcfp)
reportMCFresults(mcfp,f)

Problem
  Name                   :                 
  Objective sense        : min             
  Type                   : LO (linear optimization problem)
  Constraints            : 8192            
  Cones                  : 0               
  Scalar variables       : 65536           
  Matrix variables       : 0               
  Integer variables      : 0               

Optimizer started.
Interior-point optimizer started.
Presolve started.
Linear dependency checker started.
Linear dependency checker terminated.
Eliminator - tries                  : 0                 time                   : 0.00            
Lin. dep.  - tries                  : 1                 time                   : 0.01            
Lin. dep.  - number                 : 0               
Presolve terminated. Time: 0.12    
GP based matrix reordering started.
GP based matrix reordering terminated.
Optimizer  - threads                : 4               
Optimizer  - solved problem         : the primal      
Optimiz

Stacktrace:
 [1] [1mdepwarn[22m[22m[1m([22m[22m::String, ::Symbol[1m)[22m[22m at [1m./deprecated.jl:70[22m[22m
 [2] [1mabs[22m[22m[1m([22m[22m::Array{Float64,1}[1m)[22m[22m at [1m./deprecated.jl:57[22m[22m
 [3] [1mreportMCFresults[22m[22m[1m([22m[22m::MCFproblem{Float64,Int64}, ::Array{Float64,1}[1m)[22m[22m at [1m/Users/anuprao/Documents/repos/Laplacians.jl/src/flowUtils.jl:33[22m[22m
 [4] [1minclude_string[22m[22m[1m([22m[22m::String, ::String[1m)[22m[22m at [1m./loading.jl:515[22m[22m
 [5] [1minclude_string[22m[22m[1m([22m[22m::Module, ::String, ::String[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/Compat/src/Compat.jl:407[22m[22m
 [6] [1mexecute_request[22m[22m[1m([22m[22m::ZMQ.Socket, ::IJulia.Msg[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/execute_request.jl:154[22m[22m
 [7] [1meventloop[22m[22m[1m([22m[22m::ZMQ.Socket[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/eventloop.jl:

In [18]:
@time sol = min_cost_flow(mcfp, lapSolver = approxCholSddm, tol = 1e-5)
flow = sol[1]
reportMCFresults(mcfp,flow)

number of nodes =8192, number of edges=65536
maximum x.*s =3.112202e+10, minimum x.*s=4.808836e+04, mu=2.121781e+09
maximum (u-x).*z =1.511367e+09, minimum (u-x).*z=4.761271e+04, mu=2.121781e+09

Iteration 1, ||r_p||/||b||=2.158152e+00, ||r_d||/||c||=1.706092e+02, rel. gap=2.867166e-02, alpha=1.000000e+00
maximum theta =5.536111e+05, minimum theta=1.555805e+01, cond theta=3.558358e+04
maximum theta inverse =6.427540e-02, minimum theta inverse =1.806322e-06, cond theta inverse=3.558358e+04
Time taken to build is : 
  0.308926 seconds (250.11 k allocations: 33.870 MiB, 0.64% gc time)
Time taken to solve is : 
  0.021249 seconds (134 allocations: 2.382 MiB)
  0.021511 seconds (432 allocations: 2.514 MiB)
normal eq. residual =2.228548e-01
pred. norm(res_p_saddle_1) =2.228548e-01, norm(res_d_saddle_1)=6.645769e-09
  0.017636 seconds (138 allocations: 2.507 MiB)
normal eq. residual =8.148017e-08
pred. norm(res_p_saddle_2) =8.147307e-08, norm(res_d_saddle_2)=1.780364e-08
Time taken to solve i

Stacktrace:
 [1] [1mdepwarn[22m[22m[1m([22m[22m::String, ::Symbol[1m)[22m[22m at [1m./deprecated.jl:70[22m[22m
 [2] [1mabs[22m[22m[1m([22m[22m::Array{Float64,1}[1m)[22m[22m at [1m./deprecated.jl:57[22m[22m
 [3] [1mreportMCFresults[22m[22m[1m([22m[22m::MCFproblem{Float64,Int64}, ::Array{Float64,1}[1m)[22m[22m at [1m/Users/anuprao/Documents/repos/Laplacians.jl/src/flowUtils.jl:33[22m[22m
 [4] [1minclude_string[22m[22m[1m([22m[22m::String, ::String[1m)[22m[22m at [1m./loading.jl:515[22m[22m
 [5] [1minclude_string[22m[22m[1m([22m[22m::Module, ::String, ::String[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/Compat/src/Compat.jl:407[22m[22m
 [6] [1mexecute_request[22m[22m[1m([22m[22m::ZMQ.Socket, ::IJulia.Msg[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/execute_request.jl:154[22m[22m
 [7] [1meventloop[22m[22m[1m([22m[22m::ZMQ.Socket[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/eventloop.jl:

Just for comparison, we will use backSlash solver again. Note that it doesn't actually build an operator, so the build time is irrelevant in the output. Here both solvers have similar performances.

In [19]:
@time sol = min_cost_flow(mcfp, lapSolver = backSlash, tol = 1e-5)
flow = sol[1]
reportMCFresults(mcfp,flow)

number of nodes =8192, number of edges=65536
maximum x.*s =3.112202e+10, minimum x.*s=4.808836e+04, mu=2.121781e+09
maximum (u-x).*z =1.511367e+09, minimum (u-x).*z=4.761271e+04, mu=2.121781e+09

Iteration 1, ||r_p||/||b||=2.158152e+00, ||r_d||/||c||=1.706092e+02, rel. gap=2.867166e-02, alpha=1.000000e+00
maximum theta =5.536111e+05, minimum theta=1.555805e+01, cond theta=3.558358e+04
maximum theta inverse =6.427540e-02, minimum theta inverse =1.806322e-06, cond theta inverse=3.558358e+04
Time taken to build is : 
  0.000003 seconds (1 allocation: 16 bytes)
Time taken to solve is : 
  0.082188 seconds (61 allocations: 16.678 MiB, 1.19% gc time)
normal eq. residual =2.420795e-09
pred. norm(res_p_saddle_1) =2.422809e-09, norm(res_d_saddle_1)=6.835851e-09
Time taken to solve is : 
  0.094039 seconds (61 allocations: 16.678 MiB)
normal eq. residual =3.460362e-08
corr. norm(res_p_saddle_1) =3.494998e-08, norm(res_d_saddle_1)=5.040942e-06
normal eq. residual =1.365831e-16
corr. norm(res_p_sa

Stacktrace:
 [1] [1mdepwarn[22m[22m[1m([22m[22m::String, ::Symbol[1m)[22m[22m at [1m./deprecated.jl:70[22m[22m
 [2] [1mabs[22m[22m[1m([22m[22m::Array{Float64,1}[1m)[22m[22m at [1m./deprecated.jl:57[22m[22m
 [3] [1mreportMCFresults[22m[22m[1m([22m[22m::MCFproblem{Float64,Int64}, ::Array{Float64,1}[1m)[22m[22m at [1m/Users/anuprao/Documents/repos/Laplacians.jl/src/flowUtils.jl:33[22m[22m
 [4] [1minclude_string[22m[22m[1m([22m[22m::String, ::String[1m)[22m[22m at [1m./loading.jl:515[22m[22m
 [5] [1minclude_string[22m[22m[1m([22m[22m::Module, ::String, ::String[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/Compat/src/Compat.jl:407[22m[22m
 [6] [1mexecute_request[22m[22m[1m([22m[22m::ZMQ.Socket, ::IJulia.Msg[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/execute_request.jl:154[22m[22m
 [7] [1meventloop[22m[22m[1m([22m[22m::ZMQ.Socket[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/eventloop.jl:

Here is what Lemon did:

```
Dans17:tools spielman$ ./dimacs-solver ~/tmp/goto_8_13a.min 
Problem type: min
Num of nodes: 8192
Num of arcs:  65536

Sum of supply values: 0
GEQ supply contraints are used for NetworkSimplex

Read the file: u: 0.08s, s: 0s, cu: 0s, cs: 0s, real: 0.0842741s
Setup NetworkSimplex class: u: 0s, s: 0s, cu: 0s, cs: 0s, real: 0.00165105s
Run NetworkSimplex: u: 0.43s, s: 0s, cu: 0s, cs: 0s, real: 0.439616s

Feasible flow: found
Min flow cost: 18217956686
```

It was absurdly faster.  But, I won't worry about that yet.


Here is how Goldberg's CS2 code does, 
which I obtained from

<a href="https://github.com/iveney/cs2">https://github.com/iveney/cs2</a>

```
Dans17:cs2 spielman$ time ./cs2 < ~/tmp/goto_8_13a.min > out.txt
warning: this program uses gets(), which is unsafe.

real	0m2.425s
user	0m2.344s
sys	0m0.027s

Dans17:cs2 spielman$ head out.txt 
c CS 4.6
c Commercial use requires a licence
c contact igsys@eclipse.net
c
c nodes:            8192     arcs:            65536
c scale-factor:       12     cut-off-factor:   79.1
c
c time:             2.25     cost:      18217956686
c refines:             4     discharges:    3409417
c pushes:        3621130     relabels:      1381573
```

## goto_8_14b.min

In [8]:
mcfp = readDimacsMCF("/Users/anuprao/Documents/DIMACSFlow/goto_8_14b.min")

MCFproblem{Float64,Int64}([1 2; 1 3; … ; 15120 15750; 15750 16380], [2481.0, 951.0, 346.0, 61.0, 8368.0, 9891.0, 6933.0, 783.0, 155.0, 37.0  …  202084.0, 202084.0, 202084.0, 202084.0, 202084.0, 202084.0, 202084.0, 202084.0, 202084.0, 202084.0], [807.0, 189.0, 40.0, 285.0, 4.0, 5.0, 323.0, 675.0, 464.0, 19.0  …  38.0, 38.0, 38.0, 38.0, 38.0, 38.0, 38.0, 38.0, 38.0, 38.0], [202084.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, -202084.0, 0.0, 0.0, 0.0, 0.0])

In [21]:
f = MCFmosek(mcfp)
reportMCFresults(mcfp,f)

Problem
  Name                   :                 
  Objective sense        : min             
  Type                   : LO (linear optimization problem)
  Constraints            : 16384           
  Cones                  : 0               
  Scalar variables       : 131072          
  Matrix variables       : 0               
  Integer variables      : 0               

Optimizer started.
Interior-point optimizer started.
Presolve started.
Linear dependency checker started.
Linear dependency checker terminated.
Eliminator - tries                  : 0                 time                   : 0.00            
Lin. dep.  - tries                  : 1                 time                   : 0.01            
Lin. dep.  - number                 : 0               
Presolve terminated. Time: 0.29    
GP based matrix reordering started.
GP based matrix reordering terminated.
Optimizer  - threads                : 4               
Optimizer  - solved problem         : the primal      
Optimiz

Stacktrace:
 [1] [1mdepwarn[22m[22m[1m([22m[22m::String, ::Symbol[1m)[22m[22m at [1m./deprecated.jl:70[22m[22m
 [2] [1mabs[22m[22m[1m([22m[22m::Array{Float64,1}[1m)[22m[22m at [1m./deprecated.jl:57[22m[22m
 [3] [1mreportMCFresults[22m[22m[1m([22m[22m::MCFproblem{Float64,Int64}, ::Array{Float64,1}[1m)[22m[22m at [1m/Users/anuprao/Documents/repos/Laplacians.jl/src/flowUtils.jl:33[22m[22m
 [4] [1minclude_string[22m[22m[1m([22m[22m::String, ::String[1m)[22m[22m at [1m./loading.jl:515[22m[22m
 [5] [1minclude_string[22m[22m[1m([22m[22m::Module, ::String, ::String[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/Compat/src/Compat.jl:407[22m[22m
 [6] [1mexecute_request[22m[22m[1m([22m[22m::ZMQ.Socket, ::IJulia.Msg[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/execute_request.jl:154[22m[22m
 [7] [1meventloop[22m[22m[1m([22m[22m::ZMQ.Socket[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/eventloop.jl:

In [9]:
@time sol = min_cost_flow(mcfp, lapSolver = approxCholSddm, tol = 1e-5)
flow = sol[1]
reportMCFresults(mcfp,flow)

number of nodes =16384, number of edges=131072
maximum x.*s =5.830516e+10, minimum x.*s=6.598886e+04, mu=3.903648e+09
maximum (u-x).*z =2.352882e+09, minimum (u-x).*z=6.540233e+04, mu=3.903648e+09

Iteration 1, ||r_p||/||b||=2.485507e+00, ||r_d||/||c||=2.358563e+02, rel. gap=2.538213e-02, alpha=1.000000e+00
maximum theta =7.554044e+05, minimum theta=1.842139e+01, cond theta=4.100692e+04
maximum theta inverse =5.428473e-02, minimum theta inverse =1.323794e-06, cond theta inverse=4.100692e+04
Time taken to build is : 
  1.327878 seconds (485.87 k allocations: 61.225 MiB, 1.72% gc time)
Time taken to solve is : 
  0.040109 seconds (134 allocations: 4.757 MiB)
  0.040660 seconds (432 allocations: 5.014 MiB)
normal eq. residual =3.025736e-01
pred. norm(res_p_saddle_1) =3.025736e-01, norm(res_d_saddle_1)=1.152546e-08
  0.046729 seconds (138 allocations: 5.007 MiB)
normal eq. residual =1.459716e-07
pred. norm(res_p_saddle_2) =1.459739e-07, norm(res_d_saddle_2)=2.949383e-08
Time taken to solve

That converged! So the current min_cost_flow code is much more stable than before as if you see Dan's notebook, the previous one couldn't handle tol = 1e-5 for the above problem. The build time for approxCholSddm is around 1.5 secs while the solve time is much less. We will now compare with backSlash solver. We will see that backSlash is faster overall.

In [23]:
@time sol = min_cost_flow(mcfp, lapSolver = backSlash, tol = 1e-5)
flow = sol[1]
reportMCFresults(mcfp,flow)

number of nodes =16384, number of edges=131072
maximum x.*s =5.830516e+10, minimum x.*s=6.598886e+04, mu=3.903648e+09
maximum (u-x).*z =2.352882e+09, minimum (u-x).*z=6.540233e+04, mu=3.903648e+09

Iteration 1, ||r_p||/||b||=2.485507e+00, ||r_d||/||c||=2.358563e+02, rel. gap=2.538213e-02, alpha=1.000000e+00
maximum theta =7.554044e+05, minimum theta=1.842139e+01, cond theta=4.100692e+04
maximum theta inverse =5.428473e-02, minimum theta inverse =1.323794e-06, cond theta inverse=4.100692e+04
Time taken to build is : 
  0.000004 seconds (1 allocation: 16 bytes)
Time taken to solve is : 
  0.168068 seconds (61 allocations: 34.819 MiB, 1.37% gc time)
normal eq. residual =3.193398e-09
pred. norm(res_p_saddle_1) =3.229308e-09, norm(res_d_saddle_1)=1.144650e-08
normal eq. residual =3.814071e-18
pred. norm(res_p_saddle_2) =2.960623e-10, norm(res_d_saddle_2)=2.929213e-08
Time taken to solve is : 
  0.210570 seconds (61 allocations: 34.819 MiB, 7.23% gc time)
normal eq. residual =1.687887e-07
co

Stacktrace:
 [1] [1mdepwarn[22m[22m[1m([22m[22m::String, ::Symbol[1m)[22m[22m at [1m./deprecated.jl:70[22m[22m
 [2] [1mabs[22m[22m[1m([22m[22m::Array{Float64,1}[1m)[22m[22m at [1m./deprecated.jl:57[22m[22m
 [3] [1mreportMCFresults[22m[22m[1m([22m[22m::MCFproblem{Float64,Int64}, ::Array{Float64,1}[1m)[22m[22m at [1m/Users/anuprao/Documents/repos/Laplacians.jl/src/flowUtils.jl:33[22m[22m
 [4] [1minclude_string[22m[22m[1m([22m[22m::String, ::String[1m)[22m[22m at [1m./loading.jl:515[22m[22m
 [5] [1minclude_string[22m[22m[1m([22m[22m::Module, ::String, ::String[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/Compat/src/Compat.jl:407[22m[22m
 [6] [1mexecute_request[22m[22m[1m([22m[22m::ZMQ.Socket, ::IJulia.Msg[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/execute_request.jl:154[22m[22m
 [7] [1meventloop[22m[22m[1m([22m[22m::ZMQ.Socket[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/eventloop.jl:

And, here is Lemon:

```
Dans17:tools spielman$ ./dimacs-solver ~/tmp/goto_8_14b.min 
Problem type: min
Num of nodes: 16384
Num of arcs:  131072

Sum of supply values: 0
GEQ supply contraints are used for NetworkSimplex

Read the file: u: 0.18s, s: 0s, cu: 0s, cs: 0s, real: 0.187166s
Setup NetworkSimplex class: u: 0s, s: 0s, cu: 0s, cs: 0s, real: 0.00391507s
Run NetworkSimplex: u: 1.6s, s: 0.01s, cu: 0s, cs: 0s, real: 1.61133s

Feasible flow: found
Min flow cost: 41834525746
```


And, here is CS2:

```
Dans17:cs2 spielman$ time ./cs2 < ~/tmp/goto_8_14b.min > out.txt
warning: this program uses gets(), which is unsafe.

real	0m22.478s
user	0m21.870s
sys	0m0.096s
Dans17:cs2 spielman$ head out.txt 
c CS 4.6
c Commercial use requires a licence
c contact igsys@eclipse.net
c
c nodes:           16384     arcs:           131072
c scale-factor:       12     cut-off-factor:  107.3
c
c time:            21.70     cost:      41834525746
c refines:             5     discharges:    9131788
c pushes:        9571379     relabels:      3496218
```

## goto_8_16a
Lemon (which is implementing a network simplex code) starts to scale badly when given this problem, as does CS2.
Our scales as we would expect.
This problem is 4 times bigger than `goto_8_14b`.

I stopped CS2 after 7 minutes, because it was driving me nuts.
Here is how Lemon did:

```
Dans17:tools spielman$ ./dimacs-solver ~/tmp/goto_8_16a.min 
Problem type: min
Num of nodes: 65536
Num of arcs:  524288

Sum of supply values: 0
GEQ supply contraints are used for NetworkSimplex

Read the file: u: 0.68s, s: 0.01s, cu: 0s, cs: 0s, real: 0.711415s
Setup NetworkSimplex class: u: 0.01s, s: 0.01s, cu: 0s, cs: 0s, real: 0.015645s
Run NetworkSimplex: u: 80.66s, s: 0.26s, cu: 0s, cs: 0s, real: 81.4035s
```

The problem size is bigger now. But Sddm solver does much worse compared to \ operator.

In [10]:
mcfp = readDimacsMCF("/Users/anuprao/Documents/DIMACSFlow/goto_8_16a.min")

MCFproblem{Float64,Int64}([1 2; 1 3; … ; 62244 63882; 63882 65520], [8642.0, 741.0, 453.0, 62.0, 341.0, 1029.0, 2640.0, 1762.0, 155.0, 31.0  …  331827.0, 331827.0, 331827.0, 331827.0, 331827.0, 331827.0, 331827.0, 331827.0, 331827.0, 331827.0], [28.0, 714.0, 451.0, 822.0, 0.0, 7.0, 518.0, 29.0, 298.0, 973.0  …  25.0, 25.0, 25.0, 25.0, 25.0, 25.0, 25.0, 25.0, 25.0, 25.0], [331827.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])

In [11]:
@time sol = min_cost_flow(mcfp, lapSolver = approxCholSddm, tol = 1e-2)
flow = sol[1]
reportMCFresults(mcfp,flow)

number of nodes =65536, number of edges=524288
  0.089174 seconds (110 allocations: 13.006 MiB)
maximum x.*s =1.598429e+11, minimum x.*s=1.093886e+05, mu=1.043113e+10
maximum (u-x).*z =4.322690e+09, minimum (u-x).*z=1.087647e+05, mu=1.043113e+10

Iteration 1, ||r_p||/||b||=2.919599e+00, ||r_d||/||c||=3.948697e+02, rel. gap=1.574089e-02, alpha=1.000000e+00
maximum theta =1.245791e+06, minimum theta=2.662204e+01, cond theta=4.679545e+04
maximum theta inverse =3.756285e-02, minimum theta inverse =8.027031e-07, cond theta inverse=4.679545e+04
Time taken to build is : 
  1.539400 seconds (2.03 M allocations: 273.725 MiB, 40.05% gc time)
Time taken to solve is : 
  0.418434 seconds (130 allocations: 18.007 MiB, 42.20% gc time)
  0.419498 seconds (1.23 k allocations: 19.053 MiB, 42.09% gc time)
normal eq. residual =1.176677e+00
pred. norm(res_p_saddle_1) =1.176677e+00, norm(res_d_saddle_1)=3.078707e-08
  0.348474 seconds (138 allocations: 20.007 MiB, 3.90% gc time)
normal eq. residual =9.9607

In [8]:
@time sol = min_cost_flow(mcfp, lapSolver = approxCholSddm, tol = 1e-2)
flow = sol[1]
reportMCFresults(mcfp,flow)

number of nodes =65536, number of edges=524288
maximum x.*s =1.598429e+11, minimum x.*s=1.093886e+05, mu=1.043113e+10
maximum (u-x).*z =4.322689e+09, minimum (u-x).*z=1.087647e+05, mu=1.043113e+10

Iteration 1, ||r_p||/||b||=2.919599e+00, ||r_d||/||c||=3.948697e+02, rel. gap=1.574089e-02, alpha=1.000000e+00
maximum theta =1.245791e+06, minimum theta=2.662205e+01, cond theta=4.679544e+04
maximum theta inverse =3.756285e-02, minimum theta inverse =8.027031e-07, cond theta inverse=4.679544e+04
Time taken to build is : 
  1.468512 seconds (2.03 M allocations: 273.723 MiB, 49.80% gc time)
Time taken to solve is : 
  0.157333 seconds (134 allocations: 19.007 MiB, 2.86% gc time)
  0.158791 seconds (1.15 k allocations: 20.046 MiB, 2.83% gc time)
normal eq. residual =1.164380e+00
pred. norm(res_p_saddle_1) =1.164380e+00, norm(res_d_saddle_1)=2.881475e-08
  0.190176 seconds (142 allocations: 21.008 MiB, 2.95% gc time)
normal eq. residual =8.481741e-07
pred. norm(res_p_saddle_2) =8.480871e-07, no

In [25]:
@time sol = min_cost_flow(mcfp, lapSolver = backSlash, tol=1e-2)
flow = sol[1]
reportMCFresults(mcfp, flow)

number of nodes =65536, number of edges=524288
maximum x.*s =1.598429e+11, minimum x.*s=1.093886e+05, mu=1.043113e+10
maximum (u-x).*z =4.322689e+09, minimum (u-x).*z=1.087647e+05, mu=1.043113e+10

Iteration 1, ||r_p||/||b||=2.919599e+00, ||r_d||/||c||=3.948697e+02, rel. gap=1.574089e-02, alpha=1.000000e+00
maximum theta =1.245791e+06, minimum theta=2.662205e+01, cond theta=4.679544e+04
maximum theta inverse =3.756285e-02, minimum theta inverse =8.027031e-07, cond theta inverse=4.679544e+04
Time taken to build is : 
  0.000002 seconds (1 allocation: 16 bytes)
Time taken to solve is : 
  1.260562 seconds (61 allocations: 165.589 MiB, 20.35% gc time)
normal eq. residual =4.927840e-09
pred. norm(res_p_saddle_1) =5.041946e-09, norm(res_d_saddle_1)=2.878446e-08
normal eq. residual =1.601956e-18
pred. norm(res_p_saddle_2) =5.343373e-10, norm(res_d_saddle_2)=7.059028e-08
Time taken to solve is : 
  1.018187 seconds (61 allocations: 165.589 MiB, 9.65% gc time)
normal eq. residual =8.418621e-07

Stacktrace:
 [1] [1mdepwarn[22m[22m[1m([22m[22m::String, ::Symbol[1m)[22m[22m at [1m./deprecated.jl:70[22m[22m
 [2] [1mabs[22m[22m[1m([22m[22m::Array{Float64,1}[1m)[22m[22m at [1m./deprecated.jl:57[22m[22m
 [3] [1mreportMCFresults[22m[22m[1m([22m[22m::MCFproblem{Float64,Int64}, ::Array{Float64,1}[1m)[22m[22m at [1m/Users/anuprao/Documents/repos/Laplacians.jl/src/flowUtils.jl:33[22m[22m
 [4] [1minclude_string[22m[22m[1m([22m[22m::String, ::String[1m)[22m[22m at [1m./loading.jl:515[22m[22m
 [5] [1minclude_string[22m[22m[1m([22m[22m::Module, ::String, ::String[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/Compat/src/Compat.jl:407[22m[22m
 [6] [1mexecute_request[22m[22m[1m([22m[22m::ZMQ.Socket, ::IJulia.Msg[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/execute_request.jl:154[22m[22m
 [7] [1meventloop[22m[22m[1m([22m[22m::ZMQ.Socket[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/eventloop.jl:

In [27]:
f = MCFmosek(mcfp)
reportMCFresults(mcfp,f)

Problem
  Name                   :                 
  Objective sense        : min             
  Type                   : LO (linear optimization problem)
  Constraints            : 65536           
  Cones                  : 0               
  Scalar variables       : 524288          
  Matrix variables       : 0               
  Integer variables      : 0               

Optimizer started.
Interior-point optimizer started.
Presolve started.
Linear dependency checker started.
Linear dependency checker terminated.
Eliminator - tries                  : 0                 time                   : 0.00            
Lin. dep.  - tries                  : 1                 time                   : 0.05            
Lin. dep.  - number                 : 0               
Presolve terminated. Time: 1.51    
GP based matrix reordering started.
GP based matrix reordering terminated.
Optimizer  - threads                : 4               
Optimizer  - solved problem         : the primal      
Optimiz

Stacktrace:
 [1] [1mdepwarn[22m[22m[1m([22m[22m::String, ::Symbol[1m)[22m[22m at [1m./deprecated.jl:70[22m[22m
 [2] [1mabs[22m[22m[1m([22m[22m::Array{Float64,1}[1m)[22m[22m at [1m./deprecated.jl:57[22m[22m
 [3] [1mreportMCFresults[22m[22m[1m([22m[22m::MCFproblem{Float64,Int64}, ::Array{Float64,1}[1m)[22m[22m at [1m/Users/anuprao/Documents/repos/Laplacians.jl/src/flowUtils.jl:33[22m[22m
 [4] [1minclude_string[22m[22m[1m([22m[22m::String, ::String[1m)[22m[22m at [1m./loading.jl:515[22m[22m
 [5] [1minclude_string[22m[22m[1m([22m[22m::Module, ::String, ::String[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/Compat/src/Compat.jl:407[22m[22m
 [6] [1mexecute_request[22m[22m[1m([22m[22m::ZMQ.Socket, ::IJulia.Msg[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/execute_request.jl:154[22m[22m
 [7] [1meventloop[22m[22m[1m([22m[22m::ZMQ.Socket[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/eventloop.jl:

We will now try a bigger file.

In [8]:
mcfp = readDimacsMCF("/Users/anuprao/Documents/DIMACSFlow/goto_sr_13e.min")

MCFproblem{Float64,Int64}([1 2; 1 3; … ; 7362 7771; 7771 8180], [7671.0, 8550.0, 5351.0, 246.0, 2704.0, 4032.0, 3524.0, 2699.0, 1568.0, 1584.0  …  5.92438e6, 5.92438e6, 5.92438e6, 5.92438e6, 5.92438e6, 5.92438e6, 5.92438e6, 5.92438e6, 5.92438e6, 5.92438e6], [24.0, 781.0, 979.0, 430.0, 402.0, 565.0, 629.0, 184.0, 71.0, 407.0  …  50.0, 50.0, 50.0, 50.0, 50.0, 50.0, 50.0, 50.0, 50.0, 50.0], [5.92438e6, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])

In [29]:
f = MCFmosek(mcfp)
reportMCFresults(mcfp,f)

Problem
  Name                   :                 
  Objective sense        : min             
  Type                   : LO (linear optimization problem)
  Constraints            : 8192            
  Cones                  : 0               
  Scalar variables       : 741455          
  Matrix variables       : 0               
  Integer variables      : 0               

Optimizer started.
Interior-point optimizer started.
Presolve started.
Linear dependency checker started.
Linear dependency checker terminated.
Eliminator - tries                  : 0                 time                   : 0.00            
Lin. dep.  - tries                  : 1                 time                   : 0.04            
Lin. dep.  - number                 : 0               
Presolve terminated. Time: 1.39    
GP based matrix reordering started.
GP based matrix reordering terminated.
Optimizer  - threads                : 4               
Optimizer  - solved problem         : the primal      
Optimiz

Stacktrace:
 [1] [1mdepwarn[22m[22m[1m([22m[22m::String, ::Symbol[1m)[22m[22m at [1m./deprecated.jl:70[22m[22m
 [2] [1mabs[22m[22m[1m([22m[22m::Array{Float64,1}[1m)[22m[22m at [1m./deprecated.jl:57[22m[22m
 [3] [1mreportMCFresults[22m[22m[1m([22m[22m::MCFproblem{Float64,Int64}, ::Array{Float64,1}[1m)[22m[22m at [1m/Users/anuprao/Documents/repos/Laplacians.jl/src/flowUtils.jl:33[22m[22m
 [4] [1minclude_string[22m[22m[1m([22m[22m::String, ::String[1m)[22m[22m at [1m./loading.jl:515[22m[22m
 [5] [1minclude_string[22m[22m[1m([22m[22m::Module, ::String, ::String[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/Compat/src/Compat.jl:407[22m[22m
 [6] [1mexecute_request[22m[22m[1m([22m[22m::ZMQ.Socket, ::IJulia.Msg[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/execute_request.jl:154[22m[22m
 [7] [1meventloop[22m[22m[1m([22m[22m::ZMQ.Socket[1m)[22m[22m at [1m/Users/anuprao/.julia/v0.6/IJulia/src/eventloop.jl:

In [10]:
@time sol = min_cost_flow(mcfp, lapSolver = approxCholSddm, tol=1e-2)
flow = sol[1]
reportMCFresults(mcfp, flow)

number of nodes =8192, number of edges=741455
maximum x.*s =5.133115e+13, minimum x.*s=1.933079e+06, mu=2.894535e+11
maximum (u-x).*z =6.309197e+11, minimum (u-x).*z=1.932739e+06, mu=2.894535e+11

Iteration 1, ||r_p||/||b||=6.662155e-01, ||r_d||/||c||=5.711623e+03, rel. gap=1.163433e-01, alpha=1.000000e+00
maximum theta =2.175968e+07, minimum theta=5.483746e+01, cond theta=3.968033e+05
maximum theta inverse =1.823571e-02, minimum theta inverse =4.595655e-08, cond theta inverse=3.968033e+05
Time taken to build is : 
  1.538487 seconds (2.57 M allocations: 280.213 MiB, 37.42% gc time)
Time taken to solve is : 
  0.079909 seconds (114 allocations: 1.756 MiB)
  0.080322 seconds (1.10 k allocations: 1.919 MiB)
normal eq. residual =3.867029e+00
pred. norm(res_p_saddle_1) =3.867029e+00, norm(res_d_saddle_1)=2.530255e-06
  0.089413 seconds (114 allocations: 1.756 MiB)
normal eq. residual =1.847248e-06
pred. norm(res_p_saddle_2) =1.847781e-06, norm(res_d_saddle_2)=8.531868e-06
Time taken to sol

In [9]:
@time sol = min_cost_flow(mcfp, lapSolver = approxCholSddm, tol=1e-2)
flow = sol[1]
reportMCFresults(mcfp, flow)

number of nodes =8192, number of edges=741455
  0.042266 seconds (94 allocations: 1.129 MiB)
maximum x.*s =5.133115e+13, minimum x.*s=1.933079e+06, mu=2.894535e+11
maximum (u-x).*z =6.309198e+11, minimum (u-x).*z=1.932739e+06, mu=2.894535e+11

Iteration 1, ||r_p||/||b||=6.662155e-01, ||r_d||/||c||=5.711623e+03, rel. gap=1.163433e-01, alpha=1.000000e+00
maximum theta =2.175968e+07, minimum theta=5.483745e+01, cond theta=3.968033e+05
maximum theta inverse =1.823571e-02, minimum theta inverse =4.595655e-08, cond theta inverse=3.968033e+05
Time taken to build is : 
  1.235615 seconds (2.57 M allocations: 280.234 MiB, 18.72% gc time)
Time taken to solve is : 
  0.100657 seconds (118 allocations: 1.881 MiB)
  0.101171 seconds (1.13 k allocations: 2.050 MiB)
normal eq. residual =1.609928e+00
pred. norm(res_p_saddle_1) =1.609928e+00, norm(res_d_saddle_1)=2.697588e-06
  0.100768 seconds (118 allocations: 1.881 MiB)
normal eq. residual =5.079230e-07
pred. norm(res_p_saddle_2) =5.281414e-07, norm