Improved benchmarks #259

pkofod · 2016-08-11T09:12:37Z

Oookay, this took way too long, but here's a first go.

Caveats: I'm using non-REQUIREd packages. This is not meant to be used by regular users, so let's not clutter REQUIRE.

So, what to show, what to show. This round of work on benchmarks is for internal benchmarking only. Let's not worry about benchmarking vs Matlab or R or Python just yet. We should do it at some point to see if our linesearch is good/bad, and so on.

So we can start to ask ourselves some questions. For example, on the CUTEst problems with dimension between 1 and 100 (length(initial_x)), we can log the best objective value obtained by any solver (we do not have any solutions from CUTEst as far as I can see), calculate f_solver-f* for all problems and solvers, and calculate which proportion is within a given threshold.

If we log x*, the x associated with the best of all objectives, we can also calculate how "close" each solver comes measured in x not f. This is relevant if you want to obtain parameters for example, and aren't too concerned with hitting the exact minimum loss. Then we get the following.

The Newton's in the legend are a mistake

* Notes*
Be aware, that nothing is normalized. I do not want to normalize objective values, as many of these problems have minima of 0 exactly. I could normalize the xs, and probably should. (edit: or can I ?)

We see that (L-)BFGS are the best of the first order methods, followed (but not so closely) by Momentum gradient descent. Nelder Mead (the new one) actually does quite well for a zeroth order method. Again, as we're in levels, 1e-30 is really really low in floating point precision terms, so the most interesting part is perhaps in the middle of the figures.

For the good old UnconstrainedProblems we can do the same. There, we actually have minima and minimizers available, so here I actually compare to the minimum, not the "best obtained by a solver". We see picture that's a bit different here. Quite a few of the problems are solved quite well by many solvers, and we also have second order methods here. The good performance might be biased because we've had these problems for so long, and used them to test and fine tune algorithms if they didn't solve the problems.

TODO

We still need the numbers for the larger problems, I'm sort of resource constrained memory and time wise.
Produce informative plots based on timings (for solvers and problems that actually solve correctly)
Should we keep the csv files here? Will they clutter git, or is it fine if they're compressed?
Log error messages where they fail. If the try-catch-block ends up in the catch part, it's almost always linesearch problems. Could they be fixed?
Add linesearch benchmarks for different solvers.

codecov-io · 2016-08-11T09:25:25Z

Current coverage is 85.04% (diff: 100%)

No coverage report found for master at 11c935e.

Powered by Codecov. Last update 11c935e...38acc08

pkofod · 2016-08-12T11:06:11Z

We should trace total number of iterations as suggested in #145 . I even think that issue can be closed once that is done.

Edit: it's in there, so I guess that issue is fixed once this is merged.

pkofod · 2016-08-17T13:10:21Z

Would love to get feedback on storing the benchmark csv's in the repo. Is it annoying git-wise?

KristofferC · 2016-08-17T13:12:57Z

Will they make the repo dirty in case someone runs the benchmarks? In that case julia will stop automatically updating the package for everyone who ran the benchmark and didnt remember to clean the repo.

pkofod · 2016-08-17T13:20:24Z

Will they make the repo dirty in case someone runs the benchmarks? In that case julia will stop automatically updating the package for everyone who ran the benchmark and didnt remember to clean the repo.

Good point. I guess we could add .csv to the .gitignore, and only add those that should be archived using git add -f benchmarks/sha/cutest_benchmarks.csv (for example).

johnmyleswhite · 2016-08-17T15:10:41Z

I would suggest keeping these files in a separate repo. In the long run, if you run them frequently enough, you're going to end up with more of the repo's mass being CSV files than code.

pkofod · 2016-08-17T18:54:28Z

What about a separate branch on this repo? The script can just checkout the branch, do what it's got to do, commit, and checkout whatever branch was active.- unless people request the branch, it won't clutter their .julia. I'd be fine with OptimBenchmarkReports.jl or whatever though.

johnmyleswhite · 2016-08-17T19:00:19Z

But that branch still gets downloaded when you do git clone, doesn't it? Maybe it'll take years, but I'm concerned that you'll eventually end up in a state in which you stop adding more benchmarks because of download speed complaints.

pkofod · 2016-08-17T19:54:31Z

But that branch still gets downloaded when you do git clone, doesn't it? Maybe it'll take years, but I'm concerned that you'll eventually end up in a state in which you stop adding more benchmarks because of download speed complaints.

OptimBenchmarks.jl then? I figure the benchmark-scripts should go there as well. Could also be broader, like a OptimExtras.jl that could also contain Jupyter notebooks with examples.

pkofod · 2016-08-17T20:05:11Z

Sliightly larger test-set (n in (1,500)) picture is the same though.
CUTEst

UnconstrainedProblems

Edit By the way. I added the "error"-logging, and of all the unconstrained problems in CUTEst of dimension between 1 and 500 there were only 7 instances of (line search) errors, and most were from accelerated and momentum gradient descent. My guess is they end up taking a very large step into some area in the domain where the function overflows, is flat, or similar.

pkofod · 2016-09-04T18:32:45Z

JuliaSmoothOptimizers/CUTEst.jl#69 this is very nice for the variable size problems!

cortner · 2016-10-22T18:51:17Z

@pkofod What do you think of a train function, which you give a few prototype problems, it then tries to run a wide range of optimisers with a range of parameters and returns a OptimRecipe that works well for this class of problems.

pkofod · 2016-12-20T13:43:45Z

Moved to the more appropriate place: JuliaNLSolvers/OptimTests.jl#7

pkofod added 2 commits August 11, 2016 08:59

Improve benchmarks.

dde7ac3

More benchmarking.

38acc08

pkofod mentioned this pull request Aug 12, 2016

Benchmarks? #37

Closed

pkofod force-pushed the master branch 2 times, most recently from 7b0207c to 276c98d Compare August 13, 2016 07:49

pkofod mentioned this pull request Oct 22, 2016

Extraplbfgs2 #295

Merged

pkofod mentioned this pull request Nov 13, 2016

Constrained optimization episode 2: revenge of the slack variables #303

Closed

pkofod mentioned this pull request Dec 20, 2016

Add unconstrained benchmarks. JuliaNLSolvers/OptimTests.jl#7

Closed

4 tasks

pkofod closed this Dec 20, 2016

pkofod deleted the pkm/benchmarks branch April 8, 2017 18:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved benchmarks #259

Improved benchmarks #259

pkofod commented Aug 11, 2016 •

edited

codecov-io commented Aug 11, 2016 •

edited

pkofod commented Aug 12, 2016 •

edited

pkofod commented Aug 17, 2016

KristofferC commented Aug 17, 2016

pkofod commented Aug 17, 2016 •

edited

johnmyleswhite commented Aug 17, 2016

pkofod commented Aug 17, 2016

johnmyleswhite commented Aug 17, 2016

pkofod commented Aug 17, 2016 •

edited

pkofod commented Aug 17, 2016 •

edited

pkofod commented Sep 4, 2016

cortner commented Oct 22, 2016

pkofod commented Dec 20, 2016

Improved benchmarks #259

Improved benchmarks #259

Conversation

pkofod commented Aug 11, 2016 • edited

codecov-io commented Aug 11, 2016 • edited

Current coverage is 85.04% (diff: 100%)

pkofod commented Aug 12, 2016 • edited

pkofod commented Aug 17, 2016

KristofferC commented Aug 17, 2016

pkofod commented Aug 17, 2016 • edited

johnmyleswhite commented Aug 17, 2016

pkofod commented Aug 17, 2016

johnmyleswhite commented Aug 17, 2016

pkofod commented Aug 17, 2016 • edited

pkofod commented Aug 17, 2016 • edited

pkofod commented Sep 4, 2016

cortner commented Oct 22, 2016

pkofod commented Dec 20, 2016

pkofod commented Aug 11, 2016 •

edited

codecov-io commented Aug 11, 2016 •

edited

pkofod commented Aug 12, 2016 •

edited

pkofod commented Aug 17, 2016 •

edited

pkofod commented Aug 17, 2016 •

edited

pkofod commented Aug 17, 2016 •

edited