Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved benchmarks #259

Closed
wants to merge 2 commits into from
Closed

Improved benchmarks #259

wants to merge 2 commits into from

Conversation

pkofod
Copy link
Member

@pkofod pkofod commented Aug 11, 2016

Oookay, this took way too long, but here's a first go.

Caveats: I'm using non-REQUIREd packages. This is not meant to be used by regular users, so let's not clutter REQUIRE.

So, what to show, what to show. This round of work on benchmarks is for internal benchmarking only. Let's not worry about benchmarking vs Matlab or R or Python just yet. We should do it at some point to see if our linesearch is good/bad, and so on.

So we can start to ask ourselves some questions. For example, on the CUTEst problems with dimension between 1 and 100 (length(initial_x)), we can log the best objective value obtained by any solver (we do not have any solutions from CUTEst as far as I can see), calculate f_solver-f* for all problems and solvers, and calculate which proportion is within a given threshold.
f_err
If we log x*, the x associated with the best of all objectives, we can also calculate how "close" each solver comes measured in x not f. This is relevant if you want to obtain parameters for example, and aren't too concerned with hitting the exact minimum loss. Then we get the following.
x_err
The Newton's in the legend are a mistake

* Notes*
Be aware, that nothing is normalized. I do not want to normalize objective values, as many of these problems have minima of 0 exactly. I could normalize the xs, and probably should. (edit: or can I ?)

We see that (L-)BFGS are the best of the first order methods, followed (but not so closely) by Momentum gradient descent. Nelder Mead (the new one) actually does quite well for a zeroth order method. Again, as we're in levels, 1e-30 is really really low in floating point precision terms, so the most interesting part is perhaps in the middle of the figures.

For the good old UnconstrainedProblems we can do the same. There, we actually have minima and minimizers available, so here I actually compare to the minimum, not the "best obtained by a solver". We see picture that's a bit different here. Quite a few of the problems are solved quite well by many solvers, and we also have second order methods here. The good performance might be biased because we've had these problems for so long, and used them to test and fine tune algorithms if they didn't solve the problems.
f_err
x_err

TODO

  • We still need the numbers for the larger problems, I'm sort of resource constrained memory and time wise.
  • Produce informative plots based on timings (for solvers and problems that actually solve correctly)
  • Should we keep the csv files here? Will they clutter git, or is it fine if they're compressed?
  • Log error messages where they fail. If the try-catch-block ends up in the catch part, it's almost always linesearch problems. Could they be fixed?
  • Add linesearch benchmarks for different solvers.

@codecov-io
Copy link

codecov-io commented Aug 11, 2016

Current coverage is 85.04% (diff: 100%)

No coverage report found for master at 11c935e.

Powered by Codecov. Last update 11c935e...38acc08

@pkofod pkofod mentioned this pull request Aug 12, 2016
@pkofod
Copy link
Member Author

pkofod commented Aug 12, 2016

We should trace total number of iterations as suggested in #145 . I even think that issue can be closed once that is done.

Edit: it's in there, so I guess that issue is fixed once this is merged.

@pkofod pkofod force-pushed the master branch 2 times, most recently from 7b0207c to 276c98d Compare August 13, 2016 07:49
@pkofod
Copy link
Member Author

pkofod commented Aug 17, 2016

Would love to get feedback on storing the benchmark csv's in the repo. Is it annoying git-wise?

@KristofferC
Copy link
Contributor

Will they make the repo dirty in case someone runs the benchmarks? In that case julia will stop automatically updating the package for everyone who ran the benchmark and didnt remember to clean the repo.

@pkofod
Copy link
Member Author

pkofod commented Aug 17, 2016

Will they make the repo dirty in case someone runs the benchmarks? In that case julia will stop automatically updating the package for everyone who ran the benchmark and didnt remember to clean the repo.

Good point. I guess we could add .csv to the .gitignore, and only add those that should be archived using git add -f benchmarks/sha/cutest_benchmarks.csv (for example).

@johnmyleswhite
Copy link
Contributor

I would suggest keeping these files in a separate repo. In the long run, if you run them frequently enough, you're going to end up with more of the repo's mass being CSV files than code.

@pkofod
Copy link
Member Author

pkofod commented Aug 17, 2016

What about a separate branch on this repo? The script can just checkout the branch, do what it's got to do, commit, and checkout whatever branch was active.- unless people request the branch, it won't clutter their .julia. I'd be fine with OptimBenchmarkReports.jl or whatever though.

@johnmyleswhite
Copy link
Contributor

But that branch still gets downloaded when you do git clone, doesn't it? Maybe it'll take years, but I'm concerned that you'll eventually end up in a state in which you stop adding more benchmarks because of download speed complaints.

@pkofod
Copy link
Member Author

pkofod commented Aug 17, 2016

But that branch still gets downloaded when you do git clone, doesn't it? Maybe it'll take years, but I'm concerned that you'll eventually end up in a state in which you stop adding more benchmarks because of download speed complaints.

OptimBenchmarks.jl then? I figure the benchmark-scripts should go there as well. Could also be broader, like a OptimExtras.jl that could also contain Jupyter notebooks with examples.

@pkofod
Copy link
Member Author

pkofod commented Aug 17, 2016

Sliightly larger test-set (n in (1,500)) picture is the same though.
CUTEst
f_err_cutest
x_err_cutest
UnconstrainedProblems
f_err_optim
x_err_optim

Edit By the way. I added the "error"-logging, and of all the unconstrained problems in CUTEst of dimension between 1 and 500 there were only 7 instances of (line search) errors, and most were from accelerated and momentum gradient descent. My guess is they end up taking a very large step into some area in the domain where the function overflows, is flat, or similar.

@pkofod
Copy link
Member Author

pkofod commented Sep 4, 2016

JuliaSmoothOptimizers/CUTEst.jl#69 this is very nice for the variable size problems!

@pkofod pkofod mentioned this pull request Oct 22, 2016
@cortner
Copy link
Contributor

cortner commented Oct 22, 2016

@pkofod What do you think of a train function, which you give a few prototype problems, it then tries to run a wide range of optimisers with a range of parameters and returns a OptimRecipe that works well for this class of problems.

@pkofod
Copy link
Member Author

pkofod commented Dec 20, 2016

Moved to the more appropriate place: JuliaNLSolvers/OptimTests.jl#7

@pkofod pkofod closed this Dec 20, 2016
@pkofod pkofod deleted the pkm/benchmarks branch April 8, 2017 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants