Create a benchmark for LibLinear/LibSVM to quantify past and future improvements to the C code #16864

smarie · 2020-04-07T07:09:56Z

Following PR #13511 it appears that there is not reference benchmark for SVMs in scikit-learn or in any side-project (sklearn-contrib).

This seems quite risky on the long run, maybe we should create one - especially to quantify the impact of changes to C code such as in PR #13511 .

I have been working quite a bit on this topic of creating reference benchmarks in the past years, leading to the creation of tools in the pytest ecosystem: pytest-cases and pytest-harvest, with a beginning of tutorial here (outdated I'm afraid). I can therefore certainly try to help with a benchmark framework structure if you find such an idea interesting.

However I do not know a good set of reference datasets to start with (apart from creating challenging ones "by hand").

The text was updated successfully, but these errors were encountered:

rth · 2020-04-07T07:14:21Z

That would be a good idea. See #16723

Closing to avoid duplicates, would you mind copying part of your message there? Thanks @smarie !

smarie added the New Feature label Apr 7, 2020

rth closed this as completed Apr 7, 2020

This was referenced Apr 7, 2020

[MRG] Libsvm and liblinear rand() fix for convergence on windows targets (and improvement on all targets) #13511

Merged

Add benchmarks #16723

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a benchmark for LibLinear/LibSVM to quantify past and future improvements to the C code #16864

Create a benchmark for LibLinear/LibSVM to quantify past and future improvements to the C code #16864

smarie commented Apr 7, 2020

rth commented Apr 7, 2020

Create a benchmark for LibLinear/LibSVM to quantify past and future improvements to the C code #16864

Create a benchmark for LibLinear/LibSVM to quantify past and future improvements to the C code #16864

Comments

smarie commented Apr 7, 2020

rth commented Apr 7, 2020