New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Add benchmarking script for multilabel metrics #2643
Conversation
def benchmark(metrics=[v for k, v in sorted(METRICS.items())], | ||
formats=[v for k, v in sorted(FORMATS.items())], | ||
samples=1000, classes=4, density=.2, | ||
n_times=5): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use tuple instead of list for function arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're concerned that they're mutable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think this is better to use immutable for default arguments.
Except for the minor comments, +1 to merge. |
@arjoly Okay, yes, it's quick-and-dirty code. I don't think that's a big deal for benchmarks, but I'll get some of the lint out of it. |
Thanks ! |
Your benchmark could be improved by adding dense c-layout and dense fortran-layout. |
Only if you want to see closely-overlapping curves... |
[MRG] Add benchmarking script for multilabel metrics
merged ! Thanks for the bench !! |
These are not very important metrics in the context of scikit-learn. Yet whenever metric implementations gets changed, people seem to be interested in how it affects execution time. This makes such reports easy to calculate.
This benchmarks metrics for different multilabel target formats, also giving us an idea of their relative performance. Benchmarks are otherwise parametrised by (number of samples, classes, average density of positive labels), one of which may be plotted against time.