Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run benchmarks multiple times #49

Closed
benmccann opened this issue Nov 6, 2019 · 7 comments
Closed

Run benchmarks multiple times #49

benmccann opened this issue Nov 6, 2019 · 7 comments

Comments

@benmccann
Copy link
Contributor

The benchmarks seem rather variable since they're based on only a single run currently. I think the numbers would be more reliable if it created a new chart in a loop 50 or 100 times and then took the average

@leeoniya
Copy link
Owner

leeoniya commented Nov 6, 2019

my experience with benchmarking comes from having been involved with https://github.com/krausest/js-framework-benchmark for a number of years.

there are a bunch of issues here, really.

  • i don't want this repo to become the definitive chart-benchmarks repo. we can make a separate org, make a line-charts repo in it and co-maintain that.
  • i have no desire to bring an early death to my laptop's CPU fan by running this on my own machine, we should set up a CI or something and maybe gut the already-awesome Puppeteer/Lighthouse infrastructure of js-framework-benchmark
  • 50 or 100 runs is too many. i think 10-15 will be good enough.
  • Windows in general has high variability compared to Linux, and when running on a thermally-constrained device, you'll run into CPU throttling if you run everything as fast as possible in series.

i don't have too many spare cycles these days (no pun intended) to get this off the ground properly, but if you want to take a stab at it and put in the work, then that would be a great thing.

the table i show here is a pretty lazy benchmark that just tries not to be too misleading, but it will go stale quickly. it's not sustainable for me to continue accepting everyone's PRs into this repo for the sake of keeping the bench table current. i think the table is good enough for ballpark insight (which is its purpose), but is poor if you're looking to measure +/-10% incremental improvements.

what do you think?

EDIT: also https://github.com/paulirish/speedline

@benmccann
Copy link
Contributor Author

I imagine there's a lot we could do if we wanted to be more official. E.g. benchmark different numbers of series, chart sizes, etc. But I wasn't thinking of anything too accurate or official.

The problem I was trying to solve is that when I run it on my own machine I get +/- 50% between runs, so even for getting ballpark numbers I'm getting fairly varied results. I'd be happy to send you something to do 10-15 runs against the current repo if you're open to it, but probably wouldn't want to setup something from scratch as it's going a little beyond what I was hoping to accomplish.

@leeoniya
Copy link
Owner

leeoniya commented Nov 6, 2019

I'd be happy to send you something to do 10-15 runs against the current repo if you're open to it

let me know what you're thinking and we'll see.

but it has to be 10-15 cold JIT / flushed cache runs. the only way i know to accomplish automate this is to basically do everything js-framework-benchmark does. you cannot just create 15 charts on the page and divide by 15 - that's a very different benchmark which stresses mem, GC, JIT, CPU & GPU very differently. with uPlot, i don't see anywhere near 50% variance...maybe +/-5%. the ride is much wilder for heavy libs that do tons of mem allocation so even an average would not be very meaningful there.

@benmccann
Copy link
Contributor Author

Ah, interesting that memory allocation is so variable in terms of performance. Do some of the less memory intensive libraries like Flot and CanvasJS also show relatively stable benchmarking? I'm wondering at what point things start to go wild. Certainly Chart.js has shown extremely variable results for me

@leeoniya
Copy link
Owner

leeoniya commented Nov 8, 2019

only Zing, Apex and amCharts have ever been +/-50% for me - that's odd unless you're testing on an unusually weak device or have a lot of stuff going on in the background. +/- 20% is more typical. definitely less variation with lower mem allocation and GC pressure.

@benmccann
Copy link
Contributor Author

Ah, ok. Good to know. I have a pretty old laptop and always have a million tabs open. The numbers you get benchmarking are always 3x better than what I get. Maybe I should finally get a new machine :-)

@benmccann
Copy link
Contributor Author

I'll close this for now. You've given me lots of good stuff to think about. Thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants