-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Efficient benchmarking strategy #2
Comments
To be clear: the list is not exhaustive. OTOH, some of them are optional and relatively more time consuming than others. We can discuss here in a more structured way about this issue. |
All of the above definitely seem like great ideas for the script, especially the SHA hash one. One addition I was thinking of was to vary not only in locales, but per-locale cores as well. Currently I was planning on ramping up from 1 to 44 (with some variation in steps in between) locales, but I realize now as well that I think perhaps we should test scalability of adding more CPU cores as well. The rationale behind it is that... really, there are two bottlenecks here: Communication from work stealing, and contention/concurrent usage of the queue. If we used a simple two-locked |
Makes sense.. Just to keep our benchmarking overhead at bay, do you think it makes sense to do varying intranode paralellism tests only in single locale? Another way to put it: Is there anything new we can learn by running, say, 2 threads/node 44 locale test as opposed to 2 threads/core single locale test? |
Oh, that's one of the things I was thinking about, having, as you put it, intranode parallelism, where we'd have, say... 1 thread/core/node (Maximum parallelism) Furthermore, I was thinking of varying in terms of actual cores/node as well. 1 core/node Etc. This way it'd test all forms of scalability. Do we gain performance by adding more cores to a node (Sync-Variable = No, CCLock = Yes), do we gain performance by adding locales (both: yes), do we add more performance by oversubscribing? Etc. Did I answer the question? |
If you combine this test with increasing number of nodes, that may be a bit redundant. But it is just a hunch, I may well be wrong. |
Opened this issue just to capture some ideas as we discuss them.
My random "in an ideal world" thoughts on this are as follows:
The text was updated successfully, but these errors were encountered: