-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed fix for #40 #41
Proposed fix for #40 #41
Conversation
I have addressed the issues found by the workflows, and added a new unit test that I confirmed (on my machine) fails on master but passes on this branch. |
Thank you for this! Once I have some time i can look into this more thoroughly and run it on my machine. I don't see any issues with merging it once I've don't that. Thanks again! |
Overall this looks really good. It seems that this has a significant performance impact on the thread pool as well but I'm skeptical on the numbers I'm seeing. Here are the new benchmark results with MSVC
@jtd-formlabs Do you have any input or insights on what could be causing this large of a performance uplift? Previous benchmarks showed my library edging out some other popular libraries but now it blows them out of the water. I'm not mad at that, but I'm wondering if there is something I'm missing here... |
This is really interesting! I'll run your benchmarks on my Ubuntu system to get some additional metrics as well. One theory is that this new scheduling system is relying on the work stealing process much less, as it will always defer to a thread that is ready for work first so there is less delay. It may be worth adding some code to before and after this PR to see how many workloads are stolen during the benchmarks. |
Yes I agree with your comments. This should result in the threads always having work to do in general and having to steal less (if at all). I'm curious to see the numbers on Ubuntu as well as I haven't tried running benchmarks there yet. |
So I ran two benchmarks, one with WITH:
WITHOUT:
This was on a laptop running Ubuntu 20.04 with the power cable plugged in. 20 core system with 32 gb of ram on an Intel processor. |
I also needed a new include to compile with gcc 12, so I pushed that change. Both benchmarks were run compiling in release mode. |
Hmm, very interesting results. I think your results are much more reasonable. I thought running benchmarks on windows might be a problem since there is no equivalent to Regardless, I like the direction of this PR and will merge, but I will be hesitant to publish any new benchmark numbers until I can get more stable results. |
Seems totally reasonable! thank you for taking the time to look this over and merge it! |
This PR edits thread_safe_queue to have more functionality and serve as a priority queue for the threads.
At the end of each thread execution, the threads bump themselves up to the top of the priority list for work assignment.