-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some more optimizations #503
Conversation
oh the rebase somehow changed the committer, weird |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master NVIDIA/thrust#503 +/- ##
==========================================
- Coverage 90.34% 90.34% -0.01%
==========================================
Files 35 35
Lines 4414 4422 +8
==========================================
+ Hits 3988 3995 +7
- Misses 426 427 +1
☔ View full report in Codecov by Sentry. |
In general, I would recommend avoiding the use of rebase entirely. Merge is safer and since we do a squash and merge, there's no pollution of our commit history anyway. It also avoids the need for force-push and makes it easier to read the sequential diffs on Github. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's see if we can pull master to simplify this diff, since it seems to have a bunch of previous PRs in it. Then I'll review, but this sounds great!
it seems that Apple Clang 14 does not support pstl
Should be simplified now. |
I give up with macos fix, maybe @elalish you can see what preprocessor macro does macos provide |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know much about pmr
, but it looks like it's not ready on Apple's compiler. Can we fall back without making the code too crazy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
* simple speedup from stl algorithms * fix build * std::list allocator optimization * added documentation * fix weird formatting issues * try fixing MacOS build it seems that Apple Clang 14 does not support pstl * fix again * fix macos build * fix apple * clean up using macro * fix apple? * small fix... * fix apple (again...)
This is mainly optimization for tbb, with as much as -30% running time for some extreme example.
thrust::tbb::par
is specified (Thrust with tbb backend uses sequential versions for several algorithms NVIDIA/cccl#635)std::stack
which uses deque backend to vector.Performance differences with tbb
Surprisingly, the performance improvement is quite significant for such simple changes.
perfTest
(max: -8%):Selected long-running tests (max: -30%)
And I did not observe notable performance differences for wasm build (-60ms in total), so it should not cause regression.