Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate NVIDIA's S/R implementation into HPX #6431

Draft
wants to merge 43 commits into
base: master
Choose a base branch
from

Conversation

isidorostsa
Copy link
Contributor

@isidorostsa isidorostsa commented Feb 3, 2024

This PR serves to replace the existing senders/receivers implementation in hpx with the stdexec implementation maintained by NVIDIA.

Big credits to the @pika-org team for their work when solving almost the same problem. This PR is built on top of much of that work.

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
cmake/FindStdexec.cmake Show resolved Hide resolved
cmake/HPX_SetupStdexec.cmake Outdated Show resolved Hide resolved
cmake/HPX_SetupStdexec.cmake Outdated Show resolved Hide resolved
libs/CMakeLists.txt Outdated Show resolved Hide resolved
libs/core/execution_base/CMakeLists.txt Outdated Show resolved Hide resolved
Copy link

codacy-production bot commented Mar 11, 2024

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
-85.23% 0.00%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (9c53b99) 206733 176197 85.23%
Head commit (eff5f1f) 112922 (-93811) 0 (-176197) 0.00% (-85.23%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#6431) 41 0 0.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Codacy will stop sending the deprecated coverage status from June 5th, 2024. Learn more

- Completion signatures are a specialization of stdexec::completion_signatures<...>
- Senders/Receivers need the is_sender/is_receiver tag
- Sender_of<snd, tag> looks for completion signatues (e.g. tag = set_error(std::exception_ptr)) instead of looking for possible set_value(...) specializations
                     make_env<tag>(val, old_env)
The new calling convention for make_env is:
                     make_env(old_env, with(tag, val))
Copy link
Member

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First batch of comments... more to come.

Comment on lines +65 to +74
#ifdef HPX_HAVE_STDEXEC
auto scheduler =
hpx::execution::experimental::get_completion_scheduler<
hpx::execution::experimental::set_value_t>(
hpx::execution::experimental::get_env(u));
#else
auto scheduler =
hpx::execution::experimental::get_completion_scheduler<
hpx::execution::experimental::set_value_t>(u);
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could factor out this difference allowing for the diferences to be very localized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would have to alter our existing senders to provide environments, currently most of the times they do not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? What about:

inline constexpr struct get_completion_scheduler_ex_t
{
    template <typename U>
    decltype(auto) operator()(U&& u) const
    {
#ifdef HPX_HAVE_STDEXEC
        return hpx::execution::experimental::get_completion_scheduler<
                    hpx::execution::experimental::set_value_t>(
                    hpx::execution::experimental::get_env(HPX_FORWARD(U, u)));
#else
        return hpx::execution::experimental::get_completion_scheduler<
                    hpx::execution::experimental::set_value_t>(HPX_FORWARD(U, u));
#endif
    }
} get_completion_scheduler_ex{};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, good point. I will get on that, and I will migrate to using it after sorting out the build system.

libs/core/execution/include/hpx/execution/queries/read.hpp Outdated Show resolved Hide resolved
static_assert(ex::is_sender_v<decltype(s), ex::empty_env>);
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we rather implement is_sender_in_v for our code base?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, leaving that as a TODO to remove much of the duplication from the tests.

@isidorostsa
Copy link
Contributor Author

Fetching STDEXEC by default on C++20, so the CI is testing both with and without it.

@isidorostsa isidorostsa changed the title Integrate NVIDIA's Stdexec impl into HPX Integrate NVIDIA's S/R implementation into HPX May 31, 2024
@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)(=)(=)

Info

PropertyBeforeAfter
HPX Commitd27ac2edceb6c4
HPX Datetime2024-03-18T14:00:30+00:002024-05-31T22:31:20+00:00
Clusternamerostamrostam
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Datetime2024-03-18T09:18:04.949759-05:002024-05-31T17:39:46.402280-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch(=)

Info

PropertyBeforeAfter
HPX Commitd27ac2edceb6c4
HPX Datetime2024-03-18T14:00:30+00:002024-05-31T22:31:20+00:00
Clusternamerostamrostam
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Datetime2024-03-18T09:19:53.062988-05:002024-05-31T17:41:32.521757-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)(=)
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)(=)-

Info

PropertyBeforeAfter
HPX Commitd27ac2edceb6c4
HPX Datetime2024-03-18T14:00:30+00:002024-05-31T22:31:20+00:00
Clusternamerostamrostam
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Datetime2024-03-18T09:20:13.002391-05:002024-05-31T17:41:49.286364-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants