Add support for HIP explicit multipass #790

illuhad · 2022-07-30T02:20:11Z

This PR adds support for the hip.explicit-multipass compilation flow in the HIP backend. In hipSYCL explicit multipass flows, hipSYCL takes control over multipass compilation, kernel embedding, kernel caching and low-level module management. It also uses low-level kernel launch mechanisms instead of relying on clang-generated kernel launch stubs.

The motivation for this PR is the discussion regarding latency of hipLaunchKernel itself in PR #761 (CC @sbalint98). On the CUDA side, explicit multipass is known to substantially outperform the CUDA runtime API as far as kernel launch latencies are concerned - presumably because our kernel cache is better than whatever the CUDA runtime does. Because of this, I wanted to see if a similar speedup could be achieved on the HIP side.

With a quick test on my APU, so far unfortunately I do not see evidence for a difference in kernel launch latency between the new HIP explicit multipass and the old integrated multipass flows. But to draw proper conclusions, we will have to try again with a proper discrete GPU and look at profiler timelines which I have not done yet.
It might also be the case that on HIP there is no such difference, because the HIP API ingests compiled binaries, not an IR like PTX on the CUDA side, and therefore the kernel cache performance is potentially less important.

HIP explicit multipass is only supported on clang 13+.
~~TODO: We actually need to enforce that limitation; currently things just will not work with earlier clangs without clear error message.~~

* Add support for HIP explicit multipass * Raise error when hip.explicit-multipass is used with clang < 13

Add support for HIP explicit multipass

0bb219a

illuhad force-pushed the feature/hip-explicit-multipass branch from f59b156 to 0bb219a Compare July 30, 2022 02:20

Raise error when hip.explicit-multipass is used with clang < 13

5835007

illuhad marked this pull request as ready for review July 31, 2022 22:32

illuhad merged commit 3067486 into develop Aug 1, 2022

illuhad deleted the feature/hip-explicit-multipass branch August 1, 2022 13:31

DieGoldeneEnte pushed a commit to DieGoldeneEnte/hipSYCL that referenced this pull request Aug 8, 2022

Add support for HIP explicit multipass (AdaptiveCpp#790)

ab53c8e

* Add support for HIP explicit multipass * Raise error when hip.explicit-multipass is used with clang < 13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for HIP explicit multipass #790

Add support for HIP explicit multipass #790

illuhad commented Jul 30, 2022 •

edited

Add support for HIP explicit multipass #790

Add support for HIP explicit multipass #790

Conversation

illuhad commented Jul 30, 2022 • edited

illuhad commented Jul 30, 2022 •

edited