Skip to content

Conversation

slawekptak
Copy link
Contributor

@slawekptak slawekptak commented Jul 3, 2025

This PR introduces a fully handler-less kernel submission path. The feature is not complete yet. For testing purposes we introduce the __DPCPP_ENABLE_UNFINISHED_NO_CGH_SUBMIT macros to enable unit tests for the new handler-less path. This macro should not be used by the application, and a legacy handler-based path is used. Once the handler-less path is fully implemented, we will switch corresponding APIs to use it unconditionally and will remove the macros.

This PR covers:

  1. A parallel_for, nd_range based kernel submit.
  2. Parallel_for queue shortcut, enqueue_functions extension and KHR free functions extension.
  3. A scheduler-based kernel submission.
  4. A new unit test which covers the host task and kernel ordering for an in-order queue (including the handler-less path).

Copy link
Contributor

@vinser52 vinser52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR, I would like to see at least one public interface implementation that utilizes this approach, just to ensure it works.

@slawekptak
Copy link
Contributor Author

In this PR, I would like to see at least one public interface implementation that utilizes this approach, just to ensure it works.

In the latest update, there are two public interfaces: The enqueue functions extension, and queue.parallel_for. Both are enabled only if __DPCPP_ENABLE_UNFINISHED_NO_CGH_SUBMIT is defined.

expose the new APIs as public under a new define
Copy link
Contributor

@uditagarwal97 uditagarwal97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@intel/llvm-gatekeepers please consider merging

@uditagarwal97 uditagarwal97 merged commit 2978123 into intel:sycl Sep 26, 2025
28 checks passed
uditagarwal97 pushed a commit that referenced this pull request Sep 29, 2025
#19294 added new _no_cgh version, need to pass the macros to
fix the build failures.
sycl::detail::lambda_arg_type<KernelType, nd_item<Dims>>;
static_assert(
std::is_convertible_v<sycl::nd_item<Dims>, LambdaArgType>,
"Kernel argument of a sycl::parallel_for with sycl::nd_range "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the text be altered in the subsequent patches, as this code can be called not only from parallel_for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the plan is to extend this to other functions once parallel_for(nd_range) is complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants