From 33fc658c5176f27b30e19c2b4dbc851008fd8d54 Mon Sep 17 00:00:00 2001 From: Greg Lueck Date: Wed, 1 Oct 2025 15:17:20 -0400 Subject: [PATCH 1/4] [SYCL][Doc] Add spec to wait on a device Add a proposed extension specification which allows the application to wait for all commands submitted to a device to complete. --- .../sycl_ext_oneapi_device_wait.asciidoc | 141 ++++++++++++++++++ 1 file changed, 141 insertions(+) create mode 100644 sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc diff --git a/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc new file mode 100644 index 0000000000000..99fb2b4d4ae69 --- /dev/null +++ b/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc @@ -0,0 +1,141 @@ += sycl_ext_oneapi_device_wait + +:source-highlighter: coderay +:coderay-linenums-mode: table + +// This section needs to be after the document title. +:doctype: book +:toc2: +:toc: left +:encoding: utf-8 +:lang: en +:dpcpp: pass:[DPC++] +:endnote: —{nbsp}end{nbsp}note + +// Set the default source code type in this document to C++, +// for syntax highlighting purposes. This is needed because +// docbook uses c++ and html5 uses cpp. +:language: {basebackend@docbook:c++:cpp} + + +== Notice + +[%hardbreaks] +Copyright (C) 2025 Intel Corporation. All rights reserved. + +Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks +of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by +permission by Khronos. + + +== Contact + +To report problems with this extension, please open a new issue at: + +https://github.com/intel/llvm/issues + + +== Dependencies + +This extension is written against the SYCL 2020 revision 10 specification. +All references below to the "core SYCL specification" or to section numbers in +the SYCL specification refer to that revision. + + +== Status + +This is a proposed extension specification, intended to gather community +feedback. +Interfaces defined in this specification may not be implemented yet or may be in +a preliminary state. +The specification itself may also change in incompatible ways before it is +finalized. +*Shipping software products should not rely on APIs defined in this +specification.* + + +== Overview + +This extension adds a way for the host to wait for all commands submitted to a +device to complete. +This functionality is similar to the CUDA API `cudaDeviceSynchronize`. + + +== Specification + +=== Feature test macro + +This extension provides a feature-test macro as described in the core SYCL +specification. An implementation supporting this extension must predefine the +macro `SYCL_EXT_ONEAPI_DEVICE_WAIT` to one of the values defined in the table +below. Applications can test for the existence of this macro to determine if +the implementation supports this feature, or applications can test the macro's +value to determine which of the extension's features the implementation +supports. + +[%header,cols="1,5"] +|=== +|Value +|Description + +|1 +|The APIs of this experimental extension are not versioned, so the + feature-test macro always has this value. +|=== + +=== New member function for the device class + +This extension adds the following member functions to the `device` class. + +[source,c++] +---- +namespace sycl { + +class device { + // ... + void ext_oneapi_wait_and_throw(async_handler h); + void ext_oneapi_wait_and_throw(); +}; + +} // namespace sycl +---- + +''' + +[source,c++] +---- +void ext_oneapi_wait_and_throw(async_handler h); +---- + +_Effects:_ Blocks the calling thread until all commands previously submitted to +any queue on this device have completed. +Any unconsumed asynchronous errors from these commands are reported to the +`h` handler as defined in section 4.13.1.1 "Asynchronous error handler" of the +core SYCL specification. + +''' + +[source,c++] +---- +void ext_oneapi_wait_and_throw(); +---- + +_Effects:_ Blocks the calling thread until all commands previously submitted to +any queue on this device have completed. +Any unconsumed asynchronous errors from these commands are reported to the +default async handler as defined in section 4.13.1.2 "Behavior without an async +handler" of the core SYCL specification. + +[_Note:_ The default async handler terminates the application when an +asynchronous error occurs, so applications should use the other overload of +`ext_oneapi_wait_and_throw` if they want to handle these errors. +_{endnote}_] + +''' + + +== Implementation notes + +Note that these functions wait for "commands", which includes host tasks and +memory copy operations. +The implementation and the tests should cover these cases too. From 48acaf0857076e95135553a93389010ac4a35543 Mon Sep 17 00:00:00 2001 From: Greg Lueck Date: Wed, 1 Oct 2025 16:01:47 -0400 Subject: [PATCH 2/4] Add aspect --- .../sycl_ext_oneapi_device_wait.asciidoc | 36 ++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc index 99fb2b4d4ae69..91338e7d6d011 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc @@ -83,7 +83,35 @@ supports. feature-test macro always has this value. |=== -=== New member function for the device class +=== New aspect + +This extension adds the following aspect. + +[source,c++] +---- +namespace sycl { + +enum class aspect { + // ... + ext_oneapi_device_wait +}; + +} // namespace sycl +---- + +''' + +[source,c++] +---- +ext_oneapi_device_wait +---- + +Indicates that the device supports the `device::ext_oneapi_wait_and_throw` +member functions. + +''' + +=== New member functions for the device class This extension adds the following member functions to the `device` class. @@ -113,6 +141,9 @@ Any unconsumed asynchronous errors from these commands are reported to the `h` handler as defined in section 4.13.1.1 "Asynchronous error handler" of the core SYCL specification. +_Throws:_ A synchronous `exception` with the `errc::feature_not_supported` +error code if the device does not have `aspect::ext_oneapi_device_wait`. + ''' [source,c++] @@ -126,6 +157,9 @@ Any unconsumed asynchronous errors from these commands are reported to the default async handler as defined in section 4.13.1.2 "Behavior without an async handler" of the core SYCL specification. +_Throws:_ A synchronous `exception` with the `errc::feature_not_supported` +error code if the device does not have `aspect::ext_oneapi_device_wait`. + [_Note:_ The default async handler terminates the application when an asynchronous error occurs, so applications should use the other overload of `ext_oneapi_wait_and_throw` if they want to handle these errors. From 24d3a3c38dc9f67f6632ee70320c116d3bf123a5 Mon Sep 17 00:00:00 2001 From: Greg Lueck Date: Thu, 2 Oct 2025 14:11:41 -0400 Subject: [PATCH 3/4] Revamp API After discussing various designs, we think we will need to keep a list of unconsumed async errors in the device object anyway. Therefore, it won't be hard to implement the same `wait`, `wait_and_throw`, and `throw_asynchronous` APIs as `queue`. --- .../sycl_ext_oneapi_device_wait.asciidoc | 37 ++++++++++++------- 1 file changed, 23 insertions(+), 14 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc index 91338e7d6d011..308e6df6b63ff 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc @@ -106,8 +106,7 @@ enum class aspect { ext_oneapi_device_wait ---- -Indicates that the device supports the `device::ext_oneapi_wait_and_throw` -member functions. +Indicates that the device supports the member functions described below. ''' @@ -121,8 +120,9 @@ namespace sycl { class device { // ... - void ext_oneapi_wait_and_throw(async_handler h); + void ext_oneapi_wait(); void ext_oneapi_wait_and_throw(); + void ext_oneapi_throw_asynchronous(); }; } // namespace sycl @@ -132,14 +132,11 @@ class device { [source,c++] ---- -void ext_oneapi_wait_and_throw(async_handler h); +void ext_oneapi_wait(); ---- _Effects:_ Blocks the calling thread until all commands previously submitted to any queue on this device have completed. -Any unconsumed asynchronous errors from these commands are reported to the -`h` handler as defined in section 4.13.1.1 "Asynchronous error handler" of the -core SYCL specification. _Throws:_ A synchronous `exception` with the `errc::feature_not_supported` error code if the device does not have `aspect::ext_oneapi_device_wait`. @@ -153,17 +150,29 @@ void ext_oneapi_wait_and_throw(); _Effects:_ Blocks the calling thread until all commands previously submitted to any queue on this device have completed. -Any unconsumed asynchronous errors from these commands are reported to the -default async handler as defined in section 4.13.1.2 "Behavior without an async -handler" of the core SYCL specification. + +At least all unconsumed asynchronous errors held by any queue (or its associated +context) on this device are passed to the appropriate async_handler as described +in section 4.13.1.3 "Priorities of async handlers" of the core SYCL +specification. _Throws:_ A synchronous `exception` with the `errc::feature_not_supported` error code if the device does not have `aspect::ext_oneapi_device_wait`. -[_Note:_ The default async handler terminates the application when an -asynchronous error occurs, so applications should use the other overload of -`ext_oneapi_wait_and_throw` if they want to handle these errors. -_{endnote}_] +''' + +[source,c++] +---- +void ext_oneapi_throw_asynchronous(); +---- + +_Effects:_ Checks to see if any unconsumed asynchronous errors have been +produced by any queue (or its associated context) on this device. +If so, they are passed to the appropriate async_handler as described in section +4.13.1.3 "Priorities of async handlers" of the core SYCL specification. + +_Throws:_ A synchronous `exception` with the `errc::feature_not_supported` +error code if the device does not have `aspect::ext_oneapi_device_wait`. ''' From e688815d67c8bc2c87f3be8d875e572bf984198c Mon Sep 17 00:00:00 2001 From: Greg Lueck Date: Thu, 9 Oct 2025 10:08:14 -0400 Subject: [PATCH 4/4] Add open issue about CUDA --- .../sycl_ext_oneapi_device_wait.asciidoc | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc index 308e6df6b63ff..251ddfca0e912 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_oneapi_device_wait.asciidoc @@ -182,3 +182,20 @@ error code if the device does not have `aspect::ext_oneapi_device_wait`. Note that these functions wait for "commands", which includes host tasks and memory copy operations. The implementation and the tests should cover these cases too. + + +== Issues + +* The API described above is implementable on Level Zero. + If we are being pedantic, we cannot easily implement this API on CUDA because + `cudaDeviceSynchronize` waits only for tasks that were submitted to the device + using the current context. + (The current context can only be changed from the CUDA driver API.) + We cannot implement that semantic on Level Zero because Level Zero provides + only a function that waits for all tasks on the device (from all contexts). + If we wanted to support this extension on CUDA in the future, one option would + be to expose the difference to users. + In that case, we'd add an additional aspect and additional APIs that take a + context. + CUDA would support the APIs that take a context and Level Zero would support + the APIs that do not.