-
Notifications
You must be signed in to change notification settings - Fork 794
[SYCL][Doc] Add specialization constant-length alloca extension proposal #12660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
001b54d
[SYCL][Doc] Add specialization constant-length alloca extension proposal
victor-eds 50bcc0b
Address comments
victor-eds b4eaaaa
Add precondition
victor-eds a754145
Change precondition with remark
victor-eds ee4fca6
Do not mention stack
victor-eds 975b7fe
NIT
victor-eds bdaa346
Add two issues
victor-eds d100f37
Add aligned version
victor-eds 6fc644a
Fix formatting
victor-eds 92be499
Change memory storage duration to automatic
victor-eds c18c5c5
Add storage duration example
victor-eds 0831685
Fix formattign
victor-eds 09742ab
Try another fix
victor-eds 66d583a
Use substitution
victor-eds 3949fef
Change section headers
victor-eds 609a863
Apply renaming
victor-eds 1d4c7f6
Fix alignment claim
victor-eds 33ba677
Rephrase alignment
victor-eds File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
266 changes: 266 additions & 0 deletions
266
sycl/doc/extensions/experimental/sycl_ext_private_alloca.asciidoc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,266 @@ | ||
| = sycl_ext_oneapi_private_alloca | ||
|
|
||
| :source-highlighter: coderay | ||
| :coderay-linenums-mode: table | ||
|
|
||
| // This section needs to be after the document title. | ||
| :doctype: book | ||
| :toc2: | ||
| :toc: left | ||
| :encoding: utf-8 | ||
| :lang: en | ||
| :dpcpp: pass:[DPC++] | ||
| :cpp: pass:[C++] | ||
|
|
||
| // Set the default source code type in this document to C++, | ||
| // for syntax highlighting purposes. This is needed because | ||
| // docbook uses c++ and html5 uses cpp. | ||
| :language: {basebackend@docbook:c++:cpp} | ||
|
|
||
|
|
||
| == Notice | ||
|
|
||
| [%hardbreaks] | ||
| Copyright (C) Codeplay Software Limited. All rights reserved. | ||
|
|
||
| Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks | ||
| of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by | ||
| permission by Khronos. | ||
|
|
||
|
|
||
| == Contact | ||
|
|
||
| To report problems with this extension, please open a new issue at: | ||
|
|
||
| https://github.com/intel/llvm/issues | ||
|
|
||
|
|
||
| == Dependencies | ||
|
|
||
| This extension is written against the SYCL 2020 revision 8 specification. All | ||
| references below to the "core SYCL specification" or to section numbers in the | ||
| SYCL specification refer to that revision. | ||
|
|
||
|
|
||
| == Status | ||
|
|
||
| This is an experimental extension specification, intended to provide early | ||
| access to features and gather community feedback. Interfaces defined in this | ||
| specification are implemented in {dpcpp}, but they are not finalized and may | ||
| change incompatibly in future versions of {dpcpp} without prior notice. | ||
| *Shipping software products should not rely on APIs defined in this | ||
| specification.* | ||
|
|
||
| == Backend support status | ||
|
|
||
| The APIs in this extension may be used only on a device that has | ||
| `aspect::ext_oneapi_private_alloca`. The application must check that the device | ||
| has this aspect before submitting a kernel using any of the APIs in this | ||
| extension. If the application fails to do this, the implementation throws a | ||
| synchronous exception with the `errc::kernel_not_supported` error code when the | ||
| kernel is submitted to the queue. | ||
|
|
||
| == Overview | ||
|
|
||
| {cpp} arrays and `std::array` can be used in SYCL code to represent fixed-size | ||
| sequences of objects. However, these constructs have a significant restriction: | ||
| the number of elements must be known at compile time. In host-code context, | ||
| users can make use of dynamic memory allocations, e.g., `std::vector`, but this | ||
| is not the case in SYCL device code. | ||
|
|
||
| SYCL specialization constants (SYCL Section 4.9.5.) can be used to represent | ||
| constants whose values can be set dynamically during the execution of a SYCL | ||
| application, but that will not change when a SYCL kernel function is | ||
| invoked. This way, specialization constants could be used to implement SYCL | ||
| private arrays whose size is given during the execution of the SYCL | ||
| application. There is no possible way of implementing this using `std::array`, | ||
| as the size of such containers must be known at compile time, so we propose to | ||
| define a new `private_alloca` function whose size is specified using SYCL | ||
| specialization constants. | ||
|
|
||
| == Specification | ||
|
|
||
| === Feature test macro | ||
|
|
||
| This extension provides a feature-test macro as described in the core SYCL | ||
| specification. An implementation supporting this extension must predefine the | ||
| macro `SYCL_EXT_ONEAPI_PRIVATE_ALLOCA` to one of the values defined in the table | ||
| below. Applications can test for the existence of this macro to determine if the | ||
| implementation supports this feature, or applications can test the macro's value | ||
| to determine which of the extension's features the implementation supports. | ||
|
|
||
| [%header,cols="1,5"] | ||
| |=== | ||
| |Value | ||
| |Description | ||
|
|
||
| |1 | ||
| |The APIs of this experimental extension are not versioned, so the | ||
| feature-test macro always has this value. | ||
| |=== | ||
|
|
||
| === New aspect for specialization constant-length allocations | ||
|
|
||
| This extension adds a new device aspect: | ||
|
|
||
| [source,c++] | ||
| ---- | ||
| namespace sycl { | ||
| enum class aspect : /*unspecified*/ { | ||
| ext_oneapi_private_alloca | ||
| }; | ||
| } // namespace sycl | ||
| ---- | ||
|
|
||
| The `ext_oneapi_private_alloca` aspect indicates that the device is capable of | ||
| using the `private_alloca` API defined in the following sections. | ||
|
|
||
| === The `private_alloca` API | ||
|
|
||
| [source,c++] | ||
| ---- | ||
| namespace sycl::ext::oneapi::experimental { | ||
| template <typename ElementType, auto &SpecName, | ||
| access::decorated DecorateAddress> | ||
| private_ptr<ElementType, DecorateAddress> | ||
| private_alloca(kernel_handler &kh); | ||
|
|
||
| template <typename ElementType, std::size_t Alignment, auto &SpecName, | ||
| access::decorated DecorateAddress> | ||
| private_ptr<ElementType, DecorateAddress> | ||
| aligned_private_alloca(kernel_handler &kh); | ||
| } // namespace sycl::ext::oneapi::experimental | ||
| ---- | ||
|
|
||
| _Mandates_: `ElementType` must be a cv-unqualified trivial type and `SpecName` | ||
| must be a reference to a specialization constant of integral `value_type`. In | ||
victor-eds marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| the case of `aligned_private_alloca`, `Alignment` must be an alignment value and | ||
| must be a positive multiple of `alignof(ElementType)`. If `Alignment` is an | ||
| extended alignment, it must be supported by the implementation. | ||
gmlueck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| _Effects_: `h.get_specialization_constant<size>()` elements of type | ||
victor-eds marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| `ElementType` are allocated and default initialized in private memory. | ||
|
|
||
| _Returns_: A pointer to a default initialized region of private memory of | ||
| `h.get_specialization_constant<size>()` elements of type | ||
victor-eds marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| `ElementType`. `DecorateAddress` defines whether the returned `multi_ptr` is | ||
| decorated. In the case of `private_alloca`, the pointer is suitably aligned for | ||
| an object of type `ElementType`. In the case of `aligned_private_alloca`, the | ||
| pointer is aligned to the specified `Alignment`. | ||
|
|
||
| _Remarks_: In case of private memory exhaustion, the implementation must report | ||
| an error in the same fashion as if the allocation size were static. In case of a | ||
| successful call, allocated memory has automatic storage duration. Additionally, | ||
| `SpecName` must have a default value of at least 1 and not be set to a value | ||
victor-eds marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| less than 1 during program execution. Violation of these conditions results in | ||
| undefined behaviour. | ||
|
|
||
| == Example usage | ||
|
|
||
| This non-normative section shows some example usages of the extension. | ||
|
|
||
| === Basic usage | ||
|
|
||
| [source,c++] | ||
| ---- | ||
| constexpr specialization_id<int> size(1); | ||
|
|
||
| class Kernel; | ||
|
|
||
| SYCL_EXTERNAL void impl(const float *in, float *out, size_t n, | ||
| decorated_private_ptr<float> ptr); | ||
|
|
||
| void run(queue q, const float *in, float *out, size_t n) { | ||
| q.submit([&](handler &h) { | ||
| h.set_specialization_constant<size>(n); | ||
| h.parallel_for<Kernel>(n, [=](id<1> i, kernel_handler kh) { | ||
| // Allocate memory for 'n' 'float's | ||
| auto ptr = private_alloca<float, size, access::decorated::yes>(kh); | ||
| // Use pointer in implementation | ||
| impl(in, out, kh.get_specialization_constant<size>(), ptr); | ||
victor-eds marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| }); | ||
| }); | ||
| ---- | ||
|
|
||
| === Storage duration clarification | ||
|
|
||
| The following example is intended to clarify storage duration of memory | ||
| allocated by `private_alloca`. | ||
|
|
||
| [source,c++] | ||
| ---- | ||
| constexpr specialization_id<int> size(1); | ||
|
|
||
| class Kernel; | ||
|
|
||
| SYCL_EXTERNAL void impl(const float *in, float *out, size_t n, | ||
| raw_private_ptr<float> ptr); | ||
|
|
||
| void run(queue q, const float *in, float *out, size_t n) { | ||
| q.submit([&](handler &h) { | ||
| h.set_specialization_constant<size>(n); | ||
| h.parallel_for<Kernel>(n, [=](id<1> i, kernel_handler kh) { | ||
| raw_private_ptr<float> ptr; | ||
| { | ||
| ptr = private_alloca<float, size, access::decorated::no>(kh); | ||
| // 'private_alloca' has allocated a private memory region we can use in | ||
| // this block. | ||
| impl(in, out, kh.get_specialization_constant<size>(), ptr); | ||
| } | ||
| // Memory allocated by 'private_alloca' has been deallocated. | ||
| // Dereferencing 'ptr' at this program point is undefined behaviour. | ||
| }); | ||
| }); | ||
| ---- | ||
|
|
||
| == Design constraints | ||
|
|
||
| The big design constraint stems from the unknown allocation size at compile | ||
| time. {cpp} does not support variable length arrays and complete type sizes must | ||
| be known at compile time. Thus, the free function interface returning a pointer | ||
| to private memory is the better way to represent this construct in | ||
| {cpp}. Lifetime of the underlying memory region was a concern too, but the | ||
| current design with automatic storage duration for the allocated memory region | ||
| closely follows what the user would get from a stack-allocated array. | ||
|
|
||
| == Issues | ||
|
|
||
| === Default `DecorateAddress` value | ||
|
|
||
| At the time this extension was first proposed, there was no consensus for a | ||
| default value for `sycl::access::decorate` in SYCL. The SYCL specification | ||
| chooses `sycl::access::decorate::legacy` to avoid making breaking changes, but | ||
| this would not justify using that value in this extension. | ||
|
|
||
| Although it would be desirable to have one, the `private_alloca` extension will | ||
| not commit to a default value until the SYCL community has come to an agreement. | ||
|
|
||
| === Passing size as an argument | ||
|
|
||
| Initial design passes size as a `sycl::specialization_id<Integral> &` template | ||
| argument and receives a `sycl::kernel_handler &` as an argument. This decision | ||
| comes from the current situation in which `sycl::specialization_id` is a unique | ||
| identifier to represent a specialization constant and `sycl::kernel_handler` is | ||
| used to query the *value* of specialization constants with | ||
| `sycl::kernel_handler::get_specialization_constant`. Having a | ||
| `sycl::specialization_constant` class representing specialization constants | ||
| would enable cleaner interfaces to this function like: | ||
|
|
||
| [source,c++] | ||
| ---- | ||
| namespace sycl::ext::oneapi::experimental { | ||
| template <typename ElementType, access::decorated DecorateAddress> | ||
| private_ptr<ElementType, DecorateAddress> | ||
| private_alloca(const specialization_constant<std::size_t> &size); | ||
| } // namespace sycl::ext::oneapi::experimental | ||
| ---- | ||
|
|
||
| == Revision history | ||
|
|
||
| [cols="5,15,15,70"] | ||
| [grid="rows"] | ||
| [options="header"] | ||
| |======================================== | ||
| |Rev|Date|Authors|Changes | ||
| |1|2024-02-08|Victor Lomüller, Lukas Sommer, Victor Perez, Julian Oppermann, Tadej Ciglaric, Romain Biessy|*Initial draft* | ||
| |======================================== | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.