Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Implement initial parts of sycl_ext_oneapi_kernel_properties #6941

Closed
wants to merge 16 commits into from

Conversation

steffenlarsen
Copy link
Contributor

@steffenlarsen steffenlarsen commented Oct 3, 2022

This commit implements the following parts of the
sycl_ext_oneapi_kernel_properties extension:

  • The work_group_size, work_group_size_hint, and sub_group_size properties.
  • The new overloads for single_task, parallel_for, and parallel_for_work_group in handler.
  • The new shortcuts for single_task and parallel_for in queue.
  • Support for merging property lists which is used when kernel functors have a get member for properties. The changes do not include any changes related to the device_has property and corresponding interfaces.

Changes to the Clang frontend:

  • Fixed an issue preventing initializer lists of characters to be used as strings in add_ir_{attributes|annotations}_* attributes.
  • The add_ir_attributes_function will now be copied onto the generated kernel declaration from the recognized kernel_* functions.
  • Added a warning diagnostic for when a kernel has both an add_ir_attributes_function (with values) and one or more potentially conflicting SYCL attributes.

Changes to sycl-post-link:

  • sycl-post-link is now able to generate metadata for recognized SYCL LLVM IR attributes on functions. These are "sycl-work-group-size", "sycl-work-group-size-hint", and "sycl-sub-group-size".
  • The previously mentioned new recognized LLVM IR attributes are translated into existing metadata the SPIR-V Translator translates into corresponding SPIR-V execution modes. If these metadata nodes already exist on a function (e.g. added through a SYCL 2020 attribute), the related property is ignored.

Additionally these changes fixes a minor bug with how work-group sizes were not reported in the correct order when invalid work-group sizes are given during launch of a kernel.

This commit implements the following parts of the
sycl_ext_oneapi_kernel_properties extension:
* The work_group_size, work_group_size_hint, and sub_group_size
  properties.
* The new overloads for single_task, parallel_for, and
  parallel_for_work_group in handler.
* The new shortcuts for single_task and parallel_for in queue.
* Support for merging property lists which is used when kernel functors
  have a get member for properties.
The changes do not include any changes related to the device_has
property and corresponding interfaces.

Changes to the Clang frontend:
* Fixed an issue preventing initializer lists of characters to be used
  as strings in add_ir_{attributes|annotations}_* attributes.
* The add_ir_attributes_function will now be copied onto the generated
  kernel declaration from the recognized kernel_* functions.

Changes to sycl-post-link:
* sycl-post-link is now able to generate metadata for recognized SYCL
  LLVM IR attributes on functions. These are "sycl-work-group-size",
  "sycl-work-group-size-hint", and "sycl-sub-group-size".
* The previously mentioned new recognized LLVM IR attributes are
  translated into existing metadata the SPIR-V Translator translates
  into corresponding SPIR-V execution modes. If these metadata nodes
  already exist on a function (e.g. added through a SYCL 2020
  attribute), the related property is ignored.

Additionally these changes fixes a minor bug with how work-group sizes
were not reported in the correct order when invalid work-group sizes are
given during launch of a kernel.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
llvm/tools/sycl-post-link/CompileTimePropertiesPass.cpp Outdated Show resolved Hide resolved
llvm/tools/sycl-post-link/CompileTimePropertiesPass.cpp Outdated Show resolved Hide resolved
Comment on lines +1117 to +1118
using NameT =
typename detail::get_kernel_name_t<KernelName, KernelType>::name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to check -- what happens if somebody enqueues the same kernel function multiple times, once with properties and once without? I ask because I thought the name was the only connection between a C++ representation of a kernel with the SPIR-V, and I'm wondering if the properties need to be included in the mangling somehow...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a very good question! I suspect you might be right that we need to include the properties in the name. I will add some tests for this ASAP!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my testing, it looks like creating two unnamed submissions with the same kernel but different properties (including one with a property and one without) will cause kernel naming conflicts. That is, the compiler will fail because it tries to create two different kernels with the same name.

An option for fixing this is to create a wrapper class to be used by get_kernel_name_t when there are properties. A small proof-of-concept of this is steffenlarsen@b4bc767 but the frontend causes an error in DiagnoseKernelNameType because the kernel isn't unnamed and the instantiation comes from sycl::queue.

@elizabethandrews & @Fznamznon - Do you have some ideas of how we can get around this? Should we somehow force it to consider the wrapper as unnamed too, or would that break other assumptions?

@Pennycook - Do you think this should block this PR or are you okay with having conflicting properties for the same kernel cause name conflicts for now? It is definitely something that should be addressed, but luckily it will be a relaxation rather than a regression in functionality when we get it fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to block this PR. I want developers to be able to start experimenting with compile-time properties as soon as possible! 😄

Since this is only a partial implementation of something that will be going into the "experimental" namespace initially, I'm not concerned about there being some corner cases that don't work. If it wouldn't be much work, an error that specifically says it's not possible to provide two sets of properties for the same unnamed lambda might save somebody some head-scratching.... But if that's a lot of work, maybe just a FIXME or a TODO would do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be worth adding a NOTE to the extension specification about the limitation. This will also remind us to address it before moving the extension out of "experimental".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a note in the extension specification about the current limitation and have added TODO's here and in similar locations.

@asudarsa
Copy link
Contributor

asudarsa commented Oct 3, 2022

Adding @sarnex who is working on a similar support. Thanks

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
@steffenlarsen steffenlarsen marked this pull request as ready for review October 3, 2022 21:52
@steffenlarsen steffenlarsen requested review from a team as code owners October 3, 2022 21:52
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
clang/include/clang/Basic/DiagnosticSemaKinds.td Outdated Show resolved Hide resolved
clang/lib/Sema/SemaDeclAttr.cpp Show resolved Hide resolved
clang/include/clang/Basic/Attr.td Outdated Show resolved Hide resolved
diag::warn_sycl_old_and_new_kernel_attributes)
<< ReqdWGSizeAttr;

if (const auto *ReqdSGSizeAttr = D->getAttr<IntelReqdSubGroupSizeAttr>())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I get it right that we issue a warning unconditionally if a kernel has any property and the three chosen attributes? Should we check which property is attached to a kernel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is right. The warning is issued if add_ir_attributes_function has values, regardless of the values. #6941 (comment) goes into more detail, but in summary we do it to avoid needing the frontend knowing about select values in this attribute while the warning is there to alert the user that they are using analogous features.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, these three attributes are all that we can apply to SYCL kernel or is there more?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the only three added in this PR, but there will also be a device_has property in a follow-up. I suspect even more after that, but those are the ones defined by sycl_ext_oneapi_kernel_properties.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, when there is a new property which conflicts with existing attribute, we have to update clang each time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly yes, though this should be rare after these, device_has, and any existing SYCL extensions with attributes, as we will be leaning towards adding compile-time properties extensions rather than extensions adding new attributes, avoiding any more conflicts.

clang/lib/Sema/SemaDeclAttr.cpp Outdated Show resolved Hide resolved
else if (const auto *IntegerVal =
dyn_cast<IntegerLiteral>(InitNoImpCastE))
C = static_cast<char>(IntegerVal->getValue().getZExtValue());
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, only character and integer are supported? Can part of property value be a double?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design doc for compile-time properties allows floating point property values also, but I don't think we currently have any like that. If this PR is intending to implement the general support for compile-time properties, we should support FP types also:

https://github.com/intel/llvm/blob/sycl/sycl/doc/design/CompileTimeProperties.md#property-representation-in-c-attributes-and-in-ir

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a bit of confusion here. This is only for parsing a string of characters defined as an initializer list. For this we should only need to be able to parse actual characters and integral values. Float, bool, etc are handled separately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. But how we can do "support" (as you mentioned here #6941 (comment) ) of multi value property if we would like it to have a non-char/int as one of the values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we do in this case is that we convert a list of integers to a string with comma-separated values. For example 1, 2, 3 becomes "1,2,3".

@Fznamznon
Copy link
Contributor

BTW, can we split this to separate patches that target only one component? I don't think that giant PRs with many different changes and reviewers is a good idea.

@gmlueck
Copy link
Contributor

gmlueck commented Oct 4, 2022

What is the plan for moving the extension specification out of "proposed" and into "experimental"? Do we want to make the APIs in this PR available soon, or do we plan to wait until the full extension is implemented? If we want to make them available before implementing the full extension, the extension specification should be split as described here.

I'd also like to update the extension specification to use the official template.

If it's easier, the extension spec can be split, moved, and updated in a separate PR.

@steffenlarsen
Copy link
Contributor Author

BTW, can we split this to separate patches that target only one component? I don't think that giant PRs with many different changes and reviewers is a good idea.

Having it in one PR means we can do full E2E testing out-of-the-gate and since we now don't dismiss approvals when new changes are made I figured having the different components in the same would be okay as reviewers for the different components could focus on their select parts and potentially unsubscribe upon approval if the following messages are an annoyance. If you do not agree with that assessment I am okay with splitting it up in 3; sycl-post-link, cfe, and runtime headers. Keep in mind though that the testability of the last part would be severely limited until we have the other parts merged.

@steffenlarsen
Copy link
Contributor Author

What is the plan for moving the extension specification out of "proposed" and into "experimental"? Do we want to make the APIs in this PR available soon, or do we plan to wait until the full extension is implemented? If we want to make them available before implementing the full extension, the extension specification should be split as described here.

I'd also like to update the extension specification to use the official template.

If it's easier, the extension spec can be split, moved, and updated in a separate PR.

The device_has part of the extension should follow soon, which is why this PR does not move the extension nor does it define the feature macro. The motivation for getting this in ASAP is to unblock development of other features leveraging the kernel properties infrastructure, independent of whether or not device_has is ready. If you think splitting it is preferred in the meantime, I am okay with that solution.

I'd also like to update the extension specification to use the official template.

I think it would make sense to do when we also move it.

@gmlueck
Copy link
Contributor

gmlueck commented Oct 4, 2022

If you think splitting it is preferred in the meantime, I am okay with that solution.

No need. I'm OK if we wait for device_has before considering the extension moved to "experimental".

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
@Fznamznon
Copy link
Contributor

Having it in one PR means we can do full E2E testing out-of-the-gate and since we now don't dismiss approvals when new changes are made I figured having the different components in the same would be okay as reviewers for the different components could focus on their select parts and potentially unsubscribe upon approval if the following messages are an annoyance. If you do not agree with that assessment I am okay with splitting it up in 3; sycl-post-link, cfe, and runtime headers. Keep in mind though that the testability of the last part would be severely limited until we have the other parts merged.

What concerns me is that even though the approvals are retained there will be a lot of reviewers and therefore a ton of comments in the same PR. Because of that it will be hard to find previous conversations (for example if I want to see what are your replies on my comments). In order to perform full E2E testing you could open a separate draft PR which contains all changes and cherry pick commits that contain review comments from separate component-specific branches there. But If you still find it convenient, I'm not opposed to continue moving with this particular PR "as-is", but I would be happy to see my suggestion applied to future changes.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
@steffenlarsen
Copy link
Contributor Author

What concerns me is that even though the approvals are retained there will be a lot of reviewers and therefore a ton of comments in the same PR. Because of that it will be hard to find previous conversations (for example if I want to see what are your replies on my comments). In order to perform full E2E testing you could open a separate draft PR which contains all changes and cherry pick commits that contain review comments from separate component-specific branches there. But If you still find it convenient, I'm not opposed to continue moving with this particular PR "as-is", but I would be happy to see my suggestion applied to future changes.

That is a fair concern. I will split this into separate PRs ASAP and move this to draft. Any open discussions can either continue here or move. I will comment with links when it is done.

steffenlarsen added a commit to steffenlarsen/llvm that referenced this pull request Oct 11, 2022
This commit makes the following changes to the Clang frontend in
preparation for the initial implementation of the
sycl_ext_oneapi_kernel_properties extension:
* Fixed an issue preventing initializer lists of characters to be used
as strings in add_ir_{attributes|annotations}_* attributes.
* The add_ir_attributes_function will now be copied onto the generated
kernel declaration from the recognized kernel_* functions.
* Added a warning diagnostic for when a kernel has both an
add_ir_attributes_function (with values) and one or more potentially
conflicting SYCL attributes.

This is split from intel#6941.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
steffenlarsen added a commit to steffenlarsen/llvm that referenced this pull request Oct 11, 2022
This commit adds parsing of compile-time properties for kernels to
sycl-post-link. This includes the following:
* sycl-post-link is now able to generate metadata for recognized SYCL
LLVM IR attributes on functions. These are "sycl-work-group-size",
"sycl-work-group-size-hint", and "sycl-sub-group-size".
* The previously mentioned new recognized LLVM IR attributes are
translated into existing metadata the SPIR-V Translator translates into
corresponding SPIR-V execution modes. If these metadata nodes already
exist on a function (e.g. added through a SYCL 2020 attribute), the
related property is ignored.

This is split from intel#6941.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
steffenlarsen added a commit to steffenlarsen/llvm that referenced this pull request Oct 11, 2022
This commit implements the following parts of the
sycl_ext_oneapi_kernel_properties extension:
* The work_group_size, work_group_size_hint, and sub_group_size
properties.
* The new overloads for single_task, parallel_for, and
parallel_for_work_group in handler.
* The new shortcuts for single_task and parallel_for in queue.
* Support for merging property lists which is used when kernel functors
have a get member for properties. The changes do not include any changes
related to the device_has property and corresponding interfaces.

This is split from intel#6941.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
@steffenlarsen
Copy link
Contributor Author

This has been split into three parts:

I will keep this open as draft for testing. Please feel free to move discussions to the other PRs or continue them here.

@steffenlarsen steffenlarsen marked this pull request as draft October 11, 2022 11:32
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
steffenlarsen added a commit that referenced this pull request Oct 11, 2022
This commit adds parsing of compile-time properties for kernels to
sycl-post-link. This includes the following:
* sycl-post-link is now able to generate metadata for recognized SYCL
LLVM IR attributes on functions. These are "sycl-work-group-size",
"sycl-work-group-size-hint", and "sycl-sub-group-size".
* The previously mentioned new recognized LLVM IR attributes are
translated into existing metadata the SPIR-V Translator translates into
corresponding SPIR-V execution modes. If these metadata nodes already
exist on a function (e.g. added through a SYCL 2020 attribute), the
related property is ignored.

This is split from #6941.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
steffenlarsen added a commit that referenced this pull request Oct 13, 2022
This commit makes the following changes to the Clang frontend in
preparation for the initial implementation of the
sycl_ext_oneapi_kernel_properties extension:
* Fixed an issue preventing initializer lists of characters to be used
as strings in add_ir_{attributes|annotations}_* attributes.
* The add_ir_attributes_function will now be copied onto the generated
kernel declaration from the recognized kernel_* functions.
* Added a warning diagnostic for when a kernel has both an
add_ir_attributes_function (with values) and one or more potentially
conflicting SYCL attributes.

This is split from #6941.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
steffenlarsen added a commit that referenced this pull request Oct 18, 2022
This commit implements the following parts of the
sycl_ext_oneapi_kernel_properties extension:
* The work_group_size, work_group_size_hint, and sub_group_size
properties.
* The new overloads for single_task, parallel_for, and
parallel_for_work_group in handler.
* The new shortcuts for single_task and parallel_for in queue.
* Support for merging property lists which is used when kernel functors
have a get member for properties. The changes do not include any changes
related to the device_has property and corresponding interfaces.

This is split from #6941.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
@steffenlarsen
Copy link
Contributor Author

All three parts have been merged. Thanks all!

steffenlarsen added a commit to intel/llvm-test-suite that referenced this pull request Oct 19, 2022
Tests for intel/llvm#6941

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
aelovikov-intel pushed a commit to aelovikov-intel/llvm that referenced this pull request Mar 27, 2023
Tests for intel#6941

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants