-
Notifications
You must be signed in to change notification settings - Fork 798
[SYCL] Run CompileTimePropertiesPass early in the pipeline #20602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Some compile time properties work as a replacement for kernel attributes. For example, work_group_size semantics must be identical to sycl::reqd_work_group_size kernel attribute. The problem is kernel attributes are lowered to LLVM metadata by Clang, but work_group_size represented as an LLVM attribute. CompileTimePropertiesPass converts attribute to canonical metadata representation, but does it late in the opimization pipeline. This patch moves CompileTimePropertiesPass to the beginning of the optimization pipeline to keep canonical representation for SYCL kernel attributes information passes via compile-time properties.
|
Discussed shortly offline. It looks like some of the metadata gets lost if added early. An option could be to run the pass twice and separate the transformations into early and late transformations. The kernel properties, i.e. the ones that have OpenCL parallels (for example |
|
Out of curiosity how/why is metadata lost if pass is earlier? |
Seemingly it affects the cache control properties the worst. Looks to me like the |
|
@aratajew, can we change the cache control LLVM representation from instruction metadata to something that can survive LLVM optimizations (e.g. llvm.ptr.annotation)? Today, SYCL compiler emits llvm.ptr.annotation intrinsic annotating pointer with cache control hints, which works well for that purpose, but we have to maintain another LLVM pass to convert the intrinsic to SPIR-V metadata with the restriction to run this pass as close to SPIR-V CodeGen (or LLVM-SPIRV-Translator) as possible. I would prefer to have a single representation in LLVM for cache control hints respected by the LLVM optimizations. SYCL compiler choice of llvm.ptr.annotation intrinsic might not be the best solution due to semantic definition (https://llvm.org/docs/LangRef.html#llvm-ptr-annotation-intrinsic):
So far, all standard LLVM passes seem to do a good job with preserving annotations. Better alternatives are welcome. |
…ontrols handling.
@steffenlarsen, I implemented this suggestion in the 2c8d041. |
maarquitos14
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit, otherwise LGTM.
Do you generate cache control metadata attached to the pointer used by load/store instruction, or to the load/store instruction itself? The Khronos SPIRV-LLVM Translator initially supported only the former approach, which was indeed very prone to being optimized out. However, when the Triton Compiler faced this issue, the solution was this change: KhronosGroup/SPIRV-LLVM-Translator#2587. This change allows cache control metadata to be generated directly on a load/store instruction. The SPIRV-LLVM Translator then automatically generates a dummy GEP and reattaches the metadata to it, ensuring that proper SPIR-V can be generated. |
| class CompileTimePropertiesPass | ||
| : public PassInfoMixin<CompileTimePropertiesPass> { | ||
| public: | ||
| CompileTimePropertiesPass(bool ConvertCacheControls = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit; I think I would have preferred to call it something like EarlyRun or EarlyPass, so if other features are added that care whether it's transformed early or late, the bool still represents their needs.
We generate cache control metadata attached to the instruction emitting the pointer argument for load/store instruction.
@aratajew, thanks for the hint! Let me try it out. @steffenlarsen, @maarquitos14, @elizabethandrews, for the reviews. I really don't like the current solution with running the pass twice, so I'm going to try attaching the metadata to load/store instructions. If it doesn't work, I'll address your comments for the current patch. Ideally, I would like Clang's IRGen to emit SPIR-V metadata for cache controls and avoid using CompileTimeProperties pass for that. |
… cache controls handling." This reverts commit 2c8d041.
…ions producing a pointer argument for load/store instructions.
|
|
||
| // CHECK: spir_kernel{{.*}}cache_control_read_hint_func | ||
| // CHECK: {{.*}}addrspacecast ptr addrspace(1){{.*}}!spirv.Decorations [[RHINT:.*]] | ||
| // CHECK: store float 5.500000e+01, ptr addrspace(1) %{{.*}}, !spirv.Decorations ![[RHINT:[0-9]+]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: There is a bug in the test, which I reported here: #20718.
RHINT must be applied to load instructions.
|
@aratajew, thanks again! New approach works like a charm. The metadata attached to the load/store instructions survives optimization. The least for the test we have in our pre-commit. @steffenlarsen, I'm not sure if cache control hints feature is covered well. I updated the test checking LLVM IR, but it would be nice to check that hints are applied correctly at SPIR-V level as well. Do we have such tests? |
https://github.com/KhronosGroup/SPIRV-LLVM-Translator/tree/2f2a95e686e72ec77e6d0dfbf22413cf46c0e338/test/extensions/INTEL/SPV_INTEL_cache_controls has some tests related to the SPIR-V code generation. Is this what you had in mind? |
steffenlarsen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the changes here, we may need to make sure that the SPIR-V translator is ready to make the proper conversions from the load/store instructions rather than the GEPs. Based on the testing in test/extensions/INTEL/SPV_INTEL_cache_controls it doesn't look like we have testing for such a case.
Tag @MrSidims & @maarquitos14
Co-authored-by: Steffen Larsen <steffen.larsen@intel.com>
This is a bare minimum. In addition to these, it might be worth adding a test checking the SPIR-V emitted from SYCL sources. What do you think? |
Off the top of my head, it's not something we usually do. Typically we would check the resulting LLVM-IR, then we could have LLVM SPIR-V translator tests that check that uses the LLVM-IR output from the SYCL tests as the input of their tests. I personally like that structure as it separates the responsibilities of the tooling. |
I'm okay with that. If there are no objections, this patch should be ready for merge. I hope integration is validated by the end-to-end tests. |
|
The pre-commit failure is not related to the patch and tracked by #20750. |
Some compile time properties work as a replacement for kernel
attributes. For example, work_group_size semantics must be identical to
sycl::reqd_work_group_size kernel attribute. The problem is kernel
attributes are lowered to LLVM metadata by Clang, but work_group_size
represented as an LLVM attribute.
CompileTimePropertiesPass converts attribute to canonical metadata
representation, but does it late in the opimization pipeline.
This patch moves CompileTimePropertiesPass to the beginning of the
optimization pipeline to keep canonical representation for SYCL kernel
attributes information passes via compile-time properties.