-
Notifications
You must be signed in to change notification settings - Fork 699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Generalize local accessor to shared mem pass #5149
Merged
Merged
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
2022ae9
[SYCL] Generalize local accessor to shared mem pass
jchlanda df343be
[SYCL] [HIP] Fix alignment of local arguments
jchlanda 82dbbdb
[SYCL] Port local acessors tests to amdgcn
jchlanda 437d5c8
[SYCL] Add command line option for local accessor to shared mem pass
jchlanda 2b2a968
Update llvm/test/CodeGen/AMDGPU/local-accessor-to-shared-memory-valid…
jchlanda 151368c
Update llvm/test/CodeGen/AMDGPU/local-accessor-to-shared-memory-tripl…
jchlanda 30283c0
Remove redundand test
jchlanda a49c751
Add test for -sycl-enable-local-accessor
jchlanda File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
/// Check the correct handling of sycl-enable-local-accessor option. | ||
|
||
// REQUIRES: clang-driver | ||
|
||
// RUN: %clang -fsycl -### %s 2>&1 \ | ||
// RUN: | FileCheck -check-prefix=CHECK-NO-OPT %s | ||
// CHECK-NO-OPT-NOT: "-sycl-enable-local-accessor" | ||
|
||
// RUN: %clang -fsycl -fsycl-targets=nvptx64-nvidia-cuda -### %s 2>&1 \ | ||
// RUN: | FileCheck %s | ||
// CHECK: "-sycl-enable-local-accessor" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mdtoguchi, shouldn't the condition check
IsSYCLOffloadDevice
instead ofIsSYCL
?abi/user_mangling.cpp
andregression/fsycl-save-temps.cpp
from check-sycl suite fails on my system.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes. If this is to be only set for device, then this should be using
IsSYCLOffloadDevice
, asIsSYCL
is for all compilations (host and device) when SYCL device offloading is enabled.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get following error:
clang (LLVM option parsing): Unknown command line argument '-sycl-enable-local-accessor'. Try: 'clang (LLVM option parsing) --help'
clang (LLVM option parsing): Did you mean '--enable-local-reassign'?
Honestly, I don't really understand why the options is not visible for FE in host mode, but using
IsSYCLOffloadDevice
fixed the problem and I don't think running the pass enabled by-sycl-enable-local-accessor
is needed in the host mode.Another mystery is why this issue is not exposed by CI system. I suppose it some how related to the difference in cmake configuration - I don't build NVPTX and AMDGPU targets, which I suppose link the library with "unknown" option.
We definitely need to do more investigation on this issue.
@jchlanda, FYI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional problem:
-sycl-enable-local-accessor
is only being set when we do some kind of code generation step. As the device compilation does not do this,-sycl-enable-local-accessor
is never set for device. It is only being emitted for host compilations as that goes through the assembling step.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at a high level,
-sycl-enable-local-accessor
does only emit for the code generation step with thenvptx64
target. The steps are not combined fornvptx64
allowing the option to only be emitted for device there. Kind of a round-about way to restrict the option, but it can leak out to host if-S
is used.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, @jchlanda is away this week.
@AerialMantis, can someone else to take a look into this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, I'll have someone else take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a PR up to fix this #5408
I believe the CI did not catch this as it builds for both cuda and hip then runs for each backend, correct me if I am wrong. If you build for CUDA or HIP then -sycl-enable-local-accessor is then usable.
I think that
-sycl-enable-local-accessor
is not available when not building for CUDA/HIP because the pass is initialised within the llvm NVPTX and AMDGPU backends.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm almost glad that this bug surfaced @bader , @mdtoguchi . TBH,
-sycl-enable-local-accessor
is a workaround, that I never liked. The problem I had was that there wasn't a way to tell that a kernel was compiled from SYCL. Simply relying on the calling convention (CallingConv::AMDGPU_KERNEL
) is not enough, as there are multiple paths that use it (OpenCL, OpenMP, SYCL). I was wondering if it would be better to follow NVIDIA here and use metadata nodes to denote kernels (https://github.com/intel/llvm/blob/HEAD/clang/lib/CodeGen/TargetInfo.cpp#L7242). This would work for all the passes that we only want to run on SYCL kernels (for instance we'd like to generalize https://github.com/intel/llvm/blob/sycl/llvm/lib/Target/NVPTX/SYCL/GlobalOffset.cpp would benefit from it).