-
Notifications
You must be signed in to change notification settings - Fork 798
[NewOffloadModel][SYCL DeviceLib] Generate SYCL device library objects using new offload model #13579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NewOffloadModel][SYCL DeviceLib] Generate SYCL device library objects using new offload model #13579
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All recent changes related to unbundling of device objects have been removed. Once we turn on the new offloading model by default, all library developers are expected to recompile their files using the new compiler. This proposal will be discussed with the 'SYCL upstreaming' team and we will add backward compatibility if required in a later PR.
|
Test will be added in upcoming commit. Also existing tests might need to be fixed. |
|
This PR can be closed if #13524 goes in soon. Thanks |
…s using new offload model Following are changes in this PR: 1. Generate new SYCL device library files using new offload model 2. Pass these new files to clang-linker-wrapper 3. Avoid use of these files in old offload model 4. Remove support for bundled objects in clang-linker-wrapper Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
… targets Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
fb0599b to
ff0fd35
Compare
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
| break; | ||
| } | ||
|
|
||
| // Backend/Assemble actions are not used for the SYCL device side |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change (7883-7887) provided by @mdtoguchi. Thanks Mike. For SYCL offloading, we want -c compilation to NOT invoke backend tools. For new offloading model, we see that the backend tools are being called for -c compilation. This change prevents the backend tools from being invoked for -c compilation.
|
merge issues have been resolved. Most of the merge issues were caused when merging #13604 Thanks |
|
This is now ready for review. Thanks |
| // RUN: | FileCheck -check-prefix WRAPPER_OPTIONS %s | ||
| // WRAPPER_OPTIONS: clang-linker-wrapper{{.*}} "--triple=spir64" | ||
| // WRAPPER_OPTIONS-SAME: "-sycl-device-libraries=libsycl-crt.bc,libsycl-complex.bc,libsycl-complex-fp64.bc,libsycl-cmath.bc,libsycl-cmath-fp64.bc,libsycl-imf.bc,libsycl-imf-fp64.bc,libsycl-imf-bf16.bc,libsycl-itt-user-wrappers.bc,libsycl-itt-compiler-wrappers.bc,libsycl-itt-stubs.bc" | ||
| // WRAPPER_OPTIONS-SAME: "-sycl-device-libraries=libsycl-crt.new.o,libsycl-complex.new.o,libsycl-complex-fp64.new.o,libsycl-cmath.new.o,libsycl-cmath-fp64.new.o,libsycl-imf.new.o,libsycl-imf-fp64.new.o,libsycl-imf-bf16.new.o,libsycl-itt-user-wrappers.new.o,libsycl-itt-compiler-wrappers.new.o,libsycl-itt-stubs.new.o" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you help me to understand this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I have a similar question. Arvind had mentioned supporting raw BC files in the linker wrapper was difficult so we were using fat objects, but in this test we are giving it raw BC files. Is that just a test thing and it wouldn't have actually worked with raw BC files and always required fat objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the older version with .bc files was a recent change introduced by @mdtoguchi when he added change to replace bundled fat devicelib object with .bc files. .bc files can be easily consumed in the old offloading model. However, .bc files cannot be consumed by the current flow in clang-linker-wrapper. It requires packaged fat object files. All .new.o files are devicelib bc files packaged and thus can be consumed easily by the clang-linker-wrapper.
| if (IsNVPTX) | ||
| if (IsNVPTX) { | ||
| LibPostfix = ".cubin"; | ||
| NewLibPostfix = ".new.cubin"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we miss tests with ".new.cubin".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.cubin are intermediate filenames that are created in the old offloading driver flow. So, we cannot add tests for this. I will discuss with relevant experts on why we need this renaming.
Thanks
| static bool IsSYCLDeviceLibObj(std::string ObjFilePath, bool isMSVCEnv) { | ||
| StringRef ObjFileName = llvm::sys::path::filename(ObjFilePath); | ||
| StringRef ObjSuffix = isMSVCEnv ? ".obj" : ".o"; | ||
| StringRef NewObjSuffix = isMSVCEnv ? ".new.obj" : ".new.o"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we miss tests with ".new.obj".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me add them. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a new test in upcoming commit. Thanks
| for (auto &File : UnbundledDeviceLibFiles) | ||
| const llvm::Triple Triple(Args.getLastArgValue(OPT_triple_EQ)); | ||
| SmallVector<std::string, 16> ExtractedDeviceLibFiles; | ||
| for (auto &File : DeviceLibFiles) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please move this loop in a dedicated function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we would want to do that? I do not see any benefits as we do not need this functionality elsewhere....Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My rationale was that the function name linkDevice suggests that some linking is happening here while this particular loop is performing some extraction rather than linking. In order to understand that a reader will have to make a cognitive effort to read the actual code.
Anyway, I don't insist that it should be a blocker.
| "-nocudalib" | ||
| "--cuda-gpu-arch=sm_50") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clarify what this change is about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. For SYCL offloading, we generate .bc files for -c compilation for all JIT and AOT targets. For -c compilation with nvptx targets, there is a call to getTargetDeviceOptions(..) at some stage and this relies on an arch being present, thought this arch information is not used in the .bc file generation. We just set it to a dummy arch. This is done in our existing compilation flow as well.
This is some text from our doc:
"The CUDA backend should work on Windows or Linux operating systems with any GPU with compute capability (SM version) sm_50 or above. The default SM version for the NVIDIA CUDA backend is sm_50."
Thanks
Hi @sarnex That's a good question. I had discussed this with Mike. Let me record it here. PR #13524 added .bc versions of the device library files. Current version of the clang-linker-wrapper is set up to consume 'packaged' device IR and it requires extra work to support unpackaged raw .bc files. Hence, PR #13524 does not help here. |
Please ignore comment. Reason: #13579 (comment) Thanks. |
againull
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
libdevice/cmake/modules/SYCLLibdevice.cmake
Looks good to me.
|
@asudarsa Got it, thanks. I will try to review this on Monday. |
sarnex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM mostly, just a few questions
| DESTINATION ${install_dest_lib} | ||
| COMPONENT libsycldevice) | ||
|
|
||
| set(devicelib-obj-file-new-offload ${obj_new_offload_binary_dir}/${obj_filename}.${new-offload-lib-suffix}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand we're generating fat objects because the offload packager isn't set up to take in raw BC files, but probably that is simplier in the long term. so I'm okay with generating fat objects for now but we should have an internal tracker to enhance clang-linker-wrapper to accept raw BC inputs and then we can simplify/remove some stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been thinking about this. In my experience, I feel that it's a lot cleaner and in sync with community code to handle packaged files instead of bc files. I suppose there are pros and cons, This is something we can discuss further.
| // RUN: | FileCheck -check-prefix WRAPPER_OPTIONS %s | ||
| // WRAPPER_OPTIONS: clang-linker-wrapper{{.*}} "--triple=spir64" | ||
| // WRAPPER_OPTIONS-SAME: "-sycl-device-libraries=libsycl-crt.bc,libsycl-complex.bc,libsycl-complex-fp64.bc,libsycl-cmath.bc,libsycl-cmath-fp64.bc,libsycl-imf.bc,libsycl-imf-fp64.bc,libsycl-imf-bf16.bc,libsycl-itt-user-wrappers.bc,libsycl-itt-compiler-wrappers.bc,libsycl-itt-stubs.bc" | ||
| // WRAPPER_OPTIONS-SAME: "-sycl-device-libraries=libsycl-crt.new.o,libsycl-complex.new.o,libsycl-complex-fp64.new.o,libsycl-cmath.new.o,libsycl-cmath-fp64.new.o,libsycl-imf.new.o,libsycl-imf-fp64.new.o,libsycl-imf-bf16.new.o,libsycl-itt-user-wrappers.new.o,libsycl-itt-compiler-wrappers.new.o,libsycl-itt-stubs.new.o" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I have a similar question. Arvind had mentioned supporting raw BC files in the linker wrapper was difficult so we were using fat objects, but in this test we are giving it raw BC files. Is that just a test thing and it wouldn't have actually worked with raw BC files and always required fat objects?
| for (auto &File : InputFiles) | ||
| CmdArgs.push_back(File); | ||
| for (auto &File : InputFiles) { | ||
| auto IRFile = sycl::convertSPIRVToIR(File, Args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand this line. Don't we need to check if it's a SPIR-V file? Or can we only hit this with SPV files and not with fat objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convertSPIRVToIR will handle cases where the file is not a SPIR-V file. it will return the original file if it's not a SPIR-V file.
However, I do think it should be possible to check here if the image type is SPIR-V. i will add a TODO to address this soon. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say we rename the function then, the name sounds like it only accepts SPV and always returns IR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a supporting comment? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment and a TODO to check for SPIR-V image type and exit early on top of function definition.
Thanks
Signed-off-by: Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Sudarsanam <arvind.sudarsanam@intel.com>
|
Compilation fails on this branch using: |
Hi @hdelan thanks so much for taking a look. This is a known issue. We currently support only spir-v JIT targets for the new offload model. AOT support for Intel and non- Intel hardware will be added in a subsequent step. In the meanwhile, I can try to add a more graceful way to exit for such cases. In a separate PR hopefully. thanks |
maksimsab
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some nits.
Apart from that, it looks good.
| } | ||
| if (!CompatibleBinaryFound) | ||
| WithColor::warning(errs(), LinkerExecutable) | ||
| << "Compatible SYCL device library binary not found\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit.
According to LLVM style I would propose the following format:
not found SYCL device library binary compatible with @Triple, where @Triple is the function argument.
| // LLVM-SPIRV is not called in dry-run | ||
| // CHK-CMDS-NEXT: offload-wrapper: input: [[LLVMSPIRVOUT:.*]].table, output: [[WRAPPEROUT:.*]].bc | ||
| // CHK-CMDS-NEXT: "{{.*}}llc.exe" -filetype=obj -o [[LLCOUT:.*]].o [[WRAPPEROUT]].bc | ||
| // CHK-CMDS-NEXT: "{{.*}}/ld" -- HOST_LINKER_FLAGS -dynamic-linker HOST_DYN_LIB -o a.out [[LLCOUT]].o HOST_LIB_PATH HOST_STAT_LIB {{.*}}test-sycl.o |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit.
It seems for me that there is no ld linker present in the Windows environment. There is different linker specific to windows - link.exe. It is not relevant to the current PR.
| for (auto &File : UnbundledDeviceLibFiles) | ||
| const llvm::Triple Triple(Args.getLastArgValue(OPT_triple_EQ)); | ||
| SmallVector<std::string, 16> ExtractedDeviceLibFiles; | ||
| for (auto &File : DeviceLibFiles) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My rationale was that the function name linkDevice suggests that some linking is happening here while this particular loop is performing some extraction rather than linking. In order to understand that a reader will have to make a cognitive effort to read the actual code.
Anyway, I don't insist that it should be a blocker.
|
Hi @maksimsab Thanks for the review. |
|
Hi @intel/llvm-gatekeepers This is ready for merge. Can you please take a look? Thanks |
Following are changes in this PR:
Thanks