-
Notifications
You must be signed in to change notification settings - Fork 798
[SYCL][Doc] Add device code split feature design #631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -394,6 +394,51 @@ llvm-no-spir-kernel host.bc | |
|
||
It returns 0 if no kernels are present and 1 otherwise. | ||
|
||
#### Device code split | ||
|
||
Putting all device code into a single SPIRV module does not work well in the | ||
following cases: | ||
1. There are thousands of kernels defined and only small part of them is used at | ||
run-time. Having them all in one SPIR-V module significantly increases JIT time. | ||
2. Device code can be specialized for different devices. For example, kernels | ||
that are supposed to be executed only on FPGA can use extensions avaliable for | ||
FPGA only. This will cause JIT compilation failure on other devices even if this | ||
particular kernel is never called on them. | ||
|
||
To resolve these problems the compiler can split a single module into smaller | ||
ones. The following features is supported: | ||
* Emitting a separate module for source (translation unit) | ||
* Emitting a separate module for each kernel | ||
|
||
The current approach is: | ||
* Generate special meta-data with translation unit ID for each kernel in SYCL | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How this ID is used later? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This ID will be used to group kernels per translation unit. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ping. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
front-end. This ID will be used to group kernels on per-translation unit basis | ||
* Link all device LLVM modules using llvm-link | ||
* Perform split on a fully linked module | ||
* Generate a symbol table (list of kernels) for each produced device module for | ||
proper module selection in runtime | ||
* Perform SPIR-V translation and AOT compilation (if requested) on each produced | ||
module separately | ||
* Add information about presented kernels to a wrappring object for each device | ||
image | ||
|
||
Device code splitting process: | ||
 | ||
|
||
The "split" box is implemented as functionality of the dedicated tool | ||
`sycl-post-link`. The tool runs a set of LLVM passes to split input module and | ||
generates a symbol table (list of kernels) for each produced device module. | ||
|
||
To enable device code split, a special option must be passed to the clang | ||
driver: | ||
|
||
`-fsycl-device-code-split=<value>` | ||
|
||
There are three possible values for this option: | ||
* `per_source` - enables emitting a separate module for each source (translation | ||
unit) | ||
* `per_kernel` - enables emitting a separate module for each kernel | ||
* `off` - disables device code split | ||
|
||
### Integration with SPIR-V format | ||
|
||
|
Uh oh!
There was an error while loading. Please reload this page.