New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA: Fix and deprecate inspect_ptx()
, fix NVVM option setup for device functions
#6953
Conversation
This fixes issues with device functions: - `inspect_ptx` was broken by PR numba#6735 (commit 11a7397). The device function should use the code library to obtain the PTX instead of attempting to compile directly itself. - Optimization and debug options were not passed to NVVM when compiling device functions. These options are now set up and passed to `compile_cuda` when a device function is compiled. This commit also deprecates the use of `inspect_ptx` - the preferred API for compiling Python to PTX is to use the `compile_ptx` function instead. The `inspect_ptx()` method had a couple of issues: - Allowing it to receive a dict of options for NVVM enabled it to produce different PTX to what might actually have been compiled so far, as highlighted by the fact that optimization and debug flags failed to be passed to NVVM for the `compile()` method, but not in the `inspect_ptx()` method. - The NVVM options dict kwarg was unsafely initialized anyway (`nvvm_options={}`) - See issue numba#5811. - It returned encoded bytes, which differs from similar APIs (e.g. `inspect_llvm()`, and the Dispatcher's `inspect_asm()`, which return `str` instead. As there is no easy way to correctly and consistently pass NVVM options through `inspect_ptx()`, a warning is emitted stating that these are ignored if a user passes any options in.
Some small Numba-related changes: - Testing on CI appears to be picking up Numba 0.53.1, and there have been quite a lot of CUDA changes between Numba 0.49 (the old minimum) and 0.53.1 - increase the minimum required Numba version to 0.53.1 accordingly. - Remove old import guards and alternatives for Numba versions < 0.49 - these were needed because of Numba's internals refactor between 0.48 and 0.49. - Replace the use of `inspect_ptx()` with `compile_ptx()` in `test_ptx_generic()` - `inspect_ptx()` has always had various issues so it is being deprecated and will be removed in a later version. `compile_ptx()` provides a better alternative. See numba/numba#6953
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch, looks good. Just a couple of minor queries to resolve. Thanks again.
msg = ('inspect_ptx for device functions is deprecated. Use ' | ||
'compile_ptx instead.') | ||
warn(msg, category=NumbaDeprecationWarning) | ||
|
||
if nvvm_options: | ||
msg = ('nvvm_options are ignored. Use compile_ptx if you want to ' | ||
'set NVVM options.') | ||
warn(msg, category=NumbaDeprecationWarning) | ||
return self.compile(args).library.get_asm_str().encode() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a note in the deprecation notices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Doc build https://numba--6953.org.readthedocs.build/en/6953/reference/deprecation.html#deprecation-of-the-inspect-ptx-method looks good.
@gmarkall many thanks for the updates. Please could you resolve the conflict and then this can be run through the farm? Thanks again! |
Thanks for the review - the conflicts are fixed but the tests fail due to master being broken for CUDA at the moment, e.g.:
|
Some small Numba-related changes: - Testing on CI appears to be picking up Numba 0.53.1, and there have been quite a lot of CUDA changes between Numba 0.49 (the old minimum) and 0.53.1 - increase the minimum required Numba version to 0.53.1 accordingly. - Remove old import guards and alternatives for Numba versions < 0.49 - these were needed because of Numba's internals refactor between 0.48 and 0.49. - Replace the use of `inspect_ptx()` with `compile_ptx()` in `test_ptx_generic()` - `inspect_ptx()` has always had various issues so it is being deprecated and will be removed in a later version. `compile_ptx()` provides a better alternative. See numba/numba#6953 Authors: - Graham Markall (https://github.com/gmarkall) - Keith Kraus (https://github.com/kkraus14) Approvers: - Keith Kraus (https://github.com/kkraus14) - AJ Schmidt (https://github.com/ajschmidt8) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #8017
@stuartarchibald Now that master is fixed, this PR is working locally for me again - I've merged in master too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch and fixes.
Buildfarm ID: |
Passed. |
This fixes issues with device functions:
inspect_ptx
was broken by PR CUDA: Don't parse IR for modules with llvmlite #6735 (commit 11a7397). The device function should use the code library to obtain the PTX instead of attempting to compile directly itself.compile_cuda
when a device function is compiled.This commit also deprecates the use of
inspect_ptx
- the preferred API for compiling Python to PTX is to use thecompile_ptx
function instead. Theinspect_ptx()
method had a couple of issues:compile()
method, but not in theinspect_ptx()
method.nvvm_options={}
) - See issue Eliminate all dangerous default values found by pylint #5811.inspect_llvm()
, and the Dispatcher'sinspect_asm()
, which returnstr
instead.As there is no easy way to correctly and consistently pass NVVM options through
inspect_ptx()
, a warning is emitted stating that these are ignored if a user passes any options in.Some tests are also added, to ensure that it works correctly, and warns the user appropriately.
Fixes #6950.