-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA: Fix source location on kernel entry and enable breakpoints to be set on kernels by mangled name #6841
Merged
stuartarchibald
merged 7 commits into
numba:master
from
gmarkall:grm-cuda-debug-improvements
Jun 23, 2021
Merged
CUDA: Fix source location on kernel entry and enable breakpoints to be set on kernels by mangled name #6841
stuartarchibald
merged 7 commits into
numba:master
from
gmarkall:grm-cuda-debug-improvements
Jun 23, 2021
Commits on Mar 18, 2021
-
Separate DIBuilder API from Numba IR API
Marking a subprogram, variable, or location in the DIBuilder API requires passing in a `numba.core.ir.Loc` object. This makes sense when all debuginfo generation comes from lowering, but creates weird dependencies when generating code that doesn't relate to a specific piece of IR - for example, the kernel wrapper generated in the CUDA target. To mark locations in the kernel wrapper, one must import the `Loc` class from `numba.core.ir`, and construct instances of it to pass in, just to mark locations in generated code (or do something more odd with a "pretend" `Loc` object). Since the `Loc` instances are only used for the line number, this commit changes the interface of the relevant `DIBuilder` methods to accept a line number instead of a `Loc`, breaking the spurious coupling.
Configuration menu - View commit details
-
Copy full SHA for 7d226f3 - Browse repository at this point
Copy the full SHA 7d226f3View commit details -
CUDA: Mark kernel wrapper with subprogram and source location
Doing this provides a couple of improvements: - Breaking on any kernel launch, for example with `set cuda break_on_launch application` now results in GDB being able to find the source line at the break location - this enables the backtrace to show the mangled function name, source file, and line number. It also enables the text user interface to display the source when the kernel launches. From this point, stepping through line-by-line is feasible with the `next` command. - Breakpoints can be set on a specific function and argument types, by setting a breakpoint on the mangled name. (neither of these worked prior to this change) Issues that remain include: - GDB doesn't demangle the names of functions, like it seems to in the CPU target examples. Oddly, the act of marking the subprogram seems to break demangling - prior to this commit, the location of the wrapper kernel isn't known, but it does demangle correctly. - Most locals appear to be optimized out even with `opt=0`. This also needs further investigation.
Configuration menu - View commit details
-
Copy full SHA for 7f77680 - Browse repository at this point
Copy the full SHA 7f77680View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1f9d320 - Browse repository at this point
Copy the full SHA 1f9d320View commit details
Commits on Mar 19, 2021
-
CUDA: Compile modules with debug one at a time with NVVM
Includes: - Calls `llvm_to_ptx` once for each IR module when compiling for debug. - Don't adjust linkage of functions in linked modules when debugging, because we need device functions to be externally visible. - Fixed setting of NVVM options when calling `compile_cuda` from kernel compilation and device function template compilation. - Removes `debug_pubnames` patch. Outcomes: - The "Error: Debugging support cannot be enabled when number of debug compile units is more than 1" message is no longer produced with NVVM 3.4. - NVVM 7.0: Everything still seems to "work" as much as it did before. Stepping may be more stable, but this needs a bit more verification. Fixes numba#5311.
Configuration menu - View commit details
-
Copy full SHA for 448b0d8 - Browse repository at this point
Copy the full SHA 448b0d8View commit details
Commits on May 25, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 772ed7f - Browse repository at this point
Copy the full SHA 772ed7fView commit details
Commits on Jun 16, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 10d57b3 - Browse repository at this point
Copy the full SHA 10d57b3View commit details
Commits on Jun 21, 2021
-
Changes in response to PR numba#6841 feedback
- Use `self.py_func.__code__` to get the filename and line number for the function for `prepare_cuda_kernel`, which seems a bit cleaner / neater that digging into the type annotation for it. - Fix typo and explicitly link to an issue comment.
Configuration menu - View commit details
-
Copy full SHA for 02fad5d - Browse repository at this point
Copy the full SHA 02fad5dView commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.