Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA: Fix source location on kernel entry and enable breakpoints to be set on kernels by mangled name #6841

Merged
merged 7 commits into from Jun 23, 2021

Commits on Mar 18, 2021

  1. Separate DIBuilder API from Numba IR API

    Marking a subprogram, variable, or location in the DIBuilder API
    requires passing in a `numba.core.ir.Loc` object. This makes sense when
    all debuginfo generation comes from lowering, but creates weird
    dependencies when generating code that doesn't relate to a specific
    piece of IR - for example, the kernel wrapper generated in the CUDA
    target. To mark locations in the kernel wrapper, one must import the
    `Loc` class from `numba.core.ir`, and construct instances of it to pass
    in, just to mark locations in generated code (or do something more odd
    with a "pretend" `Loc` object).
    
    Since the `Loc` instances are only used for the line number, this commit
    changes the interface of the relevant `DIBuilder` methods to accept a
    line number instead of a `Loc`, breaking the spurious coupling.
    gmarkall committed Mar 18, 2021
    Configuration menu
    Copy the full SHA
    7d226f3 View commit details
    Browse the repository at this point in the history
  2. CUDA: Mark kernel wrapper with subprogram and source location

    Doing this provides a couple of improvements:
    
    - Breaking on any kernel launch, for example with
      `set cuda break_on_launch application` now results in GDB being able
      to find the source line at the break location - this enables the
      backtrace to show the mangled function name, source file, and line
      number. It also enables the text user interface to display the source
      when the kernel launches. From this point, stepping through
      line-by-line is feasible with the `next` command.
    - Breakpoints can be set on a specific function and argument types, by
      setting a breakpoint on the mangled name.
    
    (neither of these worked prior to this change)
    
    Issues that remain include:
    
    - GDB doesn't demangle the names of functions, like it seems to in the
      CPU target examples. Oddly, the act of marking the subprogram seems to
      break demangling - prior to this commit, the location of the wrapper
      kernel isn't known, but it does demangle correctly.
    - Most locals appear to be optimized out even with `opt=0`. This also
      needs further investigation.
    gmarkall committed Mar 18, 2021
    Configuration menu
    Copy the full SHA
    7f77680 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1f9d320 View commit details
    Browse the repository at this point in the history

Commits on Mar 19, 2021

  1. CUDA: Compile modules with debug one at a time with NVVM

    Includes:
    
    - Calls `llvm_to_ptx` once for each IR module when compiling for debug.
    - Don't adjust linkage of functions in linked modules when debugging,
      because we need device functions to be externally visible.
    - Fixed setting of NVVM options when calling `compile_cuda` from kernel
      compilation and device function template compilation.
    - Removes `debug_pubnames` patch.
    
    Outcomes:
    
    - The "Error: Debugging support cannot be enabled when number of debug
      compile units is more than 1" message is no longer produced with NVVM
      3.4.
    - NVVM 7.0: Everything still seems to "work" as much as it did before.
      Stepping may be more stable, but this needs a bit more verification.
    
    Fixes numba#5311.
    gmarkall committed Mar 19, 2021
    Configuration menu
    Copy the full SHA
    448b0d8 View commit details
    Browse the repository at this point in the history

Commits on May 25, 2021

  1. Configuration menu
    Copy the full SHA
    772ed7f View commit details
    Browse the repository at this point in the history

Commits on Jun 16, 2021

  1. Configuration menu
    Copy the full SHA
    10d57b3 View commit details
    Browse the repository at this point in the history

Commits on Jun 21, 2021

  1. Changes in response to PR numba#6841 feedback

    - Use `self.py_func.__code__` to get the filename and line number for
      the function for `prepare_cuda_kernel`, which seems a bit cleaner /
      neater that digging into the type annotation for it.
    - Fix typo and explicitly link to an issue comment.
    gmarkall committed Jun 21, 2021
    Configuration menu
    Copy the full SHA
    02fad5d View commit details
    Browse the repository at this point in the history