Skip to content

rapidsai/detect-weak-linking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Overview

Built to detect weak external linkage of CUDA kernels that can lead to unexpected behavior.

Weak external linkage of CUDA kernels is a source of incorrect behavior that is non-obvious and incredibly hard to track. Incorrect behavior can include:

  • kernel not being executed at all
  • __global__ function call is not configured errors
  • kernel for wrong SM being launched
  • random segfaults

This happens due to the fact that the host side entry point for the kernel has weak linkage, and therefore if multiple DSO have that symbol only one will exist at runtime. This collapsing of symbols breaks certain assumptions in the CUDA runtime and in effect creates undefined behavior.

Command Line Syntax

The detect.py tool offers a suite of options/flags to allow users to extract the specific details they are looking for. Any detect issues will cause the program to return an exit code of 1, and output json dictonary of the issues.

detect.py -m <libA> <libB> # List only kernels that exist in more than one of the provided library
detect.py -m <directory> # List only kernels that exist in more than one of the libraries inside the given directory
detect.py -m -r <exe|lib> # Look at the provided library, and all ldd (`DT_NEEDED`) dependencies

Options and Flags:

  • r Scan all dependencies of the provided input(s).
  • m Only consider kernels that exist in multiple files ( includes dependencies )
  • b Only output new symbol usage that doesnt't exist in the json baseline file
  • e Filter out certain c++ symbols from output with provided regex
  • u Include public global process unique variables in the output

Json Output Format

When detect.py outputs any issues to standard output using the following format:

{
  "_Z9TrickySigPfPiPdi": {
    "/path/to/file/example_A.so": [
      "SASS",
      "SASS",
      "SASS",
      "PTX"
    ],
    "/path/to/file/example_B.so": [
      "SASS"
    ],
    "symbol": "TrickySig(float*, int*, double*, int)"
  }
}

The root object contains a key/value map of all symbols where the key is the mangled symbol name and the value is a dictonary object which contains the follwing entries:

symbol Contains the C++ demangled name of the symbol

<path> Each file that has the symbol will have a separate key / value entry. The file path ( including name and extension ) is the key, and the value is a list of symbol instance types. Due to CUDA whole compilation mode it is possible for the same symbol to exist multiple times inside the same library ( de-duplicated at runtime ).

Symbol instance types:

  • "PTX" Means that kernel instance will be compiled at runtime ( compute90 )
  • "SASS" Means the kernel instance is for a specific for GPU version ( sm90 )
  • "u" Means the symbol is a globally unique variable across the entire process space

Baselines

Baseline files allow projects to easily detect when new weak symbols are introduced into a project. This can by running the detect.py as part of a CI pipeline, or part of a git bisect script.

A baseline file can be generated by running detect.py on the project from a 'good' configuration and piping the output to a file.

When detect.py is executed with the baseline option (-b) the following rules are applied to determine what output will be shown.

  1. The execution of detect.py is done as normal reading all input and obeyin the -r flag
  2. Any other filtering flags ( -m, -e, -u ) will be processed
  3. The output json after all filtering will be compared to the json in the baseline file. - Symbols will be considered to match if the mangled names are the same - Files will be matched on just the filename plus extension, with any preceding path ( absolute or relateive ) ignored

A symbol will be considered to be new and part of the output, if it never existed in the baseline file, or if the baseline has smaller total number of files, or symbols per file.

Example Use Cases

Detect multiple instances of a kernel weak linkage in an existing application

detect.py -r exe

Detect addition of new weak linkage between builds

detect.py -r exe -b <path/to/baseline.json>

Detect internal variables that will be fused across DSO at runtime

detect.py -r -m -u exe

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages