forked from numba/numba
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Starting with CUDA 11.2, a new version of NVVM is provided that is based on LLVM 7.0. This requires a number of changes to support, which must be maintained in parallel with the existing support for NVVM based on LLVM 3.4. This PR adds these changes, which consist of: - Addition of a function to query the NVVM IR version, and a property indicating whether the NVVM in use is based on LLVM 3.4 or 7.0 (`is_nvvm70`). - The CAS hack (inserting a text-based implementation of `cmpxchg` with pre-LLVM 3.5 semantics in a function) is only needed with NVVM 3.4 - on NVVM 7.0, llvmlite is used to build `cmpxchg` instructions directly instead. - Templates for other atomics (inc, dec, min, max) have the right form of the `cmpxchg` instruction inserted depending on the NVVM version. - The datalayout shorthand is now only replaced for NVVM 3.4. - There are now two variants of the functions to rewrite the IR - `llvm100_to_70_ir` and `llvm100_to_34_ir`. `llvm100_to_34_ir` is the old `llvm_39_to_34_ir` with a name reflecting what it currently does. - `llvm100_to_70_ir` removes the `willreturn` attribute from functions, as it is not supported by LLVM 7.0. It also converts DISPFlags to main subprogram DIFlags. For example, `spflags: DISPFlagDefinition | DISPFlagOptimized` is rewritten as `isDefinition: true, isOptimized: true`. - For NVVM 7.0, the `DIBuilder` also used for the CPU target can be used, instead of the `NvvmDIBuilder` that was needed to support NVVM 3.4. - Some tests are updated to support modified function names, and also to expect a CUDA version of 11.2. - `test_nvvm_driver` is updated to include appropriate IR for both NVVM 3.4 and 7.0. Some refactoring also makes its code clearer (e.g. renaming `get_ptx()` to `get_nvvimir()`, because it returns NVVM IR and not PTX). - Some optimizations in LLVM 7.0 result in different code generation in `test_constmem`, so alternative expected results are added for when NVVM 7.0 is used. Note that this recovers some optimizations that were lost when IR optimization using llvmlite was switched off (PR numba#6030, "Don't optimize IR before sending it to NVVM"). - `test_debuginfo` is updated to match the format of the debuginfo section produced by both NVVM 3.4 and 7.0 (there is some variation in whitespace between these versions).
- Loading branch information
Showing
10 changed files
with
234 additions
and
118 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.