-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NVPTX] Add ELF flags for Nvidia cubin files #75624
Conversation
Summary: Nvidia uses ELF as its file format for cubin files. This patch adds support to allow detecting the architecture using the ELF flags only. This will be used in the offloading runtime in the future. These values are completely undocumented. They were determined by manually modifying the ELF header of the cubin and checking the output of the `nvisasm` tool.
@llvm/pr-subscribers-llvm-binary-utilities Author: Joseph Huber (jhuber6) ChangesSummary: These values are completely undocumented. They were determined by Full diff: https://github.com/llvm/llvm-project/pull/75624.diff 1 Files Affected:
diff --git a/llvm/include/llvm/BinaryFormat/ELF.h b/llvm/include/llvm/BinaryFormat/ELF.h
index da38f6ef064f95..0f968eac36e72f 100644
--- a/llvm/include/llvm/BinaryFormat/ELF.h
+++ b/llvm/include/llvm/BinaryFormat/ELF.h
@@ -846,6 +846,49 @@ enum {
#include "ELFRelocs/AMDGPU.def"
};
+// NVPTX specific e_flags.
+enum : unsigned {
+ // Processor selection mask for EF_CUDA_SM* values.
+ EF_CUDA_SM = 0xff,
+
+ // SM based processor values.
+ EF_CUDA_SM20 = 0x14,
+ EF_CUDA_SM21 = 0x15,
+ EF_CUDA_SM30 = 0x1e,
+ EF_CUDA_SM32 = 0x20,
+ EF_CUDA_SM35 = 0x23,
+ EF_CUDA_SM37 = 0x25,
+ EF_CUDA_SM50 = 0x32,
+ EF_CUDA_SM52 = 0x34,
+ EF_CUDA_SM53 = 0x35,
+ EF_CUDA_SM60 = 0x3c,
+ EF_CUDA_SM61 = 0x3d,
+ EF_CUDA_SM62 = 0x3e,
+ EF_CUDA_SM70 = 0x46,
+ EF_CUDA_SM72 = 0x48,
+ EF_CUDA_SM75 = 0x4b,
+ EF_CUDA_SM80 = 0x50,
+ EF_CUDA_SM86 = 0x56,
+ EF_CUDA_SM87 = 0x57,
+ EF_CUDA_SM89 = 0x59,
+ // The sm_90a variant uses the same machine flag.
+ EF_CUDA_SM90 = 0x5a,
+
+ // Unified texture binding is enabled.
+ EF_CUDA_TEXMODE_UNIFIED = 0x100,
+ // Independent texture binding is enabled.
+ EF_CUDA_TEXMODE_INDEPENDANT = 0x200,
+ // The target is using 64-bit addressing.
+ EF_CUDA_64BIT_ADDRESS = 0x400,
+ // Set when using the sm_90a processor.
+ EF_CUDA_ACCELERATORS = 0x800,
+ // Undocumented software feature.
+ EF_CUDA_SW_FLAG_V2 = 0x1000,
+
+ // Virtual processor selection mask for EF_CUDA_VIRTUAL_SM* values.
+ EF_CUDA_VIRTUAL_SM = 0xff0000,
+};
+
// ELF Relocation types for BPF
enum {
#include "ELFRelocs/BPF.def"
|
LGTM
This is the first I'm learning about this; what is it? |
It's the OpenMP offloading runtime, the one under |
EF_CUDA_SM90 = 0x5a, | ||
|
||
// Unified texture binding is enabled. | ||
EF_CUDA_TEXMODE_UNIFIED = 0x100, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specification for these values? Or are they derived from .headerflags
output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No specification as far as I'm aware, Nvidia does not document much of anything about their binaries. I had to reverse engineer this from the tools, but it's consistent.
Summary:
Nvidia uses ELF as its file format for cubin files. This patch adds
support to allow detecting the architecture using the ELF flags only.
This will be used in the offloading runtime in the future.
These values are completely undocumented. They were determined by
manually modifying the ELF header of the cubin and checking the output
of the
nvisasm
tool.