Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NVPTX] Add ELF flags for Nvidia cubin files #75624

Merged
merged 1 commit into from
Dec 15, 2023
Merged

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Dec 15, 2023

Summary:
Nvidia uses ELF as its file format for cubin files. This patch adds
support to allow detecting the architecture using the ELF flags only.
This will be used in the offloading runtime in the future.

These values are completely undocumented. They were determined by
manually modifying the ELF header of the cubin and checking the output
of the nvisasm tool.

Summary:
Nvidia uses ELF as its file format for cubin files. This patch adds
support to allow detecting the architecture using the ELF flags only.
This will be used in the offloading runtime in the future.

These values are completely undocumented. They were determined by
manually modifying the ELF header of the cubin and checking the output
of the `nvisasm` tool.
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 15, 2023

@llvm/pr-subscribers-llvm-binary-utilities

Author: Joseph Huber (jhuber6)

Changes

Summary:
Nvidia uses ELF as its file format for cubin files. This patch adds
support to allow detecting the architecture using the ELF flags only.
This will be used in the offloading runtime in the future.

These values are completely undocumented. They were determined by
manually modifying the ELF header of the cubin and checking the output
of the nvisasm tool.


Full diff: https://github.com/llvm/llvm-project/pull/75624.diff

1 Files Affected:

  • (modified) llvm/include/llvm/BinaryFormat/ELF.h (+43)
diff --git a/llvm/include/llvm/BinaryFormat/ELF.h b/llvm/include/llvm/BinaryFormat/ELF.h
index da38f6ef064f95..0f968eac36e72f 100644
--- a/llvm/include/llvm/BinaryFormat/ELF.h
+++ b/llvm/include/llvm/BinaryFormat/ELF.h
@@ -846,6 +846,49 @@ enum {
 #include "ELFRelocs/AMDGPU.def"
 };
 
+// NVPTX specific e_flags.
+enum : unsigned {
+  // Processor selection mask for EF_CUDA_SM* values.
+  EF_CUDA_SM = 0xff,
+
+  // SM based processor values.
+  EF_CUDA_SM20 = 0x14,
+  EF_CUDA_SM21 = 0x15,
+  EF_CUDA_SM30 = 0x1e,
+  EF_CUDA_SM32 = 0x20,
+  EF_CUDA_SM35 = 0x23,
+  EF_CUDA_SM37 = 0x25,
+  EF_CUDA_SM50 = 0x32,
+  EF_CUDA_SM52 = 0x34,
+  EF_CUDA_SM53 = 0x35,
+  EF_CUDA_SM60 = 0x3c,
+  EF_CUDA_SM61 = 0x3d,
+  EF_CUDA_SM62 = 0x3e,
+  EF_CUDA_SM70 = 0x46,
+  EF_CUDA_SM72 = 0x48,
+  EF_CUDA_SM75 = 0x4b,
+  EF_CUDA_SM80 = 0x50,
+  EF_CUDA_SM86 = 0x56,
+  EF_CUDA_SM87 = 0x57,
+  EF_CUDA_SM89 = 0x59,
+  // The sm_90a variant uses the same machine flag.
+  EF_CUDA_SM90 = 0x5a,
+
+  // Unified texture binding is enabled.
+  EF_CUDA_TEXMODE_UNIFIED = 0x100,
+  // Independent texture binding is enabled.
+  EF_CUDA_TEXMODE_INDEPENDANT = 0x200,
+  // The target is using 64-bit addressing.
+  EF_CUDA_64BIT_ADDRESS = 0x400,
+  // Set when using the sm_90a processor.
+  EF_CUDA_ACCELERATORS = 0x800,
+  // Undocumented software feature.
+  EF_CUDA_SW_FLAG_V2 = 0x1000,
+
+  // Virtual processor selection mask for EF_CUDA_VIRTUAL_SM* values.
+  EF_CUDA_VIRTUAL_SM = 0xff0000,
+};
+
 // ELF Relocation types for BPF
 enum {
 #include "ELFRelocs/BPF.def"

@jlebar
Copy link
Member

jlebar commented Dec 15, 2023

LGTM

This will be used in the offloading runtime in the future.

This is the first I'm learning about this; what is it?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Dec 15, 2023

LGTM

This will be used in the offloading runtime in the future.

This is the first I'm learning about this; what is it?

It's the OpenMP offloading runtime, the one under openmp/libomptarget. There's currently some discussions about moving it around and making it generic for other users. This was needed so we can determine if a given GPU image is compatible with the system just using the ELF.

@MaskRay MaskRay requested a review from jh7370 December 15, 2023 18:44
EF_CUDA_SM90 = 0x5a,

// Unified texture binding is enabled.
EF_CUDA_TEXMODE_UNIFIED = 0x100,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specification for these values? Or are they derived from .headerflags output?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specification as far as I'm aware, Nvidia does not document much of anything about their binaries. I had to reverse engineer this from the tools, but it's consistent.

@jhuber6 jhuber6 merged commit 8c262ed into llvm:main Dec 15, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants