Skip to content

Conversation

@OCHyams
Copy link
Contributor

@OCHyams OCHyams commented Oct 21, 2025

DW_TAG_base_type DIEs are permitted to have both byte_size and bit_size attributes "If the value of an object of the given type does not fully occupy the storage described by a byte size attribute"

Change Clang to emit the actual bit-size of _BitInt in debug metadata, and change LLVM to add DW_AT_bit_size to base_type DIEs when the condition above is true.

Fixes #61952

DW_TAG_base_type DIEs are permitted to have both byte_size and bit_size
attributes "If the value of an object of the given type does not fully occupy
the storage described by a byte size attribute"

Change Clang to emit the actual bit-size of _BitInt in debug metadata, and
change LLVM to add DW_AT_bit_size to base_type DIEs when the condition above is
true.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. llvm:codegen debuginfo labels Oct 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 21, 2025

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-clang-codegen

Author: Orlando Cazalet-Hyams (OCHyams)

Changes

DW_TAG_base_type DIEs are permitted to have both byte_size and bit_size attributes "If the value of an object of the given type does not fully occupy the storage described by a byte size attribute"

Change Clang to emit the actual bit-size of _BitInt in debug metadata, and change LLVM to add DW_AT_bit_size to base_type DIEs when the condition above is true.

Fixes #61952


Full diff: https://github.com/llvm/llvm-project/pull/164372.diff

5 Files Affected:

  • (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+1-3)
  • (added) clang/test/DebugInfo/Generic/bit-int.c (+8)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp (+5)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp (+11-2)
  • (added) llvm/test/DebugInfo/bit-int-size.ll (+38)
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index 12e2813ef2ec7..2ad187de16acc 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -1152,9 +1152,7 @@ llvm::DIType *CGDebugInfo::CreateType(const BitIntType *Ty) {
   llvm::dwarf::TypeKind Encoding = Ty->isUnsigned()
                                        ? llvm::dwarf::DW_ATE_unsigned
                                        : llvm::dwarf::DW_ATE_signed;
-
-  return DBuilder.createBasicType(Name, CGM.getContext().getTypeSize(Ty),
-                                  Encoding);
+  return DBuilder.createBasicType(Name, Ty->getNumBits(), Encoding);
 }
 
 llvm::DIType *CGDebugInfo::CreateType(const ComplexType *Ty) {
diff --git a/clang/test/DebugInfo/Generic/bit-int.c b/clang/test/DebugInfo/Generic/bit-int.c
new file mode 100644
index 0000000000000..16b3d7a2582c9
--- /dev/null
+++ b/clang/test/DebugInfo/Generic/bit-int.c
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -x c++ %s -debug-info-kind=standalone -gno-column-info -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 -x c   %s -debug-info-kind=standalone -gno-column-info -emit-llvm -o - | FileCheck %s
+
+unsigned _BitInt(2) a;
+_BitInt(2) b;
+
+// CHECK: !DIBasicType(name: "_BitInt", size: 2, encoding: DW_ATE_signed)
+// CHECK: !DIBasicType(name: "unsigned _BitInt", size: 2, encoding: DW_ATE_unsigned)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 518121e200190..6f68548aa25f4 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -1795,6 +1795,11 @@ void DwarfCompileUnit::createBaseTypeDIEs() {
     // Round up to smallest number of bytes that contains this number of bits.
     addUInt(Die, dwarf::DW_AT_byte_size, std::nullopt,
             divideCeil(Btr.BitSize, 8));
+    // If the value of an object of the given type does not fully occupy the
+    // storage described by a byte size attribute, the base type entry may also
+    // have a DW_AT_bit_size [...] attribute.
+    if (Btr.BitSize && (Btr.BitSize % 8))
+      addUInt(Die, dwarf::DW_AT_bit_size, std::nullopt, Btr.BitSize);
 
     Btr.Die = &Die;
   }
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
index e40fb768027b8..1253d9a463cbb 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
@@ -766,8 +766,17 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DIBasicType *BTy) {
     addUInt(Buffer, dwarf::DW_AT_encoding, dwarf::DW_FORM_data1,
             BTy->getEncoding());
 
-  uint64_t Size = BTy->getSizeInBits() >> 3;
-  addUInt(Buffer, dwarf::DW_AT_byte_size, std::nullopt, Size);
+  uint64_t SizeInBytes = divideCeil(BTy->getSizeInBits(), 8);
+  addUInt(Buffer, dwarf::DW_AT_byte_size, std::nullopt, SizeInBytes);
+  if (BTy->getTag() == dwarf::Tag::DW_TAG_base_type) {
+    // DW_TAG_base_type:
+    // If the value of an object of the given type does not fully occupy the
+    // storage described by a byte size attribute, the base type entry may also
+    // have a DW_AT_bit_size [...] attribute.
+    if (uint64_t SizeInBits = BTy->getSizeInBits();
+        SizeInBits && SizeInBits % 8)
+      addUInt(Buffer, dwarf::DW_AT_bit_size, std::nullopt, SizeInBits);
+  }
 
   if (BTy->isBigEndian())
     addUInt(Buffer, dwarf::DW_AT_endianity, std::nullopt, dwarf::DW_END_big);
diff --git a/llvm/test/DebugInfo/bit-int-size.ll b/llvm/test/DebugInfo/bit-int-size.ll
new file mode 100644
index 0000000000000..69ab756f4e288
--- /dev/null
+++ b/llvm/test/DebugInfo/bit-int-size.ll
@@ -0,0 +1,38 @@
+; RUN: %llc_dwarf %s -filetype=obj -o - | llvm-dwarfdump - | FileCheck %s
+; REQUIRES: object-emission
+
+;; Check base types with bit-sizes that don't fit fully fit within a byte
+;; multiple get both a a byte_size and bit_size attribute.
+
+; CHECK: DW_TAG_base_type
+; CHECK-NEXT: DW_AT_name      ("unsigned _BitInt")
+; CHECK-NEXT: DW_AT_encoding  (DW_ATE_unsigned)
+; CHECK-NEXT: DW_AT_byte_size (0x02)
+; CHECK-NEXT: DW_AT_bit_size  (0x09)
+
+; CHECK: DW_TAG_base_type
+; CHECK-NEXT: DW_AT_name      ("unsigned _BitInt")
+; CHECK-NEXT: DW_AT_encoding  (DW_ATE_signed)
+; CHECK-NEXT: DW_AT_byte_size (0x01)
+; CHECK-NEXT: DW_AT_bit_size  (0x02)
+
+@a = global i8 0, align 1, !dbg !0
+@b = global i8 0, align 1, !dbg !5
+
+!llvm.dbg.cu = !{!2}
+!llvm.module.flags = !{!10, !11}
+!llvm.ident = !{!12}
+
+!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
+!1 = distinct !DIGlobalVariable(name: "a", scope: !2, file: !7, line: 4, type: !9, isLocal: false, isDefinition: true)
+!2 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !3, producer: "clang version 22.0.0git", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, splitDebugInlining: false, nameTableKind: None)
+!3 = !DIFile(filename: "bit-int.c", directory: "/")
+!4 = !{!0, !5}
+!5 = !DIGlobalVariableExpression(var: !6, expr: !DIExpression())
+!6 = distinct !DIGlobalVariable(name: "b", scope: !2, file: !7, line: 5, type: !8, isLocal: false, isDefinition: true)
+!7 = !DIFile(filename: "bit-int.c", directory: "/")
+!8 = !DIBasicType(name: "_BitInt", size: 2, encoding: DW_ATE_signed)
+!9 = !DIBasicType(name: "unsigned _BitInt", size: 2, encoding: DW_ATE_unsigned)
+!10 = !{i32 2, !"Debug Info Version", i32 3}
+!11 = !{i32 1, !"wchar_size", i32 4}
+!12 = !{!"clang version 22.0.0git"}

@llvmbot
Copy link
Member

llvmbot commented Oct 21, 2025

@llvm/pr-subscribers-debuginfo

Author: Orlando Cazalet-Hyams (OCHyams)

Changes

DW_TAG_base_type DIEs are permitted to have both byte_size and bit_size attributes "If the value of an object of the given type does not fully occupy the storage described by a byte size attribute"

Change Clang to emit the actual bit-size of _BitInt in debug metadata, and change LLVM to add DW_AT_bit_size to base_type DIEs when the condition above is true.

Fixes #61952


Full diff: https://github.com/llvm/llvm-project/pull/164372.diff

5 Files Affected:

  • (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+1-3)
  • (added) clang/test/DebugInfo/Generic/bit-int.c (+8)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp (+5)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp (+11-2)
  • (added) llvm/test/DebugInfo/bit-int-size.ll (+38)
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index 12e2813ef2ec7..2ad187de16acc 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -1152,9 +1152,7 @@ llvm::DIType *CGDebugInfo::CreateType(const BitIntType *Ty) {
   llvm::dwarf::TypeKind Encoding = Ty->isUnsigned()
                                        ? llvm::dwarf::DW_ATE_unsigned
                                        : llvm::dwarf::DW_ATE_signed;
-
-  return DBuilder.createBasicType(Name, CGM.getContext().getTypeSize(Ty),
-                                  Encoding);
+  return DBuilder.createBasicType(Name, Ty->getNumBits(), Encoding);
 }
 
 llvm::DIType *CGDebugInfo::CreateType(const ComplexType *Ty) {
diff --git a/clang/test/DebugInfo/Generic/bit-int.c b/clang/test/DebugInfo/Generic/bit-int.c
new file mode 100644
index 0000000000000..16b3d7a2582c9
--- /dev/null
+++ b/clang/test/DebugInfo/Generic/bit-int.c
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -x c++ %s -debug-info-kind=standalone -gno-column-info -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 -x c   %s -debug-info-kind=standalone -gno-column-info -emit-llvm -o - | FileCheck %s
+
+unsigned _BitInt(2) a;
+_BitInt(2) b;
+
+// CHECK: !DIBasicType(name: "_BitInt", size: 2, encoding: DW_ATE_signed)
+// CHECK: !DIBasicType(name: "unsigned _BitInt", size: 2, encoding: DW_ATE_unsigned)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 518121e200190..6f68548aa25f4 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -1795,6 +1795,11 @@ void DwarfCompileUnit::createBaseTypeDIEs() {
     // Round up to smallest number of bytes that contains this number of bits.
     addUInt(Die, dwarf::DW_AT_byte_size, std::nullopt,
             divideCeil(Btr.BitSize, 8));
+    // If the value of an object of the given type does not fully occupy the
+    // storage described by a byte size attribute, the base type entry may also
+    // have a DW_AT_bit_size [...] attribute.
+    if (Btr.BitSize && (Btr.BitSize % 8))
+      addUInt(Die, dwarf::DW_AT_bit_size, std::nullopt, Btr.BitSize);
 
     Btr.Die = &Die;
   }
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
index e40fb768027b8..1253d9a463cbb 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
@@ -766,8 +766,17 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DIBasicType *BTy) {
     addUInt(Buffer, dwarf::DW_AT_encoding, dwarf::DW_FORM_data1,
             BTy->getEncoding());
 
-  uint64_t Size = BTy->getSizeInBits() >> 3;
-  addUInt(Buffer, dwarf::DW_AT_byte_size, std::nullopt, Size);
+  uint64_t SizeInBytes = divideCeil(BTy->getSizeInBits(), 8);
+  addUInt(Buffer, dwarf::DW_AT_byte_size, std::nullopt, SizeInBytes);
+  if (BTy->getTag() == dwarf::Tag::DW_TAG_base_type) {
+    // DW_TAG_base_type:
+    // If the value of an object of the given type does not fully occupy the
+    // storage described by a byte size attribute, the base type entry may also
+    // have a DW_AT_bit_size [...] attribute.
+    if (uint64_t SizeInBits = BTy->getSizeInBits();
+        SizeInBits && SizeInBits % 8)
+      addUInt(Buffer, dwarf::DW_AT_bit_size, std::nullopt, SizeInBits);
+  }
 
   if (BTy->isBigEndian())
     addUInt(Buffer, dwarf::DW_AT_endianity, std::nullopt, dwarf::DW_END_big);
diff --git a/llvm/test/DebugInfo/bit-int-size.ll b/llvm/test/DebugInfo/bit-int-size.ll
new file mode 100644
index 0000000000000..69ab756f4e288
--- /dev/null
+++ b/llvm/test/DebugInfo/bit-int-size.ll
@@ -0,0 +1,38 @@
+; RUN: %llc_dwarf %s -filetype=obj -o - | llvm-dwarfdump - | FileCheck %s
+; REQUIRES: object-emission
+
+;; Check base types with bit-sizes that don't fit fully fit within a byte
+;; multiple get both a a byte_size and bit_size attribute.
+
+; CHECK: DW_TAG_base_type
+; CHECK-NEXT: DW_AT_name      ("unsigned _BitInt")
+; CHECK-NEXT: DW_AT_encoding  (DW_ATE_unsigned)
+; CHECK-NEXT: DW_AT_byte_size (0x02)
+; CHECK-NEXT: DW_AT_bit_size  (0x09)
+
+; CHECK: DW_TAG_base_type
+; CHECK-NEXT: DW_AT_name      ("unsigned _BitInt")
+; CHECK-NEXT: DW_AT_encoding  (DW_ATE_signed)
+; CHECK-NEXT: DW_AT_byte_size (0x01)
+; CHECK-NEXT: DW_AT_bit_size  (0x02)
+
+@a = global i8 0, align 1, !dbg !0
+@b = global i8 0, align 1, !dbg !5
+
+!llvm.dbg.cu = !{!2}
+!llvm.module.flags = !{!10, !11}
+!llvm.ident = !{!12}
+
+!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
+!1 = distinct !DIGlobalVariable(name: "a", scope: !2, file: !7, line: 4, type: !9, isLocal: false, isDefinition: true)
+!2 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !3, producer: "clang version 22.0.0git", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, splitDebugInlining: false, nameTableKind: None)
+!3 = !DIFile(filename: "bit-int.c", directory: "/")
+!4 = !{!0, !5}
+!5 = !DIGlobalVariableExpression(var: !6, expr: !DIExpression())
+!6 = distinct !DIGlobalVariable(name: "b", scope: !2, file: !7, line: 5, type: !8, isLocal: false, isDefinition: true)
+!7 = !DIFile(filename: "bit-int.c", directory: "/")
+!8 = !DIBasicType(name: "_BitInt", size: 2, encoding: DW_ATE_signed)
+!9 = !DIBasicType(name: "unsigned _BitInt", size: 2, encoding: DW_ATE_unsigned)
+!10 = !{i32 2, !"Debug Info Version", i32 3}
+!11 = !{i32 1, !"wchar_size", i32 4}
+!12 = !{!"clang version 22.0.0git"}

@llvmbot
Copy link
Member

llvmbot commented Oct 21, 2025

@llvm/pr-subscribers-clang

Author: Orlando Cazalet-Hyams (OCHyams)

Changes

DW_TAG_base_type DIEs are permitted to have both byte_size and bit_size attributes "If the value of an object of the given type does not fully occupy the storage described by a byte size attribute"

Change Clang to emit the actual bit-size of _BitInt in debug metadata, and change LLVM to add DW_AT_bit_size to base_type DIEs when the condition above is true.

Fixes #61952


Full diff: https://github.com/llvm/llvm-project/pull/164372.diff

5 Files Affected:

  • (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+1-3)
  • (added) clang/test/DebugInfo/Generic/bit-int.c (+8)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp (+5)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp (+11-2)
  • (added) llvm/test/DebugInfo/bit-int-size.ll (+38)
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index 12e2813ef2ec7..2ad187de16acc 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -1152,9 +1152,7 @@ llvm::DIType *CGDebugInfo::CreateType(const BitIntType *Ty) {
   llvm::dwarf::TypeKind Encoding = Ty->isUnsigned()
                                        ? llvm::dwarf::DW_ATE_unsigned
                                        : llvm::dwarf::DW_ATE_signed;
-
-  return DBuilder.createBasicType(Name, CGM.getContext().getTypeSize(Ty),
-                                  Encoding);
+  return DBuilder.createBasicType(Name, Ty->getNumBits(), Encoding);
 }
 
 llvm::DIType *CGDebugInfo::CreateType(const ComplexType *Ty) {
diff --git a/clang/test/DebugInfo/Generic/bit-int.c b/clang/test/DebugInfo/Generic/bit-int.c
new file mode 100644
index 0000000000000..16b3d7a2582c9
--- /dev/null
+++ b/clang/test/DebugInfo/Generic/bit-int.c
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -x c++ %s -debug-info-kind=standalone -gno-column-info -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 -x c   %s -debug-info-kind=standalone -gno-column-info -emit-llvm -o - | FileCheck %s
+
+unsigned _BitInt(2) a;
+_BitInt(2) b;
+
+// CHECK: !DIBasicType(name: "_BitInt", size: 2, encoding: DW_ATE_signed)
+// CHECK: !DIBasicType(name: "unsigned _BitInt", size: 2, encoding: DW_ATE_unsigned)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 518121e200190..6f68548aa25f4 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -1795,6 +1795,11 @@ void DwarfCompileUnit::createBaseTypeDIEs() {
     // Round up to smallest number of bytes that contains this number of bits.
     addUInt(Die, dwarf::DW_AT_byte_size, std::nullopt,
             divideCeil(Btr.BitSize, 8));
+    // If the value of an object of the given type does not fully occupy the
+    // storage described by a byte size attribute, the base type entry may also
+    // have a DW_AT_bit_size [...] attribute.
+    if (Btr.BitSize && (Btr.BitSize % 8))
+      addUInt(Die, dwarf::DW_AT_bit_size, std::nullopt, Btr.BitSize);
 
     Btr.Die = &Die;
   }
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
index e40fb768027b8..1253d9a463cbb 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
@@ -766,8 +766,17 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DIBasicType *BTy) {
     addUInt(Buffer, dwarf::DW_AT_encoding, dwarf::DW_FORM_data1,
             BTy->getEncoding());
 
-  uint64_t Size = BTy->getSizeInBits() >> 3;
-  addUInt(Buffer, dwarf::DW_AT_byte_size, std::nullopt, Size);
+  uint64_t SizeInBytes = divideCeil(BTy->getSizeInBits(), 8);
+  addUInt(Buffer, dwarf::DW_AT_byte_size, std::nullopt, SizeInBytes);
+  if (BTy->getTag() == dwarf::Tag::DW_TAG_base_type) {
+    // DW_TAG_base_type:
+    // If the value of an object of the given type does not fully occupy the
+    // storage described by a byte size attribute, the base type entry may also
+    // have a DW_AT_bit_size [...] attribute.
+    if (uint64_t SizeInBits = BTy->getSizeInBits();
+        SizeInBits && SizeInBits % 8)
+      addUInt(Buffer, dwarf::DW_AT_bit_size, std::nullopt, SizeInBits);
+  }
 
   if (BTy->isBigEndian())
     addUInt(Buffer, dwarf::DW_AT_endianity, std::nullopt, dwarf::DW_END_big);
diff --git a/llvm/test/DebugInfo/bit-int-size.ll b/llvm/test/DebugInfo/bit-int-size.ll
new file mode 100644
index 0000000000000..69ab756f4e288
--- /dev/null
+++ b/llvm/test/DebugInfo/bit-int-size.ll
@@ -0,0 +1,38 @@
+; RUN: %llc_dwarf %s -filetype=obj -o - | llvm-dwarfdump - | FileCheck %s
+; REQUIRES: object-emission
+
+;; Check base types with bit-sizes that don't fit fully fit within a byte
+;; multiple get both a a byte_size and bit_size attribute.
+
+; CHECK: DW_TAG_base_type
+; CHECK-NEXT: DW_AT_name      ("unsigned _BitInt")
+; CHECK-NEXT: DW_AT_encoding  (DW_ATE_unsigned)
+; CHECK-NEXT: DW_AT_byte_size (0x02)
+; CHECK-NEXT: DW_AT_bit_size  (0x09)
+
+; CHECK: DW_TAG_base_type
+; CHECK-NEXT: DW_AT_name      ("unsigned _BitInt")
+; CHECK-NEXT: DW_AT_encoding  (DW_ATE_signed)
+; CHECK-NEXT: DW_AT_byte_size (0x01)
+; CHECK-NEXT: DW_AT_bit_size  (0x02)
+
+@a = global i8 0, align 1, !dbg !0
+@b = global i8 0, align 1, !dbg !5
+
+!llvm.dbg.cu = !{!2}
+!llvm.module.flags = !{!10, !11}
+!llvm.ident = !{!12}
+
+!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
+!1 = distinct !DIGlobalVariable(name: "a", scope: !2, file: !7, line: 4, type: !9, isLocal: false, isDefinition: true)
+!2 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !3, producer: "clang version 22.0.0git", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, splitDebugInlining: false, nameTableKind: None)
+!3 = !DIFile(filename: "bit-int.c", directory: "/")
+!4 = !{!0, !5}
+!5 = !DIGlobalVariableExpression(var: !6, expr: !DIExpression())
+!6 = distinct !DIGlobalVariable(name: "b", scope: !2, file: !7, line: 5, type: !8, isLocal: false, isDefinition: true)
+!7 = !DIFile(filename: "bit-int.c", directory: "/")
+!8 = !DIBasicType(name: "_BitInt", size: 2, encoding: DW_ATE_signed)
+!9 = !DIBasicType(name: "unsigned _BitInt", size: 2, encoding: DW_ATE_unsigned)
+!10 = !{i32 2, !"Debug Info Version", i32 3}
+!11 = !{i32 1, !"wchar_size", i32 4}
+!12 = !{!"clang version 22.0.0git"}

@tbaederr tbaederr changed the title [DebugInfo] Emit DW_AT_bit_size for _BitSize types and others [DebugInfo] Emit DW_AT_bit_size for _BitInt types and others Oct 21, 2025
@Michael137
Copy link
Member

There was a similar PR some time ago which never landed: #69741

Will need to remind myself of the discussion there but IIRC I think that PR was trying to do it for structure types, and the motivating example wasn't compelling enough. Whereas this is for base types, and aligns with the DWARF spec, so it does seem reasonable (and required for https://dwarfstd.org/issues/230529.1.html)

@@ -766,8 +766,17 @@ void DwarfUnit::constructTypeDIE(DIE &Buffer, const DIBasicType *BTy) {
addUInt(Buffer, dwarf::DW_AT_encoding, dwarf::DW_FORM_data1,
BTy->getEncoding());

uint64_t Size = BTy->getSizeInBits() >> 3;
addUInt(Buffer, dwarf::DW_AT_byte_size, std::nullopt, Size);
uint64_t SizeInBytes = divideCeil(BTy->getSizeInBits(), 8);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know GCC emits both bit-size and byte-size, but is the byte-size here really useful if we have the bit-size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally in favour of doing whatever is least surprising. I need to check with our debugger folks if they'd be ok with only keeping one, but if that's not the case we could stick it behind SCE tuning, if you feel strongly here about only emitting one? (@dwblaikie might also have opinions?)

FWIW LLDB doesn't seem to understand bit_size in this context (see this comment), though I haven't tried the same example emitting only bit_size (it could reasonably be that seeing both is confusing it? equally it might just not expect bit_size in this context).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this thing has a byte storage size, even if the arithmetic is done exactly to the number of bits specified? So I'd imagine we could want both - though it'd be a bit out-of-DWARF-spec which says a base type should have only one or the other. I guess really the thing has a byte_size, and a "number of bits used for arithmetic" - it's not really the size of the object, it's to do with the encoding/how the bits are used.

But I don't especially mind if we use the bit_size to encode "number of bits used for arithmetic", especially if GCC's already doing it.

Maybe file a DWARF issue as well.

Copy link
Contributor Author

@OCHyams OCHyams Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though it'd be a bit out-of-DWARF-spec which says a base type should have only one or the other.

My optimisitic interpretation of the DWARF spec is that it especially is it allows it for base types (chpt 5, page end of 103 to top of 104):

A base type entry has a DW_AT_byte_size attribute or a DW_AT_bit_size attribute. [...] If the value of an object of the given type does not fully occupy the storage described by a byte size attribute, the base type entry may also have a DW_AT_bit_size and a DW_AT_data_bit_offset attribute

It says both "or" and "and" which is a bit ambiguous, but other types only say "or".

Copy link
Contributor Author

@OCHyams OCHyams Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I misunderstood what you were saying about the byte storage size. I get it now, I'll look into it tomorrow

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think the quoted parts you mentioned do refute/address my concerns - though perhaps the spec expects us to also emit a DW_AT_data_bit_offset of zero in this case? Again, I don't mind matching GCC if it's just a bit weird - though it wouldn't be hard or expensive to add DW_AT_data_bit_offset as well (especially if it's just always 0 and can be stored in the abbrev with implicit_const - but I guess that depends where the padding is and where the bits are stored (if they're LSB or MSB, etc))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the missing data_bit_offset in this case is ok too - the bit I quoted continues on a few sentences later:

The data bit offset attribute is the offset in bits from the beginning of the containing storage to the beginning of the value. Bits that are part of the offset are padding. If this attribute is omitted a default data bit offset of zero is assumed.

GCC gives us this DWARF:

0x000000a1:   DW_TAG_base_type
                DW_AT_byte_size (0x02)
                DW_AT_encoding  (DW_ATE_signed)
                DW_AT_bit_size  (0x0f)
                DW_AT_name      ("_BitInt(15)")

The only difference between that and my patched Clang is that our name is _BitInt and GCC's is _BitInt(15). (I'm happy to change that? not sure what's best - I suppose there's value to users if the bit-count is showing up in types in the debugger)

Copy link
Member

@Michael137 Michael137 Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only difference between that and my patched Clang is that our name is _BitInt and GCC's is _BitInt(15)

At least for LLDB, I'd have to double check how the type-name ends up being displayed. If it just uses the DW_AT_name then that would be great! But if we eventually can just create a clang::BitIntType from this DIE and the bit-ness, I think Clang's TypePrinter is capable of printing the typename as we would expect:

void TypePrinter::printBitIntBefore(const BitIntType *T, raw_ostream &OS) {
if (T->isUnsigned())
OS << "unsigned ";
OS << "_BitInt(" << T->getNumBits() << ")";
spaceBeforePlaceHolder(OS);
}

In fact including the bit-ness in the name might make this a bit harder for LLDB. But probably still doable.

Not opposed to aligning with GCC either though. And might be useful for other consumers 🤷‍♂️ But maybe best for a separate patch?

There was an LLDB issue around _BitInt a while ago where we talked about what would need to be done to support the type properly: #110273 (comment)

@OCHyams
Copy link
Contributor Author

OCHyams commented Oct 21, 2025

Thanks for taking a look

@OCHyams
Copy link
Contributor Author

OCHyams commented Oct 22, 2025

Breaking out of the inline comments - I've realised this patch doesn't always give us the expected byte_size. E.g. For _BitInt(23) GCC gives us:

0x000000a1:   DW_TAG_base_type
                DW_AT_byte_size (0x04)
                DW_AT_encoding  (DW_ATE_signed)
                DW_AT_bit_size  (0x17)
                DW_AT_name      ("_BitInt(23)")

Whereas this patch gives us:

0x00000042:   DW_TAG_base_type
                DW_AT_name      ("_BitInt")
                DW_AT_encoding  (DW_ATE_signed)
                DW_AT_byte_size (0x03)
                DW_AT_bit_size  (0x17)

Notice Clang says it's 3 bytes (because it's just finding the next byte size that 0x17 bits fits into) whereas GCC says it's 4 bytes (which reflects the sizeof size).

It's probably cleanest to pass both byte_size and bit_size down from the front end. I'll have a go at that later/tomorrow if others agree.

@OCHyams
Copy link
Contributor Author

OCHyams commented Oct 23, 2025

Done - this is now very close to @dstenb 's https://reviews.llvm.org/D147625

  • Add 32 bit field DataSizeInBits / data_size to DIBasicType.
  • Add roundtrip test.
  • Undo changes base-type-size.ll as DW_ATE_unsigned_1 doesn't use data_size.

I'll open a separate pull request to include the bit-count in the name, so that can be discussed separately (it's not a regression to leave it as it is now).

@dstenb
Copy link
Collaborator

dstenb commented Oct 23, 2025

Thanks for picking this up, @OCHyams, and sorry for some delay in reviewing this!

I see that you discussed omitting DW_AT_data_bit_offset of value 0. I think that we need to amend the patch so that it emits that attribute for non byte size-multiple types on big-endian targets. I think I may have only experimented with x86 when I made that initial patch (I see that there is a commented out line adding that attribute in my patch that I don't remember the history for).

I have confirmed this by trying out this patch on our downstream target with GDB, and I think it can be seen e.g. with the following MIPS example:

struct { uint16_t a : 1; uint16_t b : 15; } bf = { .b = 0x3fffU };
unsigned _BitInt(15) bi = 0x3fffU;

Memory-wise the two globals are the same, with bf.a "emulating" the padding bit in the stored bi value. The bf.b field gets a DW_AT_data_bit_offset of 1, so I think we need to do the same for bi.

0x0000003c:     DW_TAG_member
                  DW_AT_name    ("b")
                  DW_AT_type    (0x00000047 "uint16_t")
                  DW_AT_decl_file       ("bif.c")
                  DW_AT_decl_line       (3)
                  DW_AT_bit_size        (0x0f)
                  DW_AT_data_bit_offset (0x01)

@OCHyams
Copy link
Contributor Author

OCHyams commented Oct 23, 2025

Hi @dstenb thanks for helping out with this and taking a look.

Regarding endianness, I wonder if that's right. My impression is DWARF tries to leave some things up to the implementation based on the target. I'm not 100% sure on this, so if you are confident on this please do say so. But looking at a side-by-side BE and LE example of a bitfield https://godbolt.org/z/GEWcG1GYG GCC emits the same DW_AT_data_bit_offset for both. (I can't find a BE architecture / option for GCC that supports _BitInt, so I couldn't check what it does for those.)

That suggests to me that the position of the bit_size data within the byte_size can probably reasonably be assumed based on the target endianess? As I said, not 100% sure on this one so would definitely appreciate additional input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category debuginfo llvm:codegen llvm:ir

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incomplete debug information for _BitInt(N) when N differs from store size

5 participants