Skip to content

Conversation

Michael137
Copy link
Member

Instead of hardcoding these in several places lets make these part of the LLVMConstants enum.

There's no official C++03 language version code so I left it out (though we have some support for it when converting between DW_LNAME_ and DW_LANG_C_plus_plus_03).

…age codes

Instead of hardcoding these in several places lets make these part of the `LLVMConstants` enum.
@llvmbot
Copy link
Member

llvmbot commented Oct 13, 2025

@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-llvm-binary-utilities

Author: Michael Buch (Michael137)

Changes

Instead of hardcoding these in several places lets make these part of the LLVMConstants enum.

There's no official C++03 language version code so I left it out (though we have some support for it when converting between DW_LNAME_ and DW_LANG_C_plus_plus_03).


Full diff: https://github.com/llvm/llvm-project/pull/163199.diff

3 Files Affected:

  • (modified) llvm/include/llvm/BinaryFormat/Dwarf.h (+33-17)
  • (modified) llvm/lib/BinaryFormat/Dwarf.cpp (+9-9)
  • (modified) llvm/unittests/BinaryFormat/DwarfTest.cpp (+9-9)
diff --git a/llvm/include/llvm/BinaryFormat/Dwarf.h b/llvm/include/llvm/BinaryFormat/Dwarf.h
index 5039a3fe7ecc7..7a0fda1d041f5 100644
--- a/llvm/include/llvm/BinaryFormat/Dwarf.h
+++ b/llvm/include/llvm/BinaryFormat/Dwarf.h
@@ -66,6 +66,22 @@ enum LLVMConstants : uint32_t {
   DW_ARANGES_VERSION = 2,  ///< Section version number for .debug_aranges.
   /// \}
 
+  /// DWARFv6 DW_AT_language_version constants:
+  /// https://dwarfstd.org/languages-v6.html
+  /// \{
+  DW_LANG_VERSION_C89 = 198912,
+  DW_LANG_VERSION_C99 = 199901,
+  DW_LANG_VERSION_C11 = 201112,
+  DW_LANG_VERSION_C17 = 201710,
+  DW_LANG_VERSION_C23 = 202311,
+  DW_LANG_VERSION_C_plus_plus_98 = 199711,
+  DW_LANG_VERSION_C_plus_plus_11 = 201103,
+  DW_LANG_VERSION_C_plus_plus_14 = 201402,
+  DW_LANG_VERSION_C_plus_plus_17 = 201703,
+  DW_LANG_VERSION_C_plus_plus_20 = 202002,
+  DW_LANG_VERSION_C_plus_plus_23 = 202302,
+  /// \}
+
   /// Identifiers we use to distinguish vendor extensions.
   /// \{
   DWARF_VENDOR_DWARF = 0, ///< Defined in v2 or later of the DWARF standard.
@@ -246,29 +262,29 @@ inline std::optional<SourceLanguage> toDW_LANG(SourceLanguageName name,
   case DW_LNAME_C: // YYYYMM, K&R 000000
     if (version == 0)
       return DW_LANG_C;
-    if (version <= 198912)
+    if (version <= DW_LANG_VERSION_C89)
       return DW_LANG_C89;
-    if (version <= 199901)
+    if (version <= DW_LANG_VERSION_C99)
       return DW_LANG_C99;
-    if (version <= 201112)
+    if (version <= DW_LANG_VERSION_C11)
       return DW_LANG_C11;
-    if (version <= 201710)
+    if (version <= DW_LANG_VERSION_C17)
       return DW_LANG_C17;
     return {};
   case DW_LNAME_C_plus_plus: // YYYYMM
     if (version == 0)
       return DW_LANG_C_plus_plus;
-    if (version <= 199711)
+    if (version <= DW_LANG_VERSION_C_plus_plus_98)
       return DW_LANG_C_plus_plus;
     if (version <= 200310)
       return DW_LANG_C_plus_plus_03;
-    if (version <= 201103)
+    if (version <= DW_LANG_VERSION_C_plus_plus_11)
       return DW_LANG_C_plus_plus_11;
-    if (version <= 201402)
+    if (version <= DW_LANG_VERSION_C_plus_plus_14)
       return DW_LANG_C_plus_plus_14;
-    if (version <= 201703)
+    if (version <= DW_LANG_VERSION_C_plus_plus_17)
       return DW_LANG_C_plus_plus_17;
-    if (version <= 202002)
+    if (version <= DW_LANG_VERSION_C_plus_plus_20)
       return DW_LANG_C_plus_plus_20;
     return {};
   case DW_LNAME_Cobol: // YYYY
@@ -384,25 +400,25 @@ toDW_LNAME(SourceLanguage language) {
   case DW_LANG_C:
     return {{DW_LNAME_C, 0}};
   case DW_LANG_C89:
-    return {{DW_LNAME_C, 198912}};
+    return {{DW_LNAME_C, DW_LANG_VERSION_C89}};
   case DW_LANG_C99:
-    return {{DW_LNAME_C, 199901}};
+    return {{DW_LNAME_C, DW_LANG_VERSION_C99}};
   case DW_LANG_C11:
-    return {{DW_LNAME_C, 201112}};
+    return {{DW_LNAME_C, DW_LANG_VERSION_C11}};
   case DW_LANG_C17:
-    return {{DW_LNAME_C, 201710}};
+    return {{DW_LNAME_C, DW_LANG_VERSION_C17}};
   case DW_LANG_C_plus_plus:
     return {{DW_LNAME_C_plus_plus, 0}};
   case DW_LANG_C_plus_plus_03:
     return {{DW_LNAME_C_plus_plus, 200310}};
   case DW_LANG_C_plus_plus_11:
-    return {{DW_LNAME_C_plus_plus, 201103}};
+    return {{DW_LNAME_C_plus_plus, DW_LANG_VERSION_C_plus_plus_11}};
   case DW_LANG_C_plus_plus_14:
-    return {{DW_LNAME_C_plus_plus, 201402}};
+    return {{DW_LNAME_C_plus_plus, DW_LANG_VERSION_C_plus_plus_14}};
   case DW_LANG_C_plus_plus_17:
-    return {{DW_LNAME_C_plus_plus, 201703}};
+    return {{DW_LNAME_C_plus_plus, DW_LANG_VERSION_C_plus_plus_17}};
   case DW_LANG_C_plus_plus_20:
-    return {{DW_LNAME_C_plus_plus, 202002}};
+    return {{DW_LNAME_C_plus_plus, DW_LANG_VERSION_C_plus_plus_20}};
   case DW_LANG_Cobol74:
     return {{DW_LNAME_Cobol, 1974}};
   case DW_LANG_Cobol85:
diff --git a/llvm/lib/BinaryFormat/Dwarf.cpp b/llvm/lib/BinaryFormat/Dwarf.cpp
index 55fa2df632bfa..9c0dabd0e746f 100644
--- a/llvm/lib/BinaryFormat/Dwarf.cpp
+++ b/llvm/lib/BinaryFormat/Dwarf.cpp
@@ -513,30 +513,30 @@ StringRef llvm::dwarf::LanguageDescription(dwarf::SourceLanguageName Name,
   case DW_LNAME_C: {
     if (Version == 0)
       break;
-    if (Version <= 198912)
+    if (Version <= DW_LANG_VERSION_C89)
       return "C89";
-    if (Version <= 199901)
+    if (Version <= DW_LANG_VERSION_C99)
       return "C99";
-    if (Version <= 201112)
+    if (Version <= DW_LANG_VERSION_C11)
       return "C11";
-    if (Version <= 201710)
+    if (Version <= DW_LANG_VERSION_C17)
       return "C17";
   } break;
 
   case DW_LNAME_C_plus_plus: {
     if (Version == 0)
       break;
-    if (Version <= 199711)
+    if (Version <= DW_LANG_VERSION_C_plus_plus_98)
       return "C++98";
     if (Version <= 200310)
       return "C++03";
-    if (Version <= 201103)
+    if (Version <= DW_LANG_VERSION_C_plus_plus_11)
       return "C++11";
-    if (Version <= 201402)
+    if (Version <= DW_LANG_VERSION_C_plus_plus_14)
       return "C++14";
-    if (Version <= 201703)
+    if (Version <= DW_LANG_VERSION_C_plus_plus_17)
       return "C++17";
-    if (Version <= 202002)
+    if (Version <= DW_LANG_VERSION_C_plus_plus_20)
       return "C++20";
   } break;
 
diff --git a/llvm/unittests/BinaryFormat/DwarfTest.cpp b/llvm/unittests/BinaryFormat/DwarfTest.cpp
index ba7d59182ea53..a219a1ab51376 100644
--- a/llvm/unittests/BinaryFormat/DwarfTest.cpp
+++ b/llvm/unittests/BinaryFormat/DwarfTest.cpp
@@ -296,32 +296,32 @@ LanguageDescriptionTestCase LanguageDescriptionTestCases[] = {
     {DW_LNAME_Fortran, 2019, "ISO Fortran"},
     {DW_LNAME_C, 0, "C (K&R and ISO)"},
     {DW_LNAME_C, 198911, "C89"},
-    {DW_LNAME_C, 198912, "C89"},
-    {DW_LNAME_C, 199901, "C99"},
+    {DW_LNAME_C, DW_LANG_VERSION_C89, "C89"},
+    {DW_LNAME_C, DW_LANG_VERSION_C99, "C99"},
     {DW_LNAME_C, 199902, "C11"},
     {DW_LNAME_C, 201111, "C11"},
-    {DW_LNAME_C, 201112, "C11"},
+    {DW_LNAME_C, DW_LANG_VERSION_C11, "C11"},
     {DW_LNAME_C, 201201, "C17"},
     {DW_LNAME_C, 201709, "C17"},
-    {DW_LNAME_C, 201710, "C17"},
+    {DW_LNAME_C, DW_LANG_VERSION_C17, "C17"},
     {DW_LNAME_C, 201711, "C (K&R and ISO)"},
     {DW_LNAME_C_plus_plus, 0, "ISO C++"},
     {DW_LNAME_C_plus_plus, 199710, "C++98"},
-    {DW_LNAME_C_plus_plus, 199711, "C++98"},
+    {DW_LNAME_C_plus_plus, DW_LANG_VERSION_C_plus_plus_98, "C++98"},
     {DW_LNAME_C_plus_plus, 199712, "C++03"},
     {DW_LNAME_C_plus_plus, 200310, "C++03"},
     {DW_LNAME_C_plus_plus, 200311, "C++11"},
     {DW_LNAME_C_plus_plus, 201102, "C++11"},
-    {DW_LNAME_C_plus_plus, 201103, "C++11"},
+    {DW_LNAME_C_plus_plus, DW_LANG_VERSION_C_plus_plus_11, "C++11"},
     {DW_LNAME_C_plus_plus, 201104, "C++14"},
     {DW_LNAME_C_plus_plus, 201401, "C++14"},
-    {DW_LNAME_C_plus_plus, 201402, "C++14"},
+    {DW_LNAME_C_plus_plus, DW_LANG_VERSION_C_plus_plus_14, "C++14"},
     {DW_LNAME_C_plus_plus, 201403, "C++17"},
     {DW_LNAME_C_plus_plus, 201702, "C++17"},
-    {DW_LNAME_C_plus_plus, 201703, "C++17"},
+    {DW_LNAME_C_plus_plus, DW_LANG_VERSION_C_plus_plus_17, "C++17"},
     {DW_LNAME_C_plus_plus, 201704, "C++20"},
     {DW_LNAME_C_plus_plus, 202001, "C++20"},
-    {DW_LNAME_C_plus_plus, 202002, "C++20"},
+    {DW_LNAME_C_plus_plus, DW_LANG_VERSION_C_plus_plus_20, "C++20"},
     {DW_LNAME_C_plus_plus, 202003, "ISO C++"},
     {DW_LNAME_ObjC_plus_plus, 0, LanguageDescription(DW_LNAME_ObjC_plus_plus)},
     {DW_LNAME_ObjC_plus_plus, 1, LanguageDescription(DW_LNAME_ObjC_plus_plus)},

/// DWARFv6 DW_AT_language_version constants:
/// https://dwarfstd.org/languages-v6.html
/// \{
DW_LANG_VERSION_C89 = 198912,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the best naming convention here would be. The DW_LANG_ prefix might get confused with the DW_LANG_ language codes.

if (version == 0)
return DW_LANG_C;
if (version <= 198912)
if (version <= DW_LANG_VERSION_C89)
Copy link
Collaborator

@adrian-prantl adrian-prantl Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding a table to Dwarf.def that ties these together, such as:

HANDLE_DW_LANG_VERSION(DW_LANG_C99, "C99", 199901)

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm how would we be able to re-use it in this function though? Because we're enumerating DW_LNAME_, not DW_LANG_. And DW_LNAME_ doesn't encode versions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we would either need one macro per language or a real lookup table. And wether the table lives in code or data doesn't make a huge difference.

if (version == 0)
return DW_LANG_C;
if (version <= 198912)
if (version <= DW_LANG_VERSION_C89)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we would either need one macro per language or a real lookup table. And wether the table lives in code or data doesn't make a huge difference.

@Michael137
Copy link
Member Author

Closing in favour of #163348

@Michael137 Michael137 closed this Oct 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants