Skip to content

[utils][TableGen] Unify converting names to upper-camel case #141762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 5, 2025

Conversation

kparzysz
Copy link
Contributor

There were 3 different functions in DirectiveEmitter.cpp doing essentially the same thing: taking a name separated with _ or whitepace, and converting it to the upper-camel case. Extract that into a single function that can handle different sets of separators.

kparzysz added 2 commits May 28, 2025 08:23
The class "ClauseVal" actually represents a definition of an enumeration
value, and in itself it is not bound to any clause. Rename it to EnumVal
and add a comment clarifying how it's translated into an actual enum
definition in the generated source code.

There is no change in functionality.
There were 3 different functions in DirectiveEmitter.cpp doing essentially
the same thing: taking a name separated with _ or whitepace, and converting
it to the upper-camel case. Extract that into a single function that can
handle different sets of separators.
@kparzysz kparzysz requested review from erichkeane and jurahul May 28, 2025 13:29
@llvmbot
Copy link
Member

llvmbot commented May 28, 2025

@llvm/pr-subscribers-tablegen

Author: Krzysztof Parzyszek (kparzysz)

Changes

There were 3 different functions in DirectiveEmitter.cpp doing essentially the same thing: taking a name separated with _ or whitepace, and converting it to the upper-camel case. Extract that into a single function that can handle different sets of separators.


Full diff: https://github.com/llvm/llvm-project/pull/141762.diff

2 Files Affected:

  • (modified) llvm/include/llvm/TableGen/DirectiveEmitter.h (+32-44)
  • (modified) llvm/utils/TableGen/Basic/DirectiveEmitter.cpp (+1-1)
diff --git a/llvm/include/llvm/TableGen/DirectiveEmitter.h b/llvm/include/llvm/TableGen/DirectiveEmitter.h
index 8615442ebff9f..48e18de0904c0 100644
--- a/llvm/include/llvm/TableGen/DirectiveEmitter.h
+++ b/llvm/include/llvm/TableGen/DirectiveEmitter.h
@@ -113,14 +113,39 @@ class BaseRecord {
 
   // Returns the name of the directive formatted for output. Whitespace are
   // replaced with underscores.
-  static std::string formatName(StringRef Name) {
+  static std::string getSnakeName(StringRef Name) {
     std::string N = Name.str();
     llvm::replace(N, ' ', '_');
     return N;
   }
 
+  static std::string getUpperCamelName(StringRef Name, StringRef Sep) {
+    std::string Camel = Name.str();
+    // Convert to uppercase
+    bool Cap = true;
+    llvm::transform(Camel, Camel.begin(), [&](unsigned char C) {
+      if (Sep.contains(C)) {
+        assert(!Cap && "No initial or repeated separators");
+        Cap = true;
+      } else if (Cap) {
+        C = llvm::toUpper(C);
+        Cap = false;
+      }
+      return C;
+    });
+    size_t Out = 0;
+    // Remove separators
+    for (size_t In = 0, End = Camel.size(); In != End; ++In) {
+      unsigned char C = Camel[In];
+      if (!Sep.contains(C))
+        Camel[Out++] = C;
+    }
+    Camel.resize(Out);
+    return Camel;
+  }
+
   std::string getFormattedName() const {
-    return formatName(Def->getValueAsString("name"));
+    return getSnakeName(Def->getValueAsString("name"));
   }
 
   bool isDefault() const { return Def->getValueAsBit("isDefault"); }
@@ -172,26 +197,13 @@ class Directive : public BaseRecord {
 
   // Clang uses a different format for names of its directives enum.
   std::string getClangAccSpelling() const {
-    std::string Name = Def->getValueAsString("name").str();
+    StringRef Name = Def->getValueAsString("name");
 
     // Clang calls the 'unknown' value 'invalid'.
     if (Name == "unknown")
       return "Invalid";
 
-    // Clang entries all start with a capital letter, so apply that.
-    Name[0] = std::toupper(Name[0]);
-    // Additionally, spaces/underscores are handled by capitalizing the next
-    // letter of the name and removing the space/underscore.
-    for (unsigned I = 0; I < Name.size(); ++I) {
-      if (Name[I] == ' ' || Name[I] == '_') {
-        Name.erase(I, 1);
-        assert(Name[I] != ' ' && Name[I] != '_' &&
-               "No double spaces/underscores");
-        Name[I] = std::toupper(Name[I]);
-      }
-    }
-
-    return Name;
+    return BaseRecord::getUpperCamelName(Name, " _");
   }
 };
 
@@ -218,19 +230,7 @@ class Clause : public BaseRecord {
   //     num_threads -> NumThreads
   std::string getFormattedParserClassName() const {
     StringRef Name = Def->getValueAsString("name");
-    std::string N = Name.str();
-    bool Cap = true;
-    llvm::transform(N, N.begin(), [&Cap](unsigned char C) {
-      if (Cap == true) {
-        C = toUpper(C);
-        Cap = false;
-      } else if (C == '_') {
-        Cap = true;
-      }
-      return C;
-    });
-    erase(N, '_');
-    return N;
+    return BaseRecord::getUpperCamelName(Name, "_");
   }
 
   // Clang uses a different format for names of its clause enum, which can be
@@ -241,20 +241,8 @@ class Clause : public BaseRecord {
         !ClangSpelling.empty())
       return ClangSpelling.str();
 
-    std::string Name = Def->getValueAsString("name").str();
-    // Clang entries all start with a capital letter, so apply that.
-    Name[0] = std::toupper(Name[0]);
-    // Additionally, underscores are handled by capitalizing the next letter of
-    // the name and removing the underscore.
-    for (unsigned I = 0; I < Name.size(); ++I) {
-      if (Name[I] == '_') {
-        Name.erase(I, 1);
-        assert(Name[I] != '_' && "No double underscores");
-        Name[I] = std::toupper(Name[I]);
-      }
-    }
-
-    return Name;
+    StringRef Name = Def->getValueAsString("name");
+    return BaseRecord::getUpperCamelName(Name, "_");
   }
 
   // Optional field.
diff --git a/llvm/utils/TableGen/Basic/DirectiveEmitter.cpp b/llvm/utils/TableGen/Basic/DirectiveEmitter.cpp
index f459e7c98ebc1..9e79a83ed6e18 100644
--- a/llvm/utils/TableGen/Basic/DirectiveEmitter.cpp
+++ b/llvm/utils/TableGen/Basic/DirectiveEmitter.cpp
@@ -839,7 +839,7 @@ static void generateGetDirectiveLanguages(const DirectiveLanguage &DirLang,
         D.getSourceLanguages(), OS,
         [&](const Record *L) {
           StringRef N = L->getValueAsString("name");
-          OS << "SourceLanguage::" << BaseRecord::formatName(N);
+          OS << "SourceLanguage::" << BaseRecord::getSnakeName(N);
         },
         " | ");
     OS << ";\n";

Copy link
Contributor

@mrkajetanp mrkajetanp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jurahul jurahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a small comment

Base automatically changed from users/kparzysz/spr/t04-enumval to main June 4, 2025 13:16
@kparzysz kparzysz merged commit 2b3e07f into main Jun 5, 2025
11 checks passed
@kparzysz kparzysz deleted the users/kparzysz/spr/t05-upper-camel branch June 5, 2025 12:34
rorth pushed a commit to rorth/llvm-project that referenced this pull request Jun 11, 2025
…1762)

There were 3 different functions in DirectiveEmitter.cpp doing
essentially the same thing: taking a name separated with _ or whitepace,
and converting it to the upper-camel case. Extract that into a single
function that can handle different sets of separators.
DhruvSrivastavaX pushed a commit to DhruvSrivastavaX/lldb-for-aix that referenced this pull request Jun 12, 2025
…1762)

There were 3 different functions in DirectiveEmitter.cpp doing
essentially the same thing: taking a name separated with _ or whitepace,
and converting it to the upper-camel case. Extract that into a single
function that can handle different sets of separators.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants