Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb] Support custom LLVM formatting for variables #81196

Conversation

kastiglione
Copy link
Contributor

@kastiglione kastiglione commented Feb 8, 2024

Adds support for applying LLVM formatting to variables.

The reason for this is to support cases such as the following.

Let's say you have two separate bytes that you want to print as a combined hex value. Consider the following summary string:

${var.byte1%x}${var.byte2%x}

The output of this will be: 0x120x34. That is, a 0x prefix is unconditionally applied to each byte. This is unlike printf formatting where you must include the 0x yourself.

Currently, there's no way to do this with summary strings, instead you'll need a summary provider in python or c++.

This change introduces formatting support using LLVM's formatter system. This allows users to achieve the desired custom formatting using:

${var.byte1:x-}${var.byte2:x-}

Here, each variable is suffixed with :x-. This is passed to the LLVM formatter as {0:x-}. For integer values, x declares the output as hex, and - declares that no 0x prefix is to be used. Further, one could write:

${var.byte1:x-2}${var.byte2:x-2}

Where the added 2 results in these bytes being written with a minimum of 2 digits.

An alternative considered was to add a new format specifier that would print hex values without the 0x prefix. The reason that approach was not taken is because in addition to forcing a 0x prefix, hex values are also forced to use leading zeros. This approach lets the user have full control over formatting.

@kastiglione kastiglione marked this pull request as draft February 8, 2024 22:02
@llvmbot llvmbot added the lldb label Feb 8, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 8, 2024

@llvm/pr-subscribers-lldb

Author: Dave Lee (kastiglione)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/81196.diff

1 Files Affected:

  • (modified) lldb/source/Core/FormatEntity.cpp (+23-2)
diff --git a/lldb/source/Core/FormatEntity.cpp b/lldb/source/Core/FormatEntity.cpp
index fa5eadc6ff4e9a..0e929203935304 100644
--- a/lldb/source/Core/FormatEntity.cpp
+++ b/lldb/source/Core/FormatEntity.cpp
@@ -883,8 +883,29 @@ static bool DumpValue(Stream &s, const SymbolContext *sc,
   }
 
   if (!is_array_range) {
-    LLDB_LOGF(log,
-              "[Debugger::FormatPrompt] dumping ordinary printable output");
+    if (!entry.printf_format.empty()) {
+      auto type_info = target->GetTypeInfo();
+      if (type_info & eTypeIsInteger) {
+        if (type_info & eTypeIsSigned) {
+          bool success = false;
+          auto integer = target->GetValueAsSigned(0, &success);
+          if (success) {
+            LLDB_LOGF(log, "dumping using printf format");
+            s.Printf(entry.printf_format.c_str(), integer);
+            return true;
+          }
+        } else {
+          bool success = false;
+          auto integer = target->GetValueAsUnsigned(0, &success);
+          if (success) {
+            LLDB_LOGF(log, "dumping using printf format");
+            s.Printf(entry.printf_format.c_str(), integer);
+            return true;
+          }
+        }
+      }
+    }
+    LLDB_LOGF(log, "dumping ordinary printable output");
     return target->DumpPrintableRepresentation(s, val_obj_display,
                                                custom_format);
   } else {

@kastiglione
Copy link
Contributor Author

Note that the implementation here is a draft for illustration. I am first interested in high level agreement that allowing custom printf formatting of variables is good.

@jimingham
Copy link
Collaborator

The way the current FormatEntity strings work, the first % says "formatter coming next" then say S is "the kind of formatter" (in this case "return summary instead of value". So in your case, like:

${var%%1x}

The "kind of formatter" is "%" which seems like a reasonable name for "printf format string following". To be entirely strict, if the data following the "kind" is the format string, this should be:

${var%%%1x}

but that's kind of silly. I think it's fine to say the format kind % means printf formatting string, and then the data is everything after the % in a printf formatting string.

If you didn't like the two percents, you could do this by having p be the format kind that meant printf formatting string but I'm not sure that would gain much.

This seems fine to me.

@kastiglione kastiglione marked this pull request as ready for review February 13, 2024 21:34
@adrian-prantl
Copy link
Collaborator

Should this be documented in lldb/docs/use/variable.rst or a similar place on the website?

@kastiglione
Copy link
Contributor Author

@adrian-prantl good call, done.

@kastiglione
Copy link
Contributor Author

@jimingham I've updated the PR to a non-draft form if you want to check out the implementation.

static bool DumpValueWithPrintf(Stream &s, llvm::StringRef format,
ValueObject &target) {
auto type_info = target.GetTypeInfo();
if (type_info & eTypeIsInteger) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to do the same thing for eTypeIsFloat?

@kastiglione kastiglione changed the title [lldb] Support custom printf formatting for variables [lldb] Support custom LLVM formatting for variables Mar 19, 2024
@kastiglione
Copy link
Contributor Author

I've updated this PR to use llvm formatting instead of printf. For the following reasons:

  1. For printf, users would have to know the system's size of the value, eg %d vs %ld etc
  2. Users would have to provide different values for different systems, which limits the use/convenience of such summary strings

Getting the size wrong could be any of undefined behavior, buggy, crashy, insecure.

@kastiglione
Copy link
Contributor Author

@jimingham now that I've switched to llvm format, I'll loop back and follow up on your comments.

static bool DumpValueWithLLVMFormat(Stream &s, llvm::StringRef options,
ValueObject &target) {
std::string formatted;
std::string llvm_format = ("{0:" + options + "}").str();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way we can make this string static, by switching over the supported options?
Or let me ask another way — what happens if options contained "}{1}" is this well-defined in llvm::formatv because it knows the template arguments and thus will not lead to corruption and crashes?
If the answer is yes, then this is okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by switching over the supported options?

There are many supported options, and they vary from type to type (int has different options than say strings). See FormatProviders.h for the builtins.

It would be possible to exhaustively iterate all the options llvm includes, but I'm not sure it would be worth it. Switching over them seems fragile to changes made to the source of truth in llvm.

I will check/test what llvm does for invalid options.

Copy link

github-actions bot commented Mar 21, 2024

✅ With the latest revision this PR passed the Python code formatter.

lldb/source/Core/FormatEntity.cpp Outdated Show resolved Hide resolved
@kastiglione kastiglione merged commit 7a8d15e into llvm:main Apr 30, 2024
5 checks passed
@kastiglione kastiglione deleted the lldb-Support-custom-printf-formatting-for-variables branch April 30, 2024 17:45
@aaupov
Copy link
Contributor

aaupov commented Apr 30, 2024

This diff broke buildbot: https://lab.llvm.org/buildbot/#/builders/68/builds/73367

kastiglione added a commit that referenced this pull request Apr 30, 2024
@kastiglione
Copy link
Contributor Author

revert: 0f628fd

apologies, thanks for reporting @aaupov.

kastiglione added a commit that referenced this pull request May 15, 2024
Re-apply #81196, with a fix that handles the 
absence of llvm formatting: 
3ba650e
b47d425f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants