Skip to content

[lldb] colorize symbols in image lookup with a regex pattern #69422

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Dec 8, 2023

Conversation

talha-10xE
Copy link
Contributor

@talha-10xE talha-10xE commented Oct 18, 2023

Fixes #57372

Previously some work has already been done on this. A PR was generated but it remained in review:
https://reviews.llvm.org/D136462

In short previous approach was following:
Changing the symbol names (making the searched part colorized) -> printing them -> restoring the symbol names back in their original form.

The reviewers suggested that instead of changing the symbol table, this colorization should be done in the dump functions itself. Our strategy involves passing the searched regex pattern to the existing dump functions responsible for printing information about the searched symbol. This pattern is propagated until it reaches the line in the dump functions responsible for displaying symbol information on screen.

At this point, we've introduced a new function called "PutCStringColorHighlighted," which takes the searched pattern and the expected output and applies colorization to highlight the pattern in the output. This approach aims to streamline the symbol search process to improve readability of search results.

Co-authored-by: José Junior josejunior@10xengineers.ai

@github-actions
Copy link

github-actions bot commented Oct 18, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@DavidSpickett
Copy link
Collaborator

So, the automatic checkers here are going to complain about formatting and style - ignore them for now. We'll review all that once the important stuff is done.

You should put this PR into "draft" mode, there is a button somewhere on the page (that only you can see). Just to make it clear to other reviewers.

I have a few major things to mention, I wouldn't try to handle these all at once. Tackle them incrementally.

I think you could extend the existing Dump functions by adding a new new name argument with a default value of nullptr.
See https://en.cppreference.com/w/cpp/language/default_arguments and

bool SetSectionLoadAddress(const lldb::SectionSP &section,

For an example in tree. This means that all existing callers of all these dump functions can have the same result. Inside the functions you can say if (name) print colours else print without.

By doing that, you do not need to make new versions of each dump function.

Second, I'm looking at this printRed function thinking that this is a thing lldb must have done already elsewhere. That we'd have a utility function for. And we do.

auto debugger_sp = m_debugger_wp.lock();

That file does essentially what you're doing using it. You'd use the strings found here instead of the ASNI_...

So then you don't need to duplicate printRed everywhere.

To be clear, I'm not criticizing you for not knowing this. Once you've been around a while you get a sense for these things and, well, it's also what reviewers are for :).

Which brings me onto the fact that this feature will have to respect the use-color setting. The function mentioned above allows you to pass a bool to it to do that, you can look at other callers of it to see how they get that bool. To set this setting in lldb itself:

(lldb) process status
<stuff with colours>
(lldb) settings set use-color false
(lldb) process status
<stuff without colour>

And you can test it out manually.

Next I think you'll benefit from writing some tests. Best thing to do is adapt existing ones. The tests for target modules lookup -r -n are in a shell test lldb/test/Shell/Commands/command-target-modules-lookup.test. So that's where I'd start.

There are other shell tests that check colour (I'm British, color :) ) output, or lack of. If you search lldb/test for use-color you'll see them. You might have to turn colour off for the existing test, and add a new one that enables it and checks the ansi codes are there.

To run just this test:

$ ./bin/llvm-lit ../llvm-project/lldb/test/Shell/Commands/command-target-modules-lookup.test -a

-a is not required but it means that if it fails, it will print out all the commands that it just ran so you can debug it.

You showed some example output on Discourse and that's a decent starting point for the checks themselves. You had one where there was one match, many matches, etc. that's about all you'd need.

And the reason I say to get some tests done is it'll preserve your sanity if you want to refactor the implementation.

So my final thing is, I'm seeing const char* name passed around a lot and you having to strstr to find it. Even though the symbol lookup must have found at least 1 match already. I wonder if we can have an alternative interface that not only returns the indexes of symbols but the details of their matches.

This might be too big of a change for this feature though, so let's deal with everything else first.

As it's possible that the symbol table breaks after the first match in the string, so it wouldn't be able to return the full set of matches anyway. So it may mean adding an interface that does (in Python terms) re.findall instead of re.search.

Anyway, that'll seem like a lot but take it one step at a time and feel free to update this as you go and/or ask more questions. Great work so far!

@talha-10xE talha-10xE marked this pull request as draft October 18, 2023 12:31
Copy link
Contributor Author

@talha-10xE talha-10xE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi David! We're done with point no. 1 i.e., extending the existing dump functions by adding name (searched symbol) argument in the function definition. Can you please have a look at the changes. Also here I'm doing a lot of this in these dump functions:

if(name) {Pass name to next dump function}
else {Use the next dump function without name}

Instead of this, should I call the next dump function with name parameter and handle the name=nullptr in the PrintRed function. Because at the end, the program is going to land in the PrintRed function in the case of if. Even if I land the program at the PrintRed function in case of else as well, will it be okay?
Then instead of having writing if else in every dump function, I'll have to write this in PrintRed function only:

if(name == nullptr){print without Red}
else(){Print with Red}

Also, Can you please confirm if the current version is done the right way as per you suggested :)
Next: We will be working on point 2 (updating PrintRed function) and point 3 (Making test cases) and will update you shortly.

@DavidSpickett DavidSpickett self-requested a review October 23, 2023 14:23
pointer_sc.DumpStopContext(s, exe_scope, so_addr, true, false,
false, true, true, name);
else
pointer_sc.DumpStopContext(s, exe_scope, so_addr, true, false,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you can simplify this a lot.

If if (name) is true, then name is not nullptr. So:

pointer_sc.DumpStopContext(s, exe_scope, so_addr, true, false,
                                             false, true, true, name);

Is fine. Ok so far.

In the else, name must be nullptr. DumpStopContext's name parameter defaults to nullptr anyway, so by passing it here too, you wouldn't create any problems.

So the final code could be:

              if(name)
                pointer_sc.DumpStopContext(s, exe_scope, so_addr, true, false,
                                         false, true, true, name);
              else
                pointer_sc.DumpStopContext(s, exe_scope, so_addr, true, false,

Here you can simplify this a lot.

If if (name) is true, then name is not nullptr. So:

pointer_sc.DumpStopContext(s, exe_scope, so_addr, true, false,
                                             false, true, true, name);

Is fine. Ok so far.

In the else, name must be nullptr. DumpStopContext's name parameter defaults to nullptr anyway, so by passing it here too, you wouldn't create any problems.

So the final code could be:

                    pointer_sc.DumpStopContext(s, exe_scope, so_addr, true, false,
                                             false, true, true, name);

If name is nullptr, we pass a nullptr, if it isn't, we pass it on. DumpStopContext is happy with either.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may help you see what's going on behind the scenes: https://godbolt.org/z/vxc56798P

You can ignore the assembly specifics but for the fact that fn2 and fn3 generate identical code. Just in the source code one "explicitly" passes nullptr, and one "implicitly" passes it.

@@ -185,6 +226,111 @@ bool SymbolContext::DumpStopContext(Stream *s, ExecutionContextScope *exe_scope,
return dumped_something;
}

void SymbolContext::GetDescription(Stream *s, lldb::DescriptionLevel level,
Target *target, const char* name) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we need a second copy of this function? I assume it existed prior.

const std::string reset_color = ANSI_ESC_START + std::to_string(ANSI_CTRL_NORMAL) + ANSI_ESC_END;

const char *match = text;
size_t name_len = strlen(name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://en.cppreference.com/w/c/string/byte/strlen

Specifically:

The behavior is undefined if str is not a pointer to a null-terminated byte string.

I won't go into what that means according to the standard but here specifically, if clang knew a call to PrintRed took a nullptr always, it could make essentially choose the result of strlen to be almost anything. Which is not what we want.

So, as you suggested in your comment on the PR page, starting with some if a nullptr then just print it out normally is a good idea.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://en.cppreference.com/w/c/string/byte/strstr

The behavior is undefined if either str or substr is not a pointer to a null-terminated byte string.

You see the theme, a lot of these functions do not check for you.

It may be helpful that a pointer to the empty string "" will not be nullptr, a nullptr literally cannot point to anything, it's an invalid pointer (sort of, see the standards if you want the super specific language).

The point being that when you're coming from a language without strings that can be null and can only be empty string or some characters, that difference can be confusing.

@DavidSpickett
Copy link
Collaborator

Cool, some good improvements so far.

Instead of this, should I call the next dump function with name parameter and handle the name=nullptr in the PrintRed function.

It would be a good idea to do this yes. I left some comments suggesting things to check there.

I would do that and remove the redundant elses, then move on to replacing PrintRed with the built in function I mentioned.


# CHECK: 3 symbols match the regular expression 'ma' in {{.*}}
# The [[ confuses FileCheck so regex match it.
# CHECK-NEXT: Name: {{.+}}31mma{{.+}}0min.c
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the lines that are supposed to have colour are important here, and ordering isn't really an issue.

Keep the "3 symbols match" check, just as a sanity check in case the output changes, but then only check for the lines with colour (you won't need -NEXT).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, ordering is something we want to be correct - but that's tested by other tests. You can assume it will be correct.

UNSUPPORTED: system-windows

# RUN: %clang_host -g %S/Inputs/main.c -o %t
# RUN: %lldb %t -b -o 'image lookup -r -s ma' | FileCheck %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have brought this up earlier but the way you're coloring the output doesn't actually do a regex match, it looks for substrings. This works fine if the input is itself basically a substring like here, but would not for m[abc].

(keep these test lines they're good, but add more to cover this)

So you'll need some way to either do the matching again while formatting, or pass instead of name, a list of matched locations from the initial search.

I suggest you start by doing the first idea, redo the regex match in the formatting function. That'll give you an idea of what data you'd require, if we were to instead pass the result of the first matching to the formatter.

@talha-10xE
Copy link
Contributor Author

Hi David, I thought writing a few simple and very basic test cases will be better before updating the printRed function so that while updating the printRed function it easier and quicker for us to test :)
We've written 2 test cases for colorized output search in tests/Shell/Commands/TestImageLookupColor by taking help from the following 2 already written test cases:

First test is a basic regex symbol search for "ma"
The second test (which is failing right now as we have not added the support of | in printRed function) searches for 2 symbols using or (|). I'll update the printRed function to first split the input string i.e., name on | and then apply colors to each of the string in the searched result

Also, we've incorporated the previous feedback.

Next we'll update the printRed function:

  • Will add or (|) feature in printRed.
  • Will update printRed so that we don't have to write it in every separate file.
  • Will add support of no use color i.e., settings set use-color false

Can you please confirm weather the test cases good for now and the future plan looks okay? ;)

P.S: Currently The test case tests/Shell/Commands/command-target-modules-lookup is failing as it is not checking for ANSI color codes in the output. Once we update the printRed function, We'll add no use color in this test case and hopefully it will work fine.

@DavidSpickett
Copy link
Collaborator

Can you please confirm weather the test cases good for now and the future plan looks okay? ;)

Sounds good but see my comment about substrings vs. regex. If you go and special case all the characters you're gonna end up reinventing a regex engine, but you don't need to do that, we have functions to handle that already.

@talha-10xE
Copy link
Contributor Author

Hi David! I've done the following:

  1. Used regex search in the printRed format function instead of sub string search.
  2. Used ANSI function which you mentioned above
    inline std::string FormatAnsiTerminalCodes(llvm::StringRef format,
  3. In the above function, to cater with the bool variable, which takes weather the Use-Color settings are switched on or not, I've passed the CommandInterpreter object to the PrintRed function (as it was available to me in the dump functon) and fetching the Use-colors settings option (true or false) through it using:
    interpreter->GetDebugger().GetUseColor()
    And passing this bool to the ANSI function.
  4. Updated the Test-cases by writing a few more test cases which covers some commonly used regex symbols for searching. Now all previous test cases are working fine including tests/Shell/Commands/command-target-modules-lookup and in my test cases, I've passed the use-color = true option in run commands:
    # RUN: %lldb %t -b -o 'settings set use-color true' -o 'image lookup -r -s ma' | FileCheck %s --check-prefix CHECK1
    Next:
    Will figure out a way to remove duplicate PriintRed functions in different files.

Can you please confirm if passing the pointer to CommandInterpreter object in PrintRed function to get the use-color options a good idea? Are the test cases good enough and also does the future plan sounds Okay?

@DavidSpickett
Copy link
Collaborator

DavidSpickett commented Oct 30, 2023

Can you please confirm if passing the pointer to CommandInterpreter object in PrintRed function to get the use-color options a good idea?

Yes but it can be better :) See my comments.

Are the test cases good enough and also does the future plan sounds Okay?

I suggested two more test cases, otherwise they look good to me.

@talha-10xE
Copy link
Contributor Author

Hi David!

That file does essentially what you're doing using it. You'd use the strings found here instead of the ASNI_...

Which brings me onto the fact that this feature will have to respect the use-color setting. The function mentioned above allows you to pass a bool to it to do that, you can look at other callers of it to see how they get that bool. To set this setting in lldb itself

I searched for the function FormatAnsiTerminalCodes to see how the bool option is passed in other parts of LLDB. In most of the places, GetUseColor() function of Debugger class was used to check the use-color settings. So I decided to use this approach interpreter->GetDebugger().GetUseColor() inside printRed function.
Few instances:
https://lldb.llvm.org/cpp_reference/Debugger_8cpp_source.html#:~:text=std%3A%3Astring%20str%20%3D,226%20%20%20%20%20%20%20%20%20%20new_prompt%2C%20GetUseColor())%3B
https://lldb.llvm.org/cpp_reference/Debugger_8cpp_source.html#:~:text=const%20bool%20use_color,%2C%20message.c_str())%3B
I guess I could not fully understand your comment :( If possible can you please explain it a little bit more.

@DavidSpickett
Copy link
Collaborator

My point is not that interpreter->GetDebugger().GetUseColor() is the wrong source for the information. This is fine.

It's that passing interpreter down through 2/3/4 levels of calls, just to then do ``interpreter->GetDebugger().GetUseColor() ` on it is overkill and obscures what you're actually doing.

Imagine you are reading one of those first calls and you see it passes in an interpreter pointer. You would have to assume it could call anything on that object. This means you've got to go and read the entire set of calls to understand what happens to it. Ok that's not a problem now but what if I want to refactor interpreter, or this code in general?

If instead, that call just did ``interpreter->GetDebugger().GetUseColor()` and passed the result down 2/3/4 levels of calls, that's much clearer. Now we know it only needs that one property.

(and as it turns out, you might not even need another parameter for it, see my comment about the value of name)

@DavidSpickett
Copy link
Collaborator

Also side note: if you ever find yourself passing a pointer that is never going to be nullptr, use a reference instead. This also helps refactoring later.

@DavidSpickett
Copy link
Collaborator

DavidSpickett commented Oct 31, 2023

I made an example to illustrate: https://godbolt.org/z/zxdq5jMGT

The _ functions get only the data they need and so don't have to care about the Interpreter type at all. Anyone refactoring can easily see that.

Also, anyone writing tests for the _ functions doesn't have to mock up a Interpreter object, just for it to call one method on it.

...hope all of that makes some sense :)

@talha-10xE
Copy link
Contributor Author

Hi David!
Thanks for the detailed explanation. I got you point that instead of passing the whole object, we should only pass the information required by the next functions. Now I' passing a vector down the track that holds a pair of two things:
std::vector<std::pair<bool, const char*>> info;

  • The searched regex string
  • Use-color settings

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently I didn't submit the last set of review comments sorry about that.

The one about vector is the key one this time.

return;
}

bool use_color = interpreter->GetDebugger().GetUseColor();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. If we only need this one part of the CommandInterpreter, we should only be passing that one thing down.
    Try to find the earliest place you can to do the GetUseColor call, then pass the result down to here.

  2. You can combine that approach with one observation, that if name is nullptr, this will never use colour regardless of the setting. Therefore instead of having another parameter you could just set name to nullptr if colours are disabled.

As in:

  • No regex search - name is nullptr, colour setting is unused
  • Regex search - name is not nullptr, colour setting must be read

So if early in the callstack you know that GetUseColor returns false, you could just pass name=nullptr. No extra parameters needed. So it becomes something like:

interpreter->GetDebugger().GetUseColor() ? name : nullptr;

std::sregex_iterator end;

std::string red_start = lldb_private::ansi::FormatAnsiTerminalCodes("${ansi.fg.red}", use_color);
std::string reset_color = lldb_private::ansi::FormatAnsiTerminalCodes("${ansi.normal}", use_color);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally you'd use this by making a format string like:

${ansi.fg.red}%s${ansi.normal}

Then the function will remove the ansi bits as needed and you can use it as a format string to printf. strm.Printf(fmt, ...).

What you have isn't wrong, but changing it will save a call or two.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though on second thought, doing it your way saves making a new string later when you do strm.Write(text + match.position(), match.length());.

So leave it as is for now.

strm.PutCString(reset_color.c_str());

last_pos = match.position() + match.length();
++next;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally prefer a for loop for this sort of thing, so that one doesn't forget this last bit.

for ( ; next != end; ++next) {
...
}

//===========================================================================================

// This function is the one which colorizes the regex symbol searched
static void PrintRed(Stream &strm, const char *text, const char *name, CommandInterpreter *interpreter= nullptr) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider changing name to re_pattern, something that tells us that it is in fact a regex.


bool use_color = interpreter->GetDebugger().GetUseColor();

std::string str_text(text);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can avoid some string copies by:

  1. Passing the StringRef to this function, instead of converting it to std::string and then getting a char* from that.
  2. Then using StringRef's .begin and .end in the next iterator. If it doesn't like the type you can build the begin/end from .bytes_begin and .bytes_end instead. Those give you char* instead.

Basically StringRef is a "view" onto a string, so as long as we know where it ends we can iterate over that instead of a new copy of it.

# CHECK4: Name: Scrt{{.+}}31m1{{.+}}0m.o
# CHECK4: Summary: {{.+}}`completed.{{.+}}31m0{{.+}}0m
# CHECK4: Name: __libc_start_main@GLIBC_{{.+}}31m2{{.+}}0m_{{.+}}31m3{{.+}}0m{{.+}}31m4{{.+}}0m
# CHECK4: Name: __cxa_finalize@GLIBC_{{.+}}31m2{{.+}}0m_{{.+}}31m2{{.+}}0m_{{.+}}31m5{{.+}}0m
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one will be a bit tricky running on all the OS we support. I suggest you keep your tests to symbols we know are defined in the user's program.

As if I go to FreeBSD for example, there will be no glibc in the process' address space, it'll be the FreeBSD libc with different names.

So stick to names from main.c.

# CHECK3: Name: {{.+}}31mma{{.+}}0min.c
# CHECK3: Summary: {{.+}}`{{.+}}31mma{{.+}}0min at main.c:2
# CHECK3: Summary: {{.+}}`___lldb_unnamed_sy{{.+}}31mmb{{.+}}0mol36
# CHECK3: Summary: {{.+}}`___lldb_unnamed_sy{{.+}}31mmb{{.+}}0mol37
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to here is fine, I think you can drop the last test. It's fine to assume the regex engine works as expected, as long as you've demonstrated you've called it properly, which you have.

# CHECK3: 5 symbols match the regular expression 'm[abc]' in {{.*}}
# CHECK3: Name: {{.+}}31mma{{.+}}0min.c
# CHECK3: Summary: {{.+}}`{{.+}}31mma{{.+}}0min at main.c:2
# CHECK3: Summary: {{.+}}`___lldb_unnamed_sy{{.+}}31mmb{{.+}}0mol36
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also these __lldb_unnamed tend to come from inlined functions or libc functions we don't have debug info for. Don't add checks for them as they'll vary by platform.

So I would say here to maybe drop the __lldb lines and not check for the 5 symbols match line either. It's not part of what you're testing here.

@@ -0,0 +1,31 @@
UNSUPPORTED: system-windows

# RUN: %clang_host -g %S/Inputs/main.c -o %t
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can, a test where the regular expression matches at the very end of the name would be a good addition.

This would check the tail end of your PrintRed loop, where you print any remaining text.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also just for sanity checking, add one where you don't match anything at all. It shouldn't do any matching or even attempt to, so the test would just ensure that it doesn't try and end up crashing.

The classic "nothing in, nothing out" test case. So it will match 0 symbols but that's what we expect.

@@ -247,7 +247,17 @@ class Address {
bool Dump(Stream *s, ExecutionContextScope *exe_scope, DumpStyle style,
DumpStyle fallback_style = DumpStyleInvalid,
uint32_t addr_byte_size = UINT32_MAX,
bool all_ranges = false) const;
bool all_ranges = false,
std::vector<std::pair<bool, const char*>>* info = nullptr) const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna choose this one as the example. You're on the right lines again but implementation is a bit odd. Let's take it bit by bit.

You're using a pointer not a reference, which makes some sense as you can't default construct a reference. So just FYI in future, try to use const & if you are never going to pass a nullptr and can default construct the type (or better, use optional).

Next bit, pair is a decent choice but I don't see the need for a vector here. A single pair would do, so std::pair<bool, const char*>* would suffice.

If you were to do that, you could default construct the parameter and pass via copy. No reference or pointer needed (and pointer/bool are small types cheap to pass by copy).

So you may choose to apply those idea maybe not after I tell you the next bit, which makes them academic.

If you were to write out a table of values of name (aka the pattern) and whether we're using colour, what would it look like? This:

const char* name use_colour Do we use name?
non-null false No
nullptr false No
non-null true Yes
nullptr true No

(This is a https://en.wikipedia.org/wiki/Truth_table if you haven't seen one before)

Now look where we use the name. What's the equation for that state?

if (name is non-null) and (use_colour is true)

Which means that any function currently taking this pair/vector could just take name. Except that we change the value of name depending on the value of use_colour.

If name is non-null but colour is disabled, make name nullptr. The result is the same, but functions only need name to know what to do.

Example: https://godbolt.org/z/1W3EWTPqa

By doing this you can combine name and use_colour earlier in the callstack, and only pass name to the subsequent functions.

If we later added a way to highlight names that did not use colour, then yes, you would need a separate bool. But that is not what you're doing here.

Copy link
Collaborator

@DavidSpickett DavidSpickett Nov 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to put this is:

  • When name is nullptr, we won't print any colours.
  • When use_colour is false, we won't print any colours.

So why not use name == nullptr to also mean that colour is disabled? The effect is the same.

@DavidSpickett
Copy link
Collaborator

Oh and good work on the updates! (because that feedback is probably hard to see with all these comments :) )

@talha-10xE
Copy link
Contributor Author

Hi David!
I've updated the feedback. I got your point of somehow embedding the use-color and name in a single argument. Now we're doing the following:
In the very first function (LookupSymbolInModule) which calls other dump:

bool use_color = interpreter.GetDebugger().GetUseColor();
if (name && use_color){
   DumpAddress(
          interpreter.GetExecutionContext().GetBestExecutionContextScope(),
          symbol->GetAddressRef(), verbose, all_ranges, strm, name);
 }
 else{
      DumpAddress(
         interpreter.GetExecutionContext().GetBestExecutionContextScope(),
          symbol->GetAddressRef(), verbose, all_ranges, strm);
  }

As the program should only go inside the re-pattern (new name of printRed ;-) function if use-color is true and name is not null. Hence if this condition
if (name && use_color)
is not true, I'm using the Dump function which has nullptr as the default value of variable name.
Plus I've also updated the test cases and re-pattern function according to the feedback. Just 2 things are left:

  1. Writing a test case for no symbol match.

Also just for sanity checking, add one where you don't match anything at all. It shouldn't do any matching or even attempt to, so the test would just ensure that it doesn't try and end up crashing.

The classic "nothing in, nothing out" test case. So it will match 0 symbols but that's what we expect.

Was facing some issues while checking for empty string using:

# RUN: %lldb %t -b -o 'settings set use-color true' -o 'image lookup -r -s .*o132we$' | FileCheck %s --check-prefix CHECK5
# CHECK5-EMPTY:

But getting this error:

error: found non-empty check string for empty check with prefix 'CHECK5:'
# CHECK5-EMPTY:

  1. Avoiding string copy in re-pattern function

You can avoid some string copies by:

  1. Passing the StringRef to this function, instead of converting it to std::string and then getting a char* from that.
  2. Then using StringRef's .begin and .end in the next iterator. If it doesn't like the type you can build the begin/end from .bytes_begin and .bytes_end instead. Those give you char* instead.

Basically StringRef is a "view" onto a string, so as long as we know where it ends we can iterate over that instead of a new copy of it.

Was having issues while using both .begin() and .bytes_begin()

But I'll look into both of these remaining things. For the test case maybe I will try to check the length of the output to be zero and for String reference I'll look more in to the begin and bytes_begin functions. Will update you shortly on these two.
Other than that can you please provide feedback on this commit and any other stuff which needs to be done :)

@talha-10xE
Copy link
Contributor Author

Added the wrong test case in the previous commit by mistake. Have removed it in the new one :/ Sorry for the inconvenience

@talha-10xE
Copy link
Contributor Author

Hi @JDevlieghere! did you got a chance to look at the code 🙂

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an awesome feature, thank you for working on this! I left a few comments but overall this PR is in pretty good shape.

uint32_t addr_byte_size = UINT32_MAX,
bool all_ranges = false) const;
uint32_t addr_byte_size = UINT32_MAX, bool all_ranges = false,
const char *pattern = nullptr) const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is pattern a const char* and not a StringRef with an empty string as the default argument? Or potentially even better an std::optional<llvm::Regex>?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to StringRef. I could not make it work with std::optional<llvm::Regex>. Maybe I was doing something wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What errors did you get?

Remember that optionals are "pointer like" when you want to get at the value within them, *foo, foo->method() etc. that trips me up sometimes. Also the default value would be = std::nullopt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize, I should have noted the errors. But as far as I remember, it was related to use of deleted functions and type conversions. I also tried passing the parameter as a reference thinking the error could be caused by copying.

Remember that optionals are "pointer like" when you want to get at the value within them, *foo, foo->method() etc. that trips me up sometimes.

Hmm, I was trying to get pattern.value(), perhaps that was part of the problem.

Also the default value would be = std::nullopt

At least I got that right.

Copy link
Collaborator

@DavidSpickett DavidSpickett Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbf I had to look up nullopt, I don't see it much. .value returns a reference to the contained value (assuming it has one), -> does the same so either is fine.

Can you post the error message(s)? Don't need to update the PR, just describe how you're constructing and passing it, that'll likely be enough.

Copy link
Contributor Author

@talha-10xE talha-10xE Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @DavidSpickett . Tried this incremental approach:

static void test_func1(Stream &strm, llvm::Regex *regex_param = nullptr) {
  strm.Printf("Inside test_func1\n");
}

static void test_func2(Stream &strm, llvm::Regex &regex_param) {
  strm.Printf("Inside test_func2\n");
}

static void test_func3(Stream &strm, const std::optional<llvm::Regex> regex_param = std::nullopt) {
  strm.Printf("Inside test_func3\n");
}

static void test_func4(Stream &strm, const std::optional<llvm::Regex> &regex_param = std::nullopt) {
  strm.Printf("Inside test_func4\n");
}

static uint32_t LookupSymbolInModule(CommandInterpreter &interpreter,
                                     Stream &strm, Module *module,
                                     const char *name, bool name_is_regex,
                                     bool verbose, bool all_ranges) {
  llvm::Regex regex1(name);
  std::optional<llvm::Regex> regex2 = std::make_optional<llvm::Regex>(name);

  test_func1(strm, &regex1);
  test_func2(strm, regex1);
  // test_func3(strm, regex2);
  test_func4(strm, regex2);
  ...
  ...
  }

In test1, used the same logic like used previously i.e., function accepting pointer to llvm::Regex and default value is null pointer.
In test2 passed llvm::regex by reference but could not give a default value.
In test 3 used the std::optional<llvm::Regex> but was getting the error mentioned above i.e., use of deleted functions. The exact error is:

error: use of deleted function ‘std::optional<llvm::Regex>::optional(const std::optional<llvm::Regex>&)’
 1611 |   test_func3(strm, regex2)

I believe it's due to the deletion of copy constructor which you mentioned.
In test 4 passing the reference of std::optional<llvm::Regex> to the function and giving nullopt as default value. It is working fine as expected.
I'm a little confused. Right now, we're passing the pattern as llvm::StringRef, and finally when we reach PutCStringColorHighlighted we convert it to llvm::Regex:
llvm::Regex reg_pattern(pattern);
What will be the benefit of converting const char* name to std::optional<llvm::Regex> from the very beginning and than passing it till we reach PutCStringColorHighlighted? 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't speak for Jonas here so just my interpretation -

If you pass this as llvm::Regex, you're using the type system to signal to anyone reading this code that this is a regex pattern. Whereas using const char* and a comment that it is a regex could be missed by the reader.

test_func3 is the ideal here (though you may have to remove const, you'll find that out later). Deleting the copy constructor is usually done when you have a move only type, which is usually something that contains a resource where copying it doesn't make sense.

I think something like test_func3(strm, std::move(regex2)); might fix it. At least, that's the obvious way to try to move construct the regex inside the optional.

Can you pass std::make_optional<llvm::Regex>(name) to func3 maybe?

If all this is too much hassle in the end, using a pointer is the fallback option. As long as the lifetimes are ok which I'm guessing they are. By which I mean, nothing earlier in the callstack is going to destroy the pattern before you end up using it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @DavidSpickett! Sorry for the delay 🙏

test_func3(strm, std::move(regex2)) works but it gives the following problem.
When we use std::move(regex2) to pass it to test_func3, it's essentially moved, and after the move, the regex2 in LookupSymbolInModule becomes garbage.

So in functions where pattern is moved multiple times like this:

static void DumpAddress(ExecutionContextScope *exe_scope,
                        const Address &so_addr, bool verbose, bool all_ranges,
                        Stream &strm, std::optional<llvm::Regex> pattern = std::nullopt) {
     ...
     ...
     so_addr.Dump(&strm, exe_scope, Address::DumpStyleResolvedDescription,
               Address::DumpStyleInvalid, UINT32_MAX, false, std::move(pattern));
     ...
     ...
     if (verbose) {
    strm.EOL();
    so_addr.Dump(&strm, exe_scope, Address::DumpStyleDetailedSymbolContext,
                 Address::DumpStyleInvalid, UINT32_MAX, all_ranges, std::move(pattern));
  }
     ...
}

The second std::move(pattern), does not move the original pattern value but some garbage value. The only way is to make a copy of the original pattern and then use it with std::move(copy_pattern) every time (which doesn't look okay :-/).

This is the example code and output if I couldn't explain it properly above:
CODE:

static void test_func3(Stream &strm, const std::optional<llvm::Regex> pattern = std::nullopt) {
  if (pattern.has_value() && pattern.value().match("main")) {
        strm.Printf("Pattern matches 'main' inside test_func3\n");
    }
}

static uint32_t LookupSymbolInModule(CommandInterpreter &interpreter,
                                     Stream &strm, Module *module,
                                     const char *name, bool name_is_regex,
                                     bool verbose, bool all_ranges) {
  // test_func1(strm, std::make_optional<llvm::Regex>(name));
  std::optional<llvm::Regex> regex2 = std::make_optional<llvm::Regex>(name);
  if (regex2.has_value()) {
    strm.Printf("Pattern has a value\n");
    if(regex2.value().match("main")) {
      strm.Printf("Pattern matches main\n");
    }
    else{
      strm.Printf("Pattern doesn't match main\n");
    }
  } else {
    strm.Printf("Pattern doesn't have a value\n");
  }
  test_func3(strm, std::move(regex2));
  if (regex2.has_value()) {
    strm.Printf("Pattern has a value\n");
    if(regex2.value().match("main")) {
      strm.Printf("Pattern matches main\n");
    }
    else{
      strm.Printf("Pattern doesn't match main\n");
    }
  } else {
    strm.Printf("Pattern doesn't have a value\n");
  }  ...
  ...
}

OUTPUT:

talha@talha-ROG-Strix-G513QE-G513QE:~/Desktop/LLDB/build/bin$ ./lldb a.out 
(lldb) target create "a.out"
Current executable set to '/home/talha/Desktop/LLDB/build/bin/a.out' (x86_64).
(lldb) image lookup -r -s main
Pattern has a value
Pattern matches main
Pattern matches 'main' inside test_func3
Pattern has a value
Pattern doesn't match main
...
...

So after std::move(pattern), pattern != "main" within the same function

Similarly passing std::make_optional<llvm::Regex>(name) to func3 also causes problem problem:
i.e., test_func3(strm, std::make_optional<llvm::Regex>(name));

When we directly pass std::make_optional<llvm::Regex>(name), it creates a temporary std::optional<llvm::Regex> object that is passed directly to test_func3. This temporary object can be moved without requiring an explicit std::move because I guess temporaries are considered rvalues, and std::optional has appropriate move constructors to handle them.
But storing the result of std::make_optional<llvm::Regex>(name) in a variable and then passing it without using std::move, if the variable is passed as an argument to the function, it attempts to perform a copy unless we use std::move. So, initially we can pass it directly to the next dumping function but further passing it will require std::move and the same problem comes back.

Also 1 last thing for clarification, as discussed later i.e.,

I think we may be able to reduce the boilerplate by putting the prefix/suffix/pattern in a struct, and passing that as an optional assuming we can figure out the move issue with regex. But this is for a future change if at all.

The struct will be optional and not the member variables inside it as we can check them in the very first function right?
So the struct will look like:

struct information {
pattern, prefix, sufix
}

And information will be passed as optional argument rather than

struct information {
optional pattern = EMPTY, optional prefix= EMPTY, optional sufix = EMPTY
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we use std::move(regex2) to pass it to test_func3, it's essentially moved, and after the move, the regex2 in LookupSymbolInModule becomes garbage.

This is where I get on my Rust powered soapbox and preach about C++ move not being what you'd think it is. :)

Good research anyway. Sound like it's what I suspected is that the regex is carrying around the match "engine"'s state.

On the struct, yes, the struct itself would be optional. If you do pass an instance of it you must provide all 3.

To be honest, once we're packing all this into a struct, that's quite an obvious signal to the programmer that this is no ordinary string. So you could do just try the struct idea without using the Regex type.

And ok, it's not perfect because all 3 struct members have the same type still, but if making the pattern a Regex is this much trouble it's probably not worth it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in code that would be roughly:

struct information {
pattern, prefix, sufix
}

PutCStringHighlighted(....., std::optional<information> pattern_info);

Which tells the reader clearly that if they want to provide any one of the 3 they must provide all 3. Otherwise they must provide none of them.

@@ -1593,6 +1595,7 @@ static uint32_t LookupSymbolInModule(CommandInterpreter &interpreter,
return 0;

SymbolContext sc;
bool use_color = interpreter.GetDebugger().GetUseColor();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike in llvm we do const variables (and there's examples of that in the surrounding code) so let's make this const bool use_color.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return;
}

// If pattern is not nullptr, we should use color
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: seems more informative to have the inverse of this comment at the start of the function where we have an early return for the trivial case of not having a patter to highlight.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Changed this.

Comment on lines 86 to 87
std::string format_str = lldb_private::ansi::FormatAnsiTerminalCodes(
"${ansi.fg.red}%.*s${ansi.normal}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most (all?) other colors are configurable. I think this one should be too. Some examples of this are the show-autosuggestion-ansi-prefix and show-progress-ansi-prefix. This pattern should work here as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I got this suggestion right. What I did was:

  1. Defined these properties in lldb/source/Core/CoreProperties.td:
def ShowRegexMatchAnsiPrefix: Property<"show-regex-match-ansi-prefix", "String">,
    Global,
    DefaultStringValue<"${ansi.fg.red}">,
    Desc<"When displaying a regex match in a color-enabled terminal, use the ANSI terminal code specified in this format immediately before the match.">;
def ShowRegexMatchAnsiSuffix: Property<"show-regex-match-ansi-suffix", "String">,
    Global,
    DefaultStringValue<"${ansi.normal}">,
    Desc<"When displaying a regex match in a color-enabled terminal, use the ANSI terminal code specified in this format immediately after the match.">;
  1. Declared the getter functions in Debugger.h and defined them in Debugger.cpp following the pattern:
llvm::StringRef Debugger::GetRegexMatchAnsiPrefix() const {
  const uint32_t idx = ePropertyShowRegexMatchAnsiPrefix;
  return GetPropertyAtIndexAs<llvm::StringRef>(
      idx, g_debugger_properties[idx].default_cstr_value);
}

llvm::StringRef Debugger::GetRegexMatchAnsiSuffix() const {
  const uint32_t idx = ePropertyShowRegexMatchAnsiSuffix;
  return GetPropertyAtIndexAs<llvm::StringRef>(
      idx, g_debugger_properties[idx].default_cstr_value);
}
  1. Added two parameters to Stream::PutCStringColorHighlighted: prefix and suffix.
  2. In Address.cpp and Symbol.cpp, there was a Target object that I used to get the debugger and its properties. In CommandObjectTarget.cpp, there was an CommandInterpreter object that was used for the same. But in SymbolContext.cpp, there was a TargetSP object and I tried using it to get the debugger properties, I got a segmentation fault, that's why I put the following two lines hard coded (while we don't solve this):
llvm::StringRef ansi_prefix = "${ansi.fg.red}";
llvm::StringRef ansi_suffix = "${ansi.normal}";

Please let me know if there's a better way to do this or if I completely misunderstood the suggestion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I was going to add a test case for different colors, but I will only add when that error is fixed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be able to get it from the exe_scope like this does:

bool Address::Dump(Stream *s, ExecutionContextScope *exe_scope, DumpStyle style,
                   DumpStyle fallback_style, uint32_t addr_size,
                   bool all_ranges) const {

But you won't always have a target. Which is a bit odd because there'd always be a settings object somewhere, if only you could get to it.

Seems like symbolcontext can have a target or not, and methods that definitely need one have it as a parameter. So that's the other way to go, add a parameter for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could have sworn I did this yesterday and it did not work. But I tried (again ?) and it works now. Just checking if that's okay... This is how it is now:

llvm::StringRef ansi_prefix = exe_scope->CalculateTarget()->GetDebugger().GetRegexMatchAnsiPrefix();
llvm::StringRef ansi_suffix = exe_scope->CalculateTarget()->GetDebugger().GetRegexMatchAnsiSuffix();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, before pushing these, check-lldb failed because of this change.

You can check the log here: https://pastebin.com/atfvLauV

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's either find another way to get to the settings, or assume no colour if the target isn't present.

Jonas may know a way to do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are your thoughts on this approach? (In SymbolContext.cpp - line 98). I tested it and it only fails on the new test this PR creates, so we would only need to adapt it.

if (name) {
        llvm::StringRef ansi_prefix;
        llvm::StringRef ansi_suffix;
        if (target_sp) {
          ansi_prefix = target_sp->GetDebugger().GetRegexMatchAnsiPrefix();
          ansi_suffix = target_sp->GetDebugger().GetRegexMatchAnsiSuffix();
        }
        s->PutCStringColorHighlighted(name.GetStringRef(), pattern, ansi_prefix,
                                      ansi_suffix);
      }

Or even forcing the red color when there is no target? Honestly, I do not completely understand what would mean not having a target.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume if we have no target set we won't know whether to colour anything, so we'd default to no colours. Which means the prefix/suffix can default to empty string. Which is exactly what you've suggested.

Worst case in some situation it doesn't highlight and if the user cares they can open an issue for it.

I expect the SymbolContext can live without a target in some scenario, if you want to find out what, find somewhere in SymbolContext that null checks target, remove the check and see what tests fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

This is how it is working now. Basically no color for SymbolContext. IMO, it is a little weird, but the alternative option of forcing red color when no target is available could be weirder when the color was changed by the user.

@junior-jl
Copy link
Contributor

Addressed @JDevlieghere suggestions.

  • A new debugger option was created for custom colors.
  • The test needed some adaptation and we added a new case to test for other color option.
  • The SymbolContext situtation is working as discussed with David: no color when there is no target.
  • The recent changes are using StringRef for pattern. @taalhaataahir0102 is trying to make it work with std::optional<llvm::Regex>.
  • Some missing comments were added.

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize it was going to be so tedious to pass the prefix and suffix down, but I still think it's the right thing to do, so thanks for making that work. It would be awesome if we can make the optional<Regex> work, but that's not a blocker so this LGTM.

Comment on lines 80 to 83
if (pattern.empty()) {
PutCString(text);
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the prefix and suffix are empty, we can also take this shortcut, right?

@DavidSpickett
Copy link
Collaborator

@JDevlieghere thanks for reviewing.

Once Jonas' last comment is addressed let me know if you're happy with the changes and PR message and I can merge this for you.

I think we may be able to reduce the boilerplate by putting the prefix/suffix/pattern in a struct, and passing that as an optional assuming we can figure out the move issue with regex. But this is for a future change if at all.

@junior-jl
Copy link
Contributor

In the last commit:

  • Added the conditions prefix.empty() and suffix.empty() to the 'no color' path.
  • Added missing arguments to PutCStringColorHighlighted call in SymbolContext (line 177).

I had to copy the same statements from line 99 here. Is it better if I declared llvm::StringRef ansi_prefix and llvm::StringRef ansi_suffix outside the conditions even if they are not used?

      llvm::StringRef ansi_prefix;
      llvm::StringRef ansi_suffix;
      if (target_sp) {
        ansi_prefix = target_sp->GetDebugger().GetRegexMatchAnsiPrefix();
        ansi_suffix = target_sp->GetDebugger().GetRegexMatchAnsiSuffix();
      }
      s->PutCStringColorHighlighted(symbol->GetName().GetStringRef(), pattern,
                                    ansi_prefix, ansi_suffix);

Also, since these changes were approved, we opted to finalize this PR like it is now. Since we are creating another PR to do the same thing for function search, we are planning on refactoring this with an optional struct and llvm::Regex type as suggested. This change would also make this if (pattern.empty() || prefix.empty() || suffix.empty()) prettier, I guess. Is that good?

I would also like to understand better the relation between SymbolContext and the target (or another way of getting Debugger properties).

llvm::StringRef ansi_prefix =
interpreter.GetDebugger().GetRegexMatchAnsiPrefix();
llvm::StringRef ansi_suffix =
interpreter.GetDebugger().GetRegexMatchAnsiSuffix();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are doing a follow up, I would put these calls inside the call to PutCStringColorHighlighted. To limit the scope of the values even more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to suggest that too, but the Stream class lives in utility so you can't pass in the debugger or the target without breaking the layering.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might be mistaking putting the calls inside PutCStringColorHighlighted vs. putting the calls inside the call to PutCStringColorHighlighted.

I'm suggesting:

PutCStringColorHighlighted(<...>, interpreter.GetDebugger().GetRegexMatchAnsiPrefix(), interpreter.GetDebugger().GetRegexMatchAnsiSuffix())

(should have included that example in my first comment tbh :) )

@DavidSpickett
Copy link
Collaborator

DavidSpickett commented Dec 7, 2023

Is it better if I declared llvm::StringRef ansi_prefix and llvm::StringRef ansi_suffix outside the conditions even if they are not used?

As far as I can tell, that's how you must do it, or at least, one of the simpler ways. As the call must be passed something if target is nulptr.

You could do:

      s->PutCStringColorHighlighted(symbol->GetName().GetStringRef(), pattern,
             target_sp ? target_sp->GetDebugger().GetRegexMatchAnsiPrefix() : "", 
             target_sp ? target_sp->GetDebugger().GetRegexMatchAnsiSuffix() : "");

If you prefer. There's not much difference to it. One puts it all in a single call which is good because the scope of the values is limited. The other has 1 conditional instead of two.

Leave it as is. Sounds like you will end up refactoring it anyway.

@junior-jl
Copy link
Contributor

You could do:

  s->PutCStringColorHighlighted(symbol->GetName().GetStringRef(), pattern,
         target_sp ? target_sp->GetDebugger().GetRegexMatchAnsiPrefix() : "", 
         target_sp ? target_sp->GetDebugger().GetRegexMatchAnsiSuffix() : "");

I really thought about doing this, but it looked too long, so I gave up, good to know it's ok.

Leave it as is. Sounds like you will end up refactoring it anyway.

Nice! Thank you for all the help so far!

@DavidSpickett
Copy link
Collaborator

DavidSpickett commented Dec 7, 2023

Wait, one last thing :)

If the prefix and suffix are empty, we can also take this shortcut, right?

If the prefix and suffix are empty. Not prefix or suffix is empty.

If there's no pattern or (the prefix is empty and the suffix is empty) -> just print the string normally.

Also please add a some tests where you set only suffix, and only prefix, neither. To validate that.

@DavidSpickett
Copy link
Collaborator

Or rather, set the suffix to "", prefix to "" and both to "" (since we default to some red/normal sequence don't we).

@junior-jl
Copy link
Contributor

If the prefix and suffix are empty. Not prefix or suffix is empty.

I tried this now and I guess it's not correct. For example, if I have a prefix (red color) and no suffix, the text does not go back to normal and will be forever red. Or am I doing something wrong?

Or rather, set the suffix to "", prefix to "" and both to "" (since we default to some red/normal sequence don't we).

Yes, I'm adding new tests like this now.

@junior-jl
Copy link
Contributor

I tried this now and I guess it's not correct. For example, if I have a prefix (red color) and no suffix, the text does not go back to normal and will be forever red. Or am I doing something wrong?

image

The condition is: if (pattern.empty() || (prefix.empty() && suffix.empty()))

@DavidSpickett
Copy link
Collaborator

I tried this now and I guess it's not correct. For example, if I have a prefix (red color) and no suffix, the text does not go back to normal and will be forever red. Or am I doing something wrong?

This is expected. We don't expect many people to do this but if they do, that's their mistake. A more realistic use might be prefix=!!!! and suffix="". If you wanted to "highlight" matches in a way that wasn't colour, niche, but it's possible with the existing highlighters so might as well go with it.

@junior-jl
Copy link
Contributor

I tried this now and I guess it's not correct. For example, if I have a prefix (red color) and no suffix, the text does not go back to normal and will be forever red. Or am I doing something wrong?

This is expected. We don't expect many people to do this but if they do, that's their mistake. A more realistic use might be prefix=!!!! and suffix="". If you wanted to "highlight" matches in a way that wasn't colour, niche, but it's possible with the existing highlighters so might as well go with it.

I see now. Thank you for the explanation.

The last commit adds the test cases and change the if condition.

@DavidSpickett DavidSpickett merged commit c90cb6e into llvm:main Dec 8, 2023
@DavidSpickett
Copy link
Collaborator

This is failing tests but I'm pretty sure it's a missing nullptr check. I'll fix it.

@DavidSpickett
Copy link
Collaborator

ffd61c1

@DavidSpickett
Copy link
Collaborator

Ok bots look good, so congratulations on landing this change! 🥳

(you will have some email from build bots, it's a good idea to go look at the failures just to see what they look like, but I have fixed them)

Sounds like you already have ideas for follow ups. You should also add a release note (https://github.com/llvm/llvm-project/blob/main/llvm/docs/ReleaseNotes.rst#changes-to-lldb) at some point when you're happy with the feature overall. Or in other words, when it can be described in 1 or 2 sentences without a lot of caveats.

I'm trying the feature myself and noticing the thing you said about the symbol context. Even when looking up symbols on a running process, when we print a Summary: line, target_sp is always nullptr. Surprising, and a bit annoying as a majority of the symbols in my searches didn't have a Name: line.

(though one can work around this by adding -v)

Could be that there's never been a need for target here, and as you said, what you really want is the debugger settings. So perhaps the better route is to add a link back to those.

@DavidSpickett
Copy link
Collaborator

DavidSpickett commented Dec 8, 2023

Spoke too soon, had to disable the test on Windows: 810d09f

Making it work there may require making function lookup work, not sure.

And Mac OS: 61f1825

Which I think is a matter of finding the right regex for the CHECK. If either of you have access to Windows/Mac, please give it a go.

Otherwise, @JDevlieghere please could you (/one of your colleagues) try the test on Mac and see what can be done?

@talha-10xE
Copy link
Contributor Author

Sure will try to make it work for windows. Hoping it will be as easier to build lldb on windows as it's for linux 🙂

DavidSpickett pushed a commit that referenced this pull request Jan 22, 2024
Follow-up to #69422.

This PR puts all the highlighting settings into a single struct for
easier handling

Co-authored-by: Talha Tahir <talha.tahir@10xengineers.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Colorize output when searching for symbols in lldb
7 participants