-
Notifications
You must be signed in to change notification settings - Fork 14.8k
[lldb] Add support for updating string during debug process #67782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Depends on: #67309 (comment) |
You can test this locally with the following command:darker --check --diff -r 9ffecef1c651a5c1a3f284b3257ec01ff526b49c...ccc9fb6be2f390cd894e0632cfded98f329f3059 lldb/test/API/python_api/value/change_values/libcxx/string/TestChangeStringValue.py View the diff from darker here.--- TestChangeStringValue.py 2024-04-10 11:45:49.000000 +0000
+++ TestChangeStringValue.py 2024-04-10 11:53:32.558069 +0000
@@ -5,12 +5,12 @@
import lldb
from lldbsuite.test.decorators import *
from lldbsuite.test.lldbtest import *
from lldbsuite.test import lldbutil
+
class LibcxxChangeStringValueTestCase(TestBase):
-
def setUp(self):
# Call super's setUp().
TestBase.setUp(self)
def do_test_value(self, frame, var_name, new_value, str_prefix):
@@ -19,38 +19,43 @@
# update whole string
err = lldb.SBError()
result = str_value.SetValueFromCString(new_value, err)
self.assertTrue(result, "Setting val returned error: {}".format(err))
- result = str_value.GetSummary() # str_value is a summary
+ result = str_value.GetSummary() # str_value is a summary
expected = '{}"{}"'.format(str_prefix, new_value)
- self.assertTrue(result == expected, "Got value: ({}), expected: ({})"
- .format(result, expected))
+ self.assertTrue(
+ result == expected,
+ "Got value: ({}), expected: ({})".format(result, expected),
+ )
@add_test_categories(["libc++"])
@expectedFailureAll(oslist=["windows"], archs=["arm"], bugnumber="llvm.org/pr24772")
- @expectedFailureAll(archs=["arm"]) # arm can't jit
+ @expectedFailureAll(archs=["arm"]) # arm can't jit
def test(self):
"""Test that we can change values of libc++ string."""
self.build()
self.runCmd("file " + self.getBuildArtifact("a.out"), CURRENT_EXECUTABLE_SET)
bkpt = self.target().FindBreakpointByID(
lldbutil.run_break_set_by_source_regexp(
- self, "Set break point at this line."))
+ self, "Set break point at this line."
+ )
+ )
self.runCmd("run", RUN_SUCCEEDED)
# Get Frame #0.
target = self.dbg.GetSelectedTarget()
process = target.GetProcess()
self.assertState(process.GetState(), lldb.eStateStopped)
- thread = lldbutil.get_stopped_thread(
- process, lldb.eStopReasonBreakpoint)
+ thread = lldbutil.get_stopped_thread(process, lldb.eStopReasonBreakpoint)
self.assertTrue(
thread.IsValid(),
- "There should be a thread stopped due to breakpoint condition")
+ "There should be a thread stopped due to breakpoint condition",
+ )
frame0 = thread.GetFrameAtIndex(0)
self.assertTrue(frame0.IsValid(), "Got a valid frame.")
- for var_name, str_prefix in zip(("s", "l", "ws", "wl", "u16s", "u32s"),
- ('', '', 'L', 'L', 'u', 'U')):
+ for var_name, str_prefix in zip(
+ ("s", "l", "ws", "wl", "u16s", "u32s"), ("", "", "L", "L", "u", "U")
+ ):
self.do_test_value(frame0, var_name, "new_value", str_prefix)
|
'[1] = (first = 2, second = "smart")', | ||
'[2] = (first = 3, second = "!!!")', | ||
'[3] = (first = 85, second = "goofy")', | ||
'size=4', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is it about this change that is defeating the ValueObject printer from compressing this output onto one line? It looks like the contents that get printed are the same, so there's something about switching from a Summary provider to a child provider that's causing problems. We should fix that as people are really picky about variable printing being as space efficient as possible.
To clear up terminology... Strings had data formatters before AND after this change. The difference is that you've switched from a "Summary Provider" data formatter to a "Synthetic Child Provider" data formatter. It looks like you've made the printing of std::strings less space efficient. That shouldn't be necessary, and isn't desirable. We should figure out why that's happening and fix that before this change is going to not cause complaints. |
BTW, I have no problem with the general direction of this change. It makes a lot more sense to ask a synthetic child provider to change a value - since it does represent the value of the ValueObject - rather than the summary which is just some free-form text. And being able to change characters in a string seems a reasonable thing to do, so switching the std::string comprehension from a Summary provider to a Synthetic Child Provider is the way to do that. So that part if fine. But std::strings abound in code, and so if we're going to make this change I don't think we can make that printing less space efficient, which this change seems to have done. We should figure out why that's the case and fix for this to be a really good change. |
As there is mentioned in a comment for
So, there is a condition for that:
This patch adds StringSynthetic and this synthetic |
I don't know what the status of lldb-mi is, but lldb-vscode which does the same job (be a DAP server) is under active development. So "this adaptor doesn't use the better method to do X" shouldn't be a reason to not employ the better method. We should just fix the adaptors. I entertained the idea of "value setting summaries" but rejected it after some thought. Summaries are unstructured short developer notes about the value, they really aren't suitable as general stand-in's for the value. So that's really not an appropriate route for changing values. But the Synthetic Child Providers are a re-presentation of the value, so they are a suitable route to do value changing for more complex object types. I haven't read carefully through the ValueObject printer in a while, but we do compress the printing of "small" structures in some cases, so it should be possible to get it not to expand this output. I think people will not see it as a positive if these common objects start taking more space, so it would be good to investigate how to do that. |
433d397
to
374784c
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
On Oct 5, 2023, at 1:05 AM, Pavel Kosov ***@***.***> wrote:
BTW, I have no problem with the general direction of this change. It makes a lot more sense to ask a synthetic child provider to change a value - since it does represent the value of the ValueObject - rather than the summary which is just some free-form text. And being able to change characters in a string seems a reasonable thing to do, so switching the std::string comprehension from a Summary provider to a Synthetic Child Provider is the way to do that. So that part if fine.
But std::strings abound in code, and so if we're going to make this change I don't think we can make that printing less space efficient, which this change seems to have done. We should figure out why that's the case and fix for this to be a really good change.
So, is it Ok to use current SetValueFromCString api from ValueObject to ask synthetic provider to update the underlying string?
I'm going to use SCP for "synthetic child provider" as I was getting tired of typing it out... And I think you know most of what I'm saying here, but I wanted to make sure we are talking about it the same way...
Since "SetValueFromCString" is an API on ValueObject, we are free to give that whatever meaning makes sense.
However, I think it's confusing here to start with the model that ValueObjects with SCPs have "underlying strings". The vast majority of data types represented by synthetic child providers have no single underlying string, they are mapping aggregates to aggregates or reinterpreting various regions (contiguous or not) of memory as though it were an aggregate of some sort. So there will for the most part be no "underlying string".
More generally, we don't really do "SetValueFromCString" for aggregate types, only for scalar values. So I don't see how SetValueFromCString is a useful general API on the aggregate type that actually holds the SCP. You really have to specify the particular child you want to change.
What does makes pretty clear sense is that if SetValueFromCString is called on one of the synthetic children produced by the ValueObject holding the SCP, the child ValueObject should ask the parent's SCP: "can you change my value to the incoming string". The SCP API would have to grow another interface "SetValueOfChildFromCString" to support this. The SCP has to know how to do that in a way that round-trips (i.e. if I make a second ValueObject from the same variable, it will also show the changed value). If it can do that, then it will make the change. Note, it may very well have to change others of the synthetic child values to make that happen, depending on how the child is generated from the underlying data being represented. It's also not guaranteed to be 1-1, since there might be many ways to change the underlying data that produce the requested change on the synthetic child. We should probably decide whether "I can change to represent this, but not uniquely" should be an error or not.
You mentioned previously that we may add SetSummaryFromCString api - in fact currently this is what I am doing - changing the whole string through its summary (please check attached gif for example).
I think this is also a useful API though it is a bit confusingly named. Sadly I can't think of a better one that isn't overly verbose. You aren't really setting the summary - that's not a terribly useful operation if the underlying value doesn't produce that new summary when asked for it again. It's really "ask the summary provider if it knows how to change the underlying data such that it will produce this new summary string". But anyway, we can make this clear in the API docs.
By the way, as a side note, watching the part of your example where you change the raw string guts, it looks like we don't update summaries of a ValueObject if one of its children gets changed. Be good to file bug on that one so we don't forget.
But the problem with the new API is the same as for the changing characters through SBValue::GetData - IDE doesn't support it
In the case of std::strings, which seems to be your primary motivator here, I think considering the string value to be the summary makes the most sense. There really isn't a scalar type of "a buffer of memory starting at 0x123 and ending sometime later (maybe signaled by there being a `\0` somewhere)." So representing std::string with a value providing SCP is not at all natural. Also in this case, it's useful to be able to see its raw contents easily in the UI as you were doing in your example, and if you swapped the std::string ValueObject for some kind of a "C-String" ValueObject, it would be awkward to get back to the raw fields.
The UI knows that it is inserting a string it got from GetSummary in that slot in the UI, so it can easily know to call SetSummaryFromCString when that slot is edited. I am not in favor of using a strained model for as ubiquitous a data type as std::string in lldb just because we don't want to have to teach the UI how to handle it properly.
Jim
… —
Reply to this email directly, view it on GitHub <#67782 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADUPVW7EA2Y2567SD5JCJMTX5ZS6PAVCNFSM6AAAAAA5MH6OUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONBYGMZTCNJXHE>.
You are receiving this because you commented.
|
This is the last patch needed for adding an ability to update std::string/wstring/etc during debug process. std::string/std::wstring/std::u16(32)string synthetic frontend implemented Also tests for the frontend added. ~~ Huawei RRI, OS Lab
374784c
to
ccc9fb6
Compare
Hello, @jimingham , first of all - sorry for the long delay in reply. I carefully read through all your messages once again and found, that I totally misunderstood some places, i.e. "Why isn't it enough to have the ValueObjectSynthetic's SetValueFromCString do this" - I think it should be enough, I'll fix it.
I agree with this, but I think there is not so much aggregates for which it makes sense to change their length as for strings (at least in libcxx) - I mean it is natural to update whole string to completely different value, but it is not natural to do so for e.g. vector. In case of strings, one might want to set string longer, than the one he has now for the debug purposes, so this will indeed invalidate code, that relies on the pointers that was obtained through
Sorry, I think that I didn't express my thoughts carefully, by the underlying string I didn't mean, that we have some string in the SCP or ValueObjects, I meant the strings in the code that is under debug.
I'm not sure that this bug might be reproduced without the string example, I don't know which type have the summary which represent all its children. Is it ok, to file a bug with current strings example or how to do it better?
And in the end, may I kindly ask you to clarify your position about these changes please? Do you suggest to return to the |
@jimingham Please take a look: #67782 (comment) |
Target *target = exe_ctx.GetTargetPtr(); | ||
StackFrame *frame = exe_ctx.GetFramePtr(); | ||
|
||
if (target && frame) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right way to implement this. By relying on the base name of the backend and using that directly you will only be able to handle cases where the std::string is a local variable or function argument. It won't work on std::strings inside other objects - you would need the full path. It also won't work for globals, which don't have frames, even though there's no reason it shouldn't.
A better approach would be to get the address of the object you are trying to change, and using that address properly cast to do the assignment. That way you wouldn't have to care how that value got referred to. You should also make sure that address is of eAddressTypeLoad, since running expressions on data in the host isn't likely to work.
For the most part we try not to run expressions in the ValueObject code, but setting values is less likely to happen for large objects, or to lots of variables at once the way rendering their values/Summaries, etc does. So I'm not too worried about that.
But you certainly shouldn't run expressions on a ValueObject's behalf that allow more than one thread to run. Otherwise you get into the state where I stop in thread A, see something interesting, go to thread B and change a value and switch back to thread A and it's not where I left it!
It's not just pointers, of course. If you've done: size_t string_len = my_string.size(); then string_len will be wrong. But I agree, that's more the lookout of whoever is messing with their strings while running.
The more I think about it, the more it seems that SetSummaryFromCString is just the wrong API to use to change the value of something. After all, Summaries don't actually represent the "value" of the ValueObject, they are just whatever bits of that value the Summary Provider decided to extract, plus whatever additional text they wanted to provide. Summaries may not even be about the value at all, for instance for checked pointers they might just say the checked state. It's clear that's not the entity you would ask to change that object's value.
So I don't think trying to change underlying values makes sense as part of the Summary feature. Changing values of a ValueObject with a Synthetic Child Provider definitely makes sense if directed to the children the synthetic child provider shows. If the SCP for some value produces an If we were presenting the std::string contents as an array of char's then changing the individual characters by addressing the members of the array would make sense. Note, that's also the only interface that will work for std::string's in full generality because std::strings aren't required to hold null-terminated C-strings. I'm a little bothered by the notion of changing the whole of a SyntheticValue by providing some C-string. That makes sense for a restricted representation of the new values of something that stores string-y objects, but doesn't make sense without further syntax for anything else. It's also not the right implementation internally for changing the child provided by an SCP, since it that case you'd call SetValueFromCString on the child, and route that to the parent. But having this in place wouldn't preclude adding machinery to get the SCP to effect a change for one of the children they produced. And I guess we can make some attempt to define what this actually means the second time someone uses it. |
Before this update strings has only DataFormatters, and therefore the way they were printed was slightly different (see comment here: https://github.com/llvm/llvm-project/blob/main/lldb/source/DataFormatters/FormatManager.cpp#L498), so tests which rely on this formatting also changed. std::string/std::wstring/std::u16(32)string synthetic frontend implemented Also tests for the frontend added.
~~
Huawei RRI, OS Lab