New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
demangle: More elaborated demangling for Rust #1625
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. Can you please split the ".." part and trait handling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rewrite the header line of the second commit message? The same message confuses readers and I'd prefer being more specific like 'demangle: convert ".." to "::" for Rust'.
utils/demangle.c
Outdated
num = dollar - p; | ||
dd_append_len(dd, p, num); | ||
/* consume extra '_' before '$' */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this. How can we make sure if it's not a valid identifier. For example it might have my_mod_
as a module name, then any dollar sign after that would delete the trailing underscore, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I'll revert this code. However, for clearer context, the first underscore would be deleted IMO.
return xstrdup(str); | ||
} | ||
|
||
// TODO: implement demangling Rust v0 mangling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this. Could you please create an issue with this? It'd be great if it can have some examples.
utils/demangle.c
Outdated
@@ -1469,6 +1470,11 @@ static int dd_source_name(struct demangle_data *dd) | |||
if (add_name) | |||
dd_append_len(dd, p, num); | |||
__dd_consume_n(dd, num, NULL); | |||
separator = strstr(dd->new, ".."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid it might have multiple ".."
. Does this handle it properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why, but in my test case (position at last of demangle_rust1
), it seems to work properly. But to make sure, I'll add a loop for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it worked by luck. It just converts a single instance, but later dd_source_name()
call would convert it again. If the original name contains more than two ".." and it has a dd_source_name()
after that, not all ".." would be converted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. Thanks for the good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some explanation in the second commit message. Like when it needs to be converted or just include an example.
Also I think we should add a test for more complex case like in the #1620.
utils/demangle.c
Outdated
separator[1] = ':'; | ||
} | ||
else | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I prefer reducing the nesting level by checking the break condition early. Something liike below..
while (true) {
separator = strstr(dd->new, "..");
if (separator == NULL)
break;
separator[0] = ':';
separator[1] = ':';
}
utils/demangle.c
Outdated
@@ -1382,6 +1382,7 @@ static int dd_source_name(struct demangle_data *dd) | |||
{ | |||
int num = dd_number(dd); | |||
char *dollar; | |||
char *separator; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may put it inside the loop below.
utils/demangle.c
Outdated
} | ||
else | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another concern is that it's suboptimal. It checks ".." whenever it calls dd_source_name()
. I think we can do it only when the dollar sign was found. Also it doesn't need to search the whole (new) string. Could we check the newly added region only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that the ".." is always inside the <>. I'll try to find a counterexample, and if there's nothing, I'll fix this commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I collect the LLVM IR of large-scale projects and search the pattern like this:
$ cat *.ll | grep "define[^w]*\.\." | grep -v "$"
and it returns nothing.
So ".." only generated with a dollar sign is quite a safe guess. I'll work on the optimal solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work. I only have a few nitpicks.
@@ -1426,7 +1426,6 @@ static int dd_source_name(struct demangle_data *dd) | |||
/* check special symbol mappings (e.g. '$LT$') for Rust */ | |||
while (dollar != NULL && dollar < end) { | |||
bool found = false; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary change. Please leave it.
DEMANGLE_TEST("_ZN3foo3bar17h05af221e174051e9E", "foo::bar"); | ||
DEMANGLE_TEST( | ||
"_ZN61_$LT$$RF$std..io..stdio..Stdout$u20$as$u20$std..io..Write$GT$9write_fmt17h75c561f414a62159E", | ||
"_<&std::io::stdio::Stdout>::write_fmt"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be moved to the next commit.
@@ -1426,8 +1426,17 @@ static int dd_source_name(struct demangle_data *dd) | |||
/* check special symbol mappings (e.g. '$LT$') for Rust */ | |||
while (dollar != NULL && dollar < end) { | |||
bool found = false; | |||
char *separator = p; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general guide is to have a blank line between declaration and body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Can you please rebase it onto the current master?
You need to add a blank line after the commit header to separate subject and body. Please take a look at other commit messages. |
For Rust mangling, they use ".." as separator. This commit replace it to standard separator for more intuitive transformation. Signed-off-by: ChoKyuWon <kyuwoncho18@gmail.com>
This commit fixes following things:
However, because of 2, it replace the '..' to '::', it need a write permission for the source string. It makes segfault for the demangle test because the source string is const.
Signed-off-by: ChoKyuWon kyuwoncho18@gmail.com