Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

demangle: More elaborated demangling for Rust #1625

Merged
merged 2 commits into from Feb 6, 2023
Merged

demangle: More elaborated demangling for Rust #1625

merged 2 commits into from Feb 6, 2023

Conversation

ChoKyuWon
Copy link
Contributor

This commit fixes following things:

  1. Ignore traits (See Rust symbol demangling for Traits #1620)
  2. Demangle '..' to '::'
  3. Consume extra '_' at starting point of string
  4. Add placeholder to support Rust v0 mangling rule (start with '_R')

However, because of 2, it replace the '..' to '::', it need a write permission for the source string. It makes segfault for the demangle test because the source string is const.

Signed-off-by: ChoKyuWon kyuwoncho18@gmail.com

Copy link
Owner

@namhyung namhyung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Can you please split the ".." part and trait handling?

utils/demangle.c Outdated Show resolved Hide resolved
Copy link
Owner

@namhyung namhyung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rewrite the header line of the second commit message? The same message confuses readers and I'd prefer being more specific like 'demangle: convert ".." to "::" for Rust'.

utils/demangle.c Outdated
num = dollar - p;
dd_append_len(dd, p, num);
/* consume extra '_' before '$' */
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this. How can we make sure if it's not a valid identifier. For example it might have my_mod_ as a module name, then any dollar sign after that would delete the trailing underscore, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I'll revert this code. However, for clearer context, the first underscore would be deleted IMO.

return xstrdup(str);
}

// TODO: implement demangling Rust v0 mangling
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this. Could you please create an issue with this? It'd be great if it can have some examples.

utils/demangle.c Outdated
@@ -1469,6 +1470,11 @@ static int dd_source_name(struct demangle_data *dd)
if (add_name)
dd_append_len(dd, p, num);
__dd_consume_n(dd, num, NULL);
separator = strstr(dd->new, "..");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid it might have multiple "..". Does this handle it properly?

Copy link
Contributor Author

@ChoKyuWon ChoKyuWon Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why, but in my test case (position at last of demangle_rust1), it seems to work properly. But to make sure, I'll add a loop for that.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it worked by luck. It just converts a single instance, but later dd_source_name() call would convert it again. If the original name contains more than two ".." and it has a dd_source_name() after that, not all ".." would be converted.

Copy link
Contributor Author

@ChoKyuWon ChoKyuWon Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Thanks for the good point.

Copy link
Owner

@namhyung namhyung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some explanation in the second commit message. Like when it needs to be converted or just include an example.

Also I think we should add a test for more complex case like in the #1620.

utils/demangle.c Outdated
separator[1] = ':';
}
else
break;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I prefer reducing the nesting level by checking the break condition early. Something liike below..

while (true) {
    separator = strstr(dd->new, "..");
    if (separator == NULL)
        break;
    separator[0] = ':';
    separator[1] = ':';
}

utils/demangle.c Outdated
@@ -1382,6 +1382,7 @@ static int dd_source_name(struct demangle_data *dd)
{
int num = dd_number(dd);
char *dollar;
char *separator;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may put it inside the loop below.

utils/demangle.c Outdated
}
else
break;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another concern is that it's suboptimal. It checks ".." whenever it calls dd_source_name(). I think we can do it only when the dollar sign was found. Also it doesn't need to search the whole (new) string. Could we check the newly added region only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that the ".." is always inside the <>. I'll try to find a counterexample, and if there's nothing, I'll fix this commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I collect the LLVM IR of large-scale projects and search the pattern like this:

$ cat *.ll | grep "define[^w]*\.\." | grep -v "$"

and it returns nothing.
So ".." only generated with a dollar sign is quite a safe guess. I'll work on the optimal solution.

Copy link
Owner

@namhyung namhyung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work. I only have a few nitpicks.

@@ -1426,7 +1426,6 @@ static int dd_source_name(struct demangle_data *dd)
/* check special symbol mappings (e.g. '$LT$') for Rust */
while (dollar != NULL && dollar < end) {
bool found = false;

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary change. Please leave it.

DEMANGLE_TEST("_ZN3foo3bar17h05af221e174051e9E", "foo::bar");
DEMANGLE_TEST(
"_ZN61_$LT$$RF$std..io..stdio..Stdout$u20$as$u20$std..io..Write$GT$9write_fmt17h75c561f414a62159E",
"_<&std::io::stdio::Stdout>::write_fmt");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to the next commit.

@@ -1426,8 +1426,17 @@ static int dd_source_name(struct demangle_data *dd)
/* check special symbol mappings (e.g. '$LT$') for Rust */
while (dollar != NULL && dollar < end) {
bool found = false;
char *separator = p;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general guide is to have a blank line between declaration and body.

Copy link
Owner

@namhyung namhyung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Can you please rebase it onto the current master?

@namhyung
Copy link
Owner

namhyung commented Feb 6, 2023

You need to add a blank line after the commit header to separate subject and body. Please take a look at other commit messages.

This commit fixes following things:
1. Ignore traits (See #1620)
2. Add placeholder to support Rust v0 mangling rule (start with '_R')
(See #1626)

Signed-off-by: ChoKyuWon <kyuwoncho18@gmail.com>
For Rust mangling, they use ".." as separator.
This commit replace it to standard separator for more intuitive
transformation.

Signed-off-by: ChoKyuWon <kyuwoncho18@gmail.com>
@namhyung namhyung merged commit 2083cdc into namhyung:master Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants