New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demangle Function Names #270
Conversation
387 ┊ 13.74% ┊ func[2] | ||
378 ┊ 13.42% ┊ ⤷ wee_alloc::alloc_with_refill::hb32c1bbce9ebda8e | ||
387 ┊ 13.74% ┊ func[2]: wee_alloc::alloc_with_refill::hb32c1bbce9ebda8e | ||
378 ┊ 13.42% ┊ ⤷ code[2]: wee_alloc::alloc_with_refill::hb32c1bbce9ebda8e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's what we see now when running dominators
now 😸
-25 ┊ data[1] | ||
-25 ┊ data[2] | ||
+15 ┊ hello | ||
+15 ┊ code[0]: hello | ||
+15 ┊ import env::rust_oom |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe give this and custom section ...
a prefix like the others too.
Sorry for the delay -- I'll get to this tomorrow for sure. It is a bit bigger than I expected it to be :-p |
1 similar comment
Sorry for the delay -- I'll get to this tomorrow for sure. It is a bit bigger than I expected it to be :-p |
(Woops, mis-click) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @data-pup
I have concerns about the usability/ergonomics of this new output, and I'm not sure it is an improvement over what we have right now. Additionally, non-wasm inputs don't have the func vs code distinction, but are still getting the annotation prefix, so their output is particularly worse with this in that they get the annotation prefix but it doesn't even need disambiguation.
I think I would prefer, only on wasm where there is an ambiguity in the first place, output that is something like
123 | foo::bar::Baz::quux (func[7])
456 | foo::bar::Baz::quux (code[7])
over putting the annotation in the front.
Even this I am not super happy with, though. I wonder if it makes sense to collapse func[123]
and code[123]
into the same IR item, since no body is used by more than one func and vice versa. They are essentially the same, even if they are split at teh physical binary level (to allow eager type checking and validation). Twiggy's original goal/design was to try and model exactly what the binary layout was, but in this case, since there is no sharing, we might want to loosen that for usability.
ir/ir.rs
Outdated
ItemKind::Func(func) => { | ||
if let Some(name) = func.name() { | ||
// format!("{}: {}", func.decorator(), name) | ||
func.decorator().to_string() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this is WIP? both sides of the if/else are the same, but there is some commented out code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wooops, good catch. Thanks 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out that this did get cleaned up in a later commit, was using this to make sure I was still generating the same output when annotations were removed 👍
@@ -517,6 +530,9 @@ pub enum ItemKind { | |||
/// with the executable code. | |||
Data(Data), | |||
|
|||
/// Function definition. Declares the signature of a function. | |||
Func(Function), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty wasm-specific, but I guess that's fine, since other kinds of binaries can simply not generate any IR items of this variant.
ir/ir.rs
Outdated
} | ||
|
||
/// Get this item's decorated name. | ||
/// TODO: This needs better documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto here, the TODO
was removed, but I still have the same sentiment. Maybe a simple example, something like:
/// Get this item's decorated name, i.e. 'foo (code[7])'
That might warrant adding some other info to the existing name
method, which I'm also happy to add if you think that makes sense 🙂
7917 ┊ 0.38% ┊ Subroutine[68][131]: je_stats_print | ||
7852 ┊ 0.37% ┊ Subroutine[52][2800]: mallocx | ||
7019 ┊ 0.33% ┊ Subroutine[53][4212]: je_arena_boot | ||
6221 ┊ 0.30% ┊ Subroutine[71][86]: je_malloc_vsnprintf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned that this is a regression in usability for our output, and not an improvement. It seems that every item has an extra prefix now that is not meaningful or actionable to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, totally agree. Maybe an alternative would be to represent subroutines in an ELF/etc. file with a different variant that wouldn't display this prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to the Func
variant, other things need not create a Subroutine
variant if that isn't applicable, and in the case of DWARF info, we can hide that behind a cfg
flag anyhow
Totally agree w/ you about the ergonomics of this re: output being iffy. I can refactor the output to resemble the format you provided, & agree w/ you that the extra info would be best listed after the name (if at all). I've noticed that the code/func relationship is confusing to people when explaining some of the output of this tool, so maybe collapsing these together would be useful. Happy to wait for Alex's feedback on this. At that point, it might be worth deciding whether I should try solving this w/ a different approach altogether, or refactoring this into something w/ nicer output 🙂 Thanks for the review @fitzgen ❕ |
code.demangled().unwrap_or(&self.name) | ||
} else { | ||
&self.name | ||
match &self.kind { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another note, while we're in review: I sort of detest that this is called the item's "kind" if it now also stores the name due to this approach. Would this be best referred to as ItemMetadata
or something along those lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh, we used kind
in rust-bindgen
, so it is familiar to me, but I don't have much of a first-principles argument here
I think I agree with @fitzgen that the best way to handle this on WebAssembly is to merge the func/code nodes. When looking at the size of a function we may as well account for the bytes that the function type pointer takes up (although it's like 1-3 at most realistically). It'll definitely make reading the output a bit easier as I know in the past I've had to follow chains a few levels deep to get to the real part of a function! |
That sounds good to me! Thanks for the review/advice y'all 🎉 🙏 I'm going to close this for now, because it is probably easiest to start over if we should go that route. If I run into any trouble I'll ping for advice back in #174. |
Thanks @data-pup! Sorry for the delay followed by the back-and-forth :( |
@fitzgen not a problem at all! I know you're all kinda busy w/ gloo & other projects atm 🙂 Most of this work did feel speculative & exploratory anyhow, so it's nice to feel confident about why merging func/code items in the IR model is a better approach than this. |
Fixes #174.
This is a much larger PR than I expected it to be, and apologies on that regard. 😬
I tried a simpler fix here: master...data-pup:simple-function-name-demangle-fix
Notably, we need to include the type of item in the output, so that we can differentiate between code and function items. Unfortunately, this meant that the user would need to specify items by their entire name and type/location when using subcommands that accept arguments.
This change, while more involved, allows us to query for items by name, while still showing the demangled names of functions and code blocks! By moving the name of an item from the
ir::Item
struct into their::ItemKind
field, we can handle these properly.The other notable part is that this involves storing the "annotation" (i.e.
func[1]
) in that struct, and adding anannotated_name
method which is used when displaying items. This is required because thediff
command breaks if we use the annotated name, because we can't tell that two items are related if an item moved in the index space.Because this is a hefty PR, I kept changes in separate commits, to help make reviewing this easier. I'll try and find some time to annotate the important parts over the weekend, or early next week 🙂