Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We need a solid wasm code size profiler #20

Closed
fitzgen opened this issue Jan 17, 2018 · 11 comments
Closed

We need a solid wasm code size profiler #20

fitzgen opened this issue Jan 17, 2018 · 11 comments

Comments

@fitzgen
Copy link
Member

fitzgen commented Jan 17, 2018

I hacked up a little script here: https://github.com/fitzgen/source-map-mappings/blob/master/source-map-mappings-wasm-api/who-calls.py

But that was just a hacky thing out of necessity. What I really want is:

  • Lists for functions, crates, and data segments, sorted by largest size

    • This is table-stakes for a size profiler
    • The who-calls.py script does this, and shows callers as well, displayed as a tree
    • When looking at functions, I should be able to select a function from crate "dependency_i_dont_care_about" and have all of the functions in that crate collapse into a single item representing that whole crate.
  • Tree map visualizations grouped by function, crate, and data segment

  • List the path(s) in the call graph from any given private function back to an exported function

    • This helps me understand why some function is emitted in the .wasm even though I didn't expect it to be
    • who-calls.py does a very basic version of this
    • I want to see the Rust source for each of these call graph edges, so I can determine if maybe they really can't happen in practice, even if wasm-gc can't statically prove that the code is dead
  • Dominator trees to tell me which function F transitively keeps the most code size from other child functions "rooted" in the call graph, even if F itself has a small code size (and so you might otherwise ignore it). By "rooted" I mean "reachable from exported functions in the call graph", ie won't get removed by wasm-gc.

  • I want to know which logical functions got inlined into any given physical function in the wasm

    • list inlined functions by the total code size of all of their inlined blocks
    • and how many times a particular logical function was inlined across the whole .wasm file?
    • how much space would I save if this logical function was never inlined, and instead existed as a physical function?
  • Similarly, I want to be able to investigate monomorphizations of generic functions

    • list generic functions by the total code size of all of their monomorphizations
    • how many times was this generic function monomorphized?
    • how much space would I save if I switched to dynamic dispatch and trait objects for this generic function instead of monomorphization?
@fitzgen
Copy link
Member Author

fitzgen commented Jan 17, 2018

The inlined functions feature require some kind of debugging information that does not exist. Source maps don't have that information, however DWARF does (3.3.8.2):

Each inline expansion of a subroutine is represented by a debugging information
entry with the tag DW_TAG_inlined_subroutine. Each such entry is a direct
child of the entry that represents the scope within which the inlining occurs.

Each inlined subroutine entry may have either a DW_AT_low_pc and
DW_AT_high_pc pair of attributes or a DW_AT_ranges attribute whose values
encode the contiguous or non-contiguous address ranges, respectively, of the
machine instructions generated for the inlined subroutine (see Section 2.17
following). An inlined subroutine entry may also contain a DW_AT_entry_pc
attribute, representing the first executable instruction of the inline expansion (see
Section 2.18 on page 55).

@est31
Copy link

est31 commented Jan 17, 2018

There is cargo-bloat which has some initial work on this, but it doesn't support wasm (yet). cc @RazrFalcon .

@RazrFalcon
Copy link

AFAIU, we need to parse WASM binaries(?) and it's out of scope for cargo-bloat, because it works on top of goblin/object. So first we need a crate that can parse WASM binaries.

For more advanced features we need a better name mangling in rustc, because the current one does not preserve all the information. See rust-lang/rust#45691 (comment)

@mgattozzi
Copy link
Contributor

mgattozzi commented Jan 18, 2018

I'm not sure if parity-wasm could do what you need @RazrFalcon but maybe @pepyakin could shed some light on that or if they know of some other crate that might if it exists today.

@pepyakin
Copy link
Member

Yeah, parity-wasm is capable of parsing WASM binaries. Although it still lacks support of reading name section which might be useful for the usecase. However, there is ongoing PR.

@pepyakin
Copy link
Member

Btw, @emk already started something called wasm-bloat !

@pepyakin
Copy link
Member

cc WebAssembly/wabt#724

@emk
Copy link

emk commented Jan 22, 2018

@pepyakin wasm-bloat is not moving very quickly, because I'm juggling too many projects. But I see that you finished and merged my work on paritytech/parity-wasm#132 (thank you!), which means that parity-wasm can now parse all the function bodies in a WASM file, and match them to mangled function names in the source code.

This wouldn't allow you to detect inlined functions (that would require source map support, I think), but for function-level stuff, it means that parity-wasm should give you almost everything you need to write a nice bloat tool. You could de-mangle the Rust names (there's a crate for that), organize them by module, and figure out the size of each in maybe 250 lines of code, I'd bet.

I may continue work on wasm-bloat at some point if nobody beats me to adding this to cargo bloat, but I have a couple of other major disruptions coming up this winter that will reduce my open source bandwidth for a while.

@fitzgen fitzgen changed the title We need a solid wasm code size profile We need a solid wasm code size profiler Feb 16, 2018
@fitzgen
Copy link
Member Author

fitzgen commented Feb 16, 2018

Ok, I've got the start of a code profiler that does a lot of what I laid out in the original comment.

I've filed lots of issues, too, for folks who want to help build it! Happy to mentor / help anyone get up to speed with it.

https://github.com/fitzgen/svelte

@anp
Copy link

anp commented Feb 16, 2018

@fitzgen small heads up that svelte is also the name of a JS UI framework: https://github.com/sveltejs/svelte. Name collisions are unavoidable, and it doesn't seem like they're actively exploring wasm right now but I do see the potential for confusing searches in the future.

@alexcrichton
Copy link
Contributor

Given the advent and integration of Twiggy, I'm gonna close this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants