Skip to content

Lazy-loading support#9697

Merged
kripken merged 25 commits intoincomingfrom
lazhy
Oct 25, 2019
Merged

Lazy-loading support#9697
kripken merged 25 commits intoincomingfrom
lazhy

Conversation

@kripken
Copy link
Copy Markdown
Member

@kripken kripken commented Oct 24, 2019

This adds emscripten_lazy_load_code(), a function that when called will block (using Asyncify) and load the complete program. This lets the initial download not contain code that can be proven to be used only after those calls.

How this works is we emit the initial downloaded wasm after modifying it with Binaryen to assume that those calls will not rewind, as we only unwind using that binary (the optimizer can then remove a lot of code). We also emit the later downloaded wasm, optimized the other way, to assume we only rewind but never unwind. These transforms are done using ModAsyncify passes from Binaryen WebAssembly/binaryen#2404

A new option ASYNCIFY_LAZY_LOAD_CODE is necessary to use this (as well as ASYNCIFY itself).

The test shows an example where the initial download is less than half the size of the later download. Note that in this model we do download some code twice (the later download contains the code in the initial one), and we do add some code size due to Asyncify support. So this will not be an obvious win for every codebase. But if the later download contains unlikely code that may never be used, and there is a lot of such code, and we can ignore indirect calls for Asyncify's purposes, then the win can be significant.

cc @surma @RReverser

Comment thread tools/shared.py
return cmd

@staticmethod
def get_wasm_opt_command(args=[], debug=False):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you have this take and infile and outfile argument since all the call sites want that I think the command requires it?

Copy link
Copy Markdown
Member Author

@kripken kripken Oct 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, but perhaps as a followup? That's going to be a bunch more changes not directly relevant to this PR.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about splitting the get_wasm_opt_command part out of this change? Seems like a refactor that would make this change smaller. I leave that up to you though, I know you wan to get this landed today if possible so i don't want to delay you too much.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the reason I added it here is that it made this shorter to write, and also seemed like a useful change. I don't see a benefit to splitting it out, but I agree this can be improved more, I'll look at that as a followup.

Comment thread tools/shared.py
# as well minify wasm exports to regain some of the code size loss that setting DECLARE_ASM_MODULE_EXPORTS=1 caused.
if not Settings.STANDALONE_WASM and not Settings.AUTODEBUG and not Settings.ASSERTIONS and not Settings.SIDE_MODULE and emitting_js:
if not Settings.STANDALONE_WASM and not Settings.AUTODEBUG and not Settings.ASSERTIONS and not Settings.SIDE_MODULE and emitting_js and not Settings.ASYNCIFY_LAZY_LOAD_CODE:
js_file = Building.minify_wasm_imports_and_exports(js_file, wasm_file, minify_whitespace=minify_whitespace, minify_exports=Settings.DECLARE_ASM_MODULE_EXPORTS, debug_info=debug_info)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we minify with lazy loading? Is that something that can be enabled as a followup?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASYNCIFY_LAZY_LOAD_CODE disables minification because it runs after the wasm is completely finalized, and we need to be able to still identify import names at that time. To avoid that, we would need to keep a mapping of the names and send that to binaryen. That's possible, but not trivial. I'm adding a comment to clarify.

Comment thread tests/test_core.py
'conditional': (True,),
'unconditional': (False,),
})
def test_emscripten_lazy_load_code(self, conditional):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just "test_lazy_load_code"?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the name of the C function, though?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK SGTM either way. I was think thinking "lazy_load_code" is kind of concept you are testing since its also the suffix of ASYNCIFY_LAZY_LOAD_CODE.

@@ -0,0 +1,2 @@
foo_start
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you call these files "test_lazy_load_code.*" to match the test name

Comment thread tests/test_core.py Outdated
Comment thread tests/test_core.py

# attempts to "break" the wasm by adding an unreachable in $foo_end. returns whether we found it.
def break_wasm(name):
wat = run_process([os.path.join(Building.get_binaryen_bin(), 'wasm-dis'), name], stdout=PIPE).stdout
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_binaryen_command? here and below.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasm-dis and as don't support feature flags, so that utility function doesn't work on them. But given they don't need those flags, it's about the same length to type either way.

Comment thread tests/test_core.py Outdated
})

EM_JS(void, log_ended, (), {
out("foo_end");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not simply use puts/printf here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stdio increases code size, and it's nice in the test to only use it after the lazy loading, to see the size difference more clearly.


void foo_end(int n) {
log_ended();
// prevent inlining
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about __attribute__((noinline)) to be explicit?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not enough for binaryen, though :(

Comment thread tests/core/emscripten_lazy_load_code.cpp
Comment thread tools/shared.py
# that it will only rewind, after which optimizations can remove some code
cmd = Building.get_wasm_opt_command(debug=debug)
cmd += [wasm_binary_target, '-o', wasm_binary_target + '.lazy.wasm']
cmd += ['--remove-memory']
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory? See the comment a few lines up, remove the memory segments from it, as memory segments have already been applied by the initial wasm

@kripken kripken merged commit 9e389cb into incoming Oct 25, 2019
@delete-merged-branch delete-merged-branch Bot deleted the lazhy branch October 25, 2019 19:59
kripken added a commit that referenced this pull request Oct 25, 2019

Asyncify functions
==================
Fastcomp Asyncify functions
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this changed to say "Fastcomp"? AFAIK same applies to Asyncify transform on top of wasm backend?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR refactored these docs, and now this part has the fastcomp funcs, while lower down there is "Upstream Asyncify".

(Or maybe I didn't understand what you meant by "same" in that sentence?)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "same" I mean that the description in the first sentence (both -s ASYNCIFY=1 as well as link to Asyncify docs) seems generic enough, but only included in Fastcomp section now.

Also, aren't the functions that are currently included in "Fastcomp functions" - e.g. emscripten_sleep - available with upstream backend as well?

If they're, then the current structure seems confusing, as it seems to imply that these functions are Fastcomp-only, while others are upstream-only. I'd suggest turning the first section into "Asyncify functions" and then nesting the other one under it as "Upstream-only Asyncify functions".

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Now I see what you mean. Yeah, sleep looks wrong - it's mentioned in fastcomp, as you say, but should only be in the shared section earlier. I also see sleep with yield is in the wrong place. I opened #9819 now for these issues.

@gabrielcuvillier
Copy link
Copy Markdown
Contributor

@kripken A question on this interesting feature: does this work as expected if the program have dynamically allocated memory (malloc, new, etc...) ?

I mean, suppose the program have instantiated classes with virtual methods, and one of the virtual method is doing lazy_load(). When this function is called and the lazy load occurs, it seems the program is being "replaced" by the newer version: but does the memory follows and everything is "remapped" accordingly ?

Not sure if I am clear :)

@kripken
Copy link
Copy Markdown
Member Author

kripken commented Nov 15, 2019

Yes, memory is preserved exactly as it was before, so dynamic allocation is fine.

We replace the code, but do not replace the data in memory. So we do not apply the memory initialization again, for example.

belraquib pushed a commit to belraquib/emscripten that referenced this pull request Dec 23, 2020
This adds emscripten_lazy_load_code(), a function that when called will block
(using Asyncify) and load the complete program. This lets the initial download
not contain code that can be proven to be used only after those calls.

How this works is we emit the initial downloaded wasm after modifying it with
Binaryen to assume that those calls will not rewind, as we only unwind using
that binary (the optimizer can then remove a lot of code). We also emit the
later downloaded wasm, optimized the other way, to assume we only rewind but
never unwind. These transforms are done using ModAsyncify passes from Binaryen
WebAssembly/binaryen#2404

A new option ASYNCIFY_LAZY_LOAD_CODE is necessary to use this (as well as
ASYNCIFY itself).

The test shows an example where the initial download is less than half the size
of the later download. Note that in this model we do download some code twice
(the later download contains the code in the initial one), and we do add some
code size due to Asyncify support. So this will not be an obvious win for every
codebase. But if the later download contains unlikely code that may never be
used, and there is a lot of such code, and we can ignore indirect calls for
Asyncify's purposes, then the win can be significant.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants