Lazy-loading support by kripken · Pull Request #9697 · emscripten-core/emscripten

kripken · 2019-10-24T17:47:38Z

This adds emscripten_lazy_load_code(), a function that when called will block (using Asyncify) and load the complete program. This lets the initial download not contain code that can be proven to be used only after those calls.

How this works is we emit the initial downloaded wasm after modifying it with Binaryen to assume that those calls will not rewind, as we only unwind using that binary (the optimizer can then remove a lot of code). We also emit the later downloaded wasm, optimized the other way, to assume we only rewind but never unwind. These transforms are done using ModAsyncify passes from Binaryen WebAssembly/binaryen#2404

A new option ASYNCIFY_LAZY_LOAD_CODE is necessary to use this (as well as ASYNCIFY itself).

The test shows an example where the initial download is less than half the size of the later download. Note that in this model we do download some code twice (the later download contains the code in the initial one), and we do add some code size due to Asyncify support. So this will not be an obvious win for every codebase. But if the later download contains unlikely code that may never be used, and there is a lot of such code, and we can ignore indirect calls for Asyncify's purposes, then the win can be significant.

cc @surma @RReverser

sbc100 · 2019-10-25T16:36:20Z

+    return cmd
+
+  @staticmethod
+  def get_wasm_opt_command(args=[], debug=False):


can you have this take and infile and outfile argument since all the call sites want that I think the command requires it?

Sounds good, but perhaps as a followup? That's going to be a bunch more changes not directly relevant to this PR.

How about splitting the get_wasm_opt_command part out of this change? Seems like a refactor that would make this change smaller. I leave that up to you though, I know you wan to get this landed today if possible so i don't want to delay you too much.

Well, the reason I added it here is that it made this shorter to write, and also seemed like a useful change. I don't see a benefit to splitting it out, but I agree this can be improved more, I'll look at that as a followup.

sbc100 · 2019-10-25T16:39:24Z

        # as well minify wasm exports to regain some of the code size loss that setting DECLARE_ASM_MODULE_EXPORTS=1 caused.
-        if not Settings.STANDALONE_WASM and not Settings.AUTODEBUG and not Settings.ASSERTIONS and not Settings.SIDE_MODULE and emitting_js:
+        if not Settings.STANDALONE_WASM and not Settings.AUTODEBUG and not Settings.ASSERTIONS and not Settings.SIDE_MODULE and emitting_js and not Settings.ASYNCIFY_LAZY_LOAD_CODE:
          js_file = Building.minify_wasm_imports_and_exports(js_file, wasm_file, minify_whitespace=minify_whitespace, minify_exports=Settings.DECLARE_ASM_MODULE_EXPORTS, debug_info=debug_info)


Why can't we minify with lazy loading? Is that something that can be enabled as a followup?

ASYNCIFY_LAZY_LOAD_CODE disables minification because it runs after the wasm is completely finalized, and we need to be able to still identify import names at that time. To avoid that, we would need to keep a mapping of the names and send that to binaryen. That's possible, but not trivial. I'm adding a comment to clarify.

sbc100 · 2019-10-25T16:39:50Z

+    'conditional': (True,),
+    'unconditional': (False,),
+  })
+  def test_emscripten_lazy_load_code(self, conditional):


How about just "test_lazy_load_code"?

It's the name of the C function, though?

OK SGTM either way. I was think thinking "lazy_load_code" is kind of concept you are testing since its also the suffix of ASYNCIFY_LAZY_LOAD_CODE.

sbc100 · 2019-10-25T16:40:15Z

@@ -0,0 +1,2 @@
+foo_start


Can you call these files "test_lazy_load_code.*" to match the test name

sbc100 · 2019-10-25T16:43:18Z

+
+    # attempts to "break" the wasm by adding an unreachable in $foo_end. returns whether we found it.
+    def break_wasm(name):
+      wat = run_process([os.path.join(Building.get_binaryen_bin(), 'wasm-dis'), name], stdout=PIPE).stdout


get_binaryen_command? here and below.

wasm-dis and as don't support feature flags, so that utility function doesn't work on them. But given they don't need those flags, it's about the same length to type either way.

sbc100 · 2019-10-25T16:47:14Z

+})
+
+EM_JS(void, log_ended, (), {
+  out("foo_end");


Why not simply use puts/printf here?

stdio increases code size, and it's nice in the test to only use it after the lazy loading, to see the size difference more clearly.

sbc100 · 2019-10-25T16:48:26Z

+
+void foo_end(int n) {
+  log_ended();
+  // prevent inlining


How about __attribute__((noinline)) to be explicit?

That's not enough for binaryen, though :(

sbc100 · 2019-10-25T16:52:35Z

+    # that it will only rewind, after which optimizations can remove some code
+    cmd = Building.get_wasm_opt_command(debug=debug)
+    cmd += [wasm_binary_target, '-o', wasm_binary_target + '.lazy.wasm']
+    cmd += ['--remove-memory']


The memory? See the comment a few lines up, remove the memory segments from it, as memory segments have already been applied by the initial wasm

RReverser · 2019-11-07T03:36:15Z


-Asyncify functions
-==================
+Fastcomp Asyncify functions


Why was this changed to say "Fastcomp"? AFAIK same applies to Asyncify transform on top of wasm backend?

This PR refactored these docs, and now this part has the fastcomp funcs, while lower down there is "Upstream Asyncify".

(Or maybe I didn't understand what you meant by "same" in that sentence?)

By "same" I mean that the description in the first sentence (both -s ASYNCIFY=1 as well as link to Asyncify docs) seems generic enough, but only included in Fastcomp section now.

Also, aren't the functions that are currently included in "Fastcomp functions" - e.g. emscripten_sleep - available with upstream backend as well?

If they're, then the current structure seems confusing, as it seems to imply that these functions are Fastcomp-only, while others are upstream-only. I'd suggest turning the first section into "Asyncify functions" and then nesting the other one under it as "Upstream-only Asyncify functions".

Thanks! Now I see what you mean. Yeah, sleep looks wrong - it's mentioned in fastcomp, as you say, but should only be in the shared section earlier. I also see sleep with yield is in the wrong place. I opened #9819 now for these issues.

gabrielcuvillier · 2019-11-15T13:23:22Z

@kripken A question on this interesting feature: does this work as expected if the program have dynamically allocated memory (malloc, new, etc...) ?

I mean, suppose the program have instantiated classes with virtual methods, and one of the virtual method is doing lazy_load(). When this function is called and the lazy load occurs, it seems the program is being "replaced" by the newer version: but does the memory follows and everything is "remapped" accordingly ?

Not sure if I am clear :)

kripken · 2019-11-15T17:28:33Z

Yes, memory is preserved exactly as it was before, so dynamic allocation is fine.

We replace the code, but do not replace the data in memory. So we do not apply the memory initialization again, for example.

This adds emscripten_lazy_load_code(), a function that when called will block (using Asyncify) and load the complete program. This lets the initial download not contain code that can be proven to be used only after those calls. How this works is we emit the initial downloaded wasm after modifying it with Binaryen to assume that those calls will not rewind, as we only unwind using that binary (the optimizer can then remove a lot of code). We also emit the later downloaded wasm, optimized the other way, to assume we only rewind but never unwind. These transforms are done using ModAsyncify passes from Binaryen WebAssembly/binaryen#2404 A new option ASYNCIFY_LAZY_LOAD_CODE is necessary to use this (as well as ASYNCIFY itself). The test shows an example where the initial download is less than half the size of the later download. Note that in this model we do download some code twice (the later download contains the code in the initial one), and we do add some code size due to Asyncify support. So this will not be an obvious win for every codebase. But if the later download contains unlikely code that may never be used, and there is a lot of such code, and we can ignore indirect calls for Asyncify's purposes, then the win can be significant.

…9697 (emscripten-core#9712)

kripken added 23 commits September 13, 2019 17:29

wip [ci skip]

06c5c87

Merge remote-tracking branch 'origin/incoming' into lazhy

c8c2419

start to work [ci skip]

5cc987f

fixes [ci skip]

67a2764

fix [ci skip]

9823632

test [ci skip]

989d08f

Merge remote-tracking branch 'origin/incoming' into lazhy

a6eb120

comment

06160ff

docs

d38e6b5

comment [ci skip]

5ba0583

more testing [ci skip]

ce5f447

artisinal [ci skip]

666c506

more

1e461aa

finish tests [ci skip]

ebd191d

Merge remote-tracking branch 'origin/incoming' into lazhy

9cf9a25

fix [ci skip]

fe1ed61

cleanup [ci skip]

ce210d1

Merge remote-tracking branch 'origin/incoming' into lazhy

43b122a

refactor [ci skip]

b0eee0f

nicer [ci skip]

5c3170d

show the benefits [ci skip]

2d473e4

better

881d174

fix

d17e6d0

kripken requested review from jgravelle-google and sbc100 October 24, 2019 17:47

sbc100 reviewed Oct 25, 2019

View reviewed changes

kripken added 2 commits October 25, 2019 10:34

Merge remote-tracking branch 'origin/incoming' into lazhy

bfd44ff

feedback

1fab224

sbc100 approved these changes Oct 25, 2019

View reviewed changes

kripken merged commit 9e389cb into incoming Oct 25, 2019

delete-merged-branch Bot deleted the lazhy branch October 25, 2019 19:59

kripken added a commit that referenced this pull request Oct 25, 2019

get_binaryen_tool improvements, followup to #9697

be6e2a7

kripken added a commit that referenced this pull request Oct 30, 2019

Building.get_binaryen_tool improvements, followup to #9697 (#9712)

4cc1626

RReverser reviewed Nov 7, 2019

View reviewed changes

belraquib pushed a commit to belraquib/emscripten that referenced this pull request Dec 23, 2020

Building.get_binaryen_tool improvements, followup to emscripten-core#…

945bb91

…9697 (emscripten-core#9712)

Conversation

kripken commented Oct 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kripken Oct 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gabrielcuvillier commented Nov 15, 2019

Uh oh!

kripken commented Nov 15, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kripken commented Oct 24, 2019 •

edited

Loading

kripken Oct 25, 2019 •

edited

Loading