Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why are JavaScript inline snippets embedded in wasm data? #9366

Closed
greggman opened this issue Sep 2, 2019 · 8 comments · Fixed by #13443
Closed

Why are JavaScript inline snippets embedded in wasm data? #9366

greggman opened this issue Sep 2, 2019 · 8 comments · Fixed by #13443
Assignees

Comments

@greggman
Copy link
Contributor

greggman commented Sep 2, 2019

I hope it's okay to ask a question here

As a test I put some JavaScript inline in C++ using the EM_ASM_ macro

#include <emscripten.h>

extern "C" {
  extern void x2(float v) {
    float t = v * 2.0f;
    EM_ASM_({
      foo($0, "supercalif");
    }, t);
  }
}

I then compiled with

emcc -O3 -s EXPORTED_FUNCTIONS='["_x2"]' -s EXTRA_EXPORTED_RUNTIME_METHODS='["ccall", "cwrap"]'  -o foo.html foo.cc

Like I expected the JavaScript is put in the support .js file (foo.js in this case)

var ASM_CONSTS=[function($0){foo($0,"supercalif")}];
function _emscripten_asm_const_id(code,a0){
  return ASM_CONSTS[code](a0)
}

And disassembling the wasm I see the call to that JavaScript (recompiled with -g3 for the snippet below for the variable names but verified the same code is in the normal version just with short names)

  (func $legalstub$_x2 (export "_x2") (type $t7) (param $p0 f64)
    (drop
      (call $_emscripten_asm_const_id
        (i32.const 0)
        (f64.promote_f32
          (f32.mul
            (f32.demote_f64
              (local.get $p0))
            (f32.const 0x1p+1 (;=2;)))))))

But I also see the string of the JavaScript embedded in the wasm as well

  (data $d0 (i32.const 1024) "{ foo($0, \22supercalif\22); }"))

What is that string for given the JavaScript itself is already in the .js file?

@sbc100
Copy link
Collaborator

sbc100 commented Sep 3, 2019

you are right it would be nice if this could be stripped out of the data section of the wasm binary. The reason this is not done, I believe, is because the layout of the static data is all performed by lld and binaryen can't move or remove existing data without breaking data references embedded in the code section. @jgravelle-google am I right about this?

@jgravelle-google
Copy link
Contributor

Two bad reasons with a simple fix. First, because we're can't be 100% sure that nobody else is using that string. For example:

EM_ASM({ console.log('hi'); });
puts("{ console.log('hi'); }");

would emit something like

(data (i32.const 1024) "{ console.log('hi'); }")
;; ...
(call $_emscripten_asm_const_v
  (i32.const 0))
(call $_puts
  (i32.const 1024))

so removing that data section would break your program. Yes that's highly contrived.

The second reason is that like @sbc100 said, we can't actually reclaim that static allocation because any data afterwards would need to be moved down, and by the time we do the EM_ASM magic we no longer have the information needed to do so.

The solution to both of these is to explicitly put the EM_ASM strings in a separate custom data section, and then strip it in binaryen. It's probably slightly more complicated than that because we haven't looked into it in detail, but we'll want to do so one of these days. And also something similar for EM_JS.

@sbc100
Copy link
Collaborator

sbc100 commented Sep 3, 2019

Oh.. good idea. I actually think we should do that, if only because its such an elegant fix. I might take a quick look.

@sbc100
Copy link
Collaborator

sbc100 commented Sep 3, 2019

Does EM_JS not share the same problem? (sorry I misread.. I see that you are suggesting fixing that too).

@stale
Copy link

stale bot commented Sep 2, 2020

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.

@stale stale bot added the wontfix label Sep 2, 2020
@kripken
Copy link
Member

kripken commented Sep 2, 2020

It would still be good to fix this.

@juj
Copy link
Collaborator

juj commented Nov 15, 2020

I wonder if such a custom data section would be able to make external --js-library functions also linkable with same semantics as EM_JS functions currently are?

@sbc100
Copy link
Collaborator

sbc100 commented Nov 15, 2020

ooo.. that is kind of an interesting idea. Basically we could compile all .js library files to native object files and let wasm-ld perform all the linking?

I think the one roadblock to this approach would be the fact that JS libraries can include arbirary #if statements so would need to be recompiled whenever settings (or internal settings) change. Effectively re-compiled for each link command. Perhaps we could start by looking into this approach for any library that doesn't contain pre-processor directives?

sbc100 added a commit that referenced this issue Feb 17, 2021
This way, once binaryen has extracted that string data
it can potentially remove the segment or at least zero it
out.

Fixes #9366
sbc100 added a commit that referenced this issue Feb 17, 2021
This way, once binaryen has extracted that string data
it can potentially remove the segment or at least zero it
out.

Fixes #9366
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants