Skip to content

Conversation

@kripken
Copy link
Member

@kripken kripken commented Dec 29, 2017

This should help with emscripten-core/emscripten#5998 , but I don't have time right now to test it completely.

  • First, stop including the filesystem. I'm not sure why it's even needed. We should investigate that, but this is a useful change anyhow.
  • Second, new emscripten changed how Runtime works, those methods are now toplevel functions. That should get things to work with latest emscripten, but if the bots check older, the same code won't work on both. We may need to implement a small shim to support both temporarily.

@kripken
Copy link
Member Author

kripken commented Dec 29, 2017

The filesystem was "needed" because of iostream usage.

@kripken
Copy link
Member Author

kripken commented Dec 29, 2017

Ok, this should fix everything now. Added a wasm.js fix for recent emscripten and testing fixes and improvements (run binaryen.js tests in all engines).

One change here is to move binaryen.js to use standard emscripten MODULARIZE (like wasm.js). I needed that because things had broken in non-node.js for some reason. @dcodeIO , I think you wrote that code, is that ok?

Note that it means we emit Binaryen as a function, so the test suite and users now do Binaryen = Binaryen() to get an instance, since that's what MODULARIZE does by default. We could change that though.

One odd thing is that this modified some of the test outputs - what I think are pointer values changed. We should probably remove those from the output, as they will change over time. It might even fail now on the bot given it's using an older emsdk version, we'll see...

@dcodeIO
Copy link
Contributor

dcodeIO commented Dec 29, 2017

As long as the exported interface is the same (that is, Binaryen() is called in postJs - is it?), I don't expect any issues with MODULARIZE.

Regarding pointer values:; Yeah, these tend to change every time something is changed and binaryen.js is recompiled. There are a few occasions where something seems wrong, like for the names in the tests (probably missing a Pointer_stringify), but there are others like body in functionInfo where it's hard to work around, unless we decide to test them in another form without printing.

@kripken
Copy link
Member Author

kripken commented Dec 29, 2017

About changing the interface, I'm not sure what's best. In MODULARIZE the exported Binaryen is a constructor, and the user needs to do Binaryen = Binaryen() to get an instance. That's awkward in a way, but it does let the user create multiple instances. Although for binaryen that doesn't make sense I guess, so I changed it back now, so it should be the same interface as before.

About the pointer numbers, I guess we can leave them as-is for now, as the tests pass here. But I'm not sure exactly why it's not complaining already, something seems odd.

@kripken
Copy link
Member Author

kripken commented Dec 29, 2017

Ah, I see what's been going wrong... we build -g on the bot, but test the old build... :(

@kripken
Copy link
Member Author

kripken commented Dec 29, 2017

Good, and now it fails as expected on the bot.

@kripken
Copy link
Member Author

kripken commented Dec 29, 2017

And with the last commit it should work, without printing the pointer values that the tests now clean out.

This PR should be finished now, assuming tests pass.

@dcodeIO
Copy link
Contributor

dcodeIO commented Dec 30, 2017

Just noticed that the module loader code is gone entirely. Does MODULARIZE add similar checks for require/define and the global object on its own?

@kripken
Copy link
Member Author

kripken commented Dec 30, 2017

I'm not sure how - I've never really understood how require and module loading etc. work, to be honest. But we have tests check for require, for example https://github.com/kripken/emscripten/blob/incoming/tests/test_other.py#L5420

@dcodeIO
Copy link
Contributor

dcodeIO commented Dec 30, 2017

It's relatively simple, actually:

CommonJS is synchronous: Check if a global require function is present to be sure plus check if there is a non-null module object with an exports property (CommonJS initializes this to an empty object). If so, assign the exported interface to module.exports, which is what a CommonJS module exposes to the outside when requireed from another file.

if (typeof require === "function" && typeof module !== "undefined" && module && module.exports)
  module.exports = the_interface;
// other file would do
var Binaryen = require("./path/to/binaryen");
// Binaryen is `module.exports` from above

AMD is asynchronous: Check if a global define function is present plus check that it has a trueish amd property to be sure. If so, call it with a suitable factory function that returns the exported interface, which is evaluated when used as a dependency of another module or the main file.

if (typeof define === "function" && define.amd)
  define(function() { return the_interface });
// other file would do
define(["./path/to/binaryen"], function(Binaryen) {
  // Binaryen is what is returned from the factory function above
});

Additionally it's always a good idea to expose the module on the global object (window in a browser, global in node, self in web workers) in case a CommonJS or AMD environment is present but not used, just like if defining a global var the_module_name.

var globalObject = typeof window !== "undefined" && window || typeof global !== "undefined" && global || self;
globalObject.the_module_name = the_interface;

Unordered seems to be ideal, that is export unconditionally in every way possible so it "just works" no matter what.

If you want I'll take a look after this is merged. Could also be an enhancement to MODULARIZE in general if all of this was implicit.

@kripken
Copy link
Member Author

kripken commented Dec 30, 2017

Thanks, I understand better now. Is it standard though to add it to the global object, don't most people use the proper require etc. technique? I think MODULARIZE does the require stuff, but doesn't add to the global (but I'm not sure).

In any case, yeah, let's look into this as a general improvement of MODULARIZE. People on the emscripten repo will know more about this than me there.

Otherwise this looks ok to merge?

@dcodeIO
Copy link
Contributor

dcodeIO commented Dec 30, 2017

Is it standard though to add it to the global object, don't most people use the proper require etc. technique?

There is something called universal module definition (UMD) that doesn't go as far as my suggestions above, e.g., doesn't unconditionally export on the global object. That's somewhat how it was before this PR.

In my experience, though, additional issues originate from mixing these things, which is sometimes inevitable when different modules from different authors with different loaders combined with different bundlers are used in a single application.

Not sure what's best. UMD as above forces everyone to create compatible modules while doing more avoids more issues.

Otherwise this looks ok to merge?

Might fail with AMD loaders at first. I am not using AMD myself, but I can take a look and send a PR as a follow-up, probably simply consisting of adding the following two lines to binaryen.js-post.js:

if (typeof define === "function" && define.amd)
  define(function() { return Module; });

That should do for now if Binaryen() is called automatically anyway.

@kripken
Copy link
Member Author

kripken commented Dec 30, 2017

Ok, let's merge this in. For followups, let's discuss those two lines in emscripten since it seems like a general thing for modularize?

@dcodeIO
Copy link
Contributor

dcodeIO commented Dec 30, 2017

I think I made a mistake by suggesting to call the Binaryen() function implicitly because exports aren't consistent now. AMD and CommonJS export the factory function while the var also calls it and reassigns to itself.

@kripken
Copy link
Member Author

kripken commented Jan 1, 2018

Hmm, so should we assign the output of Binaryen() to the export? Perhaps this should be an emscripten option, like MODULARIZE_INSTANCE which does modularization and exports an instance instead of a constructor?

@kripken
Copy link
Member Author

kripken commented Jan 1, 2018

Alternatively, maybe there is a use for exporting the constructor, people may want multiple Binaryen instances for some reason? In which case, users should do the instantiation?

@dcodeIO
Copy link
Contributor

dcodeIO commented Jan 1, 2018

Hmm. From a JS/node module perspective I'd say that a user expects a ready-to-use API on require("binaryen"), not an API one has to instantiate. While there might be valid reasons to do so in special cases, instantiating multiple APIs that might then instantiate multiple modules each doesn't feel right to me in the usual case. This is specific to binaryen.js, of course, and other emscripten-compiled modules might actually intend to be instantiated multiple times.

In this context MODULARIZE_INSTANCE seems like the right thing to implement as it lets the developer decide what's the usual use case. Maybe, a MODULARIZE_INSTANCEed module could still provide a way to create another instance beside the default one to cover rare special cases. Like a static instantiate function on the default export (the default instance)?

It pretty much boils down to the question whether exporting a default instance of a module makes sense for a specific emscripten-compiled module or not. A static instantiate function could also become the common denominator between MODULARIZEed and MODULARIZE_INSTANCEed emscripten modules (instead of just exporting a function one has to call), with the sole difference that one exports a default instance and the other does not.

Let's say, when MODULARIZEed, a function to instantiate a new module would be exported for backwards-compatibility that also has an instantiate static property referencing itself. A MODULARIZE_INSTANCEed module would export a default instance that also has an instantiate static property on the exported default instance.

@yurydelendik
Copy link
Contributor

If there ever be instantiate static property, can it be async (return Promise)? That will simply initialization for modules that needs to load binary WebAssembly and memory.

@dcodeIO
Copy link
Contributor

dcodeIO commented Jan 4, 2018

Yeah, that was my thought. Like providing a similar API like WebAssembly in the browser, while still providing a way to export something like a synchronously instantiated node module.

@kripken
Copy link
Member Author

kripken commented Jan 8, 2018

In a hypothetical MODULARIZE_INSTANCE mode it would be more efficient not to also allow creating more instances, as a singleton instance could be better optimized by JS engines and packers. But maybe that's not a big deal compared to the benefit?

@dcodeIO
Copy link
Contributor

dcodeIO commented Jan 8, 2018

I don't know the full picture, but if the instantiation code runs once anyway, there's not much a tree-shaking/dead-code-eliminating compiler could optimize away - unless singleton instantiation is simpler and can exclude a heap of code. Can it?

@kripken
Copy link
Member Author

kripken commented Jan 10, 2018

Yeah, singleton instantiation is simpler for a VM. It can see that the code will never be used again and throw it away immediately after it runs, also it can avoid adding type profiling code as it compiles it (except for loops) etc. But, as I said I'm not sure it would matter much here. We should probably err on the side of the nicest API for users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants