Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undocumented Module.then causes infinite loop #5820

Closed
saschanaz opened this issue Nov 20, 2017 · 70 comments · Fixed by #10697
Closed

Undocumented Module.then causes infinite loop #5820

saschanaz opened this issue Nov 20, 2017 · 70 comments · Fixed by #10697

Comments

@saschanaz
Copy link
Collaborator

saschanaz commented Nov 20, 2017

https://github.com/kripken/emscripten/blob/93479ecbd390aec4f8a3765fe04bcb365d0b31b2/src/postamble.js#L118-L141

postamble.js adds Module.then when -s MODULARIZE=1 but using it causes indefinite loop on Promise.resolve() and ES2017 await statement.

By the spec the resolving strategy is greedy: if an object has then method then it will be called with two function arguments including resolver and rejector. If the resolver is called with an object with a then method it will be called again with two function arguments, and then again, again.

Module.then resolves with Module which still has then, so the await statement loops indefinitely.

PS: Fixed some wrong concepts.

@saschanaz
Copy link
Collaborator Author

This is anyway an undocumented feature, we should be able to rename this to whenInitialized (example), etc.

@curiousdannii
Copy link
Contributor

curiousdannii commented Nov 20, 2017

I have production code using Module.then, but not the return value.

This is an odd situation, because if I understand correctly, await handles normal promises fine even though they always have a then property.

Remove .then when initialized. All following then calls will cause exception as they will be calling undefined.

This would remove the entire point of making Module a thenable, rather than just accepting a callback.

If it was changed to make Module an actual Promise object (or for then to return a true Promise), would that work with await? That would then make emscripten depend on Promises being present. One option could be for it to check if Promise is defined, and if not to return the current Module. So on old platforms you either provide a Promise polyfill, or you can't call Module.then.

It would probably be worth putting all the loading errors through the same system so that you could call Module.catch as well.

And it's not completely undocumented, it is mentioned here: http://kripken.github.io/emscripten-site/docs/porting/connecting_cpp_and_javascript/WebIDL-Binder.html#modular-output

It should really be mentioned elsewhere in the docs too.

@saschanaz
Copy link
Collaborator Author

saschanaz commented Nov 20, 2017

It seems I had misunderstood the concept of thenable. Any thenable object will receive two function arguments as the Promise constructor callback does, which means that it won't be resolved until one of the arguments are called.

PS: So I modified the original post.

@saschanaz
Copy link
Collaborator Author

This is the minimal repro of the Promise thenable behavior:

var o = { then(resolve, reject) { console.log("called then()"); resolve(o) } };
Promise.resolve(o);
// logger logs indefinitely 

@curiousdannii
Copy link
Contributor

curiousdannii commented Nov 20, 2017

Sorry, I'm not really sure what you're trying to do. But a proper thenable should call resolve/onFulfilled with the value, not the whole thenable object, and it should return the object (this).

Here's the simplest example I could find from noq:

	then: function noQ_Promise_then(p, n)
	{
		if(this.__val === undefined)
		{
			(this.__c || (this.__c = [])).push({p: p, n: n});
			if(n)
				this.fail(n);
		}
		else if(this.__val === true)
			p(this.__res);
		else if(n && (this.__val === false))
			n(this.__res);
		return(this);
	},

Where __val is the status and __res is the value.

@saschanaz
Copy link
Collaborator Author

Sorry, I'm not really sure what you're trying to do. But a proper thenable should call resolve/onFulfilled with the value, not the whole thenable object, but it should return the object.

This is not what I'm trying to do, it is what the current postamble.js do.

https://github.com/kripken/emscripten/blob/93479ecbd390aec4f8a3765fe04bcb365d0b31b2/src/postamble.js#L129

@curiousdannii
Copy link
Contributor

curiousdannii commented Nov 20, 2017

Ah, true. I think emscripten is making a valid thenable (aside from ignoring the onRejected arg), but it's not a proper promise. I could be wrong. It probably should be a full proper promise though.

@saschanaz
Copy link
Collaborator Author

A valid thenable should not resolve by passing itself, as the greedy algorithm will cause infinite loop. (Forget about the return value story from my previous deleted comment, it was just wrong. Sorry 😭)

@saschanaz
Copy link
Collaborator Author

saschanaz commented Nov 20, 2017

(Ignore my another deleted comment. Sorry)

Converting module to a real Promise still may not solve the problem, as the Promise resolver still uses greedy algorithm.

new Promise((resolve, reject) => {
    resolve(Promise.resolve(3))
}).then(console.log); // logs 3, not Promise

(As it will still have to resolve by passing itself to keep compatibility)

@saschanaz
Copy link
Collaborator Author

So I would:

  1. Break the compatibility by resolving with no value.
  2. Or, export a new thenable object which resolves with no value. (so that one can instead await module.forInit for example)

@curiousdannii
Copy link
Contributor

Or should the MODULARIZE mode Module be effectively a factory? And you only get access to the true Module when your then function is called?

@saschanaz
Copy link
Collaborator Author

That would be neat 👍 but is there any use case that requires module access before initialized? Probably not?

@curiousdannii
Copy link
Contributor

curiousdannii commented Nov 30, 2017

Um, possibly if you have no external files (either no memory or inline memory) then it might be available immediately?

It would probably help to outline the use cases for MODULARIZE, so then we can see whether they would be better served by being separated out a little.

  1. Node/Browserify/UMD access (Including an emscripten module in browserify build process #5864) -> This should really be fixed for non MODULARIZE. Probably would be very easy too.
  2. Wrapping the code so it doesn't leak variables and compresses better etc. pre/post js is still an option, as would be browserifying etc. I'd be inclined to sacrifice this if it makes it easier for other use cases. Or sacrifice the non-wrapped option.
  3. Thenable interface (eventually a true Promise ideally). IMO this should move towards being the main/only way of hooking in to when the module is ready.

Can you think of any others?

@saschanaz
Copy link
Collaborator Author

My previous use case for MODULARIZE was to run a CLI command multiple times in a single worker. (as calling main() multiple times caused error)

@andreasscherman
Copy link

Was bitten by this exact behaviour -- thanks for the detailed flesh-out of the issue. If someone stumbles upon this, inspiration as a hack-around could look something like this:

Module.then(m => {
  delete m['then'];
  resolve(m);
});

@mischnic
Copy link

Promise.all([instance]) also causes an infinite loop.

On the referenced issue on binaryen, a proposed solution would be to make Module.ready a thenable instead of Module itself. (This might of course break some old code).

@kripken
Copy link
Member

kripken commented Feb 13, 2018

.ready might be the best option, then.

I think we can do it as a breaking change, if it's necessary to get a fully working API, and if we add a clear error if people use the old API - could we set Module.then to suggest using the new API?

@mischnic
Copy link

mischnic commented Feb 14, 2018

I don't think there is a way to fix the old then and getting out of the recursion hell.
This

Module.then = function(f){
    f(Module);

    // something like this as the error message?
    console.error("Please use Module.ready.then(...) instead.");
    
    delete Module.then;
    return Promise.resolve(Module);
}

might seem like a solution, but not for long:

Module.then(v => console.log(v));
Module.then(v => console.log(v)); // Uncaught TypeError: Instance.then is not a function

Alternatively resolving with no value (#5820 (comment)).


The new ready would be

Module['ready'] = new Promise(function(func) { 
  // We may already be ready to run code at this time. if 
  // so, just queue a call to the callback. 
  if (Module['calledRun']) { 
      func(Module); 
  } else { 
      // we are not ready to call then() yet. we must call it 
      // at the same time we would call onRuntimeInitialized. 
      var old = Module['onRuntimeInitialized']; 
      Module['onRuntimeInitialized'] = function() { 
          if (old) old(); 
          func(Module); 
      }; 
  } 
});

then?

@anthumchris
Copy link
Contributor

anthumchris commented Mar 29, 2018

Stumbled upon this thread when searching for a Promise-based alternative to onRuntimeInitialized. I also needed repeated calls from a Web Worker's onmessage handler. I ended up writing a Module.ready Promise implementation by building with emcc --post-js 'src/module-post.js'. Code is below. Hope this is useful to anyone in a similar situation:

https://gist.github.com/AnthumChris/c1b5f5526b966011dac39fbb17dacafe

@david-polak
Copy link

david-polak commented May 1, 2018

For anybody stuck on this bug, this is my workaround. Thanks @anthumchris for directions

--post-js

Module['ready'] = new Promise(function (resolve, reject) {
  delete Module['then']
  Module['onAbort'] = function (what) {
    reject(what)
  }
  addOnPostRun(function () {
    resolve(Module)
  })
})

Instantiate

new EXPORTED_NAME().ready.then(module => {})

@ghost
Copy link

ghost commented May 2, 2018

Couldn't we have the function at EXPORT_NAME return an actual Promise instead (or make it an async function)? So for the default EXPORT_NAME, we could get something like:

function Module() {
  const staticContext = {
    ALLOC_DYNAMIC: 3,
    ALLOC_NONE: 4,
    ALLOC_NORMAL: 0,
    // …
  };
  return new Promise((resolve, reject) => {
     // fetch and compile wasm, add exports to `staticContext`
     // resolve with modified staticContext if successful, reject otherwise
  });
}

This would fix ES2017 await statements and save us from workarounds. In order to instantiate the module, we could then use

Module().then(({ export1, export2 }) => {
  /* use exports here */
});

from non-asynchronous contexts and

const { export1, export2 } = await Module();
/* use exports here */

from asynchronous contexts.

@surma
Copy link
Collaborator

surma commented May 22, 2018

Merging myself from #6563 into this issue: This is confusing because then() is undocumented and you don’t even need to use it to end up in the infinite loop:

import myModule from './myModule.js'; // ← This is the file generated by Emscripten

const module = new Promise(resolve => {
  myModule({
    onRuntimeInitialized() {
      resolve(myModule);
    }
  });
});

I think re-naming then to ready is a great solution although I think it’d be preferable to NOT make ready() resolve to the module, but just undefined.

@surma
Copy link
Collaborator

surma commented May 22, 2018

... a second solution would be to remove the then property before resolving here. That would basically make the module then-able just once and prevent the infinite loop. Not sure that’s a good practice tho.

I’m happy to whip up a PR if one of the maintainers can give me a hint which solution is preferred.

@mischnic
Copy link

mischnic commented May 22, 2018

a second solution would be to remove the then property before resolving here. [...] then-able just once [...] Not sure that’s a good practice tho.

Yes, I mentioned that here as well: #5820 (comment)

But the Module would still be a promise that doesn't work as one would expect (just as now).

@curiousdannii
Copy link
Contributor

curiousdannii commented May 22, 2018

I think it would be best to follow @kdex's mock up and make (the external) Module return a true promise. Which I guess would mean making the internal Module object be that promise, and adding all the usual properties onto it instead of onto a plain object.

I think this might have too many difficulties, and now favour having a ready property.

@DJMcNab
Copy link
Contributor

DJMcNab commented Aug 24, 2018

This is just throwing a (probably bad) idea into the mix:

would it be possible to have the global Module with a then function which, after instantiation completes starts to inherit [1] from a private ModuleReal, which the then resolves to. Alternatively, all the properties from ModuleReal could be copied to Module (using Object.assign()). Unfortunately, this is likely to have negative performance implications no matter which way is chosen, so should not really be considered.

[1]: Using Object.setPrototypeOf() or by recreating the Module using Object.create().

@curiousdannii
Copy link
Contributor

curiousdannii commented Jan 8, 2019

@Taytay That specific problem you're running in to really has nothing to do with this issue. Emscripten expects to be run in a CommonJS environment, and if you use it in another environment that still has a module of course it will get into trouble. I'd recommend you use browserify/webpack/rollup instead of cating files.

It would be reasonable to add an option to not do any exporting, but that should be a new issue.

@Taytay
Copy link

Taytay commented Jan 9, 2019

@curiousdannii : I am running in a CommonJS environment, but I still wanted control over the modularization code. I've created a new issue/feature request here, as you suggested: #7835

@s-h-a-d-o-w
Copy link

s-h-a-d-o-w commented Dec 24, 2019

This might be a naive question but why don't you simply delete then from the module once it was executed?

I just modified the code like so:

Module['then'] = function(func) {
  delete Module.then;
  // ...

... and it works like a charm.

It may be hacky but then... couldn't the same be said for other current methods of communicating when the module is ready? (Through the global namespace)

@curiousdannii
Copy link
Contributor

@s-h-a-d-o-w Because a .then method is meant to be able to be called multiple times, and we should assume that many Emscripten users are currently doing so.

@s-h-a-d-o-w
Copy link

Ah, of course! (I even persist promises in a memoization library I wrote but it's generally something that I very rarely use, so I didn't think of that.)

Well... how about documenting here how users can write a wrapper for it?

Like:

  const importWASM = () => {
    return new Promise((resolve) => {
      require('./webp_wasm.js')().then((module) => {
        // Since the module contains a .then(), it needs to be 
        // wrapped to prevent infinite calls to it.
        resolve({module});
      });
    });
  };

  const {module} = await importWASM();

@lourd
Copy link
Contributor

lourd commented Feb 6, 2020

Just wanted to +1 this and add another case. When using Emscripten with TypeScript, the compiler will not allow the factory function to be awaited on. It gives the error: Type is referenced directly or indirectly in the fulfillment callback of its own 'then' method.. This playground demonstrates the compiler outcome. This is actually quite helpful, as it avoids the infinite loop at runtime, but for this, loading a module becomes a confusing mess because even returning the module from an async function causes this compiler error.

I'm happy to take on the work to fix this. Instead of adding the ready property, what about just not passing the Module to the callback in then? Or is ready the approach people would prefer?

Edit: It's also strange that Module is returned from then. Could that be changed as well? I can't imagine there's many users out there relying on the then API given this footgun.

@kripken
Copy link
Member

kripken commented Feb 7, 2020

Thanks @lourd !

I really would like us to make progress here. I have not been able to do much myself because I am not knowledgeable enough about conventions in the JS ecosystem, and no one else has had the time to work on it it seems.

Your suggestion sounds reasonable. I think we can even make it error in a clear way for existing users by, in ASSERTIONS builds, proving an argument to then() that throws with a message if it is used (and the message would explain what to do and/or point to the PR for more details etc.). So we can avoid silent / weird breakage.

Alternatively, given the severity of this issue, another option is as discussed above to delete .then after the first use. That would also be a breaking change, and also I think it's a reasonable one, which we can mitigate with a nice error message in ASSERTIONS etc.

I don't have a strong preference between the two (again, I don't know enough about this) so I'd really like to hear from the people that discussed this at length previously, which of those two is best. But I do think we should move forward with one of the two, and soon, as this has been a big footgun for far too long.

(Sounds good about the other proposed change, to not return Module from then() - again, with some method to make sure users get a nice error in ASSERTIONS.)

@curiousdannii
Copy link
Contributor

curiousdannii commented Feb 8, 2020

Alternatively, given the severity of this issue, another option is as discussed above to delete .then after the first use.

No, this is a bad option as it's meant to be able to be called more than once.

There's some good discussion at #9325 but I'm not sure if that PR should be merged as it's doing too many other things as well.

One more thought: it's not actually possible to add ready without removing then (from any builds that have ready), because if the object returned when ready resolves contains a then, the then will be called automatically.

Making it more restrictive is what I was aiming to do. Personally at least, for a very long time working with Emscripten I would consistently either forget to wait for initialization or forget how. Returning an opaque promise makes it obvious: wait for the promise and then you have a fully initialized Emscripten module.

I like these comments from @kainino0x and think there's a good argument for making the modularised function directly return a promise, because that way there's no accidentally using it before it's ready. For non-modularised builds a ready property could be made. In the first case it has to resolve with the module (otherwise you can't access it), and in the second it doesn't have to, but in both cases it couldn't have a then property. (Possibly this is exactly what your PR did, except that it requires you to change to MODULARIZE=2.)

Things would be simpler if the non-modularised option was deprecated. (And I think a terminology change would help - we have the confusing fact that "module" refers to what you pass in to it, as well as the object at the end of the process with all of the ASM functions and memory views, while "modularize" is really a factory function.)

@pwuertz
Copy link

pwuertz commented Feb 8, 2020

Changing the module-API during lifetime, e.g. being thenable at first, not thenable later, feels wrong. Never seen that kind of behavior in any other API.

Removing the then argument may fix the await for module readiness. But if tried to return a module from another async function, then() will still get into your way.

Just remove then() in favor of ready.

@bvibber
Copy link
Collaborator

bvibber commented Feb 8, 2020

I agree changing the API at runtime is icky, and will break on static analysis tools. The most natural API for modern JS I feel is for the module factory function to return a Promise, hiding the actual Module object until it's resolved through the promise. This would be a breaking change, but so would removing the then method on its own without making the change to the factory func.

@kainino0x
Copy link
Collaborator

kainino0x commented Feb 8, 2020

Another +1 to not removing then dynamically.

Removing then (statically) would be a nice easy fix, but I personally think we should go all the way to being completely opaque until the promise resolves.

I would greatly appreciate if someone else would pick up this task. I'm not working on it anymore; I have another way to achieve what I needed.

@lourd
Copy link
Contributor

lourd commented Feb 8, 2020

Thanks for the input everyone. It sounds like a good approach would be:

  • For modularized builds, calling the module factory function should return a Promise. That Promise should resolve with the module instance. If initialization fails, it should be rejected with... what? What currently happens?
  • For non-modularized builds, there should be a property named ready on the module that is a Promise. That Promise should be resolved once the Promise has been loaded.
  • Removing the then method altogether.

Is that all correct? Should I also make any changes to onRuntimeInitialized? Is there anywhere I should put warnings on assertion builds?

Any codebase guidance folks have would be appreciated too. Will this change be contained to postamble.js and tests, or will this change affect more source files?

@bvibber
Copy link
Collaborator

bvibber commented Feb 9, 2020

In case of failure, the promise should be rejected with an Error instance; for instance if an exception gets thrown and caught during a sync operation, you'd funnel that into the reject function (unless you want to replace the specific error with a more generic error condition, but it's easier to diagnose problems with the original one I find)

IIRC what happens now on failure is that an exception gets thrown and then your 'then' callback just never gets called. This is ok for manual debugging but terrible for automatic fallback behavior where you might want to replace the emscripten module with a nice error message or alternate experience. :)

@pwuertz
Copy link

pwuertz commented Feb 9, 2020

Edit: @Brion Ah sorry, totally misunderstood the point you were making! ( Hi again by the way ;) )

I'll just keep this here as a possible example for loading modules as discussed earlier:

// Promise with success/error handlers
loadModule().then(onModuleLoaded, onModuleFail);

// Promise with fallback and fail handling
loadModule()
.catch(error => { /* try returning fallback module or re-throw */ });
.then(module => { /* use module */ })
.catch(error => { /* show error */ });

// Async style
try {
  const module = await loadModule();
  /* use module */
} catch (error) {
  /* show error */
}

@juj
Copy link
Collaborator

juj commented Feb 9, 2020

Whatever the fix, it would be great if someone can contribute minimal test cases for the use cases/loading patterns that are being used that showcase this failure. Reading through the long conversation thread, I don't think I was able to find a test case. (sorry if I missed one)

If there was a PR that showcases "I would like to do this, but it fails - if you fix this loading code to work, all would be good", I think that would help towards what is needed. (also, do people have different loading patterns that exhibit this problem from different angles?)

@bvibber
Copy link
Collaborator

bvibber commented Feb 9, 2020

@pwuertz I mean that not getting a rejection is terrible. When you get one, you can handle it and that's great! :)

@juj the failure cases I want to catch are mostly things like "oops we were asked to load module with SIMD instructions but no SIMD support in browser" or "the .wasm file is missing or inaccessible, so can't fully instantiate". The former is dependent on actual browser behavior so might not be suitable for test cases; the latter could be simulated in tests by deliberately corrupting or deleting the wasm file, perhaps.

Similar to @pwuertz's samples above, a basic instantiation in classic (non-async) code would look something like:

MyCoolModule({
  locateFile: blah,
  options: foo,
}).then(function(module) {
  module._do_something();
}).catch(function(err) {
  // This will catch any errors during instantiation or
  // at runtime during the 'then' handler.
  // We could also put this before the 'then' if we wanted
  // to treat instantiation errors separately from runtime errors.
  displayError("Error loading module: " + err);
});

You could be fancier with a fallback to loading another module, but the important thing is just getting the 'catch' callback on failure.

@kripken
Copy link
Member

kripken commented Feb 12, 2020

Thanks for the feedback everyone! Sounds like there is general consensus on the solution @lourd and @Brion describe here? It sounds good to me. Perhaps a good next step is a PR from @lourd , including tests (which would also address @juj's point above)?

Regarding code, yes, I'd expect it to be just postamble.js + tests, unless I'm forgetting something. There is some modularize code emitted from emcc.py (see modularize()) but at a glance I don't think that handles .then() related things.

@lourd
Copy link
Contributor

lourd commented Mar 16, 2020

Just submitted a PR to start work on this. Would love any review or collaboration on it from folks subscribed here.

@lourd
Copy link
Contributor

lourd commented Mar 18, 2020

Running into some hairy issues with the changes needed to do this as related to pthreads and the MODULARIZE_INSTANCE option. See the PR for details, thanks in advance for your help.

kripken pushed a commit that referenced this issue May 8, 2020
This removes the Module.then function and replaces it with a proper
Promise being returned from the initialization function.

That is, when in MODULARIZE mode, calling `Module()` does not
return an instance, but instead returns a Promise you can wait on
for when it is ready.

Fixes #5820
@jeffRTC
Copy link

jeffRTC commented Jan 28, 2021

I'm investigating a browser crash. I came across this issue while browsing issues related to it. Can anyone here confirm that the below pattern is okay or not?

let ModuleInstance;

Module().then(instance => {
    ModuleInstance= instance;
    
    ModuleInstance.foo();
});

@lourd
Copy link
Contributor

lourd commented Jan 29, 2021

Yeah, that looks fine

@curiousdannii
Copy link
Contributor

Or alternatively:

const ModuleInstance = {}

Module(ModuleInstance).then(() => {ModuleInstance.foo()})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.