Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What changed for proxying between 2.0.8 and 2.0.23? #14312

Open
jozefchutka opened this issue May 27, 2021 · 14 comments
Open

What changed for proxying between 2.0.8 and 2.0.23? #14312

jozefchutka opened this issue May 27, 2021 · 14 comments

Comments

@jozefchutka
Copy link

jozefchutka commented May 27, 2021

Hi, in emsdk 2.0.8 my emcc args inlcuded:

-s USE_PTHREADS=1
-s PROXY_TO_PTHREAD=1
-s MODULARIZE=1
-s EXPORTED_FUNCTIONS="[_main, _proxy_main]"

followed by javascript:

core.ccall('proxy_main', ...);

however building with 2.0.23 it complains emcc: error: undefined exported symbol: "_proxy_main" [-Wundefined] [-Werror]

I can remove the _proxy_main from the list, but then how to run main proxied?

@jozefchutka
Copy link
Author

It seems I can ccall('emscripten_proxy_main'), is that the intended change?

@kripken
Copy link
Member

kripken commented May 27, 2021

I think proxy_main has always been an internal detail - you should not need to export it, or call it, or even be aware of it. Just setting PROXY_TO_PTHREAD will do everything for you automatically.

For examples, you can look in tests/*, which has many tests for PROXY_TO_PTHREAD, and it never mentions proxy_main.

@jozefchutka
Copy link
Author

Hi @kripken ,

following Build FFmpeg WebAssembly version my compilation is using -s PROXY_TO_PTHREAD=1 however there is significant difference running:

core.ccall('proxy_main', ...); // 2.0.8
core._emscripten_proxy_main(...) // 2.0.23

vs.

core._main(...)

The difference is that if I do not call the proxy ones, the execution seems to happen on main thread. Anything I am doing wrong?

@sbc100
Copy link
Collaborator

sbc100 commented May 28, 2021

Normally one does not call main directly with emscripten, it gets called from callMain() which itself gets called from run() which itself is called automatically during startup.

See:

emscripten/src/postamble.js

Lines 118 to 138 in 087ca39

function callMain(args) {
#if ASSERTIONS
assert(runDependencies == 0, 'cannot call main when async dependencies remain! (listen on Module["onRuntimeInitialized"])');
assert(__ATPRERUN__.length == 0, 'cannot call main when preRun functions remain to be called');
#endif
#if STANDALONE_WASM
#if EXPECT_MAIN
var entryFunction = Module['__start'];
#else
var entryFunction = Module['__initialize'];
#endif
#else
#if PROXY_TO_PTHREAD
// User requested the PROXY_TO_PTHREAD option, so call a stub main which pthread_create()s a new thread
// that will call the user's real main() for the application.
var entryFunction = Module['_emscripten_proxy_main'];
#else
var entryFunction = Module['_main'];
#endif
#endif

Because you are building with INVOKE_RUN=0 this does not happen in your case. Looking at the docs fro INVOKE_RUN to seem that it recommends that you call Module.callMain():

// Whether we will run the main() function. Disable if you embed the generated
// code in your own, and will call main() yourself at the right time (which you
// can do with Module.callMain(), with an optional parameter of commandline args).
// [link]
var INVOKE_RUN = 1;

@jozefchutka
Copy link
Author

jozefchutka commented Jun 2, 2021

Hi @sbc100 ,
thanks for the instructions, I am trying to follow, but my module has no callMain()

These are my make instructions https://github.com/jozefchutka/ffmpeg.wasm-core/blob/yscene/wasm/build-scripts/build-ffmpeg.sh . After loading and calling const module = await createFFmpegCore() the module has module._main(), module._emscripten_proxy_main(), module.run() etc. but there is not module.callMain() ...

I am still not sure how come it diverges from the suggested, do I need to explicitly expose callMain?

I will try to expose it and see if it works. Should I expect any different behavior calling callMain vs. _emscripten_proxy_main ?

@sbc100
Copy link
Collaborator

sbc100 commented Jun 2, 2021

I think you need to add -sEXPORTED_RUNTIME_METHODS=callMain.

@jozefchutka
Copy link
Author

I have exported callMain() and being able to call it now. But sometimes when called multiple times in row it throws:

ErrnoError {node: undefined, errno: 44, message: "FS error", stack: "<generic error, no stack>", setErrno: ƒ}
errno: 44
message: "FS error"
node: undefined
setErrno: ƒ (errno)
stack: "<generic error, no stack>"

...which differs from calling _emscripten_proxy_main() directly. Calling _emscripten_proxy_main throws no exception

Any ideas?

@jozefchutka
Copy link
Author

It seems it has to do something with how arguments are generated.

The algorithm in callMain does

var argc=args.length+1;
var argv=stackAlloc((argc+1)*4);
GROWABLE_HEAP_I32()[argv>>>2]=allocateUTF8OnStack(thisProgram);
for(var i=1;i<argc;i++){
	GROWABLE_HEAP_I32()[(argv>>2)+i>>>0]=allocateUTF8OnStack(args[i-1])
}
GROWABLE_HEAP_I32()[(argv>>2)+argc>>>0]=0;

while the one from ffmpeg article

 const args = ['ffmpeg', '-hide_banner'];
  const argsPtr = Module._malloc(args.length * Uint32Array.BYTES_PER_ELEMENT);
  args.forEach((s, idx) => {
    const buf = Module._malloc(s.length + 1);
    Module.writeAsciiToMemory(s, buf);
    Module.setValue(argsPtr + (Uint32Array.BYTES_PER_ELEMENT * idx), buf, 'i32');
  })

The latter seems more stable in my case.

I am considering to stick with emsdk 2.0.8, which:

  • seems more stable
  • allows me to re-call my main method while using EXIT_RUNTIME=1
  • nicely disposes workers when exit() is called

The 2.0.23 suffers from these 3 compared to 2.0.8

@kripken
Copy link
Member

kripken commented Jun 3, 2021

main() is meant to be called once, and so calling callMain multiple times is not supported. It is possible that calling main in an internal/direct way happens to work, but that is not safe in general - it depends on how things are set up in the startup process.

I'd recommend avoiding calling main multiple times, and instead export a function for that purpose, and call it as many times as you need.

(The "dispose of workers when exit" issue seems separate, and may be worth filing a bug if you see an issue there?)

@jozefchutka
Copy link
Author

Hi @kripken ,

I managed to export additional function notmain , ffmpeg does good job reinitializing everything necessary inside main(), so the code looks like:

int notmain(int argc, char **argv)
{
    return main(argc, argv);
}

Can you please point me to the right direction or documentation in order to be able to call it multiple times via pthreads.

  1. Whats the proper way of calling my notmain from javascript? I can see Module._notmain but not Module._emscripten_proxy_notmain and I need the executions in pthreds/workers
  2. How to properly pass arguments? Which method should I use What changed for proxying between 2.0.8 and 2.0.23? #14312 (comment) and is there anything available in Module I could call instead of rebuilding this algorithm myself?
  3. I find it hard to follow docs when it comes to EXIT_RUNTIME=1, my intention is to call nomain() multiple times on the same Module instance, as each execution creates file in filesystem etc. However later at some point I want to exit the Module and make sure there are no leaks.

I have seen some realted activity on #14367 , thanks for that. I would love to see more docs or any solid instructions no how to proceed when main() (or any other) is to be called multiple times.

@kripken
Copy link
Member

kripken commented Jun 4, 2021

notmain calling main is also not valid, I think. main is a very special function in the C world - it's called automatically by the runtime for you, and exactly once.

It sounds like your codebase wants to actually run main multiple times - perhaps using MODULARIZE and creating a new instance for each invocation is the right thing?

@sbc100
Copy link
Collaborator

sbc100 commented Jun 4, 2021

Although calling main mulitple times is technically not a good idea, I think in this case is probably OK.

You could solve the both issues by modifying the code or using -Dmain=notmain on the command line? That way main no longer exists and you would just export _notmain. This would avoid the wrapper function and avoid the "calling main more than once issue".

The important thing is the main all the static constructors and startup code in emscripten's run() method is called just once. (i.e. just like we are using the module as a library).

@sbc100
Copy link
Collaborator

sbc100 commented Jun 4, 2021

  1. Whats the proper way of calling my notmain from javascript? I can see Module._notmain but not Module._emscripten_proxy_notmain and I need the executions in pthreds/workers

You probably want to create you own wrapper that creates the thread just retuns. PROXY_TO_PTHREAD only really works for programs with a single main that is called once. So I this case you don't want PROXY_TO_PTHREAD but would need to roll your own runner. The code is pretty simple:

int emscripten_proxy_main(int argc, char** argv) {
pthread_attr_t attr;
pthread_attr_init(&attr);
// Use the size of the current stack, which is the normal size of the stack
// that main() would have without PROXY_TO_PTHREAD.
pthread_attr_setstacksize(&attr, emscripten_stack_get_base() - emscripten_stack_get_end());
// Pass special ID -1 to the list of transferred canvases to denote that the thread creation
// should instead take a list of canvases that are specified from the command line with
// -s OFFSCREENCANVASES_TO_PTHREAD linker flag.
emscripten_pthread_attr_settransferredcanvases(&attr, (const char*)-1);
_main_argc = argc;
_main_argv = argv;
pthread_t thread;
int rc = pthread_create(&thread, &attr, _main_thread, NULL);
pthread_attr_destroy(&attr);
return rc;
}

Alternatively you could create your worker in JS and use postmessage to sent the work over to it.

  1. How to properly pass arguments? Which method should I use #14312 (comment) and is there anything available in Module I could call instead of rebuilding this algorithm myself?

The code in that comment looks reasonable assuming you need to pass strings. You can write simple wrapper that takes JS strings and converts them to an array of string.

BTW, often times when building a library its easier to avoid the generic argv-style argument passing and use something more specific.

  1. I find it hard to follow docs when it comes to EXIT_RUNTIME=1, my intention is to call nomain() multiple times on the same Module instance, as each execution creates file in filesystem etc. However later at some point I want to exit the Module and make sure there are no leaks.

You can set EXIT_RUNTIME=1 and it should do what you want. The module will stay alive until someone calls exit(). One issue to be aware of is that if any of your calls to notmain ever call exit (or, for example, abort or assert) the module will then becoming technically unusable since it will "exit" and run any static desctructors. (This might not matter for your app since it C and not C++, but technically you should not call the module after its exit'd and we have asserts in debug builds that let you know if you do this).

@jozefchutka
Copy link
Author

Hi @kripken , @sbc100 thanks for your valuable instructions. I think I have a solid starting point now. It would be also nice to have this in documentation so it can help more people struggling with the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants