Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bysyncify support: Asyncify/Emterpreter-like functionality for the wasm backend #8808

Merged
merged 32 commits into from Jun 21, 2019

Conversation

Projects
None yet
4 participants
@kripken
Copy link
Member

commented Jun 17, 2019

Basically gets us up to parity with fastcomp in terms of async support. This support should be faster and better. However, it doesn't have whitelist/blacklist and coroutine support yet (unsure how important those are). But the main features all work:

  • Sleep support (emscripten_sleep, SDL_Delay, etc.)
  • Emscripten sync APIs (emscripten_wget* and others).
  • Synchronous fsync syscall.
  • Browser main loop integration.

This enables on the wasm backend a bunch of previously emterpreter-only tests.

cc @gabrielcuvillier - if you test this on Doom let me know how it works!

@kripken kripken requested review from sbc100 and jgravelle-google Jun 17, 2019

@sbc100

This comment has been minimized.

Copy link
Collaborator

commented Jun 20, 2019

Amazing work alon! I didn't see the binaryen side yet but I guess that is where the heavy lifting is.

@sbc100

sbc100 approved these changes Jun 20, 2019

Show resolved Hide resolved tests/runner.py Outdated
Show resolved Hide resolved tests/test_browser.py

@kripken kripken merged commit 7eb910a into incoming Jun 21, 2019

26 of 27 checks passed

ci/circleci: test-browser-firefox Your tests failed on CircleCI
Details
ci/circleci: build Your tests passed on CircleCI!
Details
ci/circleci: build-docs Your tests passed on CircleCI!
Details
ci/circleci: build-upstream Your tests passed on CircleCI!
Details
ci/circleci: flake8 Your tests passed on CircleCI!
Details
ci/circleci: test-ab Your tests passed on CircleCI!
Details
ci/circleci: test-browser-chrome Your tests passed on CircleCI!
Details
ci/circleci: test-c Your tests passed on CircleCI!
Details
ci/circleci: test-d Your tests passed on CircleCI!
Details
ci/circleci: test-e Your tests passed on CircleCI!
Details
ci/circleci: test-f Your tests passed on CircleCI!
Details
ci/circleci: test-ghi Your tests passed on CircleCI!
Details
ci/circleci: test-jklmno Your tests passed on CircleCI!
Details
ci/circleci: test-other Your tests passed on CircleCI!
Details
ci/circleci: test-p Your tests passed on CircleCI!
Details
ci/circleci: test-qrst Your tests passed on CircleCI!
Details
ci/circleci: test-sanity Your tests passed on CircleCI!
Details
ci/circleci: test-upstream-browser-chrome Your tests passed on CircleCI!
Details
ci/circleci: test-upstream-browser-firefox Your tests passed on CircleCI!
Details
ci/circleci: test-upstream-other Your tests passed on CircleCI!
Details
ci/circleci: test-upstream-wasm0 Your tests passed on CircleCI!
Details
ci/circleci: test-upstream-wasm2 Your tests passed on CircleCI!
Details
ci/circleci: test-uvwxyz Your tests passed on CircleCI!
Details
ci/circleci: test-wasm0 Your tests passed on CircleCI!
Details
ci/circleci: test-wasm1 Your tests passed on CircleCI!
Details
ci/circleci: test-wasm2 Your tests passed on CircleCI!
Details
ci/circleci: test-wasm3 Your tests passed on CircleCI!
Details

@kripken kripken deleted the bysyncify branch Jun 21, 2019

@Keno

This comment has been minimized.

Copy link

commented Jun 23, 2019

How do I try this out? I installed latest emscripten/binaryen using emsdk, and just to test added a simple emscripten_sleep call in my code, as well as setting -s BYSYNCIFY=1. However, I'm seeing TypeError: Module._bysyncify_stop_unwind is not a function in the console. Is there some other setting I need to set?

@kripken

This comment has been minimized.

Copy link
Member Author

commented Jun 23, 2019

@Keno That should work. But you may need to use the bysyncify2 branch in this repo, and very latest master in binaryen, as I'm in the process of fixing and optimizing some final things.

If that doesn't fix things for you, does say ./tests/runner.py browser.test_async_iostream work? If nothing obvious explains what you see, please share your full testcase, could be a bug.

I hope to finish this and have a summarizing blogpost out in a week or two.

@Keno

This comment has been minimized.

Copy link

commented Jun 23, 2019

I think the problem might be that I'm not using the LLVM wasm backend (which unfortunately doesn't work for me for reasons I don't quite understand).

@kripken

This comment has been minimized.

Copy link
Member Author

commented Jun 23, 2019

Ah yes, it doesn't work without the wasm backend. I'll add an error for fastcomp.

Btw, please report anything that doesn't work in the wasm backend - it should be stable at this point, we intend to switch to it by default soon.

@Keno

This comment has been minimized.

Copy link

commented Jun 23, 2019

Yeah, will do. Right now the symptom is that mpfr fails to find gmp when I try to build both with the LLVM backend (worked fine with fastcomp). I'll dig one level deeper and then file an issue with whatever project is responsible.

@Keno

This comment has been minimized.

Copy link

commented Jun 23, 2019

Found my error. GMP was using a ranlib that doesn't understand wasm, causing the .a to not have a symbol table and wasm-ld to fail to find the symbols in it (but not otherwise complaining). Seems like that'll be a fairly common problem, so probably a good idea for wasm-ld to warn or error about.

@Keno

This comment has been minimized.

Copy link

commented Jun 24, 2019

Alright, after many hours and a bunch of fixes and workarounds up and down the tool chain, I've finally gotten everything built with the LLVM backend. Good news and bad news: The unwind works, and the timeout gets scheduled properly also. However, it errors during the rewind with a rather cryptic RuntimeError: unreachable executed at bysyncify_start_rewind http://localhost:8888/hello.wasm:24290850. I'll dig in further, but let me know if there's something obvious I need to look at.

@kripken

This comment has been minimized.

Copy link
Member Author

commented Jun 25, 2019

@Keno thanks for testing!

I've been working on setting up fuzzing, and looks like there are some fuzz bugs. So maybe you're seeing one of those that I haven't fixed yet. However, another possibility is that the unreachable in those functions is basically an assertion for the data structure being valid. Are you using the normal emscripten code emscripten_sleep etc. or the lower level handleSleep? Or did you manually call bysyncify_start_rewind yourself?

@Keno

This comment has been minimized.

Copy link

commented Jun 25, 2019

I'm just using the regular emscripten_sleep for now, with no special handling anywhere else. I was gonna look at the data structure that's passed to rewind, but unfortunately doing so crashes Firefox and chrome won't even get that far due to stack size limits at the moment, so I'll have to probably resolve those issues, before I can continue debugging this.

@kripken

This comment has been minimized.

Copy link
Member Author

commented Jun 25, 2019

I see, thanks @Keno. In that case maybe wait until I finish all the fuzz bugs, as then things should be a lot more stable. I can let you know when that is.

@kripken

This comment has been minimized.

Copy link
Member Author

commented Jun 25, 2019

@Keno I fixed the fuzz issues I was seeing, and landed bysyncify2 meanwhile. So worth testing on latest binaryen master and emscripten incoming now, I think.

@Akaricchi

This comment has been minimized.

Copy link

commented Jun 26, 2019

How feasible would it be to implement coroutines? I'm currently experimenting with coroutines in my C game project, if I could keep emscripten compatibility that'd be awesome. You can definitely expect some testing from me.

@kripken

This comment has been minimized.

Copy link
Member Author

commented Jun 26, 2019

It's definitely possible to add, if someone is interested to do that. Bysyncify implements unwinding and rewinding, and for sleeping there is just one "context" that is unwound and rewound. For coroutines there would be multiple ones being juggled, and also the C stack may need to be saved, but that's mostly it I think. Here is where the emterpreter support was added, might be helpful (but the mechanisms are different), #6014

@Keno

This comment has been minimized.

Copy link

commented Jun 29, 2019

@kripken I've attempted to do some more debugging and I think my problem has to do with bysyncify unwinding through functions on the stack that are using longjmp. Is the expected to be supported?

@Keno

This comment has been minimized.

Copy link

commented Jun 29, 2019

In particular, if I'm interpreting what I'm seeing correctly, it looks like the binaryen pass doesn't resolve the invoke_* call targets, just treating them as imports not in the bysyncify import list and thus assuming that they cannot start an unwind.

@kripken

This comment has been minimized.

Copy link
Member Author

commented Jun 29, 2019

Thanks @Keno, yeah, I forot that and I guess the emterpreter/asyncify tests didn't cover it. I'll have a fix soon.

@Keno

This comment has been minimized.

Copy link

commented Jul 1, 2019

Can confirm that things work with #8895. There's two remaining issues, though I can work around them on my side and run with this a bit more:

  1. The unwind buffer is not checked against overflow (or if it's support to be that isn't working), so if it's too small (which the default is most of the time in my case), it goes around overwriting random portions of heap (and failing to rewind).
  2. Activating bysyncify seems to blow up stack usage (?). In any case, Chrome starts throwing RangeError: Maximum call stack size exceeded, with very few (tens) stack frames on the stack when bysyncify is turned on. Not sure if this is from the extra frames introduced by instrumenting the exports of if Chrome measures something about the transformed wasm functions.
@kripken

This comment has been minimized.

Copy link
Member Author

commented Jul 1, 2019

@Keno - do you have a testcase for the second issue? That seems very surprising, not sure what could cause that.

For the first issue, it should check for overflow (after unwinding completes), so sounds like something's wrong, I'll look into that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.