Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tfjs 3.4.0 regression: wasm causes browser hang #4932

Closed
vladmandic opened this issue Apr 14, 2021 · 9 comments · Fixed by #4957
Closed

tfjs 3.4.0 regression: wasm causes browser hang #4932

vladmandic opened this issue Apr 14, 2021 · 9 comments · Fixed by #4957
Assignees
Labels
comp:wasm type:bug Something isn't working

Comments

@vladmandic
Copy link
Contributor

vladmandic commented Apr 14, 2021

tfjs 3.4.0 was just released and unfortunately wasm module is somewhat broken

to be precise, tfjs-backend-wasm.wasm and tfjs-backend-wasm-simd.wasm seem to work (although haven't tested in depth)
what's definitely broken is tfjs-backend-wasm-threaded-simd.wasm

and symptoms are simple - complete browser hang,
up to a point where browser window is nonresponsive to close requests and requires a hard kill

to reproduce:

  • enable WebAssembly SIMD support in chrome://flags
  • enable WebAssembly threads support in chrome://flags
  • set backend to wasm
  • run any model inference

this used to work in all recent versions of tfjs

environment: tfjs 3.4.0 on windows 10 with chrome 89 or edge 89

@vladmandic vladmandic added the type:bug Something isn't working label Apr 14, 2021
@rthadur
Copy link
Contributor

rthadur commented Apr 14, 2021

#4796 this might be related , will try to test in my local as well and update , thank you

cc @mattsoulanille

@lina128
Copy link
Collaborator

lina128 commented Apr 15, 2021

I test in my local, the problem is gone after turning off the SIMD flag in chrome. We are working on a fix. Thank you for reporting the bug!

@pyu10055 pyu10055 assigned mattsoulanille and unassigned rthadur Apr 15, 2021
@pyu10055
Copy link
Collaborator

@mattsoulanille This maybe related to the recent changes to bazel wasm build, #4769

@vladmandic
Copy link
Contributor Author

any updates on this one? as much as i'd love to use tfjs 3.4.0 due to support of tf2, this is a blocker for me.

@mattsoulanille
Copy link
Member

No updates yet, but I've been able to reproduce it. I agree with Ping that this is likely related to changes made in #4769, and I'm taking a look.

@vladmandic
Copy link
Contributor Author

@mattsoulanille - thanks - 'taking a look' is an update, just wanted to know what's going on as it's a blocker for me.

mattsoulanille added a commit that referenced this issue Apr 22, 2021
Without the linker option `PTHREAD_POOL_SIZE` set, the wasm backend does not create any webworkers and spins forever waiting for them to be created (crashing the browser). This PR sets `PTHREAD_POOL_SIZE` to `Math.min(4, Math.max(1, (navigator.hardwareConcurrency || 1) / 2))`, meaning it will use half of the logical cores available up to a max of 4 (or only 1 if `navigator.hardwareConcurrency` is not available). See [this comment](emscripten-core/emscripten#10263 (comment)) for details on why this works. This option apparently supersedes the value passed to [`pthreadpool_create` in `backend.cc`](https://github.com/tensorflow/tfjs/blob/c9dfebddfa34f531d2f0c363b6ea574a9fe9745d/tfjs-backend-wasm/src/cc/backend.cc#L65-L66), but its current setting should match the value computed in the `.cc` file. Fixes #4932 

This PR also fixes a bug where webworkers created by the wasm backend were not removed when `dispose` is called. Fixes #4796.

Fixes #4934 as well.
@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

@vladmandic
Copy link
Contributor Author

@mattsoulanille thanks!

given it's a regression that causes hangs, does this warrant spinning up a tfjs service build (e.g., 3.4.1)?
(cc @pyu10055)

@lina128
Copy link
Collaborator

lina128 commented Apr 22, 2021

Hi @vladmandic , we are releasing 3.5.0 with this fix today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:wasm type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants