-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wasm] Fix wasm backend not starting any threads #4957
Conversation
Add PTHREAD_POOL_SIZE=4 to linker options to ensure threaded wasm creates workers.
The thread count computed in |
@@ -19,6 +19,7 @@ import './flags_wasm'; | |||
import {backend_util, BackendTimingInfo, DataStorage, DataType, deprecationWarn, engine, env, KernelBackend, TensorInfo, util} from '@tensorflow/tfjs-core'; | |||
|
|||
import {BackendWasmModule, WasmFactoryConfig} from '../wasm-out/tfjs-backend-wasm'; | |||
import {BackendWasmThreadedSimdModule} from '../wasm-out/tfjs-backend-wasm-threaded-simd'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the Simd module need a separate instance too or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simd module doesn't need a separate interface because its API is the same as the non-simd API. The threaded simd module, however, does need a separate interface because it now has the PThread.terminateAllThreads
function exposed (via the type declarations).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 approvals obtained (waiting on @mattsoulanille)
tfjs-backend-wasm/src/backend_wasm.ts, line 22 at r1 (raw file):
Previously, mattsoulanille (Matthew Soulanille) wrote…
The simd module doesn't need a separate interface because its API is the same as the non-simd API. The threaded simd module, however, does need a separate interface because it now has the
PThread.terminateAllThreads
function exposed (via the type declarations).
Got it,
tfjs-backend-wasm/src/backend_wasm.ts, line 164 at r1 (raw file):
this.wasm.tfjs.dispose(); if ('PThread' in this.wasm) { this.wasm.PThread.terminateAllThreads();
Is PThread in both wasm types? Is it better to use a type guard, like:
function isWasmThreaded(object: BackendWasmModule|BackendWasmThreadedSimdModule): object is BackendWasmThreadedSimdModule {
return object != null && 'PThread' in object;
}
if (isWasmThreaded(this.wasm)) {
this.wasm.PThread.terminateAllThreads();
}
"-s PROXY_TO_PTHREAD=1", | ||
# Many x86-64 processors have 2 threads per core, so we divide by 2. | ||
"-s PTHREAD_POOL_SIZE=" + | ||
"'Math.min(4, Math.max(1, (navigator.hardwareConcurrency || 1) / 2))'", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it doable that we register a flag WASM_PTHREAD_POOL_SIZE
in flags_wasm.ts
and initialized with 'Math.min(4, Math.max(1, (navigator.hardwareConcurrency || 1) / 2))'
. Then in here and backend.cc
, we both use the flag value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've found the root cause of this bug, and I'm pretty sure it's caused by the PROXY_TO_PTHREADS
link option not working. During my testing, I noticed that emscripten was running on the main thread, and that the browser freeze was due to the _emscripten_futex_wait
function running a busy wait loop. This matches the behavior described in the second paragraph of Blocking on the Main Browser Thread.
I'll look into why PROXY_TO_PTHREADS
isn't working as expected.
Reviewable status: 0 of 1 approvals obtained (waiting on @lina128, @mattsoulanille, and @qjia7)
tfjs-backend-wasm/src/backend_wasm.ts, line 164 at r1 (raw file):
Previously, lina128 (Na Li) wrote…
Is PThread in both wasm types? Is it better to use a type guard, like:
function isWasmThreaded(object: BackendWasmModule|BackendWasmThreadedSimdModule): object is BackendWasmThreadedSimdModule { return object != null && 'PThread' in object; } if (isWasmThreaded(this.wasm)) { this.wasm.PThread.terminateAllThreads(); }
PThread is only in the BackendWasmThreadedSimdModule
type (and it only actually appears in the JS object when the simd threaded build is run). We could use a type guard if that's the recommended style, but tsserver knows that this.wasm
is a BackendWasmThreadedSimdModule
after I check if ('PThread' in this.wasm)
, so I don't think a type guard is necessary.
tfjs-backend-wasm/src/cc/BUILD, line 58 at r1 (raw file):
Previously, qjia7 (Jiajia Qin) wrote…
Is it doable that we register a flag
WASM_PTHREAD_POOL_SIZE
inflags_wasm.ts
and initialized with'Math.min(4, Math.max(1, (navigator.hardwareConcurrency || 1) / 2))'
. Then in here andbackend.cc
, we both use the flag value?
This is a good idea, but I think it will have to be done in a separate PR after this fix is out. Feel free to create a feature request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Matt. This is great! LGTM
Reviewable status:
complete! 1 of 1 approvals obtained (waiting on @lina128, @mattsoulanille, and @qjia7)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 1 approvals obtained (waiting on @lina128, @mattsoulanille, and @qjia7)
tfjs-backend-wasm/src/cc/BUILD, line 58 at r1 (raw file):
Previously, mattsoulanille (Matthew Soulanille) wrote…
This is a good idea, but I think it will have to be done in a separate PR after this fix is out. Feel free to create a feature request.
Hi Jiajia, the flag you proposed is different than this one, #4942, right? For both flags, do you expect them to be tunable?
Fixes #4934: |
Yes. #4942 is just a get method. I hope we can control the threads number or pool size. Do you know the relationship between the |
They should be same. When pthread_create() is called, if we need to create a new Web Worker, then that requires returning the main event loop. That is, you cannot call pthread_create and then keep running code synchronously that expects the worker to start running - it will only run after you return to the event loop. This is a violation of POSIX behavior and will break common code which creates a thread and immediately joins it or otherwise synchronously waits to observe an effect such as a memory write. There are several solutions to this: |
Without the linker option
PTHREAD_POOL_SIZE
set, the wasm backend does not create any webworkers and spins forever waiting for them to be created (crashing the browser). This PR setsPTHREAD_POOL_SIZE
toMath.min(4, Math.max(1, (navigator.hardwareConcurrency || 1) / 2))
, meaning it will use half of the logical cores available up to a max of 4 (or only 1 ifnavigator.hardwareConcurrency
is not available). See this comment for details on why this works. This option apparently supersedes the value passed topthreadpool_create
inbackend.cc
, but its current setting should match the value computed in the.cc
file. Fixes #4932This PR also fixes a bug where webworkers created by the wasm backend were not removed when
dispose
is called. Fixes #4796.Fixes #4934 as well.
To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.
This change is![Reviewable](https://camo.githubusercontent.com/23b05f5fb48215c989e92cc44cf6512512d083132bd3daf689867c8d9d386888/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)