Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix malloc() to work with -sWASM_WORKERS #16703

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

juj
Copy link
Collaborator

@juj juj commented Apr 12, 2022

Fix malloc() to work with -sWASM_WORKERS. Currently only emmalloc will work with Wasm Workers, but dlmalloc does not yet work. Fix emscripten_semaphore_async_acquire(). Add a test.

EDIT: Actually, simpler to add both dlmalloc and emmalloc support in the same PR, instead of deferring the dlmalloc part to a later PR, so updated to do both in this one.

juj added 2 commits April 12, 2022 20:36
…l work with Wasm Workers, but dlmalloc does not yet work. Fix emscripten_semaphore_async_acquire(). Add a test.
@juj juj mentioned this pull request Apr 12, 2022
…lloc-ww-debug-noerrno' but try to use 'libemmalloc-debug-ww-noerrno'
@juj juj force-pushed the malloc_with_wasm_workers branch from da7571c to d170d1a Compare April 13, 2022 09:59
@juj juj enabled auto-merge (squash) April 13, 2022 10:37
@juj juj disabled auto-merge April 13, 2022 15:10
…ince they need to configure different in Wasm Workers vs pthreads build modes.
@juj juj force-pushed the malloc_with_wasm_workers branch from 44dcb4d to 778842c Compare April 13, 2022 15:12
#define USE_LOCKS 1
#define USE_SPIN_LOCKS 0 // Ensure we use pthread_mutex_t.
#endif

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather see this land as a seperate PR since it doesn't just effect WASM_WORKERS.

Its also rather involved to figure out what this change does. As far as I can tell it means that USE_SPIN_LOCKS gets defined to 1 by default which ends up also defining USE_LOCKS, and then USE_SPIN_LOCKS ends up using __sync_lock_test_and_set and __sync_lock_release for locking instead of pthead_mutex_t? Is that right?

Anyway it seems involved enough for its own PR I think..

Copy link
Collaborator Author

@juj juj Apr 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is a productive use of time to artificially split to fix Wasm Workers to first work with emmalloc, and then with dlmalloc.

This change just moves the definition of USE_LOCKS and USE_SPIN_LOCKS from dlmalloc.c into system_libs.py. I had to do that migration for the following reason:

Earlier with system_libs.py, I had decided to save on the permutational explosion by not building each of the four combinations of {no threads, pthreads} X {no wasm workers, wasm workers} of system libraries, but instead the combo "pthreads but no wasm workers" is omitted as considered to be redundant, so we only build the three variants "singlethreaded" (no thread safety), "wasm workers" (use atomics and shmem so user can link in wasm workers), and "wasm workers+pthreads" (use atomics, shmem and pthread API, user can link in wasm workers) of each multithreaded system library.

However now with dlmalloc, it looks like we do want to configure it to recognize only when "pthreads but not wasm workers" is being used. This is because that is the scenario when we want to use pthread mutex locks in dlmalloc. When building with pthreads+wasm workers we want to use spinlocks because malloc could be called from wasm workers that won't be able to use pthread mutex api.

Maybe there will be more code in the future that requires such special case, and we do need to build all the four variants - but so far it seems this is the only place, so opted to do this configuration specifically for libmalloc only to keep embuilder build times smaller.

# Target dlmalloc multithreading aware implementation if using pthreads or Wasm Workers
cflags += [f'-DUSE_LOCKS={self.thread_safe}']
# When using Wasm Workers, target spinlocks. Otherwise target pthread mutexes when using pthreads.
cflags += [f'-DUSE_SPIN_LOCKS={self.use_spinlocks}']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just always defined USE_SPIN_LOCKS for consistency ... and it looks like USE_LOCKS gets defined to 1 when USE_SPIN_LOCKS is set.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I used USE_SPIN_LOCKS always, the obvious review would be "can you benchmark that it does not regress performance for any user?", which I really don't. I wanted to keep the existing behavior for existing pthreads users to be sure that there is no danger of regressing.

Copy link
Collaborator Author

@juj juj Apr 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like USE_LOCKS gets defined to 1 when USE_SPIN_LOCKS is set."

I did not want to disrupt the structure of the flags that were set from the way they originally were. The documentation in dlmalloc reads

Thread-safety: NOT thread-safe unless USE_LOCKS defined non-zero
When USE_LOCKS is defined, each public call to malloc, free,
etc is surrounded with a lock. By default, this uses a plain
pthread mutex, win32 critical section, or a spin-lock if if
available for the platform and not disabled by setting
USE_SPIN_LOCKS=0. However, if USE_RECURSIVE_LOCKS is defined,

suggesting that the intent is that one first sets USE_LOCKS to 1 to enable thread safety, and then further adjusts e.g. with USE_SPIN_LOCKS to 0 to customize what type of locking to use.

Originally in dlmalloc.c side Emscripten configured both of these flags, so I opted to preserve the same. Passing -DUSE_SPIN_LOCKS without defining -DUSE_LOCKS would not follow the wording in the docs, and no harm in explicitly passing both?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I used USE_SPIN_LOCKS always, the obvious review would be "can you benchmark that it does not regress performance for any user?", which I really don't. I wanted to keep the existing behavior for existing pthreads users to be sure that there is no danger of regressing.

I think this is worth a worthwhile thing to do if only to avoid the extra complexity of adding an additional library variant. It seems highly beneficial to use to use the wasm code in both wasm workers mode and without, no? And then we can avoid this new -ww variant of the library.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if spinlocks are good enough for wasm workers why wouldn't they be good enough for the normal pthread use case?

…ime rounding". It does look like optimizer does sweep the runtime rounding away, so no bother optimizing manually.

This reverts commit 837c731.
@@ -1271,6 +1271,8 @@ def __init__(self, **kwargs):
self.memvalidate = kwargs.pop('memvalidate')
self.verbose = kwargs.pop('verbose')
self.is_debug = kwargs.pop('is_debug') or self.memvalidate or self.verbose
self.thread_safe = kwargs.pop('thread_safe')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This library already inherits from MTLibrary and has the is_mt attribute.. is this not the same thing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants