Skip to content

Conversation

@kripken
Copy link
Member

@kripken kripken commented Nov 23, 2021

This includes one of the two methods from #14666. Back then we did not have
a clear use case for them, but now wasmfs will use sync_to_async, and in fact
@tlively and @ethanalee found a bug in that code - further justifying that we should
write it once in a utility and reuse it rather than expect users and ourselves to
rewrite this type of logic.

The specific helper here, sync_to_async, spins up a pthread and then provides
an API to run code on that pthread. For wasmfs this can be useful for a backend
that proxies all operations to another thread, for example.

@kripken kripken requested review from sbc100 and tlively November 23, 2021 19:14
ChangeLog.md Outdated

3.0.1
-----
- Add `emscripten/thread_utils.h` helper header, which includes C++ utilities
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we add this as an internal detail of wasmfs for now and we can publicize it later if/when we think is something is widely useful?

Also, that will allow us to iterate on it while developing wasmfs without breaking any folks who start using it.

In general I think we should try to avoid creating more API surfaces that we need to support/

Copy link
Member Author

@kripken kripken Nov 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair points, changed to be internal in wasmfs for now.

I do think this is general enough and hard enough to get right that users and we should not be rewriting it. But we can consider that later.

// a thread, or when using PROXY_TO_PTHREAD), but you have code that is hard to
// refactor to be async, but that requires some async operation (like waiting
// for a JS event).
class sync_to_async {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the coding style to use CamelCase for class names?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The worker thread should always either be holding the lock, preventing other threads from sending it new work, or it should be waiting on the condition variable for new work. Any time it unlocks without waiting on the condition variable is an opportunity for other threads to send work that could be overwritten and lost.

threadIter therefore needs to be holding childLock as both a pre- and post-condition.

That means that threadMain needs to acquire the lock before scheduling threadIter.

Furthermore, SyncToAsync() needs to ensure that the worker thread is waiting for work before it returns, otherwise the worker thread might not be ready and waiting by the time the first work arrives. So it needs to acquire the lock, start the worker thread, and wait on the condition variable. threadMain then needs to acquire the lock, notify the original thread, then schedule threadIter, all without releasing the lock. Once threadIter runs and waits, releasing the lock, SyncToAsync will be able to safely resume and return.

Comment on lines +94 to +97
parent->childLock.lock();
parent->condition.wait(parent->childLock, [&]() {
return parent->readyToWork;
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it's possible for this code to miss the first message that some work should be done. Would it make sense to have the spawning thread wait to be notified that the worker thread is ready to accept work before continuing?

Comment on lines +104 to +105
parent->childLock.unlock();
parent->condition.notify_one();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unlock also looks fishy. Multiple invoker threads could send new work while the worker thread does not hold the lock and is not sleeping and they would all trample each other and the work would be lost.

Comment on lines +119 to +123
SyncToAsync() : childLock(mutex) {
// The child lock is associated with the mutex, which takes the lock as we
// connect them, and so we must free it here so that the child can use it.
// Only the child will lock/unlock it from now on.
childLock.unlock();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it could be better handled with a std::unique_lock and its constructor that takes a std::defer_lock_t. (Although I think that actually we shouldn't be releasing the lock at all here.)

Comment on lines +148 to +149
finishedWork = false;
readyToWork = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be simpler if these were combined into a single piece of state.

@kripken
Copy link
Member Author

kripken commented Nov 23, 2021

Thanks for the feedback @tlively !

You know this sort of thing better than me - would you prefer to pick up this PR? I'd be ok either way - happy to do the work to get through the comments, but if you'd prefer to rewrite this, that sgtm too.

@tlively
Copy link
Member

tlively commented Nov 24, 2021

Sure, I'll push some edits to this branch as a new commit. Reviewing the diff will probably be more productive than trying to work through my comments.

@tlively
Copy link
Member

tlively commented Nov 24, 2021

Proposed changes to this branch uploaded as a separate PR: #15621.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants