Permalink
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
364 lines (266 sloc) 15.1 KB

Atomics.waitAsync (Stage 2 proposal)

Author: Lars T Hansen (lhansen@mozilla.com), April 2017

Overview and background

We provide a new API, Atomics.waitAsync, that an agent can use to wait on a shared memory location (to later be awoken by some agent calling Atomics.wake on that location) without waiting synchronously (ie, without blocking). Notably this API is useful in agents whose [[CanBlock]] attribute is false, such as the main thread of a web browser document, but the API is not restricted to such agents.

The API is promise-based. Very high performance is not a requirement, but good performance is desirable.

This API was not included in ES2017 so as to simplify the initial API of shared memory and atomics, but it is a desirable API for smooth integration of shared memory into the idiom of ECMAScript as used in web browsers. A simple polyfill is possible but a native implementation will likely have much better performance (both memory footprint and execution time) than the polyfill.

Examples of usage: see example.html in this directory.

Prior history: Related APIs have been proposed before, indeed one variant was in early drafts of the shared memory proposal under the name Atomics.futexWaitCallback.

Synopsis

Atomics.waitAsync(i32a, index, value, [timeout]) => result

The arguments are interpreted and checked as for Atomics.wait. If argument checking fails an exception is thrown synchronously, as for Atomics.wait.

  • i32a is an Int32Array mapping a SharedArrayBuffer
  • index is a valid index within i32a
  • value will be converted to Int32 and compared against the contents of i32a[index]
  • timeout, if present, is a timeout value

The result is a promise. The promise can be resolved with a string value, one of "ok", "timed-out", "not-equal"; the value has the same meaning as for the return type of Atomics.wait. The promise is never rejected.

Agents can call Atomics.wake on some location corresponding to i32a[index] to wake any agent waiting with Atomics.waitAsync. The agent performing the wake does not need to know how that waiter is waiting, whether with wait or with waitAsync.

Notable facts (informal semantics)

Multiple agents can waitAsync on the same location at the same time. A wake on the location will resolve all the waiters' promises (as many as the count argument to wake allows for).

A single agent can waitAsync multiple times on a single location before any of the waits are resolved. A wake on the location will resolve (sequentially) all the promises (as many as the count argument allows for).

Some agents can wait and other agents can waitAsync on the same location at the same time, and a wake will wake waiters regardless of how they are waiting.

A single agent can first waitAsync on a location and then, before that wait is resolved, wait on the same location.

A single agent can waitAsync on a location and can then wake on that location to resolve that promise within itself.

More generally, an agent can waitAsync on a location and only subsequent to that take action that will cause some agent to perform a wake. For this reason, an implementation of waitAsync that blocks is not viable.

Agents that wait with waitAsync participate in the same fairness scheme as agents that wait with wait: When an agent performs a wake with a count s.t. it does not wake all waiting agents, waiting agents are woken in a predictable order. This order is the order in which agents called wait or waitAsync, except when an agent calls wait after waitAsync (on the same location) when it is more complicated; see below.

Semantics

Several changes to the ES2017 spec are necessary to accomodate Atomics.waitAsync. These changes are not in themselves intended to change semantics.

Primarily, timeouts are made less magical by introducing a notion of concurrent "alarms" which are thunks that are run in response to expired timeouts and perform wakeups using the standard wakeup mechanisms. Alarms are named by IDs and sets of these IDs hang off the WaiterLists. The sets are in some sense optional, but clarify the coordination between alarms and explict wakeups as well as the lifetime of the alarm thunks.

Secondarily, the job of removing a waiter has been moved to the thread performing a wakeup, whether that is another agent performing a wake or the concurrent alarm thread.

With that background, the changes necessary to add waitAsync boil down to generalizing the WaiterList to hold both sync and async waiters, and to inserting waiters into that list in an order that ensures the desired wakeup order.

24.4.1.3, GetWaiterList

Change this section as follows.

Before the first paragraph, insert this paragraph:

"A Waiter is a pair (agent, null | promise) where agent is the agent signifier of the agent that is waiting. Agents that are waiting in Atomics.wait will be designated by a Waiter where the second element of the pair is null; agents that are waiting in Atomics.waitAsync will be designated by a Waiter where the second element of the pair is the promise that is to be resolved when the agent is awoken."

In the current first paragraph, change the word "agent" to "Waiter".

After the current first paragraph, insert these two paragraphs:

"There can be multiple entries in a WaiterList with the same agent signifier, but at most one of those will have null in the second element of the entry."

"The WaiterList has an attached alarm set, a set of truthy values. This set is manipulated only when the thread manipulating it is in the critical section for the WaiterList."

In the current third paragraph, add "adding and removing alarms" to the list of guarded operations.

Spec note: The alarm system may perhaps be further formalized so that the clause about the possibly absent alarm ID is not needed in the spec of CancelAlarm, but I'm not sure this would actually clarify anything.

24.4.1.6, AddWaiter( WL, W, promise )

The abstract operation AddWaiter takes three arguments, a WaiterList WL, an agent signifier W, and promise, which is either null or a Promise object. It performs the following steps:

  1. Assert: The calling agent is in the critical section for WL.
  2. Assert: (W, promise) is not on the list of waiters in any WaiterList.
  3. If promise is null:
    1. Insert (W, null) into the list of waiters in WL directly before the first other entry (W, x) in the list, or to the end of the list if there is no such entry.
  4. Else,
    1. Add (W, promise) to the end of the list of waiters in WL.

Spec note: The purpose of prioritizing sync waits is to avoid surprises in a corner case: Suppose an agent A async-waits and then sync waits on the same location, and another agent B performs a wakeup on that location with a count of 1. If waits are inserted strictly in order nothing will happen, because it's A's async-wait that will end, while A's sync wait keeps A sleeping; it is only when A is woken again that the sync wait ends, A runs to completion, and the promise for the async-wait can be resolved. With the prioritization A will run more quickly, which seems most reasonable.

24.4.1.7, RemoveWaiter( WL, W, promise )

The abstract operation RemoveWaiter takes three arguments, a WaiterList WL, an agent signifier W, and promise, which is either null or a Promise object. It performs the following steps:

  1. Assert: The calling agent is in the critical section for WL.
  2. Assert: (W, promise) is on the list of waiters in WL.
  3. Remove (W, promise) from the list of waiters in WL.

24.4.1.9, Suspend( WL, W)

Change this function not to take a 'timeout' argument. Timeouts are now handled in the caller. (Not intended as a semantic change.)

Change step 5 to be this:

  1. Perform LeaveCriticalSection(WL) and suspend W, performing the combined operation in such a way that a wakeup that arrives after the critical section is exited but before the suspension takes effect is not lost. W can wake up only because it was woken explicitly by another agent calling WakeWaiter(WL, W), not for any other reasons at all.

Remove steps 7 and 8.

24.4.1.10, WakeWaiter( WL, W, promise )

The abstract operation WakeWaiter takes three arguments, a WaiterList WL, an agent signifier W, and promise, which is either null or a Promise object. It performs the following steps:

  1. Assert: The calling agent is in the critical section for WL.
  2. Assert: (W, promise) is on the list of waiters in WL.
  3. If promise is null:
    1. Wake the agent W.
  4. Else,
    1. Make promise resolveable. (Spec details TBD.)

24.4.1.13, AddAlarm(WL, alarmFn, timeout)

Spec note: A new section.

AddAlarm takes three arguments, a WaiterList WL, a thunk alarmFn, and a nonnegative finite number timeout. It performs the following steps:

  1. Assert: the current thread is in the critical section on WL.
  2. Let alarm be a truthy value that is not in WL's alarm set.
  3. Add alarm to WL's alarm set.
  4. After timeout milliseconds has passed, perform the following actions on a concurrent thread:
    1. Perform EnterCriticalSection(WL).
    2. If alarm is in WL's alarm set:
      1. Remove alarm from WL's alarm set.
      2. Perform alarmFn().
    3. Perform LeaveCriticalSection(WL).
    4. Note: alarmFn is now dead.
  5. Return alarm.

24.4.1.4, CancelAlarm(WL, alarm)

Spec note: A new section.

CancelAlarm takes two arguments, a WaiterList WL and an alarm ID alarm. It performs the following steps:

  1. Assert: the current thread is in the critical section on WL
  2. Assert: alarm is a truthy value.
  3. Remove alarm from WL's alarm set (it may not be present).
  4. Note: No alarm that subsequently triggers for alarm (in the concurrent thread referenced in AddAlarm) will have any effect. The thunk associated with alarm is now dead and can be reclaimed; any scheduled timeout associated with alarm can be canceled.

24.4.11, Atomics.wait(typedArray, index, value, timeout)

Replace Steps 16-19 with the following:

  1. Perform AddWaiter(WL, W, null).
  2. Let awoken be true.
  3. Let alarm be false.
  4. If q is finite then:
    1. Let alarmFn be a function of no arguments that does the following:
      1. Set awoken to false.
      2. Perform RemoveWaiter(WL, W, null)
      3. Perform WakeWaiter(WL, W, null).
    2. Set alarm to AddAlarm(WL, alarmFn, q).
  5. Perform Suspend(WL, W)
  6. If awoken is true and alarm is not false:
    1. Perform CancelAlarm(WL, alarm)
  7. Assert: (W, null) is not on the list of waiters in WL.

24.4.12, Atomics.wake(typedArray, index, count)

RemoveWaiters no longer returns a list of pairs, but a list of waiters, which are (agent, null | Promise) pairs, so modify step 12 of this algorithm as follows:

  1. Repeat, while S is not an empty List,
    1. Let W be the first waiter in S.
    2. Remove W from the front of S.
    3. Assert: W is a pair (A, p)
    4. Perform WakeWaiter(WL, A, p).
    5. Add 1 to n.

Also note the subtle change of the word agent to the work waiter in substep a.

24.4.15, Atomics.waitAsync( typedArray, index, value, timeout )

Spec note: A new section. This is substantially similar to Atomics.wait.

  1. Let buffer be ? ValidateSharedIntegerTypedArray(typedArray, true).
  2. Let i be ? ValidateAtomicAccess(typedArray, index).
  3. Let v be ? ToInt32(value).
  4. Let q be ? ToNumber(timeout).
  5. If q is NaN, let t be + infinity, else let t be max(q, 0).
  6. Let block be buffer.[[ArrayBufferData]].
  7. Let offset be typedArray.[[ByteOffset]].
  8. Let indexedPosition be (i × 4) + offset.
  9. Let WL be GetWaiterList(block, indexedPosition).
  10. Perform EnterCriticalSection(WL).
  11. Let w be ! AtomicLoad(typedArray, i).
  12. If v is not equal to w, then
    1. Perform LeaveCriticalSection(WL).
    2. Return a new Promise resolved with the String "not-equal" (as for Promise.resolve)
  13. Let W be AgentSignifier().
  14. Let awoken be true.
  15. Let alarm be false.
  16. Let executor be a function of two arguments, resolve and reject, that does the following:
    1. If awoken is true then
      1. If alarm is not false:
        1. Perform CancelAlarm(WL, alarm)
      2. Invoke resolve on the string "ok"
    2. else
      1. Invoke resolve on the string "timed-out"
  17. Let P be a new Promise created with the executor executor
  18. If q is finite:
    1. Let alarmFn be a function of no arguments that does the following:
      1. Set awoken to false.
      2. Perform RemoveWaiter(WL, W, P)
      3. Perform WakeWaiter(WL, W, P)
    2. Set alarm to AddAlarm(WL, alarmFn, q)
  19. Perform AddWaiter(WL, W, P)
  20. Perform LeaveCriticalSection(WL).
  21. Return P.

Polyfills

A simple polyfill is possible.

We can think of the semantics as being those of an implementation that creates a new helper agent for each waitAsync call; this helper agent performs a normal synchronous wait on the location; and when the wait is complete, it sends an asynchronous signal to the originating agent to resolve the promise with the appropriate result value.

This polyfill models every situation, except for the situation where an agent performs a waitAsync followed by a wait and another agent subsequently asks to wake just one waiter - in the real semantics, the wait is woken first, in the polyfill the waitAsync is woken first.

As suggested by the semantics, in the Web domain it uses a helper Worker that performs a synchronous wait on behalf of the agent that is performing the waitAsync; that agent and the helper communicate by message passing to coordinate waiting and promise resolution.

As Workers are heavyweight and message passing is relatively slow, the polyfill does not have excellent performance, but it is a reasonable implementation and has good fidelity with the semantics. (Helpers can be reused within the agent that created them.)

The polyfill will not work in agents that cannot create new Worker objects, either if they are too limited (worklets?) or if nested Workers are not allowed (some browsers) or if a Worker cannot be created from a data: URL.

See polyfill.js in this directory for the polyfill and example.html for some test cases.

Implementation challenges

It would seem that multiple implementation strategies are possible, from having a thread per async wait (as the polyfill has) to having no additional threads at all, instead dispatching runnables to existing event loops in response to wakeups when records for async waits are found in the wait/wake data structures (a likely strategy for Firefox, for example).

Performance and optimizations

Synchronous resolution (rejected)

For performance reasons it might appear that it is desirable to "resolve the promise synchronously" if possible. Leaving aside what that would mean for a minute, this was explored and the committee decided it did not like the complexity of it. Additionally, it's easy enough to handle manually when it is necessary. See SYNC-RESOLVE.md for a writeup of the details around this idea.