Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upAllow Atomics.wait() to throw if a deadlock is guaranteed #1018
Comments
addaleax
changed the title from
Allow Atomics.wait() to throw if a deadlock is guranteed
to
Allow Atomics.wait() to throw if a deadlock is guaranteed
Oct 7, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jfbastien
Oct 7, 2017
It would be useful to understand why you think this is useful, compared to a deadlock.
jfbastien
commented
Oct 7, 2017
|
It would be useful to understand why you think this is useful, compared to a deadlock. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
TimothyGu
Oct 7, 2017
Member
@jfbastien No (sane, at least) program would want to deadlock, period. I know this sounds a bit curt but I simply cannot identify a use case for deadlocking in such situations.
|
@jfbastien No (sane, at least) program would want to deadlock, period. I know this sounds a bit curt but I simply cannot identify a use case for deadlocking in such situations. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
addaleax
Oct 7, 2017
@jfbastien This is mostly coming from a huge appreciation of deadlock detection mechanisms in other MT libraries for other languages (edit to clarify: as a consumer of those libraries, I mean).
I realize Atomics.wait() is supposed to be a building block for low-level mechanisms, but I don’t think this particular thing could be implemented in userland as well as in the threading implementations itself – also, people are going to make mistakes.
addaleax
commented
Oct 7, 2017
•
|
@jfbastien This is mostly coming from a huge appreciation of deadlock detection mechanisms in other MT libraries for other languages (edit to clarify: as a consumer of those libraries, I mean). I realize |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
syg
Oct 8, 2017
Member
In general, I would not mind some quick set of shallow, eager, and cheap deadlock detection, but we should tread carefully (and I am skeptical), for at least the following reasons.
-
The deadlock detection should not be implementation-dependent and needs to be fully defined.
-
JS having very different multiple embeddings greatly constrains the kind of deadlock detection that is feasible. Your proposal might be desirable for node, but the concepts of "agent" and "agent cluster" in the spec are loose enough that it's not clear at all to me what the minimal desirable set is, or if there is one.
It may be the case in node that, if at the time of calling
Atomics.wait, if the calling agent is the only with access to the underlying shared memory, no new agents that can access that shared memory may be spawned in the agent cluster until thatAtomics.waitcall returns. Is that true of all embeddings now? And if it is, is that a constraint we wish to impose on all embeddings? -
Spec-wise, it is too late to add this behavior. The bar is higher than 1) it'd be nice to have and 2) it might not break the web in reality. And throwing is not in line with the rest of the
Atomics.waitAPI. It would return a string literal indicating deadlock if anything.
(I tried searching the issues in the old repo and didn't see any find any existing discussion about simple deadlock detection. @lars-t-hansen should chime in, this might very well have been discussed already.)
|
In general, I would not mind some quick set of shallow, eager, and cheap deadlock detection, but we should tread carefully (and I am skeptical), for at least the following reasons.
(I tried searching the issues in the old repo and didn't see any find any existing discussion about simple deadlock detection. @lars-t-hansen should chime in, this might very well have been discussed already.) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
addaleax
Oct 8, 2017
The deadlock detection should not be implementation-dependent and needs to be fully defined.
I’m not disagreeing, but it would be helpful for me to have a short summary of why that is a hard and fast rule?
And throwing is not in line with the rest of the
Atomics.waitAPI.
Fwiw, Atomics.wait already throws if it’s not allowed, and unlike the possible return values of Atomics.wait, this condition always points to a programmer error. (Not that I’m set on this specific behavior, just pointing this out.)
Spec-wise, it is too late to add this behavior.
Is that a positive answer to the question “Is this something that should/needs to go through all stages as a proposal”?
addaleax
commented
Oct 8, 2017
I’m not disagreeing, but it would be helpful for me to have a short summary of why that is a hard and fast rule?
Fwiw,
Is that a positive answer to the question “Is this something that should/needs to go through all stages as a proposal”? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
syg
Oct 8, 2017
Member
I’m not disagreeing, but it would be helpful for me to have a short summary of why that is a hard and fast rule?
The point of the standard is to give a hard guarantee of interop among standard-conforming implementations. Having something like Atomics.wait throw on some implementations and not others is a nonstarter. We've made an effort to have SAB in general have no UB.
Fwiw, Atomics.wait already throws if it’s not allowed, and unlike the possible return values of Atomics.wait, this condition always points to a programmer error. (Not that I’m set on this specific behavior, just pointing this out.)
Ah that's right, it does throw on agents whose [[CanBlock]] is false. My bad.
Is that a positive answer to the question “Is this something that should/needs to go through all stages as a proposal”?
It should be a proposal that goes through the stages, yeah. I don't know if the ship has already sailed though. Making error cases non-errors is backwards compatible, the inverse isn't.
The point of the standard is to give a hard guarantee of interop among standard-conforming implementations. Having something like
Ah that's right, it does throw on agents whose [[CanBlock]] is
It should be a proposal that goes through the stages, yeah. I don't know if the ship has already sailed though. Making error cases non-errors is backwards compatible, the inverse isn't. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
addaleax
Oct 8, 2017
Okay, thank you, these were helpful answers. I’ll think about it a bit.
Making error cases non-errors is backwards compatible, the inverse isn't.
I wholeheartedly agree, but I am not sure actual deadlocks can count as non-error cases. ;)
addaleax
commented
Oct 8, 2017
|
Okay, thank you, these were helpful answers. I’ll think about it a bit.
I wholeheartedly agree, but I am not sure actual deadlocks can count as non-error cases. ;) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jasnell
Oct 8, 2017
The idea of not throwing but returning a value that indicates deadlock is an interesting approach on this. It would certainly provide the caller with a bit of flexibility on handling that avoids having to handle a throw in the process.
On the part of use cases, there are some rather abstract academic use cases I can imagine where a deadlock is actually helpful but none that would apply in uses of this API. I would argue that such detection is definitely worthwhile. That said, reading through this, I definitely agree the discussion should be raised with tc39 as a possible new stage-0 proposal.
jasnell
commented
Oct 8, 2017
|
The idea of not throwing but returning a value that indicates deadlock is an interesting approach on this. It would certainly provide the caller with a bit of flexibility on handling that avoids having to handle a throw in the process. On the part of use cases, there are some rather abstract academic use cases I can imagine where a deadlock is actually helpful but none that would apply in uses of this API. I would argue that such detection is definitely worthwhile. That said, reading through this, I definitely agree the discussion should be raised with tc39 as a possible new stage-0 proposal. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
erights
Oct 8, 2017
What would a deterministic spec that said exactly which deadlocks are detected look like? (Only MUST and MUST NOT, no SHOULDs.) Which genuine deadlocks should we omit? Which are infeasible to detect on some platform? (and why?)
Note that I am not asking about whether the occurrences of the deadlocks could be made deterministic. They clearly cannot be.
erights
commented
Oct 8, 2017
|
What would a deterministic spec that said exactly which deadlocks are detected look like? (Only MUST and MUST NOT, no SHOULDs.) Which genuine deadlocks should we omit? Which are infeasible to detect on some platform? (and why?) Note that I am not asking about whether the occurrences of the deadlocks could be made deterministic. They clearly cannot be. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
binji
Oct 8, 2017
It may be the case in node that, if at the time of calling Atomics.wait, if the calling agent is the only with access to the underlying shared memory, no new agents that can access that shared memory may be spawned in the agent cluster until that Atomics.wait call returns. Is that true of all embeddings now?
At least in Chrome, it is currently possible to postMessage a SAB and not know immediately whether it will arrive in an agent that can receive it or not. (The issue comes from having transferable message ports and multiple processes; seems like other browsers should have the same issue, not sure if they have some clever way of dealing with this.) So at the time of a call to Atomics.wait, it is not clear whether it is a guaranteed deadlock or not. You could always explicitly check for this case ("is there an in-flight SAB?"), but I'd imagine it might be tricky to specify.
It's also worth mentioning that we are planning to support a wait instruction in WebAssembly as well (see the threads proposal). This uses a WebAssembly.Memory instead, but a SAB can be extracted from this, so we'd want to coordinate there too.
binji
commented
Oct 8, 2017
At least in Chrome, it is currently possible to postMessage a SAB and not know immediately whether it will arrive in an agent that can receive it or not. (The issue comes from having transferable message ports and multiple processes; seems like other browsers should have the same issue, not sure if they have some clever way of dealing with this.) So at the time of a call to It's also worth mentioning that we are planning to support a |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
lars-t-hansen
Oct 9, 2017
Contributor
Atomics.wait() could, at least in many cases, and maybe optionally, could identify situations in which a deadlock is guaranteed to happen. In particular:
- When the current Agent is the only one that has access to the underlying SharedArrayBuffer memory
- When all other Agents that have access to the underlying SharedArrayBuffer memory are inside an Atomics.wait() call themselves
Those basically boil down to the same case (all agents sharing a SAB would be waiting), which is fine.
Deadlock detection was discussed a little bit during the design work, mostly in the context of tools though, I don't think we ever really talked about building it in. Partly this is because, on the web, the main thread can't block but will likely be the coordinating agent in many cases and responsible for distributing the SAB. As a consequence, there aren't likely to be many cases where deadlock can actually be detected - any SAB referenced by the main thread is effectively deadlock-free so far as any simple algorithm is concerned.
Suppose A and B share a SAB and for the sake of argument, both are allowed to wait. Then:
- Suppose A goes away, or more interestingly, A GC's its copy of the SAB. Now B waits; this is a deadlock. Should we detect this? If so, @erights wants to talk to you about observing GC. Also, deadlock detection becomes racy, but this is just @binji's issue inverted, I think.
- Suppose B waits. Then A GC's its copy of the SAB. Should B's wait now throw a deadlock exception?
- Suppose A waits and B then waits; there's a deadlock. Do A and B both get an exception or just B? (Or just A?)
- Suppose there's another shared SAB, and A waits on a location in the first SAB and B waits on a location in the second SAB. Do we detect this? (I think we'd almost have to, and I don't think it would necessarily be complicated, it just needs to be in there somehow.)
I suspect it would be more fruitful to work on a standard/popular locking library that could build in the notion of deadlocks by detecting lock cycles, perhaps. Most programs and programmers should not be using raw wait/wake calls in the first place. (For instance, I have a locking library at https://github.com/lars-t-hansen/js-lock-and-condition, though it does not detect deadlocks. On the other hand it has a notion of main-thread locks, so deadlock detection could become more meaningful.)
Those basically boil down to the same case (all agents sharing a SAB would be waiting), which is fine. Deadlock detection was discussed a little bit during the design work, mostly in the context of tools though, I don't think we ever really talked about building it in. Partly this is because, on the web, the main thread can't block but will likely be the coordinating agent in many cases and responsible for distributing the SAB. As a consequence, there aren't likely to be many cases where deadlock can actually be detected - any SAB referenced by the main thread is effectively deadlock-free so far as any simple algorithm is concerned. Suppose A and B share a SAB and for the sake of argument, both are allowed to wait. Then:
I suspect it would be more fruitful to work on a standard/popular locking library that could build in the notion of deadlocks by detecting lock cycles, perhaps. Most programs and programmers should not be using raw wait/wake calls in the first place. (For instance, I have a locking library at https://github.com/lars-t-hansen/js-lock-and-condition, though it does not detect deadlocks. On the other hand it has a notion of main-thread locks, so deadlock detection could become more meaningful.) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jfbastien
Oct 9, 2017
@jfbastien No (sane, at least) program would want to deadlock, period. I know this sounds a bit curt but I simply cannot identify a use case for deadlocking in such situations.
Sure, I agree there's little practical use to a deadlock. I'd still like to know why the proposed approach is the right one. Will a SAB user do something useful when a deadlock is detected? I'd venture that any deadlock handling code won't be tested because... well they'd have avoided the deadlock in the first place if they could.
At least from my POV, whenever I encounter a deadlock I know exactly how to debug it. Are you saying that having an exception would allow devs to collect backtraces on their user's machines and know "hey I have a deadlock somewhere"? That seems worthwhile, but I'd like you to detail these use cases.
@jfbastien This is mostly coming from a huge appreciation of deadlock detection mechanisms in other MT libraries for other languages (edit to clarify: as a consumer of those libraries, I mean).
What was done in other MT libraries, along with their rationale, would be useful information to add to this proposal.
I realize
Atomics.wait()is supposed to be a building block for low-level mechanisms, but I don’t think this particular thing could be implemented in userland as well as in the threading implementations itself
That claim is useful to add to the proposal.
– also, people are going to make mistakes.
Sure, but is your proposal going to help anyone when they do make a mistake? I'm not saying it won't, I even propose something useful above, I'm asking you to explain why anything other than a deadlock is better. I'm asking because I'd vote strongly against any proposal that makes matters worst, or which doesn't claim to improve anything.
Also remember than the browser is allowed to terminate pages. ECMAScript can say what it wants, but an implementation which seems unresponsive can get terminated. I think you need to make the case that throwing is better than a deadlock or page termination.
Lastly, please also consider the interaction with non-JavaScript agents as mentioned by others above. In a browser we expect SAB and WebAssembly to communicate with each other, and be able to wait on each other. I'd like to make sure that what you proposal won't break down in such a setup.
jfbastien
commented
Oct 9, 2017
•
Sure, I agree there's little practical use to a deadlock. I'd still like to know why the proposed approach is the right one. Will a SAB user do something useful when a deadlock is detected? I'd venture that any deadlock handling code won't be tested because... well they'd have avoided the deadlock in the first place if they could. At least from my POV, whenever I encounter a deadlock I know exactly how to debug it. Are you saying that having an exception would allow devs to collect backtraces on their user's machines and know "hey I have a deadlock somewhere"? That seems worthwhile, but I'd like you to detail these use cases.
What was done in other MT libraries, along with their rationale, would be useful information to add to this proposal.
That claim is useful to add to the proposal.
Sure, but is your proposal going to help anyone when they do make a mistake? I'm not saying it won't, I even propose something useful above, I'm asking you to explain why anything other than a deadlock is better. I'm asking because I'd vote strongly against any proposal that makes matters worst, or which doesn't claim to improve anything. Also remember than the browser is allowed to terminate pages. ECMAScript can say what it wants, but an implementation which seems unresponsive can get terminated. I think you need to make the case that throwing is better than a deadlock or page termination. Lastly, please also consider the interaction with non-JavaScript agents as mentioned by others above. In a browser we expect SAB and WebAssembly to communicate with each other, and be able to wait on each other. I'd like to make sure that what you proposal won't break down in such a setup. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
addaleax
Oct 19, 2017
I’m closing this. It seems determinism is a must-have for this feature, and that restricts it too much for what I had in mind – if anybody else thinks this is a good idea and wants to push it, feel free to do so. :)
addaleax
commented
Oct 19, 2017
|
I’m closing this. It seems determinism is a must-have for this feature, and that restricts it too much for what I had in mind – if anybody else thinks this is a good idea and wants to push it, feel free to do so. :) |
addaleax commentedOct 7, 2017
•
edited
Hey everyone! I’m not sure this is the right place or form to bring this up, please send me somewhere else if there’s something better.
I’m working on a Workers implementation for Node (or rather, a fork of it).
SharedArrayBufferandAtomicsare supported right now; One thing I’ve been thinking about is thatAtomics.wait()could, at least in many cases, and maybe optionally, could identify situations in which a deadlock is guaranteed to happen. In particular:SharedArrayBuffermemorySharedArrayBuffermemory are inside anAtomics.wait()call themselvesSo, what I would like to propose is to allow an
Atomics.wait()call to throw an error if it is guaranteed to never be fulfilled.A few notes:
pthread_mutex_lock()is allowed to fail withEDEADLK(although with somewhat different semantics)SharedArrayBuffer+Atomicusage is widespread enough in browsers for this to be a breaking change, e.g. it is still behind a flag in implementations, and there are to my knowledge no other implementations of it outside of browsersI would expect this feature to be optional to implementations, but I don’t feel like I can make a good call on that.
Now, my questions:
/cc @TimothyGu @lars-t-hansen @binji @jasnell