-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unbounded supply {} + react {} = pseudo-hang #6056
Comments
From @japhbSee the following gist: https://gist.github.com/japhb/40772099ed24e20ec2c37c06f434594b (If you run that at the command line, you'll probably want to pipe it to Essentially it appears that unlike the friendly one-at-a-time behavior of |
From @jnthnOn Fri, 03 Feb 2017 21:20:59 -0800, gjb@google.com wrote:
Firstly, the boring observations: there are two mistakes in the gist. 1) A role is not a closure, so: 2) In the react example, there is $s1.done, when I presume $s2.done was meant. Even with these corrected, the behavior under consideration still occurs. The deadlock we're seeing here is thanks to the intersection of two individually reasonable things. The first is the general principle that supplies are about taming, not introducing, concurrency. There are, of course, a number of Supply factory methods that will introduce concurrency (Supply.interval, for example), together with a number of supply operators that also will - typically, anything involving time, such as the delay method. Naturally, schedule-on also can. But these are all quite explicitly asking for the concurrency (and all are delegating to something else - a scheduler - to actually provide it). The second, which is in some ways a follow-on from the first, is the actor-like semantics of supply and react blocks. Only one thread may be inside of a given instance of a supply or react block at a time, including any of the whenever blocks inside of it. This has two important consequences: 1) You can be sure your setup logic inside of the supply or react block will complete before any messages are processed. 2) You can be sure that you'll never end up with data races on any of the variables declared inside of your supply or react block because only one message will be processed at a time. This all works out well if the supply being tapped truly *is* an asynchronous source of data - which is what supplies are primarily aimed at. In the case we're considering here, however, it is not. Thanks to the first principle, we don't introduce concurrency, so we tap the supply on the thread running the react block's body. It never hands back control due to the loop inside of it, running straight into the concurrency control mechanism. A one-word fix is to introduce a bit of concurrency explicitly: react { With this, the react block's setup can complete, and then it starts processing the messages. Longer term, a back-pressure model for supplies is something that wants looking in to, designing, and implementing. I put this off on the basis that Rx.Net is plenty useful without one, and RxJava introduced one after its initial release. Taken together, there was no incentive to rush one in. However, we might be able to find a solution in that space for this particular case. That said, back when I was teaching async programming, I always made a point to note that the places where synchrony and asynchrony meet are often sources of trouble. Here, a supply block whose body runs synchronously runs up against a construct (react) and data structure (Supply) whose designs are optimized for dealing with asynchronous data. Reduced to its essence, the code submitted here and the C# code I would show my students to illustrate the problem look strikingly similar: a blocking subscription prevents message processing, leading to a deadlock. It's worth noting that this general problem can *not* be solved through a back-pressure mechanism; it can only solve cases like the one in this ticket where when emit can serve as a preemption point in the case of back-pressure being applied. The consequences of making emit have such semantics, however, will probably run deep once we get into non-toy examples. (For example, will it end up with us declaring `emit` as being like `await` in 6.d where you may be on a different OS thread afterwards if you do it inside of the thread pool?) A perhaps simpler solution space to explore is providing an API that separates the obtaining of a Tap from the starting of processing. That would allow us to run the setup logic to completion. But...then what? Again, it's easy to make this toy example work because there's only one whenever block. But if there are more, then we're just moving the problem, and making it harder to diagnose, because instead of a "where are we deadlocked" backtrace showing the whenever line, it'd instead show...some other location in supply internals. So, a back-pressure model that allows us to round-robin is probably a bit better than this. tl;dr use "start whenever $supply { }" when $supply is going to work synchronously. We should also consider implementing a missing "tap-on" supply operator, so you can also write: whenever $supply.tap-on(ThreadPoolScheduler) { } Or do it at the source: supply { A simple implementation would likely be: method tap-on(Scheduler:D $scheduler) { Making the code as originally submitted work is an interesting problem to ponder, but raises a bunch of non-trivial questions, and should be considered together with various other challenges. Hope this helps, /jnthn |
The RT System itself - Status changed from 'new' to 'open' |
From @jnthnOn Sat, 04 Feb 2017 07:08:04 -0800, jnthn@jnthn.net wrote:
That would also be a wrong implementation, since cue doesn't bring dynamic context along for the ride, which is how a whenever locates its enclosing supply block. Would need to be: method tap-on(Scheduler:D $scheduler) { That'll teach me to prematurely optimize. :-) /jnthn |
From @japhbResponses inline ... On Sat, Feb 4, 2017 at 7:08 AM, jnthn@jnthn.net via RT <
It took me a minute to realize this was true, because if you move the `my That said, this raises two questions: A. How did this work in the first place? Was the role's reference to $done B. Why isn't a role declaration a closure? I understand that the 2) In the react example, there is $s1.done, when I presume $s2.done was
Yup, didn't notice that pasto because the gist was essentially a merge of
OK, the above makes sense to me, but why does the .act version work
Well ... that kinda works. As I tried this (with a `sleep 2` added at With all the things I tried, at this point I'm not even sure which problems Longer term, a back-pressure model for supplies is something that wants
I can understand that problem -- though it does lead me to wonder what
I thought the initial point of Supply was to address a few fundamental We should also consider implementing a missing "tap-on" supply operator, so
|
From @jnthnOn Sun, 05 Feb 2017 16:14:15 -0800, gjb@google.com wrote:
Yes.
It's because classes and roles are constructed at compile time; by runtime we're just referencing the one thing that was created at compile time. So, the role meta-object points to the single static instance of the role body block, and runs that every time. (The role body does run at runtime since this is a runtime mixin. However, even those get interned. Even if they didn't, we'd still be in the same situation, however, since there's a single static instance of the role body block.)
The `act` version does have a problem too, in a sense. `act` returns a Tap object, and calling `.close` on that would be the correct way to close the supply being tapped. However, since that supply works synchronously, the call to `act` gets control and the `Tap` object doesn't become available. Really, `.act` just means `.serialize.sanitize.tap`. However, a `supply` block cannot emit multiple concurrent messages, so is already serial. It's also sanitary (follows the supply protocol), so the behavior in this case really is just `.tap`. So, the "actor-like" behavior of `.act` just means that there will never be concurrent calls to any of the blocks passed to that particular `.act` call. The `react` and `supply` constructs allow establishing of richer actors. The one-at-a-time applies to all of the whenever blocks. So: my $i = 0; Is a data race on $i, but: my $i = 0; Is not. The problem you're running in to is that we also promise that: my $i = 0; Will not be a data race - that is, the code inside of the main body of the react block holds the "lock" until it completes and all subscriptions and state are set up. But if $s1 or $s2 here do not give control back upon being tapped, then the react block's main body never completes and releases the lock either, and so it's impossible to process messages.
Turns out the support for `last` inside of whenever blocks didn't get merged yet (it's in a PR, which I've looked at today, but seems to have some issues that need looking over beforehand). So that's the issue with `last`. Even if it was merged, there'd still be trouble. The real difficulty here is down to react/supply so far making the assumption that they are dealing with supplies that will deliver data asynchronously, and that will not block upon subscription. When the tap handle from subscription is not handed back before messages are emitted, there's no way for it to close the Supply. I'm still considering various ways we might be able to address this limitation, but it'll need some thinking time.
I'd suggest something like this: sub make-supply() { say "\nUSING react";
It's a bit deeper than that. Supply and Channel are for different processing models. Channels are for when you want a queue that a producer can place things in to quickly, without blocking on whatever will process them. Something else receives and processes the messages. A channel is typically used to *introduce* parallel processing, and has the concurrency control in place to cope with that being 1:N, M:1, or N:M. Supplies were introduced to provide for the reactive paradigm, where values are being produced asynchronously and we wish to react to them in various ways and compose those various reactors. These values may come from a range of sources and arrive concurrently. Supplies are thus about taming/controlling concurrency. So, supplies don't replace channels; they solve a different set of problems that channels would not be suited to. It is true that supplies will process messages on the producing thread unless you explicitly say otherwise. Note that in the solution above, while we introduce a worker with the `start` block, all the `whenever` blocks inside of the `react` will be run on the thread of that `start` block. The thread doing the react will just wait for the react to be done (or, in 6.d.PREVIEW, if the react is in the thread pool, it will return to thread to the pool to get on with other work).
It essentially does, but in the reactive space the yield just becomes a call (to process the reaction), and the resume is the return from that call. In common they have that the call is "abstract" (that is, with `take` the code is abstracted from a particular consumer, and with `emit` from a particular reactor).
Because, as hopefully clarified above, react is trying to perform more concurrency control than act. Hope this helps, and I'll keep the issue under consideration. |
From @jnthnOn Wed, 15 Feb 2017 08:35:09 -0800, jnthn@jnthn.net wrote:
So the time came to tackle getting supply/react/whenever syntax capable of playing nice with non-blocking await, and I decided as part of those changes to look at both this problem and also the more general problem of lack of good back-pressure and, related to that, lack of fairness when using the supply/react syntax. To recap, until very recently, a `supply` or `react` block instance had its own processing queue. If it was empty, the emitting thread would enter and run the required code. If any messages were emitted to it in the meantime, they would be queued asynchronously. When the sender of the currently-being-processed message was done, it would check if there was anything added to the queue in the meantime, and if so it would process those too. This mechanism also handled recursive messages by queuing them up (this occurs when some code running in a `supply` block instance results in another emit being sent to the same `supply` block instance). The asynchronous queuing, however, meant that the cost of processing a message didn't push back on senders as it should have. I've just finished (I hope :-)) re-working the Supply internals to instead use asynchronous locking throughout. An asynchronous lock returns a Promise that will already be Kept if the lock is already available, or will be Kept once it is available. Multiple contenders queue for the lock. This, in combination with non-blocking `await` of the Promise, forms the new supply concurrency control model, used consistently in `supply` blocks and elsewhere in the supplies implementation (previously, the code elsewhere used a real mutex, which gave its own set of issues). On its own this cannot replace the previous mechanism, however, because the queuing was used in preventing various forms of deadlock, especially recursion. It also would cause problems for any `whenever` tapping a `supply` that emitted values synchronously after being tapped (as in your case). The former is resolved by detecting lock recursion and, in that case falling back to queuing the work to run later, using the thread pool. The latter is resolved with a custom Awaiter implementation: if anything during the processing of setting up a `whenever` block does an `await`, a continuation is taken, and then - after the setup work of the `supply` block is completed - the continuations are invoked. This latter case is relevant to the original subject to this ticket, because with the supply concurrency control mechanism now being asynchronous locking, the outcome of the `emit` that previously queued endlessly is now an `await` instead. Thus the setup of the consumer is allowed to complete, before the producer is resumed. Any further awaits are also collected and handled in the same way, until we run out of them. The effect is that if we rewrite the original code (to use the CLOSE phaser, not a hack with a role): sub make-supply() { my $s2 = make-supply; Then it will produce: Emitting ... Furthermore, if it is written as just: sub make-supply() { my $s2 = make-supply; The output is similar: Emitting ... Note the one extra "Emitting ...". The `emit` operation will now check if the supply block is still active; in this case, it was closed by its consumer, so it won't bother emitting and won't bother resuming either (emit is a control exception, which is why we can unwind the stack and thus exit the loop). Finally, this model means that: sub make-supply() { my $s2 = make-supply; Works - as in, the second `whenever` block gets its fair chance to have a message processed too, so the output is something like: Emitting 2000 messages I've added some tests to S17-supply/syntax.t. Thanks for the original ticket; hopefully this solution will make things better for those writing "source" supplies. |
@jnthn - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#130716 (status was 'resolved')
Searchable as RT130716$
The text was updated successfully, but these errors were encountered: