-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race-condition with fan-in pattern #644
Comments
Shared Queue As a Serialization MechanismBasically each solution we can get will be serialization for connections with shared receivers. To do so, we need to get rid of concurrency for them. The most simple way to do is this:
connect() {
for msg := range q {
broadcast(msg)
}
} However, there's a couple of problems with this solution... Problems
Performance Problem (Making the bottleneck wider)There's basically 3 ways to improve performance of this solution
It's easy to do first two but last is a bit tricky. The idea is to have queue for each fan-in pattern. I.e. each time we have N senders sharing the same receiver - they need a shared queue. Their messages needs to be serialized. Intermediate Connections Handling (Fixing Correctness)As soon as we got rid of "goroutine per connection" we created a problem. Let's look at broadcast(msg) {
for r := range receivers {
r <- msg
}
} The problem is with this line It's not a problem if we have some runtime function that reads from this port but what if this it's intermediate connection_? By "intermediate" connection I mean connection that is exist because there was a user-defined component. Let's take a look at these 2 versions:
They are equal by functionality but the first one gonna have more ports and connections (channels and gorotuines) - more message passing (more synchronization). Which is bad BTW. This could be solved by Optimization step (whether that source-code lvl optimizer or IR optimizer, doesn't matter). However, we don't have optimizer how and it's not clear whether any program could be optimized up to the point where it doesn't have "intermediate connections". So, the thing is after rewriting runtime will handle later but not the first. Because we don't have goroutine that could transfer our message from SolutionThere's 2 ways to solve it
The answer is... We need both! And we need to start from the second one. The reason is that optimisation should be optimisation_. We should not (probably) think of slow programs as of incorrect ones (even tho I personally would love to). Second, it's not clear whether we can optimize any program, as been said, this way. Is any program could be 100% inlined into one giant So we need to support this. How? connect(prog) {
for item := range q {
broadcast(item, prog)
}
}
broadcast(item, prog) {
finalReceivers := getFinalReceivers(item, prog)
for finalReceiver := range finalReceivers {
finalReceiver <- item.Msg
}
} I didn't describe As an optimisation for this instead of could make runtime remember addr of the leaf once it found so we don't have to do it all the time on each msg. That... feels like that optimization we talked about before right? I know it's weird. But I feel like runtime should be able to work with this. We should not just push it fully onto compiler's shoulders. Why? Debugging A note on
|
|
Graph Reduction Algorithm In IRThis is actual solution for #644 (comment) Idea is to add extra step to It's not the most efficient (probably possible to do in one go) but good enough
This way we'll have graph where basically bunch of runtime functions does message passing via queue (serialized pub-sub basically) and we will have no race conditions. You can see highlighted ports on this picture, they will be removed. And this is how this
Becomes this
We don't need a goroutine to do intermediate step anymore. By not having several goroutines sending to same receiver we avoid race condition. |
|
Related to #677 |
This specific problem is solved in #670 (doesn't mean we don't have race/out-of-order completely though) |
Every time you have
N>1
senders for1
receiver you can have race-conditions1
sent a messagem1
at0:00:00
s2
sent a messagem2
at0:00:01
m2
received by receiverr1
beforem1
- data-raceThe reason for that is the way runtime interprets the program
For each connection there is a separate goroutine. Goroutines are concurrent which means if we have 2 goroutines they compete for resources.
So, 2 goroutines await for signal from sender (those are different) to deliver a message to receiver (which is the same). Even though first sender sent first there's no guarantee Go scheduler will activate corresponding connection goroutine first. Sometimes it does, sometimes it doesn't.
Here's the example
It's clear that
true
must be printed first. Yet it's not always the case.This behaviour considered a bug that is caused by incorrect implementation. When there will be a language spec we will describe correct behaviour and each implementor will follow it.
We need to serialize message passing for connections that follow fan-in pattern.
This is root-cause of #575 and #635
The text was updated successfully, but these errors were encountered: