Skip to content
This repository has been archived by the owner. It is now read-only.

Lessons to be taken from channels in Go? #39

Closed
stjepang opened this issue Mar 4, 2018 · 11 comments

Comments

Projects
None yet
6 participants
@stjepang
Copy link
Member

commented Mar 4, 2018

I'd like to take a step back and challenge some of the core design decisions in crossbeam-channel, std::sync::mpsc and futures::sync::mpsc.

Motivation: select! for crossbeam-channel is hard to get right

The current select_loop! macro is kinda silly (it's a loop, you can't use break/continue inside it, it causes potentially subtle side effects on every iteration). I'm trying to come up with a new, nicer macro, with fewer surprises, and without the implicit loop.

This is what seems like the best solution so far:

select! {
    recv(receiver, msg)     => {} // `msg` is of type `T`
    send(sender, foo + bar) => {} // sends message `foo + bar`
    closed(receiver)        => {} // fires when `receiver` is closed
    closed(sender)          => {} // fires when `sender` is closed
    default(timeout)        => {} // fires when `timeout` expires
}

And this is how it works:

  1. If any of the recv or send cases can complete without blocking, a random one is chosen. The chosen operation is executed and its block (in this example {} for simplicity) is evaluated.
  2. If any of the closed cases are ready (because a channel is closed), it is evaluated.
  3. If the timeout has expired, the default case is evaluated.
  4. Otherwise, we wait until something changes.

Pros:

  • No surprises (apart from the complexity) - it behaves just as one would expect.
  • Very flexible - allows a lot of freedom in choosing how to react to channel events.

Cons:

  • Using a recv or a send case doesn't nudge you into handling the closed case. For example, this is in contrast to a bare Receiver::recv operation, which returns a Result so the compiler advises you to do something with it (like call .unwrap()).
  • The macro looks kind of complicated.
  • Internal implementation is a little challenging (although not too bad).

There were some other different ideas for the macro but I won't go there.

Folks from Servo (more concretely, @SimonSapin and @nox) were cool with this macro idea, although it wasn't the absolute best option for their use case.

@matthieu-m had a good point that the select! macro should ideally be exhaustive in the sense that it makes sure you don't forget edge cases like a channel being unexpectedly closed.

Now let's see if we can somehow force the user to handle the closed case for every recv and send. One idea is to change recv so that it returns a Result<T, RecvError>:

select! {
    recv(receiver, msg) => {
        match msg {
            Ok(m) => println!("got {:?}", m),
            Err(RecvError) => println!("the channel is closed"),
        }
    }
}

But what about the send case? Here's a try:

let mut greeting = String::new("Howdy!");
select! {
    send(sender, greeting, result) => {
        match result {
            Ok(()) => println!("successfully sent"),
            Err(SendError(m)) => greeting = m,
        }
    }
}

That'd work and eliminate the need for closed case, but it doesn't look very nice. To be fair, users of this select! syntax would typically just raise a panic on 'closed' events:

let mut greeting = String::new("Howdy!");
select! {
    recv(receiver, msg) => {
        let m = msg.expect("the channel is closed");
        println!("got {:?}", m),
    }
    send(sender, greeting, result) => {
        result.expect("the channel is closed");
    }
}

Looks a bit better, but still kind of clumsy.

When does a channel become closed/disconnected?

Depends on the implementation.

  • In std::sync::mpsc: when either the Receiver or all Senders get dropped.
  • In crossbeam-channel: when either all Receivers or all Senders get dropped.
  • In futures::sync::mpsc: when either the Receiver or all Senders get dropped, or when you call Receiver::close.
  • In chan: when all Senders get dropped.
  • In Go: when you call close(sender). It is not possible to close from the receiver side.

These differences are important. The chan crate follows Go's model very closely so they have very similar behavior. The reason why Go allows closing from the sender side is because it follows the principle of unidirectionality.

Unidirectionality means that information flows in one direction only. Closing a channel signals to the receivers that no more messages will arrive, ever. Note that even if all receivers get dropped, sending into the channel works as usual. The whole idea is that receiver side cannot signal anything to the sender side!

Why do we want unidirectionality? See this and this comment by Russ Cox for an explanation.

Channels in Go

Channels in Go are quite peculiar. They come with a bunch of rules that seem arbitrary at first sight, but are actually well-thought-out. See Channel Axioms and Curious Channels by Dave Cheney. The controversial blog post titled Go channels are bad and you should feel bad is worth a read, too, despite the title.

Here are a few interesting rules:

  1. Sending into a closed channel panics. The idea is that senders have to coordinate among themselves so that the last sender closes the channel.

  2. Receiving from a closed channel returns a 'zero value'. This is like returning None, but Go doesn't have sum types.

  3. Sending into a nil channel blocks forever. This is handy when you want to disable a send operation inside a select - just set the sender to nil!

  4. Receiving from a nil channel blocks forever. This is useful for the same reason the previous rule is. See this StackOverflow answer for an example.

How would we port Go's channels to Rust

Let's solve some of the quirks in Go's channels by using two useful language features of Rust: destructors and sum types.

Keeping the idea of unidirectionality, we change the disconnection behavior: channel gets closed only when all Senders get dropped. That's it. This means sending into a channel cannot panic because having a sender implies the channel is not closed. Next, receiving from a closed channel returns an Option<T> rather than a 'zero value'. The chan crate follows the same design - see here and here.

I'm feeling very tempted to redesign crossbeam-channel around the philosophy of Go channels:

impl<T> Sender<T> {
    fn send(&self, msg: T);
}

impl<T> Receiver<T> {
    fn recv(&self) -> Option<T>; // `None` if closed.
}

let greeting = String::new("Howdy!");
select! {
    recv(receiver, msg) => {
        // `msg` is of type `Option<T>`
        let m = msg.expect("the channel is closed");
        println!("got {}", m);
    }
    send(sender, greeting) => {
        println!("successfully sent");
    }
    default(timeout) => println!("timed out!"),
}

This is beautifully simple:

  • Easy to understand, especially for programmers coming from the world of Go.
  • No need for those annoying unwraps in sender.send(foo).unwrap(). @BurntSushi is going to like this. :)
  • select! macro doesn't need the closed case.
  • You're advised to handle the possibility of the channel being closed in the recv case. Perhaps we might want to change the Option type to Result.

Some drawbacks:

  • Dropping all receivers doesn't prevent senders from sending. This might or might not be a drawback - we could argue both ways.
  • select! is not as powerful as before. But it probably doesn't matter since all real-world cases should be covered by this simpler version.

Note that we could also accept Option<Sender<T>> and Option<Receiver<T>> in the recv and send cases. That would be equivalent to using nil channels in Go.

We can get all the benefits of Go channels without its weak spots like accidental panics (e.g. sending into a closed channel), accidental deadlocks (e.g. receiving from a nil channel), and incorrect closing (we close automatically when the last Sender gets dropped).

Final words

The simple Go-like channel interface currently seems to me to be sitting in some kind of sweet spot of the design space. I've been thinking about this problem for way too long now, constantly switching from one idea to another. In the end, I'm not sure whether this one is the way to go and need your opinion.

Any thoughts?

cc @arcnmx - you might be interested in this comment, too.

@stjepang

This comment has been minimized.

Copy link
Member Author

commented Mar 4, 2018

To answer this comment from @danburkert:

I'm fine with removing Receiver::close, although I find the 'unidirectional' argument pretty flimsy. I read the linked discussions and as far as I can tell they explained what unidirectional means in this context, but not why it's useful. In other words, I've never personally needed Receiver::close, but I also don't see a reason to discourage it either.

Honestly, this seems to be a matter of opinion on which different people strongly disagree. I'm feeling pretty ambivalent here, however.

Making Sender::send() not return an error when the Receiver has been dropped would be a huge mistake. This is just asking for task leaks where producer tasks never realize that there's no one listening on the other side, and they continue on their merry way indefinitely.

Go would answer "it's your own fault". I think in Go this is not as much of a problem because all channels are bounded, and most channels are even of zero-capacity. But in Rust we do have unbounded channels, where this is a much important issue to consider.

This is in my opinion one of the strongest reasons against the proposed Go-like channels in the original comment.

Sender::close() is extremely useful for implementing more complex shutdown sequences. If all Sender instances have to be dropped before the channel is closed, that means all producer contexts need a separate way to be notified of shutdown. If Sender::send() can already fail due to the Receiver being dropped (as I argue it should in the previous bullet), then there's no reason not to support Sender::close().

That makes sense. Adding Sender::close() wouldn't be difficult, although then it would make Sender::send() panic if the channel is closed. Alternatively, we could make Sender::send() return a Result so that the user takes the burden of handle the possibility of channel being closed, but that's somewhat unergonomic. A few days ago, @BurntSushi said he thinks crossbeam-channel should have a method for sending that doesn't require .unwrap()s, probably for ergonomic reasons.

Currently, the main goal of crossbeam-channel is to be a sort of a 'mega-channel' that can do pretty much anything and doesn't impose any artificial restrictions on the user. It's just that I've come to question this goal now that certain complexities inside select! have showed up.

@glaebhoerl

This comment has been minimized.

Copy link

commented Mar 4, 2018

I think I'd like to hear more argumentation about why unidirectionality is a desirable property rather than merely a sensible one.

From the link:

There almost always need to be two steps in a cancellation: a request for the cancellation and an acknowledgement that work has in fact stopped. Close can serve as the latter; it cannot serve as both, and we make it as hard as possible for people to do that accidentally.

Maybe somebody can elaborate on what this means and why? It seems to me that the receiver doing a close() followed by the sender detecting the failure of its send and reacting appropriately could serve as those two steps, at least under one way of interpreting them. (Is the "cannot serve as both" claim only intended to be true within the context of Go, e.g. because it happens to have no way to detect the failure of a send, or is it something more fundamental?)

And as a potential counterpoint to unidirectionality: I'm sure many of us have noticed the symmetry between:

impl Sender<T> {
    fn send(&self, value: T) -> Result<(), T>;
}
impl Receiver<T> {
    fn recv(&self) -> Result<T, ()>; // could be thought of as having `value: ()`!
}

It's almost as if these can be generalized to:

struct Endpoint<A, B> { ... }
impl Endpoint<A, B> {
    fn new() -> (Endpoint<A, B>, Endpoint<B, A>);
    fn rendezvous(&self, value: A) -> Result<B, A>;
}
type Sender<T> = Endpoint<T, ()>;
type Receiver<T> = Endpoint<(), T>;

Which is what OCaml is thinking about doing. Of course, as I meant to imply with the name there, this only makes sense for 0-buffered channels. But, at least, it suggests to me that bidirectionality can also be a sensible property.

@stjepang

This comment has been minimized.

Copy link
Member Author

commented Mar 4, 2018

@glaebhoerl

I think I'd like to hear more argumentation about why unidirectionality is a desirable property rather than merely a sensible one.

Unidirectionality is strictly less powerful than bidirectionality, so one can only argue for it from the standpoint of simplicity and ergonomics.

Unidirectionality makes selection straightforward. Any operation declared inside a select will behave exactly as if it was a standalone operation (outside select). No surprises there.

With bidirectional closing, however, we have to introduce a special closed case in select, which will not be automatically enforced, unfortunately. Alternatively, we must reach for more complicated variants of recv and send cases, which adds a considerable penalty on ergonomics.

In one sentence, unidirectionality makes the interface smaller, simpler, and more consistent (in terms of select vs standalone operations, at least).

Maybe somebody can elaborate on what this means and why? It seems to me that the receiver doing a close() followed by the sender detecting the failure of its send and reacting appropriately could serve as those two steps, at least under one way of interpreting them.

My understanding is that in Go this is simply considered an antipattern.

If any send operation can fail, that means most users will just ignore the error and prefer to panic instead. Basically every use of channels in Rust is littered with .unwrap()s, which sometimes looks silly. This is reminding of tedious .unwrap()s one has to write after locking a Mutex, which prompted some people to create Result-free Mutexes as in antidote. parking_lot decided to do away with Results, too. The chan crate sends into a channel without a Result, too.

Also, signalling to the sender that the receiver is shutting down is not hard to do using a 0-capacity channel anyway, and may lead to cleaner design. At least the Go team believes so.

Finally, it's interesting to ponder how one would actually implement channel closing. In Go, each channel is wrapped into a single mutex so there's a closed: bool flag in there, but what if we had a lock-free queue and no mutexes behind the scenes? While it's tempting to use a closed: AtomicBool flag, that would be wrong and cause linearizability issues. Instead, a close operation in crossbeam-channel marks a special bit in the sender index.

Note that "close the channel" is literally just a synonym for "freeze the sending side". Since channel closing is only concerned with the sending side, perhaps only the sending side should be allowed to close? The answer to this question probably depends on whether your mental model of a channel is "a simple concurrent queue" or "like a unix pipe where both sides can signal closing".

(Is the "cannot serve as both" claim only intended to be true within the context of Go, e.g. because it happens to have no way to detect the failure of a send, or is it something more fundamental?)

While Go doesn't have a way of detecting failure of a send operation, they could've easily implemented this feature (e.g. ok := channel <- msg). To omit the feature was a deliberate decision.

Which is what OCaml is thinking about doing. Of course, as I meant to imply with the name there, this only makes sense for 0-buffered channels. But, at least, it suggests to me that bidirectionality can also be a sensible property.

This is an interesting example. Indeed, the symmetry is beautiful. However... :)

I've done a lot of research on queues and channels in other languages and libraries. They will often boast about performance, features, and so on. The rosy story always falls apart when it comes to select. Go is the only language where channel selection is simple, easy, and just works. Many channel implementations don't have any kind of selection at all! Let's take a look at the mpsc_select feature in Rust as an example. The macro has fragile syntax (poor compilation errors), only accepts receive operations (no sends!), and has seen no progress in years. The macro is basically deprecated (see here) and at this point it is only kept alive because Servo still depends on it.

The chan crate is an alternative that sacrifices performance, but provides a nice Go-like interface with proper chan_select! macro. Apart from a few quirks and bad performance, it works very well. In 2015, a request for adding a send operation returning a Result was submitted. @BurntSushi was originally hesitating:

The second design decision I made was, "sending a value that can never be received is either a bug or an intentional leak." This may be a bad decision, but it is one that has a large body of evidence that suggests it may not be a horrible idea. (e.g., Go.) The idea here is too encourage users of chan to write code that doesn't enter that state. I was motivated to do things this way because so many uses of the std::sync::mpsc channel are tx.send(...).unwrap(). In other words, it makes the common case verbose.

In the end, he was sold on the feature request and decided to implemented the non-panicking send method. However, it turned out that fitting the concept of non-panicking send into chan_select! was a very difficult design problem, and that's where the story ends.

crossbeam-channel is a successor to chan and std::sync::mpsc. I've put a lot more time and effort into its development and managed to overcome some key difficulties other channels have run into. However, when it comes to selection, now I'm too facing the same old infamous selection problem. There seem to be only two ways forward:

  1. Keep the std::sync::mpsc-like design and implement a slightly complicated select! with explicit closed case.

  2. Take a more opinionated stance and implement chan/Go-like channels with simple select!.

@BurntSushi

This comment has been minimized.

Copy link

commented Mar 5, 2018

@stjepang It was a joy reading your comments in this thread, and I can definitely identify with your toil. However, when I wrote chan, I took a b-line straight to what I knew worked: Go's channels. I didn't even come close to exploring the design space as thoroughly as you have, so I don't think I can quite identify with all of your toil! :-) Speaking as someone who has written Go since before it was 1.0, and worked with others that have written Go, its channel implementation has been a raging success from my perspective. There are lots of downsides (all of which you are clearly aware of), but it works well in practice. So with that said, I think my vote would be towards being opinionated. The reasoning is simple: select is the single most important aspect of an ergonomic channel implementation, and if people have trouble using it, it's always going to compare unfavorably to other choices (like Go's).

One thing that came up towards the end of BurntSushi/chan#2 was what to do when a channel send occurs on a non-closed channel that is guaranteed to never be received. chan today will lock forever, which is what Go will do. However, it seems possible that you could panic in that case, and indeed, if Go's runtime detects that no goroutine will make progress, it will actually panic because it effectively detected a deadlock. To keep select simple, you could simply not expose this panic at all and keep the send() -> () type signature.

@matthieu-m

This comment has been minimized.

Copy link

commented Mar 5, 2018

After thinking for a longer moment about send() failure, I think bi-directionality is weirder than I initially thought.

I have used (distributed) queuing systems quite extensively for nigh on 5 years at my previous job: I was actually developing the middleware applications on top of them, which were used among other things to... make sure that the luggage you drop at the departure airport arrives at destination. One of the most important properties of our queuing systems was to never lose an item. A queue which acknowledges an item but fails to deliver it was buggy, plain and simple.

Which is why closing from the receiving side is slightly weird. Even if the senders immediately stop sending, what of all the items in flight?

To design a queue which does not lose items, the receiving side must "request" a close, and then continue processing until it is guaranteed that no other item will be enqueued. This is easily implemented if the queue is locked for each send/receive, but faster implementations are preferred, and then things become racy.

And suddenly, we seem very close to the unidirectionality advocated by Go: the receiver may only signal an intent, the senders decide when to stop.

Also, note that an absence of receiver should not necessarily fatal. That is, even if the receiver dies (unexpected data in the queue causes a panic which unwinds the thread), it may be desirable to have the ability to spawn a new receiver to continue processing. Being able to use the current queue (already wired in on the sender side, already containing some non-processed items) is quite desirable then.

@danburkert

This comment has been minimized.

Copy link
Contributor

commented Mar 5, 2018

Which is why closing from the receiving side is slightly weird. Even if the senders immediately stop sending, what of all the items in flight?

The receiver keeps receiving buffered items until there are no more, at which point further calls to recv indicate that the channel has been closed.

@stjepang

This comment has been minimized.

Copy link
Member Author

commented Mar 6, 2018

@BurntSushi

One thing that came up towards the end of BurntSushi/chan#2 was what to do when a channel send occurs on a non-closed channel that is guaranteed to never be received. chan today will lock forever, which is what Go will do. However, it seems possible that you could panic in that case, and indeed, if Go's runtime detects that no goroutine will make progress, it will actually panic because it effectively detected a deadlock. To keep select simple, you could simply not expose this panic at all and keep the send() -> () type signature.

I like this idea. So basically when sending into a bounded channel that is full and has no receivers left, rather than waiting forever, we panic with an appropriate message. Sounds good to me! We could do something similar in select!, too.

By the way, I'm sure you know the pains of dealing with poor error messages due to trivial mistakes in macros. Well, you're going to like this. :)

An example:

select! {
    recv(r, msg) => msg,
    send(s, 1 + 2) => println!("sent")
    default(timeout) => ()
}

Error message:

error: did you mean to put a comma after `println!("sent")`?

The new macro seems to give a sensible message no matter what kind of mistake I try planting in there, and isn't nitpicky about things like trailing commas. I'm pleasantly surprised to discover that macros can be so friendly at all!

@matthieu-m

Thank you for the comment! This is a very compelling argument in favor of Go-like channels, in my opinion. I find the parts about draining the remaining messages and spawning a new receiver pretty convincing.

In the end, reacting to the automatic close signal when all receivers go away is probably not the most robust way of shutting down anyway. I think this is the key point Russ Cox was trying to get across in his comments.

I'd just like to add a demonstration of how it's still easy to implement a signal in the backwards direction if one needs it:

let (sender, receiver) = channel::unbounded();
let (signal, done) = channel::bounded(0);

// The producer thread.
thread::spawn(move || {
    while !done.is_closed() {
        select! {
            recv(done) => break,
            send(sender, generate_message()) => (),
        }
    }
    // Do something with the remaining messages.
    // Maybe spawn a new worker thread?
});

// The worker thread.
thread::spawn(move || {
    let signal = signal; // Will get dropped when this thread exits.

    for msg in receiver {
        if process_message(msg).is_err() {
            // Something went wrong. Shut down.
            break;
        }
    }
}

@danburkert

The receiver keeps receiving buffered items until there are no more, at which point further calls to recv indicate that the channel has been closed.

This confuses me, too. Perhaps @matthieu-m meant to write the following?

Which is why closing from the receiving side is slightly weird. Even if the receivers immediately stop receiving, what of all the items in flight?

@BurntSushi

This comment has been minimized.

Copy link

commented Mar 6, 2018

By the way, I'm sure you know the pains of dealing with poor error messages due to trivial mistakes in macros. Well, you're going to like this. :)

Wow. I had no idea that was possible. 😍

@coder543

This comment has been minimized.

Copy link

commented Apr 27, 2018

Wow. I had no idea that was possible. 😍

@BurntSushi considering that I sent a pull request like 5 months ago that does the same thing for chan... Idk. I just hoped you might have at least looked at the PR at some point. :/ oh well.

@stjepang after reading all of this discussion and thinking about it, I'm starting to lean towards the opinionated, unidirectional channel interface, even though I would have said it makes sense for the receiver to be able to close the channel. The one question I have with that: what happens if the receiver thread panics? Does the sender just keep adding items to the channel's buffer? As @matthieu-m said, losing the items in flight would not be good, but I'm not sure how you would approach creating a new receiver into the same channel from a dead thread, from an API perspective. If there is a solution, that would be interesting to see.

@BurntSushi

This comment has been minimized.

Copy link

commented Apr 27, 2018

considering that I sent a pull request like 5 months ago that does the same thing for chan... Idk. I just hoped you might have at least looked at the PR at some point. :/ oh well.

Sorry, but with the volume of PRs/emails I get that are blocked personally on me, some of them just fall through the cracks. PRs that touch write-once code that I no longer understand naturally go to the bottom of my list, and are often forgotten about unless someone makes noise. It's just the way it is.

@stjepang

This comment has been minimized.

Copy link
Member Author

commented Apr 29, 2018

@coder543

but I'm not sure how you would approach creating a new receiver into the same channel from a dead thread, from an API perspective. If there is a solution, that would be interesting to see.

Perhaps something like this?

fn producer() {
    let (s, r) = bounded(BUFFER);
    let (alive, mut dead) = bounded(0);
    let mut t = thread::spawn(consumer(r.clone(), alive));

    while !finished() {
        let msg = produce_message();

        select! {
            send(s, msg) => {}
            recv(dead, _) => {
                let (alive, d) = bounded(0);
                dead = d;
                t = thread::spawn(consumer(r.clone(), alive));
            }
        }
    }

    drop(r);
    t.join().ok();
}

fn consumer(r: Receiver<Message>, alive: Sender<()>) {
    for msg in r {
        consume_message(msg); // may panic
    }
}
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.