Recovering from errors during writes #229

jonhoo · 2018-02-27T22:53:49Z

In most cases, a write to an io::Write will only fail if the writer is no longer valid. In those cases, the is no reason to continue writing, and the serialization should simply be aborted. However, in other cases, it would be useful to resume a write into an io::Write following an error. This applies particularly to transient errors like ErrorKind::TimedOut, ErrorKind::WouldBlock, and ErrorKind::Interrupted.

Unfortunately, since bincode currently writes things out piece by piece (and calls into the writer multiple times) without keeping track of how far it has gotten, resuming from these errors is basically impossible. One has to first serialize the value into a Vec<u8>, and then write from that value into the underlying writer, which introduces some unfortunate double-buffering.

What's the proposed solution for handling these kinds of errors?

The text was updated successfully, but these errors were encountered:

TyOverby · 2018-02-27T23:43:12Z

If you can't afford buffering, my proposed solution would be to create your own Write implementating struct that keeps track of how many bytes were written into it, and when it is retried, it drops the first n bytes, but continues to write the bytes past the last successful write.

This does mean that you do wind up going through the effort of serializing a lot of your content into the void when you don't really need to, but I don't think that there's any other way to solve this given Bincodes constraints.

jonhoo · 2018-02-27T23:59:14Z

I guess such a wrapper would also need to know when serialization starts and finishes (i.e., the first and last write of a serialization) so that it knows when to start dropping, and when to reset n. Though managing that would then presumably be the responsibility of the thing calling bincode.

The alternative proposal here would be for bincode to keep track of its serialization progress somehow, but that would likely also come with some pretty substantial API changes that we'd want to avoid. This should probably be mentioned in the docs for bincode::serialize_into, since ErrorKind::Interrupted is an error that can relatively easily come up in normal execution, and would completely break most implementations that aren't prepared for this.

TyOverby · 2018-02-28T00:06:23Z

Though managing that would then presumably be the responsibility of the thing calling bincode.

Yep!

The alternative proposal here would be for bincode to keep track of its serialization progress somehow, but that would likely also come with some pretty substantial API changes that we'd want to avoid.

Also, because bincode uses Serde for serialization/deserialization, it would require breaking Serde changes. Last time I talked to them, they have no plans to ever break API compat.

ErrorKind::Interrupted is an error that can relatively easily come up in normal execution

For this error in particular, could you have Write wrapper that retries the internal write call on Interrupted "errors"?

struct RetryWriteOnInterrupted<W: Write>(W);
impl Write for ......

bincode::serialize_into(&my_struct, RetryOnInterrupted(my_writer));

TyOverby · 2018-02-28T00:07:01Z

And yeah, this would be a good thing to document, thanks for bringing it up!

jonhoo · 2018-02-28T00:07:51Z

Yes, that would work. This is basically what write_all does, but given that bincode does many small writes, you don't actually get that benefit. This sounds like a good idea for crate (both retry and "drop-n") :)

Previously, send_back would not block on WouldBlock or Interrupted, and would silently just not send acks to the clients. That's be bad. This commit fixes that. See also bincode-org/bincode#229. This has the unfortunate side-effect of introducing a race between mutators and domains. If a mutator doesn't wait for acks sufficiently often, it may end up blocking on sending to the domain, while the domain is blocking on sending acks to the client. It *also* means that domains now wait for slow clients...

ZoeyR · 2020-04-16T04:02:08Z

Closing since it seems that this is best solved by an external Write wrapper.

ZoeyR closed this as completed Apr 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recovering from errors during writes #229

Recovering from errors during writes #229

jonhoo commented Feb 27, 2018

TyOverby commented Feb 27, 2018

jonhoo commented Feb 27, 2018

TyOverby commented Feb 28, 2018

TyOverby commented Feb 28, 2018

jonhoo commented Feb 28, 2018

ZoeyR commented Apr 16, 2020

Recovering from errors during writes #229

Recovering from errors during writes #229

Comments

jonhoo commented Feb 27, 2018

TyOverby commented Feb 27, 2018

jonhoo commented Feb 27, 2018

TyOverby commented Feb 28, 2018

TyOverby commented Feb 28, 2018

jonhoo commented Feb 28, 2018

ZoeyR commented Apr 16, 2020