Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recovering from errors during writes #229

Closed
jonhoo opened this issue Feb 27, 2018 · 6 comments
Closed

Recovering from errors during writes #229

jonhoo opened this issue Feb 27, 2018 · 6 comments

Comments

@jonhoo
Copy link

jonhoo commented Feb 27, 2018

In most cases, a write to an io::Write will only fail if the writer is no longer valid. In those cases, the is no reason to continue writing, and the serialization should simply be aborted. However, in other cases, it would be useful to resume a write into an io::Write following an error. This applies particularly to transient errors like ErrorKind::TimedOut, ErrorKind::WouldBlock, and ErrorKind::Interrupted.

Unfortunately, since bincode currently writes things out piece by piece (and calls into the writer multiple times) without keeping track of how far it has gotten, resuming from these errors is basically impossible. One has to first serialize the value into a Vec<u8>, and then write from that value into the underlying writer, which introduces some unfortunate double-buffering.

What's the proposed solution for handling these kinds of errors?

@TyOverby
Copy link
Collaborator

If you can't afford buffering, my proposed solution would be to create your own Write implementating struct that keeps track of how many bytes were written into it, and when it is retried, it drops the first n bytes, but continues to write the bytes past the last successful write.

This does mean that you do wind up going through the effort of serializing a lot of your content into the void when you don't really need to, but I don't think that there's any other way to solve this given Bincodes constraints.

@jonhoo
Copy link
Author

jonhoo commented Feb 27, 2018

I guess such a wrapper would also need to know when serialization starts and finishes (i.e., the first and last write of a serialization) so that it knows when to start dropping, and when to reset n. Though managing that would then presumably be the responsibility of the thing calling bincode.

The alternative proposal here would be for bincode to keep track of its serialization progress somehow, but that would likely also come with some pretty substantial API changes that we'd want to avoid. This should probably be mentioned in the docs for bincode::serialize_into, since ErrorKind::Interrupted is an error that can relatively easily come up in normal execution, and would completely break most implementations that aren't prepared for this.

@TyOverby
Copy link
Collaborator

Though managing that would then presumably be the responsibility of the thing calling bincode.

Yep!

The alternative proposal here would be for bincode to keep track of its serialization progress somehow, but that would likely also come with some pretty substantial API changes that we'd want to avoid.

Also, because bincode uses Serde for serialization/deserialization, it would require breaking Serde changes. Last time I talked to them, they have no plans to ever break API compat.

ErrorKind::Interrupted is an error that can relatively easily come up in normal execution

For this error in particular, could you have Write wrapper that retries the internal write call on Interrupted "errors"?

struct RetryWriteOnInterrupted<W: Write>(W);
impl Write for ......

bincode::serialize_into(&my_struct, RetryOnInterrupted(my_writer));

@TyOverby
Copy link
Collaborator

And yeah, this would be a good thing to document, thanks for bringing it up!

@jonhoo
Copy link
Author

jonhoo commented Feb 28, 2018

Yes, that would work. This is basically what write_all does, but given that bincode does many small writes, you don't actually get that benefit. This sounds like a good idea for crate (both retry and "drop-n") :)

jonhoo added a commit to mit-pdos/noria that referenced this issue Feb 28, 2018
Previously, send_back would not block on WouldBlock or Interrupted, and
would silently just not send acks to the clients. That's be bad. This
commit fixes that. See also
bincode-org/bincode#229. This has the unfortunate
side-effect of introducing a race between mutators and domains. If a
mutator doesn't wait for acks sufficiently often, it may end up blocking
on sending to the domain, while the domain is blocking on sending acks
to the client. It *also* means that domains now wait for slow clients...
@ZoeyR
Copy link
Collaborator

ZoeyR commented Apr 16, 2020

Closing since it seems that this is best solved by an external Write wrapper.

@ZoeyR ZoeyR closed this as completed Apr 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants