Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dist_ffi fixes and receive error handling #3

Closed
wants to merge 2 commits into from
Closed

Conversation

RalfJung
Copy link
Contributor

@RalfJung RalfJung commented May 6, 2021

I figured it'd be a good time to go to the basics and (re-)consider our network / FFI model. In particular I wanted to make it so that Receive will actually return Err=true sometimes, namely when the socket is gone.

But then I realized that this doesn't make sense since when the socket is gone, we try to reconnect! So there's a discussion we need to have about which behavior we want dist_ffi to have. Should a channel created via Connect transparently reconnect? If yes, should Receive on such a channel ever return an error (and when)? Also when (if at all) should Receive on a channel created via Listen return an error?

From a uRPC perspective, it is quite nice that Send reconnects transparently and we anyway could not do much with the information that Receive failed. So maybe we should rename Err in ReceiveRet to Timeout and set some arbitrary timeout (60s) just to keep the operation fallible (which is required from a formal perspective), but not even attempt to actually provide information about network errors on the Receive side?

Also I realized the current "transparent reconnect" code is buggy -- or rather, it is racy; there could be other Send operations going on while we reconnect. So I added a lock to sender, which I then also use to avoid having to copy the data unnecessarily.

@RalfJung RalfJung requested a review from upamanyus May 6, 2021 16:08
@RalfJung
Copy link
Contributor Author

We're going to take a different approach to error handling that I'll implement shortly; let's close this one.

@RalfJung RalfJung closed this May 26, 2021
tchajed added a commit that referenced this pull request Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant