-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Active socket, implementation with state machine #68
Conversation
…sed or :einval depending on timing of send call)
Confirmed the latest changes fix this issue #65 (comment) |
end | ||
|
||
defp reply_to_caller(reply, %{caller: caller, context: context} = data) do | ||
caller && :gen_statem.reply(caller, format_reply(reply, context)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can avoid the boolean logic here, in fact if we're ending up here when caller is nil
, we have a bug somewhere further up and it would be best to crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to avoid situations where the receiver sends more packets after full MLLP has been received.
It would not be a bug on our side, but rather a receiver misbehaving. Do you think we should crash in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we'd hit the clause at line 577, no? This raises an interesting question none the less. Crashing aside, one place a RST
is appropriate is when the the remote sends data that is unexpected. I wonder if we should shutdown in this case? Feels heavy handed, but yeah 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we'd hit the clause at line 577, no?
You are correct, we are probably covered for the "more packets" case. Although I'm not entirely sure, as there could be a race condition with the timer in the "receiving" state. There is actually a more serious reason for checking if we have a caller - the receiver can disconnect at any given time. If it happens while the 'send' call is in progress, we notify the caller, otherwise, we don't (no caller there).
If we want to shut down on the "more packets", the natural place is around line 577 indeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I'm not entirely sure, as there could be a race condition with the timer in the "receiving" state.
Not sure, will have to stare a bit more and we'll both have to attempt to put it through the ringer :)
I think shutting down makes sense, I think we can just use maybe_close
, we might need to make a modification though (i.e., maybe sometimes we want to do a graceful close and maybe some times an abortive close).
Right now maybe_close will attempt to do a graceful close, in this case it's not what we want. We want :gen_tcp.shutdown(socket, :read_write)
. This is the same behavior as send_timeout_close: true
, which ends up resulting in a sock_select()
call that does an abortive close (RST
). This should be double checked though.
Speaking of send_timeout_close: true
, if we don't have it this set as the default in sock opts we should, the only reason it's false in otp is for backwards compat reasons (though it should be configurable by the user).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the send_timout_close bit was resolved by your last commit @bokner. WE still need to hang up the phone on receive_timeout
as well. Basically, per the nature of MLLP and the implicit behaviour when shipping HL7 around, there's may be no recourse in a situation where the receive timed out.
Specifically, the client can either hang up the phone or make assumptions about how the server is implemented. In this regard, both the client and server may end up in a bad state and form an error loop. One example where a receive timeout can result in bad things happening is you peel off part of the response, you time out waiting for the rest. The caller of the client decides to send again, yet there still data in transit (i.e., not in the local buffers and bubbled up to our app yet), the send happens, and now the rest of the ack from a previous response arrives. Meanwhile, the remote might get what you just sent and either accept it, and send back a nack or hang up the phone.
Tricky business 😁
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside of dropping the connection on the client side for whatever reason (say, receive_timeout
) is that then it would have to reconnect again. Maybe it is a better recourse after all.
I guess the sane client would try to check for connection on any error anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, and to add to that, there needs to be options for all of it, that really bothers me, I think having too many options is never great, but sadly since MLLP isn't a much of a protocol, the behaviour is going to differ from system to system. All we can do is try to provide a mostly sane set of defaults that cover most cases. If I go by the "protocol" and how tcp works, it leads me here to these decisions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tell ya what, let's not agonize over this right now. We're not at 1.0, code is easy to change, etc. Let's get test passing, merge, and take it from there.
Co-authored-by: Bryan Paxton <39971740+starbelly@users.noreply.github.com>
Co-authored-by: Bryan Paxton <39971740+starbelly@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful! ❤️🧡💛💚💙💜
* hca/main: Active socket, implementation with state machine (HCA-Healthcare#68)
Switch to use active mode for the socket. Addresses #66.