-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow input and output streams of connected_socket to be closed independently #232
Comments
Fixed by 2494d4c |
Turns out that this wasn't fixed for the Perhaps |
Do you mean that streams::close should not relate to socket_shutdown? If so, should that not be the case in normal tcp socket as well? And what is the result of stream close then? Just setting flags? |
2017-08-21 11:18 GMT+02:00 Calle Wilund <notifications@github.com>:
Do you mean that streams::close should not relate to socket_shutdown? If
so, should that not be the case in normal tcp socket as well? And what is
the result of stream close then? Just setting flags?
That's not what I wanted, at least not intentionally. The problem is that
we want streams in both directions to be independent, so you can for
example close the read side before you close the write side, or the other
way around. Currently that is not the case for the TLS impl:
1) closing the write side closes the session output socket, which will be
needed to close the read side (closing the read side writes to that
socket). If you close the write side before you close the read side, then
read side closing will fail, you won't be able to perform SHUTDOWN_RDWR.
2) if you close the read side before you close the write side, that issues
SHUTDOWN_RDWR, which implicitly disallows further writes.
The question is, can we rearrange the implementation of TLS socket so that
the requirements of independence of the streams can be met? Or is TLS
incompatible with this abstraction.
… Note that our (current) contract is that the shutdown operations wake up a
pending reader, which is also propagated through the stream close right now.
|
To give more context here, I've noticed this while trying to add auto-close to streams (https://github.com/scylladb/seastar-dev/commit/e516f105ae31c47665c3de6eb3d4595ce7cf1a46), where a stream is closed automatically when goes out of scope unless explicitly closed earlier. This depends on streams being independent. Currently causes TLS tests to fail due to exception being thrown from input stream shutdown if it is closed after the output side is closed. |
Yes, the gnutls goodbye semantics are indeed incompatible with independent close. We could 100% abandon well-behaved-ness when operating transparently, and require a caller to be TLS aware and do a different termination w. handshake if he wants to adhere to procedure, and if not, just use socket level close. Problem with that is that currently all TLS is hidden in socket_impl layer. |
On Mon, Aug 21, 2017 at 02:18:31AM -0700, Calle Wilund wrote:
Do you mean that streams::close should not relate to socket_shutdown? If so, should that not be the case in normal tcp socket as well? And what is the result of stream close then? Just setting flags?
Note that our (current) contract is that the shutdown operations wake up a pending reader, which is also propagated through the stream close right now.
As I noted in
scylladb/scylladb#2244 (comment)
the commit already broke the contract for regular tcp.
…--
Gleb.
|
@elcallio What is the well-behaved way of using the streams of the TLS socket regarding life cycle of the input/output streams? Can I shutdown each side concurrently, do I have to do it in certain order? |
It should be ok to do from both ends, though only one of the goodbyes will be the actual goodbye (I think). But as stated earlier, it is probably "better" (outside seastar) to make an application aware of the handshake and build it into whatever state machine you run, i.e. the "client" side does the goodbye. Again, the whole bye-bye handshaking does not fully sit well with transparent TLS. |
FYI, another prominent case of TLS transparent layering, Java, does not allow streams to operate separately. I.e. closing either input or output stream handshakes and closes the entire connection. (Unless I read the code wrong...) |
Wow, this issue was known for all these years, and I just rediscovered it now in #906. As I explain in #906, the fact that closing just one stream (input or output) immediately closes the entire connection not only is an inconsistent API (because Seastar's TCP sockets don't do this), it also breaks Seastar's HTTP server which assumes it can continue to write to the output stream even after closing the input stream. |
I don't understand the problem here, maybe you can explain this again assuming I know nothing about gnutls and knew in the past - but forgot - the details of the TLS protocol. Let me explain why I'm asking: The Seastar "tradition" is that streams (and files, and almost anything else) need to be |
This again is known issues, the close-all approach was done in 5832c7f, before this we did indeed try to do partial shutdowns, but this caused a number of issues with clients, esp. when doing transparent io. Partially because our general socket abstractions have some issues when it comes to enforcing proper state management (no real raii-behaviour for example). As stated above, the current approach instead models on JDK behaviour of sockets/streams. Which in turn means that one can argue that the non-tls socket usage in seastar apps is borked and should be fixed instead. The problem again is that to ensure proper TLS handshaking when actually disconnecting (be it shutdown_or or shutdown_rdwr) is that both sides need to ack the other. (Initiator can skip acking - but if you have two sides both shutting down write for example, they need to ack each other. And to relate to your question: It is quite vital IIRC that a shutdown(WR) cause reactable side effects on the reading side, i.e. wakes things up, otherwise things like rpc etc will deadlock. The only way to ensure this via TLS is via a shutdown message. And I can't 100% remember if shutdown_wr is enough. |
What is "transparent io"? I saw you used this phrase a few times, but didn't understand the context.
Right. The main issue is that normal (TCP) sockets and TLS sockets behave differently, so code written and heavily tested for TCP (including Seastar's own HTTP server, see #906, but potentially other Seastar-based applications as well) and then changed to use TLS will stop working correctly.
I don't know if we allow two input streams (what if two fibers read from these two input streams?), but if we do, the solution is simple - use a reference count. The connection will be closed when all the references - both writes and readers - are gone.
After the application closed the (last?) input stream, and you don't allow calling the function to get the input stream again (you can throw an exception in this case), there is no way that the application can read more - it won't have any way to do it. The low level implementation may read and discard data if it does arrive, I guess.
This is not relevant to the Seastar API - if the user calls
I agree, but I am not sure it's relevant to Seastar's API. I don't think (but maybe @tgrabiec or @avikivity can corrrect me) that we ever suggested that
This appears to be the heart of the matter. When you "shutdown(WR)" (I assume you mean close the output stream), it appears you want to wake up the remote side. But since with the current code, at that point it can't send you a reply (you already closed the TLS session!), you have no use for the input stream as well. So in the existing code, if RPC want to close the connection and alert the remote side, it can close both input and output streams. If it does this, then under my suggestion the TLS connection to be closed and the remote side to be woken like you wanted. What am I missing? |
Using a TLS socket as if it was any other socket, i.e. abstracted away.
Which typically means one has to avoid things like closing input before being done with output. You can just as well argue that our non-tls sockets allow to many things.
But what if client does not call close? Which is very well known to happen. Of course, can be handled with extra statefulness in source/sink, but it also needs to be backed in session. I would argue that enforcing that a client keeps input alive, and closes output first is not that hard a condition for transparent (I used it again!) socket usage.
There is nothing enforcing this in streams.
Guarantee and "happen to rely on it" are two different things.
IIRC, said close of everything did/does not happen. Just like you want TLS behaviour to change so it fits into one set of "legacy" code, the current mode exists to large part to ensure it does work with other existing "legacy" code. |
I agree that the problem that the non-TLS and TLS sockets differ doesn't immediately say that the TLS one needs to be fixed and not the other way around. However, what I am suggesting is that it is likely that Seastar applications are developed and tested with non-TLS first, and then break when modified to use TLS, and not the other way around. Furthermore, I resurrected this discussion because Seastar's HTTP server had exactly the same problem - it was developed and tested mostly for non-TLS, and it turns out it doesn't work correctly (in a specific scenario) for TLS because of this difference. Yes, it can be modified to work for both, but wouldn't it be nice if both just had the same interface so it wouldn't need to be modified in the first place?
We could decrease the reference count in the destructor, not close(), and then close() would just check if it's the last reference, and if it is, wait for the session destruction. Is this good enough for your use case? Anyway, it should be documented whether or not it is necessary to close() these streams. You are suggesting now it is fine to not close() both streams, and close just one. But it is also fine not to close any of them? If that's not fine, why is it fine to close just one? Isn't that weird?
It's not hard, but it's completely arbitrary, and as a matter of fact, whoever wrote the HTTP server code (not me...) didn't know this condition existed. And why didn't he know this condition existed? Because it doesn't in the non-TLS case. Which makes this condition non-transparent, instead of transparent. I added in #906 a comment on how I can indeed work around this condition - close both streams together and this will work on both TLS and non-TLS sockets. But this will just solve the HTTP bug and leave us with the inconsistent API that can bite other users in the future.
As usual, we have no useful documentation saying what close() is supposed to mean on a stream.
I think I see now - you're saying that the legacy Seastar-using code doesn't close() both streams - just one, and should continue to work unmodified. |
Sure, but I am not certain that it can ever 100% happen. I think there are pitfalls where trying to ensure that the TLS state machine is executed as it should while allowing all the permutations of semi-allowed socket usage you have with non-TLS sockets will fail eventually and in places they remain hidden until they are at a customers and things go escalate. Because that is what traditionally happened.
Yes, I am aware of how you can do it. My point is the added statefulness, and that stateful things always break. Always.
The problem is that it is ok for "normal" sockets. And has/does happened over and over. Our posix/native IP stack is quite forgiving w/r to violating the "explicit destructors" just as it is forgiving visavi order of closing things. So we have had code that does things haphazardly.
But my point is again that the non-TLS case is a bit to forgiving. The http server was initially written way before the TLS layer existed, so it can be partially forgiven for taking liberties.
Socket streams will defer to the source close, which will typically close the actual socket in the proper direction (i.e. shutdown RD). So in this case, closing one input stream will close any other as well.
I cant recall right now where the various issues where. But doing things in destructor is IIRC not good enough in some cases and will cause problems. I really hope I am wrong about this, because honestly I would love for the state machine to just be automatic and super forgiving like you suggest. I am just pointing out that things are like they are for very real and terrible reasons. And poking them might have side effects that does not show up until they are in data centers somewhere and causing angry bug reports. |
Btw, one more really terrible scenario: What if a client gets input stream, reads some, then closes it. There are no references to writers, so we would ideally want to do a full shutdown handshake now (avoiding having background tasks running). But then it would be impossible to after this get the output stream and write, something that a "normal" socket usually allows fine. |
For
posix_connected_socket_impl
(and tls::session [1]) when either input or output streams are closed, the whole socket is implicitly closed by closing pollable_fd. After the socket is closed, no further actions are allowed on it. It also cannot be closed concurrently with any operations on the socket (pollable_fd::close() destroys pollable_fd_state and closes the file descriptor). Because of that, the use of input and output streams must be synchronized with regard to closing of each of them. We cannot close the output stream when input stream is still in use, and vice versa. We cannot close input stream after closing the output stream.One problem is that this makes using the streams correctly complicated. Another problem is that the general contract for input and output streams is that they should be closed before they're destroyed. Here we're violating that by requiring one of the streams to not be closed. This is a problem if the ownership of one of the streams is passed to generic code which will try to close it.
I propose to change this so that:
[1] Similar applies to
tls::session
, if input stream is closed after the output stream, we will crash:The text was updated successfully, but these errors were encountered: