Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid triggering SIGPIPE after stream_socket_shutdown(SHUT_WR) of a SSL stream #2605

Closed
wants to merge 1 commit into from

Conversation

bwoebi
Copy link
Member

@bwoebi bwoebi commented Jul 1, 2017

Let $sock be a SSL stream, then

stream_socket_shutdown($sock, STREAM_SHUT_WR);
fclose($sock);

shall not trigger a SIGPIPE.

SSL_shutdown() is called from within fclose() currently and will try to write to the socket, thus that function must be called before SHUT_WR.

I am not sure about the correct behavior within non-blocking I/O streams when the buffer is full. … What will happen then? Will it try to dispatch the SSL alert a second time inside the SSL_shutdown() within fclose() then?

Do I need to set sslsock->ssl_active = 0; here? (but won't that then break the read pipe?!)

please review - ping @bukka @DaveRandom

case STREAM_XPORT_OP_SHUTDOWN:
if (sslsock->ssl_active && (xparam->how == STREAM_SHUT_WR || xparam->how == STREAM_SHUT_RDWR)) {
if (SSL_shutdown(sslsock->ssl_handle) == -1) {
php_stream_socket_ops.set_option(stream, option, value, ptrparam); /* ensure socket shutdown either way, but report failure */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually going to work for non-blocking socket? The thing is that if it is -1 then in case of the non-blocking socket the operation is not completed and it will have to be finished once the data can be read or written. It basically depends on SSL_get_error result. So closing socket might not be a good idea maybe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should just add a new stream_socket_shutdown_crypto function?

@bukka
Copy link
Member

bukka commented Jul 2, 2017

I think there should be definitely a test for non-block with full buffer. The base could be similar like the one that I added for non-block write issue: https://github.com/php/php-src/blob/f066f59eab719d0171febf7bcd7f70edb5ffd6bd/ext/openssl/tests/bug72333.phpt . It would be also cool to have a test for blocking socket as well which should be even easier. ;)

@bukka
Copy link
Member

bukka commented Jul 2, 2017

About the non-blocking I'm not sure how it should work from the stream point of view. It should definitely fail and not close socket immediately. Maybe something like error first and then allow using select so user can finish the closing when ready. It would probably need some extra logic though.

@krakjoe krakjoe added the Bug label Jul 4, 2017
@krakjoe
Copy link
Member

krakjoe commented Jul 4, 2017

Is it possible to add a test for this ?

@DaveRandom
Copy link
Contributor

DaveRandom commented Jul 25, 2017

TL;DR for a blocking socket, it would be fine to do the SSL_shutdown() in stream_socket_shutdown(), or in fclose() if it hasn't been done yet, for a non-blocking socket this approach is fixing the problem in the wrong place,

In non-blocking mode, stream_socket_shutdown() does not have an API that is suitable for enabling a graceful shutdown of SSL, specifically a boolean return value is not capable of conveying enough information to the caller about what happened - and it's also logically the wrong approach when you compare it to the underlying actions that need to be taken.

Really in order to do this for a non-blocking socket the logic needs to go like this (pseudo):

while (!$result = stream_socket_enable_crypto($sock, false)) {
    if ($result === false) {
        goto failure;
    }

    poll_socket_for_read_or_write($sock);
}

stream_socket_shutdown($sock, STREAM_SHUT_WR);

failure:
fclose($sock);

There are still a couple of problems with this in terms of the existing implementation, though.

  1. stream_socket_enable_crypto() still does not report whether the operation is in SSL_ERROR_WANT_READ or SSL_ERROR_WANT_WRITE, so our poll_socket_for_read_or_write() implementation needs to watch for both readability and writability, which makes the code inefficient and risks a busy wait when the socket is waiting for readable data and the socket is continuously writable (this is true for enabling crypto for non-blocking sockets as well).
  2. The stream_socket_enable_crypto() implementation isn't up to scratch for this approach (and while we're at it, this might not be good enough in non-blocking mode either, although I haven't applied much thought to that yet).

I cannot come up with a way to truly resolve (1) without BC breaks, or introducing a new function.

@DaveRandom
Copy link
Contributor

DaveRandom commented Jul 25, 2017

A possible solution that does not break BC would be to add a third optional by-ref out param to stream_socket_enable_crypto() which reports whether the operation is in WANT_READ or WANT_WRITE state when the return value is 0.

In order to make this work internally I see two options, equally distasteful in different ways:

  1. Add another member to php_stream_xport_crypto_param.outputs to pass the state back to the caller. This is probably the "right" way to do it, but it breaks ABI.
  2. Add more possible values for php_stream_xport_crypto_param.outputs.returncode (e.g. -2 for WANT_READ, -3 for WANT_WRITE). This doesn't break ABI in the sense that existing code which uses explicit values for that member will continue to work, but it does break the semantics of it.

I'm happy to work on a patch for this, though, as I think it would be worth doing, and that particular element of it is a relatively small part.

@DaveRandom
Copy link
Contributor

DaveRandom commented Jul 27, 2017

After looking into this further, it seems that any I/O operation can potentially result in WANT_READ/WANT_WRITE, including a write operation resulting in WANT_READ and vice-versa. As such I propose a new PHP function:

openssl_get_last_error(resource $stream): int

This will allow userland to access this information when one of the following occurs for a non-blocking stream:

fwrite($stream, $data) < strlen($data)
($data = fread($stream, $len)) !== false && strlen($data) < $len
stream_socket_enable_crypto($stream, ...) === 0
// other situations as well?

This would be implemented internally by proxying all calls to SSL_get_error() through an inlined function which will store the result in a thread-global variable. openssl_get_last_error() will simply return this value. This should allow applications to handle all these states correctly, and will not require any breaking changes to existing PHP API.

While exploring xp_ssl.c and researching this, it strikes me that it might be worth refactoring it such that the read/write/close/flush implementations are replaced when a stream is put in/out of blocking mode. The existing implementation is somewhat confusing because of the number of branches around sslsock->s.is_blocked, and the poll loops in the context of non-blocking sockets produced a number of wtfs and took me quite a long time to wrap my head around. If we have separate implementations of these operation handlers for blocking and non-blocking sockets, I believe it will be easier to read and thus more maintainable, It will also potentially be more efficient as there will be fewer branches.

Thoughts please @bukka @bwoebi

@kelunik
Copy link
Member

kelunik commented Jul 27, 2017

fwrite($stream, $data) < strlen($data)
($data = fread($stream, $len)) !== false && strlen($data) < $len

Both of these are quite common for non-blocking reads / writes. Does an app really need to check the error in these cases or can they be trimmed down?

@DaveRandom
Copy link
Contributor

DaveRandom commented Jul 27, 2017

The way I read it, in these cases applications would need to check the error code.

Consider the following situation in terms of how concurrency frameworks currently handle it: fwrite() writes zero bytes but does not fail (i.e. does not return false). At present this is assumed to be because the write buffer is full, so we register a write watcher and try again when it is satisfied. If the reason is actually that the socket is in WANT_READ state, two unintended things will happen:

  1. The socket is writable, so the watcher will be invoked immediately (on the next loop cycle) regardless of whether any data has arrived. This will result in something that I can only describe as an asynchronous busy-wait, where it's effectively polling the socket trying to write the data. This may or may not eventually succeed because...
  2. If there is a read watcher registered, this may end up being fired before the write watcher when the data finally does arrive. This will likely result in an fread() that returns an empty string, which will at present be interpreted as the remote peer closing the connection.

In the opposite situation, where fread() succeeds with an empty string and the socket is in WANT_WRITE state, this would again be erroneously treated as the remote peer closing the connection.

One thing that is notable about the fwrite() situation is that it is a potential DoS vector, if a remote party does something which puts the socket into WANT_READ state while the server has data to send and then stops sending data, it will cause the aforementioned busy-wait state potentially for a long period of time. Multiply this by many concurrent connections and it's a way to make the server burn CPU cycles.

There's no way (that I can see) to mitigate either of these without directly exposing SSL_get_error() to userland in some way.

This change would also solve the original problem that this PR was aiming for, in that it would enable a graceful SSL shutdown before shutting down the TCP layer.

@kelunik
Copy link
Member

kelunik commented Jul 27, 2017

fwrite($stream, $data) < strlen($data)
vs.
fwrite() writes zero bytes but does not fail

An empty read is only considered as a stream end if the resource is no longer valid or feof returns true.

So the conditions would actually be fwrite() writing 0 bytes or fread() returning an empty string.

What would an application actually call if it gets WANT_READ from fwrite()?

@DaveRandom
Copy link
Contributor

When you get WANT_READ from fwrite() you need to call fwrite() again, the difference is that you need to wait for the socket to become readable before you do it, rather than waiting for it to become writable.

@bwoebi
Copy link
Member Author

bwoebi commented Jul 27, 2017

To summarize the socket shutdown case:

We'd call stream_socket_shutdown() repeatedly on a SSL stream resource each time the stream is marked as writable by select/epoll/kqueue etc. until the openssl_get_last_error() isn't returning WANT_WRITE anymore? (WANT_WRITE should be the only error shutting the write stream down should ever produce as long as there aren't other catastrophical failures?)

Am I understanding that correctly? If yes, that sounds like a nice API solution to me, given that there's no real break either.

@kelunik
Copy link
Member

kelunik commented Jul 27, 2017

@bwoebi No, we'd not use stream_socket_shutdown() at all, but stream_socket_enable_crypto.

@bwoebi
Copy link
Member Author

bwoebi commented Jul 27, 2017

uh, yes, confused the functions. You're right ;-)

@DaveRandom
Copy link
Contributor

@bwoebi yes that's correct

@bukka
Copy link
Member

bukka commented Jul 30, 2017

@DaveRandom I think it definitely makes sense to somehow expose the result from SSL_get_error. I'm not sure about adding a new function in openssl however. I think it should be more stream related and it should go to the stream funcs to be more consistent with other stream operations - e.g.
stream_socket_enable_crypto. Maybe stream_socket_get_ssl_error or something like that. But that's just a detail anyway.

I think it's a good idea to separate just php_openssl_sockop_io to 2 functions for non-blocking and blocking socket. That would definitely make it more readable and maintainable. Other parts (flush/close) don't seem necessary as there are not too many differences unless I'm missing something.

Also I think we should resolve the issue for stream_socket_shutdown when user doesn't call stream_socket_enable_crypto. It means handling shutdown nicely. Basically internally disabling crypto in the same way as it would be done by stream_socket_enable_crypto.

@DaveRandom
Copy link
Contributor

@bukka agree on the naming completely, I was thinking the same to myself this morning. I was pondering suggesting renaming (with alias for BC) stream_socket_enable_crypto() to stream_socket_crypto_enable() (or possibly _toggle is more descriptive?) and then it could be stream_socket_crypto_get_error() or something - grouping related functions by prefix is good where possible IMO - but I feel like that might meet resistance and it's not really necessary.

I've not really looked at the flush/close implementations. Intuitively I feel like flush should be different for a non-blocking stream, because if buffer size > send window then logically it could be a blocking call and/or return without completing in non-blocking mode, I will look into it further though. close() IIRC is always a blocking call when it comes to the underlying sockets API, and it's the responsibility of the application to ensure that the send buffer is empty before calling if it doesn't want to block. That might be wrong though, I was planning to give myself a refresher course this week.

When a socket is in blocking mode, it absolutely does make sense to gracefully shut down SSL before shutting down the socket. In non-blocking mode I'm not sure it does because again, it may or may not be a blocking call for any given socket depending on the state of the connection and what's in the send buffer... my personal view is that when you put a socket into non-blocking mode you are essentially saying "I am taking responsibility for ensuring everything happens in the right order" and as such if you attempt to shutdown the socket without shutting down SSL it should maybe be a best-effort approach, that might result in an unclean shutdown of the SSL layer or perhaps will fail entirely - I'm very much open to persuasion on this point, though.

What I'm bothered about is ensuring it is possible to do it "correctly" in non-blocking mode. I guess I might be OK with forcing the socket into blocking mode for the shutdown operation, since in the case of a socket where SSL has been properly shut down by the application it would still be an atomic operation, and if you didn't do that then it's "your fault" that it's now a blocking operation.

I am planning to try to work up a patch this week, but I am also happy if someone else who can do it faster than me wants to take it on. I don't have loads of spare time and I will have to rtfm a bit to remind myself of how some stuff works.

@bukka
Copy link
Member

bukka commented Aug 2, 2017

@DaveRandom I wouldn't probably rename stream_socket_enable_crypto. I definitely agree stream_socket_crypto_toggle or even stream_socket_tls_toggle would be better names but don't think it's worth the change. Maybe adding something like stream_socket_get_crypto_error would be sort of consistent. :)

I'm not sure what we should do in terms of flush actually? Currently it just flushes the underlying socket. Not too sure about close either. I guess it might need maybe some work to make sure that it is safe to close it for non-blocking but not sure atm.

In terms of shutdown I think it would make sense just to return false if SSL_shutdown returns negative value for non-blocking and let user handle it in the same way as stream_socket_enable_crypto. It means just poll till the socket is available for read or write (in dependency on the SSL error).

I won't have time to write a patch as I have bunch of other things on my list so it would be great if you find a time. I should be able to help with testing and review possibly.

@krakjoe
Copy link
Member

krakjoe commented Oct 2, 2019

There's been no movement here in more than two years. It seems as if the discussion raised the possibility of another (API based) solution, which may be superior anyway.

I'm closing this, if there is to be work on this issue please reference this PR from the new pull request (or RFC).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants