Skip to content
Permalink
Browse files

Bug #41631: Observe socket read timeouts in SSL streams

  • Loading branch information...
rdlowrey committed Aug 7, 2014
1 parent 7189039 commit 6569db88081562f68a4f79e52cba83482bdf05fc
Showing with 46 additions and 0 deletions.
  1. +46 −0 ext/openssl/xp_ssl.c
@@ -204,13 +204,59 @@ static size_t php_openssl_sockop_write(php_stream *stream, const char *buf, size
return didwrite;
}

static void php_openssl_stream_wait_for_data(php_netstream_data_t *sock TSRMLS_DC)
{
int retval;
struct timeval *ptimeout;

if (sock->socket == -1) {
return;
}

sock->timeout_event = 0;

if (sock->timeout.tv_sec == -1)
ptimeout = NULL;
else
ptimeout = &sock->timeout;

while(1) {
retval = php_pollfd_for(sock->socket, PHP_POLLREADABLE, ptimeout);

if (retval == 0)
sock->timeout_event = 1;

if (retval >= 0)
break;

if (php_socket_errno() != EINTR)
break;
}
}

static size_t php_openssl_sockop_read(php_stream *stream, char *buf, size_t count TSRMLS_DC)
{
php_openssl_netstream_data_t *sslsock = (php_openssl_netstream_data_t*)stream->abstract;
php_netstream_data_t *sock;
int nr_bytes = 0;

if (sslsock->ssl_active) {
int retry = 1;
sock = (php_netstream_data_t*)stream->abstract;

/* The SSL_read() function will block indefinitely waiting for data on a blocking
socket. If we don't poll for readability first this operation has the potential
to hang forever. To avoid this scenario we poll with a timeout before performing
the actual read. If it times out we're finished.
*/
if (sock->is_blocked) {
php_openssl_stream_wait_for_data(sock);
if (sock->timeout_event) {
stream->eof = 1;
php_error_docref(NULL TSRMLS_CC, E_WARNING, "SSL read operation timed out");
return nr_bytes;
}
}

do {
nr_bytes = SSL_read(sslsock->ssl_handle, buf, count);

22 comments on commit 6569db8

@oerdnj

This comment has been minimized.

Copy link
Contributor

replied Aug 27, 2014

Daniel, it looks like this update broke TLS connections to IMAP servers:

http://bugs.horde.org/ticket/13491
and
http://bugs.debian.org/759381

@Tyrael

This comment has been minimized.

Copy link
Member

replied Aug 27, 2014

based on the report we can't be sure that it isn't the socket timeout is set too low or that the server is doing the ssl handshake too slowly.
or am I missing something?
btw. Lior already poked me about this and I've sent an email to Daniel, hope he can reply soon, but it would be nice if we could have more info or a repro script.

@oerdnj

This comment has been minimized.

Copy link
Contributor

replied Aug 27, 2014

I have no idea, but it was reported both in IMP/Horde and Roundcube, so it sounds like a regression.

@Tyrael

This comment has been minimized.

Copy link
Member

replied Aug 27, 2014

this change makes sure that the socket timeout is also honoured by the ssl handshake.
so while it can break applications, but it doesn't mean that it is doing the wrong thing.
(one could even argue, that it is fixing a DoS vulnerability)

@Tyrael

This comment has been minimized.

Copy link
Member

replied Aug 27, 2014

but this should have an entry in the UPGRADING docs, so we won't forget mentioning this in the official migration docs.

@oerdnj

This comment has been minimized.

Copy link
Contributor

replied Aug 27, 2014

I am comfortable to have this change in 5.6.x, but I am not sure if such regression (from user POV) should land in stable updates (5.4 & 5.5)...

@staabm

This comment has been minimized.

Copy link
Contributor

replied Aug 27, 2014

we experienced this DoS problem, which was the reason that I poked rdlowrey to have a look at it.

@staabm

This comment has been minimized.

Copy link
Contributor

replied Aug 27, 2014

@oerdnj do you use a self-configured timeout or do you rely on the default timeout?

@rdlowrey

This comment has been minimized.

Copy link
Contributor Author

replied Aug 27, 2014

@oerdnj The only thing I see that might impact correct code (i.e. code where the timeout is sufficiently high) is pending data already buffered by OpenSSL during the handshake. This could lead to the underlying socket never reporting as readable as our polling mechanisms are unaware of such data. If this is the case, it is related to a fairly recent bug that @DaveRandom and I encountered (and for which he currently has a PR in the works).

If this is indeed the problem, changing Line 252 from this:

    if (sock->is_blocked) {

to this:

    if (sock->is_blocked && SSL_pending(sslsock->ssl_handle) == 0) {

should solve the problem. I can apply this change in a new commit either way to be on the safe side (and because we only discovered the possibility for bugs in this area in the last couple of days). However, if this change does not resolve the reported problem I'd be hard-pressed to say there is an actual bug in the php-src implementation and it's likely an issue with a socket timeout setting that is too low in the userland script.

If someone at Debian/Horde/RoundCube is available to verify whether or not the above change resolves the issue it would be helpful.

@Tyrael, would you prefer I commit the additional SSL_pending() check now (as it's the only possible source of the problem on our end that I see) or would you prefer to revert the relevant commits and we can re-apply these changes in the next round of bugfixes?

@Tyrael

This comment has been minimized.

Copy link
Member

replied Aug 27, 2014

the problem is that we are really close to the planned release date (tomorrow) and 5.6.0 is already tagged, but if you think that it is a likely cause for the reported problem, I think we should fix it.
if we can agree that this is a problem, I would prefer reverting this(reverting 84a4041, 98e67ad and removing the NEWS entry should be enough, right?) fix from the 5.6.0 final release instead of risking to have an incomplete fix without enough time to properly test it.
@jpauli, what do you think?

@jpauli

This comment has been minimized.

Copy link
Contributor

replied Aug 27, 2014

Ok guys, the release is tomorrow.

Let's revert the commits please, play it safe , we can't release something we are "barely sure" it works, that's not serious.

The changes may be added to a 5.6.X.

@oerdnj

This comment has been minimized.

Copy link
Contributor

replied Aug 27, 2014

Thank you for reverting the change for now. I am sure the original reporter will show up, so he can confirm or deny the timeouts. (I can also prepare special packages with the changes up there, so we can be sure that the fix is correct for 5.6.x+1).

@staabm

This comment has been minimized.

Copy link
Contributor

replied Aug 27, 2014

@jpauli as PHP 5.3 already reached EOL does it mean that this bug 41631 will never be fixed on a PHP 5.3?

@jpauli

This comment has been minimized.

Copy link
Contributor

replied Aug 27, 2014

@staabm yes : upgrade

@oerdnj

This comment has been minimized.

Copy link
Contributor

replied Aug 27, 2014

JFTR the original reporter said that he hasn't done anything with the timeouts at all. So the default was used... well, unless Roundcube and Horde/IMP are mangling the timeouts in the code.

@rdlowrey

This comment has been minimized.

Copy link
Contributor Author

replied Aug 27, 2014

@jpauli You're right, of course :)

We can make the SSL_pending() addition for 5.6.x (and earlier branches). This is really the only possible issue. If that change does not solve the problem then the issue is almost certainly insufficiently low userland timeout settings.

@staabm There will not be a 5.3 fix forthcoming from me, for sure. The relevant commit here was also not applied to 5.3. Fixing things in EOL releases is only harming the community in my opinion.

@DaveRandom

This comment has been minimized.

Copy link
Contributor

replied Aug 27, 2014

@oerdnj yes, whether the timeouts have been manually defined in code may have no bearing on this issue, as the default_socket_timeout will be used. We believe Daniel's original suggested fix is the only possible solution to this, is there any way you can test this against the problematic code?

It's quite difficult to reproduce this specific bug reliably as there are so many external factors that may influence when/whether it will actually occur.

@arjenschol

This comment has been minimized.

Copy link

replied Sep 8, 2014

This causes a regression in 5.5.17RC1, see https://bugs.php.net/bug.php?id=67965

@rdlowrey

This comment has been minimized.

Copy link
Contributor Author

replied Sep 8, 2014

Thanks -- I have only been in the office sporadically since this was originally reported. Will fix in all relevant branches later today and post updates here.

@DaveRandom

This comment has been minimized.

Copy link
Contributor

replied Sep 9, 2014

A fix for this has been created and will be merged in later today.

@rdlowrey

This comment has been minimized.

Copy link
Contributor Author

replied Sep 9, 2014

@arjenschol this is actually a separate issue from the linked issue you provided. However, that has now been addressed in commit f86b2193a.

@rdlowrey

This comment has been minimized.

Copy link
Contributor Author

replied Sep 9, 2014

This issue should now be resolved via 3728449 which will make its way into the next round of bugfix releases.

Please sign in to comment.
You can’t perform that action at this time.