Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenSSL error 140E0197 with Qt >= 5.12.2 #3679

Closed
an0nfunc opened this issue Apr 29, 2019 · 42 comments · Fixed by #4032
Closed

OpenSSL error 140E0197 with Qt >= 5.12.2 #3679

an0nfunc opened this issue Apr 29, 2019 · 42 comments · Fixed by #4032

Comments

@an0nfunc
Copy link

an0nfunc commented Apr 29, 2019

After upgrading to Qt 5.12.2 all clients got a disconnect after about 5 minutes of connection time. Reproducible with Qt 5.12.2 and Qt 5.12.3, downgrading to 5.12.1 solves this issue.

Not sure if this is a Qt bug or not, so I'm reporting it here first, however it looks to me like a bug in Qt.

murmur.log

Mumble/Murmur 1.3.0_rc1
Archlinux

Edit: Only the Qt version of murmur is relevant, client runs latest Qt just fine with murmur on Qt 5.12.1.

@metsuke0
Copy link

Confirmed this exact issue happens to me and another user using the rc1 client.

@reelsense
Copy link

reelsense commented Jul 15, 2019

Same thing on FreeBSD 11.3-RELEASE with murmur-1.3.0.r1_4 via pkg.

Clients connect and get disconnected instantly with this log entry:

Connection closed: Error while reading: erro r:140E0197:SSL routines:SSL_shutdown:shutdown while in init

Same thing with murmur-1.3.0.r1_5 via ports.

@davidebeatrici davidebeatrici pinned this issue Jul 15, 2019
@davidebeatrici
Copy link
Member

I failed to reproduce the issue on FreeBSD 12.0 (Qt 5.12.2) and Fedora 30 (Qt 5.12.4).

@an0nfunc
Copy link
Author

Just confirmed it's happening with Qt 5.13.0 and murmur 1.3.0-rc2 as well.

Steps to reproduce:

  1. Start murmur
  2. Connect with client (version seems to be unrelated)
  3. Sit in server for about 5 minutes (can vary depending on murmur uptime)
  4. Get disconnected with server-side error message Connection closed: Error while reading: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init [20]

@an0nfunc
Copy link
Author

an0nfunc commented Sep 9, 2019

After some testing it seems like the database used is somehow related. I got no problems after switching the same config from the QMYSQL adapter to sqlite. Same settings otherwise.

Just to confirm, @metsuke0 & @reelsense, what database backend are you using?

EDIT: I'm gonna try and get a SQL log from mariadb for this later today.

@reelsense
Copy link

I'll try to look into this again some time in the next week. I have a feeling it has to do with my large(900MB) sqlite database and murmur 1.3. I moved my murmur server to Ubuntu 18.04 and it runs 1.2.19 PPA.

@reelsense
Copy link

I didn't have time to figure out what dependency was having a problem. So I moved all my mumble servers to Ubuntu 18.04. And since I upgraded them to the new murmur 1.3 release I haven't had a problem.

@davidebeatrici davidebeatrici added bug A bug (error) in the software server priority/P0 - Blocker labels Mar 19, 2020
@davidebeatrici davidebeatrici added this to the 1.3.1 milestone Mar 19, 2020
@davidebeatrici
Copy link
Member

This is a critical issue, we received a private report this week explaining exactly how to trigger it.

We will probably disclose it through a CVE as soon as it's fixed and 1.3.1 is released.

@davidebeatrici davidebeatrici changed the title Connection drops with Qt 5.12.2+ (SSL_shutdown:shutdown while in init) SSL routines:SSL_shutdown:shutdown while in init Mar 19, 2020
@cfstras
Copy link
Contributor

cfstras commented Mar 22, 2020

We've also noticed this trying to build with a new OpenSSL (using the Dockerfile, with the unreleased Ubuntu focal instead of disco).

Interestingly, only some users randomly get kicked out every few minutes.

Example log:

mumble_1       | <W>2020-03-22 21:43:05.034 1 => <34:(-1)> Connection closed: Error during SSL handshake: error:14209102:SSL routines:tls_early_post_process_client_hello:unsupported protocol [13]
mumble_1       | <W>2020-03-22 21:43:05.050 1 => <33:user1234 (12)> Connection closed: Error while reading: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init [20]

@bin
Copy link

bin commented Mar 26, 2020

@davidebeatrici Seeing this issue frequently as well, also with an arch server. Is there any mitigation for this aside from an update, or if not, an ETA on 1.3.1? I haven't seen it on plumble (third-party android app), but have had pymumble bots with the same issue. It also seems to occur with more users more frequently. The server sees clients as ghosts and disconnects them, but client reports the remote host closed the connection. Also mentions being unable to send UDP packets and switching to TCP mode, and as @AnonFuncsAreAwesome mentioned, disconnects usually every five minutes or so (though has disconnected twice in a row within ~15 seconds before).

Update: tested on an alpine server and had the same result.

cfstras pushed a commit to flipdot/mumble-docker that referenced this issue Mar 28, 2020
- don't use the github images for now, because we don't want to authenticate
- use phlak/mumble instead of our own build, until mumble-voip/mumble#3679 is resolved
@cfstras
Copy link
Contributor

cfstras commented Mar 28, 2020

@bin as mentioned above, downgrading Qt on the server should help. Alternatively, you can use a static build, such as the one used in this docker image: https://github.com/PHLAK/docker-mumble

@bin
Copy link

bin commented Mar 28, 2020

@cfstras Do this for client, server, or both?

@cfstras
Copy link
Contributor

cfstras commented Mar 28, 2020

@bin on the server. I wouldn't expect any user of my mumble server to use specific versions of anything...

@bin
Copy link

bin commented Mar 28, 2020

Thanks, so static linking against QT 15.2.1 should be a safe work-around?

@cfstras
Copy link
Contributor

cfstras commented Mar 28, 2020

@bin you will have to try that for yourself ;)

@TredwellGit
Copy link
Contributor

I stumbled upon an astonishingly easy way to trigger this reliably as a denial of service. @davidebeatrici, do you need help tracking down exactly where this issue is? Reason I ask is because I don't know if you are working on this privately and I don't want to waste time debugging this if you already know how to fix it.

@davidebeatrici
Copy link
Member

davidebeatrici commented Apr 2, 2020

Thank you for asking!

We're currently finishing up the new build environment and system, so that all developers can easily build Mumble.

Help is of course appreciated to find the root cause of this issue.

The following function is called right after the warning by OpenSSL:

mumble/src/murmur/Server.cpp

Lines 1424 to 1475 in b54cf63

void Server::connectionClosed(QAbstractSocket::SocketError err, const QString &reason) {
Connection *c = qobject_cast<Connection *>(sender());
if (! c)
return;
if (c->bDisconnectedEmitted)
return;
c->bDisconnectedEmitted = true;
ServerUser *u = static_cast<ServerUser *>(c);
log(u, QString("Connection closed: %1 [%2]").arg(reason).arg(err));
if (u->sState == ServerUser::Authenticated) {
MumbleProto::UserRemove mpur;
mpur.set_session(u->uiSession);
sendExcept(u, mpur);
emit userDisconnected(u);
}
Channel *old = u->cChannel;
{
QWriteLocker wl(&qrwlVoiceThread);
qhUsers.remove(u->uiSession);
qhHostUsers[u->haAddress].remove(u);
quint16 port = (u->saiUdpAddress.ss_family == AF_INET6) ? (reinterpret_cast<sockaddr_in6 *>(&u->saiUdpAddress)->sin6_port) : (reinterpret_cast<sockaddr_in *>(&u->saiUdpAddress)->sin_port);
const QPair<HostAddress, quint16> &key = QPair<HostAddress, quint16>(u->haAddress, port);
qhPeerUsers.remove(key);
if (old)
old->removeUser(u);
}
if (old && old->bTemporary && old->qlUsers.isEmpty())
QCoreApplication::instance()->postEvent(this, new ExecEvent(boost::bind(&Server::removeChannel, this, old->iId)));
if (static_cast<int>(u->uiSession) < iMaxUsers * 2)
qqIds.enqueue(u->uiSession); // Reinsert session id into pool
if (u->sState == ServerUser::Authenticated) {
clearTempGroups(u); // Also clears ACL cache
recheckCodecVersions(); // Maybe can choose a better codec now
}
u->deleteLater();
if (qhUsers.isEmpty())
stopThread();
}

Returning early from the function prevents the client from being disconnected.

It's definitely an issue related to Qt, but I'm not sure whether it's due to a bug or wrong API usage and we're only now seeing the effects due to internal Qt changes.

@TredwellGit
Copy link
Contributor

Found it. In qt/qtbase@93a803a Qt incorrectly calls SSL_shutdown() in OpenSSL mid-handshake. I compiled qt5-base on Arch Linux with that commit reverted and Murmur no longer disconnects clients or shows the error message.

@bin
Copy link

bin commented Apr 2, 2020

@TredwellGit What condition has caused this to trigger for you on the murmur server?

@TredwellGit
Copy link
Contributor

I don't feel that comfortable sharing how to trigger it because it is a very effective denial of service attack and since @davidebeatrici already stated in #3679 (comment) that he wants to wait for it to be fixed.

@davidebeatrici
Copy link
Member

davidebeatrici commented Apr 2, 2020

Found it. In qt/qtbase@93a803a Qt incorrectly calls SSL_shutdown() in OpenSSL mid-handshake. I compiled qt5-base on Arch Linux with that commit reverted and Murmur no longer disconnects clients or shows the error message.

@TredwellGit Good job!

I actually had found that commit too, but didn't investigate further because I mistakenly thought QSslSocketBackendPrivate::destroySslContext() was only called after the session is closed.

From https://www.openssl.org/docs/manmaster/man3/SSL_shutdown.html:

SSL_shutdown() only closes the write direction. It is not possible to call SSL_write() after calling SSL_shutdown(). The read direction is closed by the peer.

The behaviour of SSL_shutdown() additionally depends on the underlying BIO. If the underlying BIO is blocking, SSL_shutdown() will only return once the handshake step has been finished or an error occurred.

If the underlying BIO is non-blocking, SSL_shutdown() will also return when the underlying BIO could not satisfy the needs of SSL_shutdown() to continue the handshake. In this case a call to SSL_get_error() with the return value of SSL_shutdown() will yield SSL_ERROR_WANT_READ or SSL_ERROR_WANT_WRITE. The calling process then must repeat the call after taking appropriate action to satisfy the needs of SSL_shutdown(). The action depends on the underlying BIO. When using a non-blocking socket, nothing is to be done, but select() can be used to check for the required condition. When using a buffering BIO, like a BIO pair, data must be written into or retrieved out of the BIO before being able to continue.

After SSL_shutdown() returned 0, it is possible to call SSL_shutdown() again to wait for the peer's close_notify alert. SSL_shutdown() will return 1 in that case. However, it is recommended to wait for it using SSL_read() instead.

SSL_shutdown() can be modified to only set the connection to "shutdown" state but not actually send the close_notify alert messages, see SSL_CTX_set_quiet_shutdown(3). When "quiet shutdown" is enabled, SSL_shutdown() will always succeed and return 1.

@bin Please send me an email, I will reply with details.

@Krzmbrzl
Copy link
Member

Krzmbrzl commented Apr 2, 2020

Qt incorrectly calls SSL_shutdown() in OpenSSL mid-handshake.

Does this mean this is ultimately a Qt bug? If so has it been reported already? If it has not, we should probably report it in order to get it fixed...

@an0nfunc
Copy link
Author

an0nfunc commented Apr 2, 2020

More interestingly is why only mumble seems to be affected. Is mumble doing something unique with Qt's SSL implementation?

@bendem
Copy link
Contributor

bendem commented Apr 2, 2020

@davidebeatrici
Copy link
Member

davidebeatrici commented Apr 2, 2020

I actually had found that commit too, but didn't investigate further because I mistakenly thought QSslSocketBackendPrivate::destroySslContext() was only called after the session is closed.

Even then, I wouldn't expect other SSL sessions to be affected.

Possible bug in OpenSSL?

OpenSSL's behavior seems to be fine: when returning early from Server::connectionClosed(), preventing the function from effectively disconnecting the user, its session is not affected at all.

Server::connectionClosed() is triggered by Connection::connectionClosed():

connect(u, SIGNAL(connectionClosed(QAbstractSocket::SocketError, const QString &)), this, SLOT(connectionClosed(QAbstractSocket::SocketError, const QString &)));

Connection::connectionClosed() is emitted when QSslSocket::error() is:

connect(qtsSocket, SIGNAL(error(QAbstractSocket::SocketError)), this, SLOT(socketError(QAbstractSocket::SocketError)));

mumble/src/Connection.cpp

Lines 140 to 142 in b54cf63

void Connection::socketError(QAbstractSocket::SocketError err) {
emit connectionClosed(err, qtsSocket->errorString());
}

It seems like Qt is emitting error() in the wrong QSslSocket.

@TredwellGit
Copy link
Contributor

Does this mean this is ultimately a Qt bug? If so has it been reported already? If it has not, we should probably report it in order to get it fixed...

It appears to be a Qt bug. Calling SSL_shutdown() mid-handshake is incorrect and not handling the resulting error is really incorrect. See how NGINX handles this: https://github.com/nginx/nginx/blob/65ae8b315211988a821bdc32050768f41571ddae/src/event/ngx_event_openssl.c#L2732-L2824

More interestingly is why only mumble seems to be affected. Is mumble doing something unique with Qt's SSL implementation?

What other server programs are you aware of that use Qt's TLS implementation?

Even then, I wouldn't expect other SSL sessions to be affected.

This is what is confusing me.

@davidebeatrici davidebeatrici changed the title SSL routines:SSL_shutdown:shutdown while in init OpenSSL error 140E0197 with Qt >= 5.12.2 Apr 4, 2020
@davidebeatrici
Copy link
Member

Workaround in #4032.

@TredwellGit Could you open an issue in Qt Bug Tracker, please?

I can do it too, but you deserve the credits since you're the one who pinpointed the exact line causing the bug.

@Krzmbrzl Krzmbrzl added upstream-bug qt and removed bug A bug (error) in the software server labels Apr 4, 2020
@TredwellGit
Copy link
Contributor

Do you know why Qt emits error() from unrelated QSslSocket(s)? Is this an additional bug?

I will open an issue report, but I am really more concerned about this getting fixed than I am getting credit.

@davidebeatrici
Copy link
Member

I believe it's an additional bug that was already present and didn't cause any issues until qt/qtbase@93a803a.

@davidebeatrici
Copy link
Member

@TredwellGit Did you open an issue in Qt's tracker yet?

@TredwellGit
Copy link
Contributor

https://bugreports.qt.io/browse/QTBUG-83450

@davidebeatrici
Copy link
Member

Excellent, the description clearly explains the issue in detail.

@Krzmbrzl Krzmbrzl unpinned this issue Apr 14, 2020
@davidebeatrici
Copy link
Member

@davidebeatrici
Copy link
Member

Fixed: qt/qtbase@36a8bdb

@Krzmbrzl
Copy link
Member

Any indication in which version this will make it?

@davidebeatrici
Copy link
Member

No, but the fix has been merged in the 5.15, 5.14 and 5.12 branches.

@Krzmbrzl
Copy link
Member

So I guess we can surround our workaround by an #if QT_VERSION < QT_VERSION_CHECK(5, 15, 0), right?

@davidebeatrici
Copy link
Member

Yes, we can.

However, I would also like to disable the workaround for the next 5.12 and 5.14 versions.

@Krzmbrzl
Copy link
Member

However, I would also like to disable the workaround for the next 5.12 and 5.14 versions.

I guess as long as we're checking for the minor version number as well, this should be fine :)

@davidebeatrici
Copy link
Member

Qt 5.12.9: qt/qtbase@36a8bdb
Qt 5.14.3: qt/qtbase@8ddffc6

@an0nfunc
Copy link
Author

an0nfunc commented Jun 8, 2020

Glad that this was resolved 🥇

@davidebeatrici
Copy link
Member

davidebeatrici commented Jun 9, 2020

We requested a CVE ID for Mumble, but MITRE created it for Qt because it's effectively an upstream issue: CVE-2020-13962

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants