Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DTLSv1_listen() regression on 1.1.x #6934

Open
richardweinberger opened this issue Aug 13, 2018 · 30 comments
Open

DTLSv1_listen() regression on 1.1.x #6934

richardweinberger opened this issue Aug 13, 2018 · 30 comments
Assignees
Labels
branch: master Merge to master branch triaged: feature The issue/pr requests/adds a feature
Milestone

Comments

@richardweinberger
Copy link

The rewrite of DTLSv1_listen() makes multi client DTLS server impossible on OpenSSL 1.1.x.
DTLSv1_listen() peeks into the kernel queue, but does not consume the client hello. Therefore
the common approach, creating a new socket for the client and off loading it, is doomed to failure since the hello is pending and the application logic thinks a new client approached.

This PR seems to address the regression but sadly didn't get much attention yet:
#5024

For more details see my mail:
https://mta.openssl.org/pipermail/openssl-users/2018-August/008498.html

@mattcaswell mattcaswell added this to the Assessed milestone Aug 13, 2018
@levitte levitte added 1.1.0 branch: 1.1.1 Merge to OpenSSL_1_1_1-stable branch labels Aug 29, 2018
@richardweinberger
Copy link
Author

Is there anything I can do?

@mattcaswell mattcaswell self-assigned this Sep 9, 2018
@mattcaswell
Copy link
Member

All our recent focus has been on getting the 1.1.1 release out the door - which (all being well) should happen on Tuesday. I'm hoping to get some cycles to look at this after that.

@richardweinberger
Copy link
Author

kind ping

I don't expect you to fix the problem, just have a look at it and comment on the PR, please.
I'm willing to prepare a fix myself, but I need to know how you want it to be resolved.

@mattcaswell
Copy link
Member

Oops. Thanks for the reminder and your patience. I've been working on this today. I think I have a fix, but it needs more work yet.

@richardweinberger
Copy link
Author

Oh wow! If you have something to test, just throw it at me! :-)

mattcaswell added a commit to mattcaswell/openssl that referenced this issue Oct 9, 2018
Previously when a ClientHello arrives with a valid cookie using
DTLSv1_listen() we only "peeked" at the message and left it on the
underlying fd. This works fine for single threaded applications but for
multi-threaded apps this does not work since the fd is typically reused for
the server thread, while a new fd is created and connected for the client.
By "peeking" we leave the message on the server fd, and consequently we
think we've received another valid ClientHello and so we create yet another
fd for the client, and so on until we run out of fds.

In this new approach we remove the ClientHello and buffer it in the SSL
object.

Fixes openssl#6934
@mattcaswell
Copy link
Member

Please try out the fix in #7375.

In the email you linked to you mentioned the dtls_udp_echo.c program. I've tried using this with the patched version. It works fine on FreeBSD. However it doesn't seem to work on Linux. I also tried the pre-rewrite version of DTLSv1_listen() in OpenSSL 1.0.2 and that didn't work on Linux either. Most of the time it fails to complete the handshake. Every now and then it all goes through smoothly.

AFAICT linux seems to have a problem with creating multiple fds bound to the same address. So if you do this:

mainfd = socket(...);
/* Set mainfd socket opts to reuse addr and port */
bind(mainfd, ...);
/* Set mainfd in the read/write BIOs */
while (DTLSv1_listen(s, caddr) <= 0);
/* now create fd for this peer */
peerfd = socket(...);
/* Set peerfd socket opts to reuse addr and port */
bind(peerfd, ...);
connect(peerfd, ...);
/* Set peerfd as the fd for the BIOs */

Then peerfd will (most of the time) never receive any data. Sometimes it does and everything goes through smoothly. If however you do this:

mainfd = socket(...);
/* Set mainfd socket opts to reuse addr and port */
bind(mainfd, ...);
/* Set mainfd in the read/write BIOs */
while (DTLSv1_listen(s, caddr) <= 0);
/* CLOSE THE MAIN FD FIRST!!! */
close(mainfd);
/* now create fd for this peer */
peerfd = socket(...);
/* Set peerfd socket opts to reuse addr and port */
bind(peerfd, ...);
connect(peerfd, ...);
/* Set peerfd as the fd for the BIOs */

Then it works everytime. Obviously though that isn't very helpful ;-)

That's what happens on my machine anyway. It looks a bit like a kernel bug. Or possibly (more likely) me be stupid somehow.

@mattcaswell
Copy link
Member

Looks like its a linux load balancing feature:

https://lwn.net/Articles/542629/

Which leaves me wondering how you are supposed to do this on Linux.

@richardweinberger
Copy link
Author

I don't understand, why do you need SO_REUSEPORT?
SO_REUSEADDR should work just fine.

@mattcaswell
Copy link
Member

Just doing SO_REUSEADDR has the same issue.

@richardweinberger
Copy link
Author

Give me a day or two such I can test it and talk to Linux netdev folks in case this is a kernel issue.

@mattcaswell
Copy link
Member

mattcaswell commented Oct 9, 2018

Just doing SO_REUSEADDR has the same issue.

PBCAK. I take it back (I had a toy reproducer where it didn't make a difference...but looks like there is another problem with that). Commenting out the SO_REUSEPORT code from the dtls_udp_echo.c program solves the issue. I suspect when that code was written (a long time ago) SO_REUSEPORT support was not in the linux kernel so this conditionally compiled code never got run on linux, and it worked just fine. SO_REUSEPORT now is available on Linux but doesn't work the way dtls_udp_echo.c expects it to.

Anyway problem solved.

@mattcaswell
Copy link
Member

Grr...I meant SO_REUSEPORT not SO_REUSEADDR...I edited the above comment accordingly.

levitte pushed a commit that referenced this issue Oct 19, 2018
Previously when a ClientHello arrives with a valid cookie using
DTLSv1_listen() we only "peeked" at the message and left it on the
underlying fd. This works fine for single threaded applications but for
multi-threaded apps this does not work since the fd is typically reused for
the server thread, while a new fd is created and connected for the client.
By "peeking" we leave the message on the server fd, and consequently we
think we've received another valid ClientHello and so we create yet another
fd for the client, and so on until we run out of fds.

In this new approach we remove the ClientHello and buffer it in the SSL
object.

Fixes #6934

Reviewed-by: Ben Kaduk <kaduk@mit.edu>
(Merged from #7375)

(cherry picked from commit 079ef6b)
@mattcaswell
Copy link
Member

This is now fixed in master and 1.1.1 (1.1.0 is only receiving security fixes now).

@neoxic
Copy link

neoxic commented Dec 28, 2019

I hope it's ok to comment on this issue, but the advertised approach to create a connected a socket like:

bind(peerfd, ...);
connect(peerfd, ...);

is totally flawed because this operation is not atomic.

The window of opportunity between bind() and connect() makes peerfd briefly capable to receive unfiltered messages, e.g. ClientHellos from other peers. As far as I can see, what happens next is that those messages will get dropped by SSL_read() due to failed consistency checks. Will this make SSL_read() signal an unrecoverable SSL error or will it silently carry on?

Yes, one may say that UDP is unreliable by definition, but artificially dropping packets when they have already reached their destination looks fishy at best.

Unfortunately, there's no way to bind a socket both to local and remote addresses atomically on a modern system (except for macOS which has connectx() now). The reliable solution to this problem is to be always ready to receive packets from other peers on connected sockets like peerfd by peeking into them using recvfrom() and comparing their source address to the one that the socket is connected to. In particular, this means that when a packet is received from an unfamiliar peer, it must be treated as a ClientHello request.

I have this approach running reliably without DTLS and now need to implement it for DTLS. The problem with DTLSv1_listen() is that it reads datagrams until EAGAIN and thus is not suitable for processing one request at a time as far as I can see. This makes it impossible to process ClientHellos on a connected socket. Am I missing something?

@mattcaswell
Copy link
Member

This makes it impossible to process ClientHellos on a connected socket.

Maybe I'm misunderstanding what you wrote - but why would you want to use DTLSv1_listen() on a connected socket? It's whole purpose is to listen on an unconnected socket.

@neoxic
Copy link

neoxic commented Dec 28, 2019

Thanks for your quick reply! I guess the best way to illustrate the problem is to provide a code snippet.

Please check out this gist: https://gist.github.com/neoxic/0d9314ec756d37ca4303bced49b94543

@neoxic
Copy link

neoxic commented Dec 28, 2019

Again in a nutshell, peerfd will receive ClientHellos from other peers for some short time under certain load conditions due to a race between bind() and connect().

How will OpenSSL handle this situation? What error will SSL_read() return for a ClientHello from an arbitrary peer on a peerfd? Will it be silently skipped or will it demand that the underlying connection be closed due to having beed compromised? What can be done to mitigate this problem, in particular, is it possible to process ClientHellos on peerfd?

@mattcaswell
Copy link
Member

How will OpenSSL handle this situation? What error will SSL_read() return for a ClientHello from an arbitrary peer on a peerfd?

I believe it will just be silently skipped.

What can be done to mitigate this problem, in particular, is it possible to process ClientHellos on peerfd?

Not at the moment...at least not with any kind of libssl support. I suppose you could do something application side to check for this, but it wouldn't be pretty. I'm not sure what a good libssl solution would like like. libssl is inherently connection based.

hmmmm.....I wonder if you could have some kind of multiplexing BIO the worked with an unconnected fd that had a set of associated child BIOs (one for each peer) which were effectively "connected". The job of the parent multiplexing BIO would be to forward packets to the right child BIO based on the source of the packet.

@neoxic
Copy link

neoxic commented Dec 28, 2019

Writing a custom BIO is always an option indeed. Though it doesn't look like an easy solution for anyone who just wants to implement reliable DTLS support.

The other way around this problem would be a DTLSv1_listen() which could process only one ClientHello request at a time. On my side, I could then check the source address of each packet and schedule either SSL_read() or DTLSv1_listen(). It could have a different name though, let's say, DTLSv1_process_hello() which would be used internally in DTLSv1_listen().

@richardweinberger
Copy link
Author

Thanks for bringing up this topic @neoxic!
I wonder how other DTLS capable SSL libraries handle this case?

@mattcaswell
Copy link
Member

Writing a custom BIO is always an option indeed. Though it doesn't look like an easy solution for anyone who just wants to implement reliable DTLS support.

I was thinking more of providing such a BIO in libssl.

The other way around this problem would be a DTLSv1_listen() which could process only one ClientHello request at a time.

This could possibly be achieved with DTLSv1_listen() as it is and using a mem BIO, i.e. you create a mem bio, and then peek at the packets in your real fd. If they should be handled by DTLSv1_listen(), then read them out and push them into the mem bio. DTLSv1_listen() then only ever reads packets from the mem bio.

@kroeckx
Copy link
Member

kroeckx commented Dec 28, 2019

I assume that we'd still have a main "listen" socket. Either that should also push it in that new bio, or DTLSv1_listen() should look at both.

@kroeckx kroeckx reopened this Dec 28, 2019
@neoxic
Copy link

neoxic commented Dec 29, 2019

I wonder how other DTLS capable SSL libraries handle this case?

I can speak only for MbedTLS which handles TLS/DTLS very consistently using a similar yet much more simple concept of basic I/O (bio) over a session, connection or whatever you call it. Please note that MbedTLS is also connection oriented, but what differs is that MbedTLS gives full control to the user over how a (D)TLS session is handled by allowing to process a handshake and a shutdown for a session in progressions, not just until the user's intervention is unavoidable (WANT_READ/WANT_WRITE). Important stages of mbedtls_handshake() return an intermediate result, thus a (D)TLS handshake can be performed in a successive manner. The same could be easily achieved in OpenSSL, for instance, if DTLSv1_listen() returned intermediate results instead of reading/writing until EAGAIN.

So, instead of handling certain stages of a handshake on the BIO level by messing around with custom multiplexing, peeking into packets, etc., it is much easier to just let the calling party know exactly what needs to be scheduled/done next by returning corresponding results from SSL_do_handshake() for example.

MbedTLS also has "demo" mbedtls_net_* functions which serve illustrative purpose and shouldn't be used in production. For DTLS, those functions have the same flaw (even worse) when a new DTLS connection is created. The new connection is created from a "main" file descriptor by first peeking into a packet with recvfrom() and then calling connect() on it effectively converting it into a session. A new "main" file descriptor is then created by calling bind() to an address fetched via getsockaddr() on the old one. This leads to a much wider window of opportunity where packets not only get lost when no socket is bound, i.e. between connect() and bind(), but they also get lost for the same reason as described above after connect() was called on a session socket (queued packets from other peers). But again, the above doesn't cause any problems because the demo functions serve educative purpose.

@mattcaswell
Copy link
Member

I assume that we'd still have a main "listen" socket. Either that should also push it in that new bio, or DTLSv1_listen() should look at both.

Needs a bit of thought to try and figure out exactly how that would work.

@neoxic
Copy link

neoxic commented Dec 30, 2019

The other way around this problem would be a DTLSv1_listen() which could process only one ClientHello request at a time.

This could possibly be achieved with DTLSv1_listen() as it is and using a mem BIO, i.e. you create a mem bio, and then peek at the packets in your real fd. If they should be handled by DTLSv1_listen(), then read them out and push them into the mem bio. DTLSv1_listen() then only ever reads packets from the mem bio.

I'll stick to this approach for now. Thanks, Matt!

@neoxic
Copy link

neoxic commented Jan 1, 2020

Unfortunately, the mem BIO trick doesn't work with DTLSv1_listen() because a mem bio doesn't know what BIO_dgram_set_peer() is. Consequently, I can't generate a cookie in a cookie callback because I can't get a client's address.

Also, I was quite surprised to discover that there's no way to pass an "opaque user pointer" to a callback in OpenSSL to circumvent the above issue. The s_server.c example uses a static variable ourpeer to pass an address to SSL_stateless() callbacks, wow!

There seems to be no other way but to implement my own custom BIO to solve this issue. OpenSSL's API looks more and more outlandish with each passing day.

@mattcaswell
Copy link
Member

Also, I was quite surprised to discover that there's no way to pass an "opaque user pointer" to a callback in OpenSSL to circumvent the above issue.

Many callbacks do allow that - but apparently not the cookie gen callback. However it does pass a pointer to the SSL. You could use SSL_set_app_data() and SSL_get_app_data() to associate and retrieve arbitrary data with the SSL object - which you can then get at from your callback.

@neoxic
Copy link

neoxic commented Jan 2, 2020

Oh, I didn't know about SSL_set_app_data() and SSL_get_app_data(), thanks! They could still make the mem bio trick work.

@t8m t8m added branch: master Merge to master branch and removed branch: 1.1.0 labels May 12, 2021
@t8m t8m removed this from the 1.1.1a milestone May 12, 2021
@t8m t8m added triaged: feature The issue/pr requests/adds a feature and removed branch: 1.1.1 Merge to OpenSSL_1_1_1-stable branch labels Jul 19, 2021
@t8m t8m added this to the Post 3.0.0 milestone Jul 19, 2021
@t8m
Copy link
Member

t8m commented Jul 19, 2021

Improving the DTLSv1_listen() situation would be rather a feature as it would require API changes and additions.

@tlsalmin
Copy link

tlsalmin commented Jun 9, 2023

Just wanted to drop in and thank @mattcaswell on the memory BIO hint as that solved accepting multiple simultaneous connections for me. The full solution maybe worth a writeup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch: master Merge to master branch triaged: feature The issue/pr requests/adds a feature
Projects
None yet
Development

No branches or pull requests

8 participants