-
Notifications
You must be signed in to change notification settings - Fork 8k
TLS communication for secure sockets #8814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLS communication for secure sockets #8814
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8814 +/- ##
=======================================
Coverage 52.28% 52.28%
=======================================
Files 196 196
Lines 24768 24768
Branches 5151 5151
=======================================
Hits 12951 12951
Misses 9738 9738
Partials 2079 2079
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, minor style issues
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove empty line, we normally do not have empty line between variable setting and if() statement that checks it.
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove empty line
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove empty line
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space char in the middle of the string
3faa7e2
to
3a16a9a
Compare
@jukkar Thanks, done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed one compile issue in earlier review.
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will give error if MBEDTLS_X509_CRT_PARSE_C is not defined, the function would not return anything in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I've removed the ifdef
and call to mbedtls_ssl_conf_cert_profile
, as we do not verify certs at this point yet, there is no use to set the profile anyway.
Just FYI, this entire function is going to be rewritten, once the credential support is added.
3a16a9a
to
fbceddd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on vacation until Thurs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR!
One general comment...
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that tls has been moved up to the socket layer, and since offloading is still being done at the net_context layer, we may need to do a little extra to support wifi drivers which completely offload the TLS support to the coprocessor.
In a secure socket offloading case, do we need to allocate a TLS context? Maybe (context->tls == NULL) can be the switch to fall back to the non-TLS zsock_ APIs.
Also, there need not be any calls to mbedtls APIs in that case.
There is a CONFIG_NET_OFFLOAD macro which, if not defined, could compile out these TLS calls from sockets.
For example, see CONFIG_NET_OFFLOAD code sections in https://github.com/zephyrproject-rtos/zephyr/blob/master/subsys/net/ip/net_context.c#L261
However, some wifi drivers may not actually handle secure socket offload, so we may need to introduce a new definition (eg: CONFIG_SECURE_SOCKET_OFFLOAD) to compile out the TLS and mbedtls calls from the socket layer.
Then, I imagine secure socket offload would work by:
- setsockopt() calls setting up the socket for TLS, which calls get passed down to net_context_set_opt() - TBD, which offloads to the driver.
- all the rest of the standard zsock_ socket calls proceed down to net_context_* operations as usual, which offloads to the wifi driver (when CONFIG_NET_OFFLOAD is set).
Is this how we see secure socket offload working?
Otherwise, PR looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for looking into this. Let me address the comments inline.
In a secure socket offloading case, do we need to allocate a TLS context? Maybe (context->tls == NULL) can be the switch to fall back to the non-TLS zsock_ APIs.
Also, there need not be any calls to mbedtls APIs in that case.
With TLS offloading TLS context does not need to be allocated, as the context contains mostly mbedTLS data. As for the fallback, that's already implemented. We only allocate TLS context when secure socket is created (i. e. proto type is TLS, for instance IPPROTO_TLS_1_2
). When TLS context is not allocated, we do fallback to regular sockets.
There is a CONFIG_NET_OFFLOAD macro which, if not defined, could compile out these TLS calls from sockets.
However, some wifi drivers may not actually handle secure socket offload, so we may need to introduce a new definition (eg: CONFIG_SECURE_SOCKET_OFFLOAD) to compile out the TLS and mbedtls calls from the socket layer.
Look's like we'd rather need another switch, CONFIG_SECURE_SOCKET_OFFLOAD
, just as you mention. At socket shim layer, it would be enough to block TLS context allocation in case we do support TLS offloading.
net_context would need to accept new protocol types though, so that they can be passed to the offloaded driver. In this solution, we use for instance IPPROTO_TLS_1_2
proto type to create a secure socket. Currently, this information is kept at the socket shim layer though and net_context is not aware of it.
setsockopt() calls setting up the socket for TLS, which calls get passed down to net_context_set_opt() - TBD, which offloads to the driver.
This will be needed as well, for socket configuration. We need a way to set certs/keps/ciphers, and that'd be a job for socket options.
Is this how we see secure socket offload working?
Well, I guess that we have consensus here, that it is doable with CONFIG_NET_OFFLOAD
.
But from Nordic's point of view though, a better solution would be something like proposal from #4821. I saw that PR got closed due to lack of consensus and socket support in application protocols. But as Zephyr's networking subsystem is undergoing transition to socket API being default, and there are more parites interested in the solution (Nordic), perhaps it would be a good idea to bring back the topic? Forcing the transistion between sockets<->net_context<->sockets sounds like a unnecessary overcomplexity to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the replies...
net_context would need to accept new protocol types though, so that they can be passed to the offloaded driver. In this solution, we use for instance IPPROTO_TLS_1_2 proto type to create a secure socket.
Yes, either net_context_get() could allow a new protocol number (non-IANA standard) in enum net_ip_protocol
, or we could pass another option via setsockopt() to promote an open non-secure socket to a secure socket. (The SimpleLink stack can handle either). It may be nice to support both, eventually (if there's a use case).
But from Nordic's point of view though, a better solution would be something like proposal from #4821.
As I understand it, the main blockers for that PR were:
- It bypasses the net_context APIs, so existing protocols built on net_context do not benefit from the socket offload;
- This seems to be getting resolved via the plan to move protocols to sockets, and perhaps by the work to split up net_app into libraries.
- This socket offload method bypasses the Zephyr IP routing tables, so only one network interface is ever bound to the sockets.
- So, for example, packets cannot be routed between network interfaces. Eg: no multi-bearer support (https://en.wikipedia.org/wiki/Multi-bearer_network).
- Any other potential uses of the routing table would also be bypassed. Eg: to bind a socket to a particular network interface.
- This actually is still OK for our customer's typical use case: an IoT edge device connected via a single network interface (WiFi device). But, it's a constraint that would need to be known/accepted building for a pure "socket offload" device.
perhaps it would be a good idea to bring back the topic?
Given there is interest from at least two important vendors of IoT devices (TI, Nordic), and one of the major issues is getting resolved (protocols migrating to sockets), it seems it may be time to do so.
include/net/socket.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit is nice, clean and easy to review, I can only +1 it. But I think that commit message should emphasize that the purpose of this commit is to establish "switching infrastructure", but the actual implementation in this commit is null, just redirects to normal socket calls.
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is not part of this commit, and following commit removes this #include (apparently, the decision was made to make "sockets_tls.h" contents inline). So, I guess it makes sense to give another thought whether it should be inline or standalone, and if the former, then remove this line right from this commit (i.e. make each commit buildable across e.g. bisecting).
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EBADF is the error for such a case, see e.g. http://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EBADF
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this should be atomic, right? So, mutex locking shouldn't be required.
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, given EBADF comments above, the way to handle errors might be more involved. It should finalize both TLS context and socket, but if any returned error, it should return it too (with errno set appropriately). If both error out, it's up to you to choose which prevails ;-).
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth adding TODO that this won't work for non-blocking sockets.
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, in a sense, K_FOREVER contradicts here a case where is_block == false. Overall, all this stuff adds a lot of complication for unknown benefit.
subsys/net/lib/sockets/sockets_tls.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why "poll-like" instead of the proper poll? Overall, all this stuff adds a lot of complication for unknown benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally I've implemented it with poll
, which indeed was simpler. It didn't work-out though as poll
isn't reentrant (uses static event array), so running two separate sockets on two separate threads simply wasn't possible.
I should start with saying that I very much appreciate this refactor - it's more than I could expect. I made some comments, which I think either should be addressed (like make each commit buildable) or responded too. But here's one thing I find an overcomplication and problematic:
I didn't understand what's the problem in the discussion in #7118, and I don't understand it now. I guess, to approach in general manner, we should start with conservative assumption:
Then, we should challenge that by trying to find explicit specification in POSIX where it mandates doing it on socket (i.e. OS) level instead. This specification would include:
Let's consider case 1. Among the domain of possibilities there would be: a) "first" call finishes and gets as much data as possible, only then "second" call proceeds; b) they interleave data received in particular pattern; c) they interleave data in arbitrary pattern. Well, we know that POSIX already allows to return less bytes bytes from read()/recv() call than requested. That's logically entails choice c), and there's no need to specify that explicitly (indeed, I bet that won't find any spec like above in POSIX). But then, calling recv() concurrently without additional application-aware synchronization is absolutely useless for the apps. So, suggestion: from "net: tls: Handle TLS socket send and recv", a) remove "struct k_mutex mbedtls_lock;" b) remove implement blocking-ops-using-non-blocking ops trickery. Each of these changes should be a subject of a separate PR (in that order), with specific arguments given for why it is needed and what drawbacks with that. For example, even if we added "struct k_mutex mbedtls_lock", any well-behaving application (if you call an app which shares socket among threads "well-behaving", 95% of apps will never do that in the first place) would need to use its own mutex in addition to that, so we'll have double-locking which obviously wastes resources. Is there any benefit can be found in having "struct k_mutex mbedtls_lock" then? Yeah, with some vigour it can be found, but then it should be written down in the commit message and discussed from all sides. (OTOH, I can assure that there's no, and can't be benefit in emulating blocking calls using non-blocking - unless the exact no-nonsense code showing otherwise is provided.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure that points in #8814 (comment) are properly considered, let me give this a temporary -1 to avoid premature merge. Thanks.
@pfalcon Thank you for the feedback. Regarding mutex protection, for me it's fine to protect the socket (and therefore mbedTLS) access at application level. But as this protection was added in reply to explicit concern from @d3zd3z, I'd like to hear from him that's he's fine with it as well. Notheless, I'll split the mutex-related stuff to a separate commit(s), so they could be easily removed in case we agree on that. So @d3zd3z I'd like to hear your opinion here. As for the other stuff, I'll just fix it and update PR soon, nothing to disagree there. |
That's why I suggest to split off mutex introduction, etc. to separate focused PR(s), while in the meantime we can merge no-concurrent-socket-access case with this PR ASAP. |
Add switch to a socket layer that will enable switching socket API to TLS secure sockets. At this point there is no secure sockets implementation, so secure socket calls redirect to regular socket calls. Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
Add tls_context structure that stored data required by TLS socket implementation. This structure is allocated from global pool during socket creation and freed during socket closure. Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
Add entropy source for mbedTLS. If no entropy driver is available, use non-secure, software entropy source. Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
Add mbedTLS logging function to enable logs from mbedTLS. Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
Implement TLS handshake handling in socket connect/accept functions. Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
fbceddd
to
a47c871
Compare
Implement socket recv/recvfrom and send/sendto functions. Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
Implement socket poll function for TLS socket. In addition to regular poll checks, we have to check if there is some decrypted data pending on mbedTLS. Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
Add config file that enables to run http_get and big_http_download samples with TLS enabled and receive the data through HTTPS. Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
a47c871
to
5e169ff
Compare
I've applied fixes and extracted mbedTLS mutex implementation to a separate branch. If there are voices that we need mutex at socket level, I can submit another PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think this is the good starting point to be merged, and other things can implemented/addressed in follow-up PRs.
One question I'd still have is about poll() implementation, but I don't want to complicate matters by requesting splitting it out to another PR, let's go with it as it is for now.
This PR is a follow up of #7118.
It provides an alternative TLS implementation for secure sockets, implemented at socket shim layer instead of net_context layer.
Most of this code was rewritten, but I've kept in mind comments from #7118 wherever applicable. Especially, I've added mutex protection for TLS context allocation and mbedTLS calls.
To avoid long mbedTLS mutex lockups during
mbedtls_ssl_read/mbedtls_ssl_write
, underlying socket works in a non-blocking manner, and the actual blocking takes place outside of these calls (as suggested by @d3zd3z).To avoid bloating of this PR, it contains only TLS communication part. It enables to create TLS socket and establish TLS connection, yet it does not provide socket options to configure the TLS session (credentials, ciphersuites etc.), therefore the communication cannot be considered secure yet.
What is not added into this PR, but will be submited in consecutive PR's in the near future:
As a proof of concept, I've included two initially ported samples (
http_get
andbig_http_download
). They do use TLS for communication, but do not verify certs.