-
Notifications
You must be signed in to change notification settings - Fork 17.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crypto/tls: support kernel-provided TLS #44506
Comments
cc @FiloSottile |
I would love to have this happen as well! It's a major use case for L7 load balancers written in golang, and could transparently provide significant performance boosts for a lot of systems (including Kubernetes) |
Can we get some benchmarks and numbers for the performance improvement? My patch linked above might be a good starting point. It's a lot of complexity and it would have to be justified by very good numbers. |
Hi, all I have updated kernel tls support based on @FiloSottile's original code. It now supports more ciphers like AES_GCM_256, AES_CCM_128 and CHACHA20_POLY1305. Code: https://github.com/jim3ma/go/tree/dev.ktls.1.16.3. And I have fixed some kernel issues when in coding: torvalds/linux@974271e, torvalds/linux@d8654f4 In my simple tests, when enable kernel tls, I have got 30% time cost decreased. |
I made some real-world tests with one of our internal applications (CDN node specialised in delivering video segments for DASH and HLS streams).
I compared https vs http, vs http + sendfile and ktls + sendfile. Most of the TLS stuff is working, except TLS 1.3 with Chrome and k6. k6 reports With ktls, the latency is increased - but this can also be related to the difference in the used Go-Versions. The ktls implementation reduces overall CPU usage, around 10%. We'll deploy the Nvidia ConnectX-6 (200 Gbit/s) in our latest hardware setup, and we hope we can use the TLS NIC offloading in the future. https://docs.google.com/spreadsheets/d/1XaiFczae9GLixu__8y2kuKPsw7RGqW9vMDkYxuTLx28/edit#gid=0 |
@totallyunknown If the latency issue is related to the kernel implementation (rule out golang side) we can take a look at kernel side improvements. We've been using the openssl implementation lately so I'll check there as well, but I don't recall extra latency last time I did metrics. Having a golang implementation would be very useful on my side as well. fwiw I'm one of the ktls maintainers on kernel side so we shouldn't have trouble getting improvements there as needed and happy to help where I can to get this moving forward. |
Which version do you test ? I have update some go code for http with ktls. |
@jim3ma Your branch: https://github.com/jim3ma/go/tree/dev.ktls.1.16.3 |
Okay, I will merge some optimized code into this branch tomorrow. |
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
@jim3ma are there any plans to introduce the changes into the Go code? |
Sorry for busy work. I will rebase kTLS code in latest branch and test it again. |
any updates? |
@jim3ma curious about the updates too been checking here @FiloSottile i've been watching ktls progress for golang since you started the blog in 2021. once this is ktls-ed, i believe will be one of golang's greatest milestone ever. |
I did some rough benchmarks late last year where I had Golang call into rust's TLS library via CGO to do the handshake and then handed off the established TCP connection to Golang. I found that the performance (throughput/latency on sustained traffic) ended up being about the same as golang's built-in TLS or slightly worse. I'm not sure why to be honest - maybe I did something wrong? But I would like to see some numbers hopefully from someone else on the actual performance of the kTLS implementation in the linux kernel. |
@ShivanshVij u hv the code for helping to debug? but ktls is better for sure. |
Based on my understanding, kTLS does not magically works, it's used for zero copy, so you have to send a fd through syscall |
Yep - so the implementation was really straight forward.
Start a TCP listener, wait for a TCP connection to get accepted, and read some N bytes from it and send them to `rustls` via CGO. If we needed more bytes the `rustls` library would signal that, otherwise it would give us some bytes to write back to the connection - which we would do in Go by blindly writing the byte slice into the `net.Conn`.
Once the handshake was complete, we'd pull out the required kTLS secrets from the handshake in `rustls`, and then do the required syscalls in Go to tell the kernel that the `fd` that backed the `net.Conn` is a kTLS `fd`.
After that, future reads/writes on the `net.Conn` would result in proper TLS encryption/decryption without any userspace overhead.
|
One more thing - many better network card have crypto-acceleration and this can be accessed by ktls API, so supporting ktls in golang we are able to offload encryption to network card |
@rolandshoemaker and @FiloSottile to work out an API. It sounds like we should work on an API where Go keeps the handshake and then hands off the key so the kernel can do the record layer. |
@FiloSottile and I discussed this, and we wonder if this can be done without any new secret-sharing API at all: if kTLS is good enough, then Go should arrange to use it by default, right? We'd probably also need to add ReadFrom and WriteTo methods to the tls.Conn implementations so that io.Copy goes straight to sendfile, but no new TLS-related API would be needed. Is there a flaw in this thinking? Are there Go or Rust kTLS implementations already that are worth looking at to understand the kernel interaction details? We spent a while reading linux/tls.h but it's not terribly well documented. And are there other operating systems with kTLS that we should look at? |
I believe it would the be the right thing to get kTLS going as a default on supported systems. Having a secret sharing API might be useful for some developers though, maybe something one can meddle with explicitly. Maybe this helps with the kernel interaction. Looking at FreeBSD would probably be a good idea. The implementation seems quite mature. |
Nginx has had support since around 2021. Although I think it just delegates the hard work to OpenSSL. Still might be worth a look here; https://hg.nginx.org/nginx/rev/65946a191197 |
Meddlers can always use reflect and unsafe. No need to add API for them. |
here, some of the unverified and broken ktls on my radar: when's the eta for this? been looking at this thread since 2021. :D @totallyunknown 's doing 165GBits/s on a 400Gbits/s line is really weak. i'm hoping for the performance too. @rsc possible for meddlers to live with one without the alloc/op too? that'll be heaven. talk about zero alloc/op... i really wish arena feature is fully supported as non-experimental. |
This is not kernel TLS. It looks like some TLS secret as Kubernetes secret |
Possibly on the roadmap for 1.24 if we have the time. |
@sprappcom 400G is the future goal. 165 Gbit/s is with 2x100G (NVIDIA Connect-X6 + AMD Rome). |
@totallyunknown ok. 82.5% is impressive. mine can only do 60% on laptop |
It sounds like people are on board for "no new API", implementation on by default once it works, with a GODEBUG like tlskernel=0 to turn off. Do I have that right? |
I have a few concerns about on-by-default: There is a meaningful difference in data being written to the kernel in plaintext vs encrypted, from a debugging, tooling, and even security POV. (I would not claiming there are legitimate threat vectors here, but some people are quite paranoid, and I am not an expert -- so I suspect others might). kTLS may work on a wide-ish range of Linux versions which we can check against, but it doesn't necessarily work well on all of them. https://people.kernel.org/kuba/tls-1-3-rx-improvements-in-linux-5-20 for instance shows there are some very recent critical performance improvements. This makes it tricky to know what is the right bar to implicitly turn this on. Is it the oldest Linux version that supports the features indicated? The oldest one we measured as "fast enough"? What if changes to Go or Linux change whether it is "fast enough", or different kernel or hardware configurations do? For instance, kTLS may be "not fast enough" on Linux 5.0 in general, but I may have a NIC that supports kTLS offload making it suitable even on that version. I won't claim to be a TLS expert, it just feels like there are a tremendous number of variables to consider. I could see maybe in many years there is enough clarity on real world use cases, production testing, etc that we get to the point where we could turn it on by default. However, I don't think that would happen for many years likely, and even when it did I would think it is controversial enough to warrant first class configuration rather than just a GODEBUG. FWIW, around a year ago I did some performance testing of Go+kTLS with some fork I found (sorry, I forget a lot of details at this point). The performance was pretty rough, and, surprisingly, even worse when using |
@howardjohn about the point of which is the first good enough version. We already conditionally use linux features based on the kernel's version in the codebase:
These old benchmarks showed worst latency even when throughput is significantly higher. Lines 879 to 894 in 52ce25b
If this happens again because handshakes will still happen in userland go we don't need to transition right away to KTLS, we can handle the first 1MB (or Xbytes) in userland to keep a healthy time-to-first-byte. |
The fact that deciding when to enable kTLS is hard is not a reason to delegate the choice to the user, but the opposite: it means we should do the research, measure the performance, weight the tradeoffs, and make the judgement calls, so our users don't have to. We're in the business of building a TLS stack, our users are in the business of writing Go programs. More concretely, yes, I think we should figure out which Linux version has "good enough" kTLS, and require that. Users that want kTLS can upgrade their kernel version, and as time passes that will be less and less of a problem. I am not too concerned about weird combinations of old kernels and powerful NICs, if they are common, we'll hear about it and reassess. |
ktls are for extreme users, just go extreme in this order (seriously, i doubt there will be more than 100 proj adopting this coz i'm looking at this thread since 2021)
get it out already by now, pls at go 1.24. make it an experimental feature at least. thx in advance. @rsc @FiloSottile GOEXPERIMENT=ktls those who want edge cases can fork it and do the package they want on their own os. |
Let's keep the GODEBUG name starting with tls like all the other tls names, so tlskernel=1 not ktls=1. It sounds like we all agree not to add new API other than the GODEBUG. With the proposal being just to add the GODEBUG and work on an implementation, I think this is moving towards acceptance. Do I have that right? |
Have all remaining concerns about this proposal been addressed? The proposal is to develop transparent kTLS support behind GODEBUG=tlskernel=1. |
Hi guys, I have a use case you may want to hear. Dropping it here even though I'm not 100% sure kTLS would be the solution, my TLS knowledge is very limited, sorry about that. My use case is seamless application upgrade. My TCP clients use long-lived (weeks or more) connections and allowing software updates without disconnecting them would be a killer. I understand this could very well be achieved without kTLS with lots of knowledge and hacking, but I don't want to do it the hacky way and my readings make me think this use case is not supported by the current crypto/tls API and there is no plan to do so. Am I right? Client TLS session resumption is not an option unfortunately, I have no control on the many possible client software. Hope this helps 😃 |
I wonder if there would be an issue passing the open socket to a separate process if kTLS is involved... my initial research says there wouldn't be, but definitely worth testing. |
I hadn't even imagined passing a TLS enabled socket to another process. That would definitely be useful for me also. Currently we use a separate process to handle TLS and then proxy the connection over unix sockets to another process does the actual work. This causes unwanted resource usage and latency, but allows us to keep the connections open while recycling the worker process. If kTLS could make this whole design a lot simpler, we'd be very happy campers. |
Not giving you a complete solution but do take a look at pidfd_open(2), pidfd_getfd(2), pidfd_send_signal(2), @astrolox, if you wish to avoid unix socket with recent (5.6?, 5.10?) linux kernels. |
I got a proof-of-concept working here. I basically adapted the first attempt from @FiloSottile to up to date Go TLS stack. I also added support for enabling TLS_RX on the socket. (example from Filippo only configured TLS_TX) In the new process, you end up with a net.Conn in which you need to:
I still need to have this run for a long time with lots of data to make sure there is no catch. Potential drawbacks with the new process:
Also, it need support for all ciphers, only implemented it in |
Based on the discussion above, this proposal seems like a likely accept. The proposal is to develop transparent kTLS support behind GODEBUG=tlskernel=1. |
No change in consensus, so accepted. 🎉 The proposal is to develop transparent kTLS support behind GODEBUG=tlskernel=1. |
Lots of background and a implementation, albeit from 3+ years ago: https://blog.filippo.io/playing-with-kernel-tls-in-linux-4-13-and-go/
Basically, Linux now supports handling TLS encryption in the kernel. The primary benefit here is the possibility of
sendfile
/splice
to work with TLS. Currently, we need to choose between TLS andsplice
(or a custom TLS implementation, I suppose).It would be great to have first class support in go for this.
The text was updated successfully, but these errors were encountered: