Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto/tls: support kernel-provided TLS #44506

Open
howardjohn opened this issue Feb 22, 2021 · 48 comments
Open

crypto/tls: support kernel-provided TLS #44506

howardjohn opened this issue Feb 22, 2021 · 48 comments
Labels
Proposal Proposal-Accepted Proposal-Crypto Proposal related to crypto packages or other security issues
Milestone

Comments

@howardjohn
Copy link
Contributor

Lots of background and a implementation, albeit from 3+ years ago: https://blog.filippo.io/playing-with-kernel-tls-in-linux-4-13-and-go/

Basically, Linux now supports handling TLS encryption in the kernel. The primary benefit here is the possibility of sendfile/splice to work with TLS. Currently, we need to choose between TLS and splice (or a custom TLS implementation, I suppose).

It would be great to have first class support in go for this.

@seankhliao seankhliao changed the title crypto/tls: support Kernal TLS proposal: crypto/tls: support Kernal TLS Feb 22, 2021
@seankhliao seankhliao added Proposal Proposal-Crypto Proposal related to crypto packages or other security issues labels Feb 22, 2021
@gopherbot gopherbot added this to the Proposal milestone Feb 22, 2021
@seankhliao
Copy link
Member

cc @FiloSottile

@rsc rsc changed the title proposal: crypto/tls: support Kernal TLS proposal: crypto/tls: support kernel-provided TLS Feb 24, 2021
@ShivanshVij
Copy link

I would love to have this happen as well! It's a major use case for L7 load balancers written in golang, and could transparently provide significant performance boosts for a lot of systems (including Kubernetes)

@FiloSottile
Copy link
Contributor

Can we get some benchmarks and numbers for the performance improvement? My patch linked above might be a good starting point. It's a lot of complexity and it would have to be justified by very good numbers.

@jim3ma
Copy link

jim3ma commented Nov 29, 2021

Hi, all

I have updated kernel tls support based on @FiloSottile's original code. It now supports more ciphers like AES_GCM_256, AES_CCM_128 and CHACHA20_POLY1305.

Code: https://github.com/jim3ma/go/tree/dev.ktls.1.16.3.

And I have fixed some kernel issues when in coding: torvalds/linux@974271e, torvalds/linux@d8654f4

In my simple tests, when enable kernel tls, I have got 30% time cost decreased.

@totallyunknown
Copy link

I made some real-world tests with one of our internal applications (CDN node specialised in delivering video segments for DASH and HLS streams).

  • Kernel 5.13.12
  • Curve: prime256v1

I compared https vs http, vs http + sendfile and ktls + sendfile.

Most of the TLS stuff is working, except TLS 1.3 with Chrome and k6. k6 reports tls: oversized record received with length 62464.

With ktls, the latency is increased - but this can also be related to the difference in the used Go-Versions.

The ktls implementation reduces overall CPU usage, around 10%. We'll deploy the Nvidia ConnectX-6 (200 Gbit/s) in our latest hardware setup, and we hope we can use the TLS NIC offloading in the future.

https://docs.google.com/spreadsheets/d/1XaiFczae9GLixu__8y2kuKPsw7RGqW9vMDkYxuTLx28/edit#gid=0

@jrfastab
Copy link

@totallyunknown If the latency issue is related to the kernel implementation (rule out golang side) we can take a look at kernel side improvements. We've been using the openssl implementation lately so I'll check there as well, but I don't recall extra latency last time I did metrics. Having a golang implementation would be very useful on my side as well. fwiw I'm one of the ktls maintainers on kernel side so we shouldn't have trouble getting improvements there as needed and happy to help where I can to get this moving forward.

@jim3ma
Copy link

jim3ma commented Feb 15, 2022

I made some real-world tests with one of our internal applications (CDN node specialised in delivering video segments for DASH and HLS streams).

  • Kernel 5.13.12
  • Curve: prime256v1

I compared https vs http, vs http + sendfile and ktls + sendfile.

Most of the TLS stuff is working, except TLS 1.3 with Chrome and k6. k6 reports tls: oversized record received with length 62464.

With ktls, the latency is increased - but this can also be related to the difference in the used Go-Versions.

The ktls implementation reduces overall CPU usage, around 10%. We'll deploy the Nvidia ConnectX-6 (200 Gbit/s) in our latest hardware setup, and we hope we can use the TLS NIC offloading in the future.

https://docs.google.com/spreadsheets/d/1XaiFczae9GLixu__8y2kuKPsw7RGqW9vMDkYxuTLx28/edit#gid=0

Which version do you test ? I have update some go code for http with ktls.

@totallyunknown
Copy link

@jim3ma
Copy link

jim3ma commented Feb 15, 2022

@jim3ma Your branch: https://github.com/jim3ma/go/tree/dev.ktls.1.16.3

Okay, I will merge some optimized code into this branch tomorrow.

@kkkygytb

This comment was marked as duplicate.

@kolinfluence

This comment was marked as duplicate.

@VirrageS
Copy link

@jim3ma are there any plans to introduce the changes into the Go code?

@jim3ma
Copy link

jim3ma commented Jul 29, 2023

@jim3ma are there any plans to introduce the changes into the Go code?

Sorry for busy work. I will rebase kTLS code in latest branch and test it again.

@zyxkad
Copy link
Contributor

zyxkad commented Mar 18, 2024

any updates?

@ouvaa
Copy link

ouvaa commented Mar 28, 2024

@jim3ma curious about the updates too

been checking here
https://github.com/0-haha/gnet-tls-go1-20/
and ref:
panjf2000/gnet#534

@FiloSottile i've been watching ktls progress for golang since you started the blog in 2021.
this is sort of the final huge golang performance benchmark penalty ever.

once this is ktls-ed, i believe will be one of golang's greatest milestone ever.

@ShivanshVij
Copy link

I did some rough benchmarks late last year where I had Golang call into rust's TLS library via CGO to do the handshake and then handed off the established TCP connection to Golang.

I found that the performance (throughput/latency on sustained traffic) ended up being about the same as golang's built-in TLS or slightly worse.

I'm not sure why to be honest - maybe I did something wrong? But I would like to see some numbers hopefully from someone else on the actual performance of the kTLS implementation in the linux kernel.

@ouvaa
Copy link

ouvaa commented Mar 28, 2024

@ShivanshVij u hv the code for helping to debug? but ktls is better for sure.

@zyxkad
Copy link
Contributor

zyxkad commented Mar 29, 2024

I did some rough benchmarks late last year where I had Golang call into rust's TLS library via CGO to do the handshake and then handed off the established TCP connection to Golang.

Based on my understanding, kTLS does not magically works, it's used for zero copy, so you have to send a fd through syscall

@ShivanshVij
Copy link

ShivanshVij commented Mar 29, 2024 via email

@kanocz
Copy link

kanocz commented Mar 29, 2024

One more thing - many better network card have crypto-acceleration and this can be accessed by ktls API, so supporting ktls in golang we are able to offload encryption to network card
so please don't compare only software encryption in golang vs software encryption in kernel - it's not so relevant for many production environments

@rsc
Copy link
Contributor

rsc commented Jun 5, 2024

@rolandshoemaker and @FiloSottile to work out an API. It sounds like we should work on an API where Go keeps the handshake and then hands off the key so the kernel can do the record layer.

@rsc
Copy link
Contributor

rsc commented Jun 11, 2024

@FiloSottile and I discussed this, and we wonder if this can be done without any new secret-sharing API at all: if kTLS is good enough, then Go should arrange to use it by default, right? We'd probably also need to add ReadFrom and WriteTo methods to the tls.Conn implementations so that io.Copy goes straight to sendfile, but no new TLS-related API would be needed.

Is there a flaw in this thinking?

Are there Go or Rust kTLS implementations already that are worth looking at to understand the kernel interaction details? We spent a while reading linux/tls.h but it's not terribly well documented.

And are there other operating systems with kTLS that we should look at?

@4xoc
Copy link

4xoc commented Jun 11, 2024

I believe it would the be the right thing to get kTLS going as a default on supported systems. Having a secret sharing API might be useful for some developers though, maybe something one can meddle with explicitly.

Maybe this helps with the kernel interaction.

Looking at FreeBSD would probably be a good idea. The implementation seems quite mature.

@astrolox
Copy link

Nginx has had support since around 2021. Although I think it just delegates the hard work to OpenSSL. Still might be worth a look here; https://hg.nginx.org/nginx/rev/65946a191197

@rsc
Copy link
Contributor

rsc commented Jun 12, 2024

I believe it would the be the right thing to get kTLS going as a default on supported systems. Having a secret sharing API might be useful for some developers though, maybe something one can meddle with explicitly.

Meddlers can always use reflect and unsafe. No need to add API for them.

@sprappcom
Copy link

sprappcom commented Jun 14, 2024

here, some of the unverified and broken ktls on my radar:
https://github.com/0-haha/gnet-tls-go1-20/blob/dev/ktls_linux.go
https://github.com/soluble-ai/go-ktls/blob/master/ktls.go

when's the eta for this? been looking at this thread since 2021. :D

@totallyunknown 's doing 165GBits/s on a 400Gbits/s line is really weak. i'm hoping for the performance too.

@rsc possible for meddlers to live with one without the alloc/op too? that'll be heaven.

talk about zero alloc/op... i really wish arena feature is fully supported as non-experimental.

@harshavardhana
Copy link
Contributor

https://github.com/soluble-ai/go-ktls/blob/master/ktls.go

This is not kernel TLS. It looks like some TLS secret as Kubernetes secret

@rolandshoemaker
Copy link
Member

Possibly on the roadmap for 1.24 if we have the time.

@totallyunknown
Copy link

totallyunknown commented Jun 17, 2024

@totallyunknown 's doing 165GBits/s on a 400Gbits/s line is really weak. i'm hoping for the performance too.

@sprappcom 400G is the future goal. 165 Gbit/s is with 2x100G (NVIDIA Connect-X6 + AMD Rome).

@sprappcom
Copy link

sprappcom commented Jun 19, 2024

@totallyunknown ok. 82.5% is impressive. mine can only do 60% on laptop

@rsc
Copy link
Contributor

rsc commented Jun 20, 2024

It sounds like people are on board for "no new API", implementation on by default once it works, with a GODEBUG like tlskernel=0 to turn off.

Do I have that right?

@howardjohn
Copy link
Contributor Author

I have a few concerns about on-by-default:

There is a meaningful difference in data being written to the kernel in plaintext vs encrypted, from a debugging, tooling, and even security POV. (I would not claiming there are legitimate threat vectors here, but some people are quite paranoid, and I am not an expert -- so I suspect others might).

kTLS may work on a wide-ish range of Linux versions which we can check against, but it doesn't necessarily work well on all of them. https://people.kernel.org/kuba/tls-1-3-rx-improvements-in-linux-5-20 for instance shows there are some very recent critical performance improvements. This makes it tricky to know what is the right bar to implicitly turn this on. Is it the oldest Linux version that supports the features indicated? The oldest one we measured as "fast enough"? What if changes to Go or Linux change whether it is "fast enough", or different kernel or hardware configurations do? For instance, kTLS may be "not fast enough" on Linux 5.0 in general, but I may have a NIC that supports kTLS offload making it suitable even on that version.

I won't claim to be a TLS expert, it just feels like there are a tremendous number of variables to consider. I could see maybe in many years there is enough clarity on real world use cases, production testing, etc that we get to the point where we could turn it on by default. However, I don't think that would happen for many years likely, and even when it did I would think it is controversial enough to warrant first class configuration rather than just a GODEBUG.

FWIW, around a year ago I did some performance testing of Go+kTLS with some fork I found (sorry, I forget a lot of details at this point). The performance was pretty rough, and, surprisingly, even worse when using splice which is what should be the big win. I wouldn't put too much weight on that given the vague claims + age + unofficial implementation, but something to keep an eye on.

@Jorropo
Copy link
Member

Jorropo commented Jun 21, 2024

@howardjohn about the point of which is the first good enough version.
I don't see why we shouldn't default to using it with a linux version where linux's software implementation is good enough to match / beat go's one because at worst performance is similar but you can now sendfile and friends.
5.0 being slow is not a good reason to arbitrary slow down 5.20 (or whatever the threshold is).
We can make a ternary option with GODEBUG=tlskernel=always for the users on 5.0 that happen to have compatible hardware.

We already conditionally use linux features based on the kernel's version in the codebase:

return major > 5 || (major == 5 && minor >= 3)

These old benchmarks showed worst latency even when throughput is significantly higher.
However this is already a problem in the TLS protocol due to AEAD framing tradeoffs and here is our strategy against it:

go/src/crypto/tls/conn.go

Lines 879 to 894 in 52ce25b

// maxPayloadSizeForWrite returns the maximum TLS payload size to use for the
// next application data record. There is the following trade-off:
//
// - For latency-sensitive applications, such as web browsing, each TLS
// record should fit in one TCP segment.
// - For throughput-sensitive applications, such as large file transfers,
// larger TLS records better amortize framing and encryption overheads.
//
// A simple heuristic that works well in practice is to use small records for
// the first 1MB of data, then use larger records for subsequent data, and
// reset back to smaller records after the connection becomes idle. See "High
// Performance Web Networking", Chapter 4, or:
// https://www.igvita.com/2013/10/24/optimizing-tls-record-size-and-buffering-latency/
//
// In the interests of simplicity and determinism, this code does not attempt
// to reset the record size once the connection is idle, however.

If this happens again because handshakes will still happen in userland go we don't need to transition right away to KTLS, we can handle the first 1MB (or Xbytes) in userland to keep a healthy time-to-first-byte.

@FiloSottile
Copy link
Contributor

The fact that deciding when to enable kTLS is hard is not a reason to delegate the choice to the user, but the opposite: it means we should do the research, measure the performance, weight the tradeoffs, and make the judgement calls, so our users don't have to. We're in the business of building a TLS stack, our users are in the business of writing Go programs.

More concretely, yes, I think we should figure out which Linux version has "good enough" kTLS, and require that. Users that want kTLS can upgrade their kernel version, and as time passes that will be less and less of a problem. I am not too concerned about weird combinations of old kernels and powerful NICs, if they are common, we'll hear about it and reassess.

@sprappcom
Copy link

sprappcom commented Jun 23, 2024

ktls are for extreme users, just go extreme in this order (seriously, i doubt there will be more than 100 proj adopting this coz i'm looking at this thread since 2021)

  1. "future proof" linux compat (coz i'm into linux and the userbase for this is the largest and of course most important)
  2. security (i thought it's just api so no need to worry that much about this ktls security coz linux side would have taken care of it. just allow us to upgrade to the latest kernel will do)

get it out already by now, pls at go 1.24. make it an experimental feature at least. thx in advance.
this is really tough to get right i understand.

@rsc @FiloSottile GOEXPERIMENT=ktls

those who want edge cases can fork it and do the package they want on their own os.

@rsc
Copy link
Contributor

rsc commented Jun 26, 2024

Let's keep the GODEBUG name starting with tls like all the other tls names, so tlskernel=1 not ktls=1.
GODEBUG not GOEXPERIMENT because it is a runtime decision.

It sounds like we all agree not to add new API other than the GODEBUG. With the proposal being just to add the GODEBUG and work on an implementation, I think this is moving towards acceptance. Do I have that right?

@rsc
Copy link
Contributor

rsc commented Jun 27, 2024

Have all remaining concerns about this proposal been addressed?

The proposal is to develop transparent kTLS support behind GODEBUG=tlskernel=1.
A future proposal can discuss the conditions under which it should be enabled by default.

@scwgoire
Copy link

scwgoire commented Jul 9, 2024

Hi guys, I have a use case you may want to hear. Dropping it here even though I'm not 100% sure kTLS would be the solution, my TLS knowledge is very limited, sorry about that.

My use case is seamless application upgrade.

My TCP clients use long-lived (weeks or more) connections and allowing software updates without disconnecting them would be a killer.
The application is not very complex, I can already serialize its internal state and pass it with the open sockets to the upgraded instance of the software. However when TLS joins the game, it gets tougher, using kTLS may eliminate this problem (considering there is nothing more to do at application level once the kernel takes over the socket, which I honestly don't know).

I understand this could very well be achieved without kTLS with lots of knowledge and hacking, but I don't want to do it the hacky way and my readings make me think this use case is not supported by the current crypto/tls API and there is no plan to do so. Am I right?

Client TLS session resumption is not an option unfortunately, I have no control on the many possible client software.

Hope this helps 😃

@ShivanshVij
Copy link

I wonder if there would be an issue passing the open socket to a separate process if kTLS is involved... my initial research says there wouldn't be, but definitely worth testing.

@astrolox
Copy link

astrolox commented Jul 10, 2024

I hadn't even imagined passing a TLS enabled socket to another process. That would definitely be useful for me also. Currently we use a separate process to handle TLS and then proxy the connection over unix sockets to another process does the actual work. This causes unwanted resource usage and latency, but allows us to keep the connections open while recycling the worker process. If kTLS could make this whole design a lot simpler, we'd be very happy campers.

@kajtzu
Copy link

kajtzu commented Jul 10, 2024

Not giving you a complete solution but do take a look at pidfd_open(2), pidfd_getfd(2), pidfd_send_signal(2), @astrolox, if you wish to avoid unix socket with recent (5.6?, 5.10?) linux kernels.

@scwgoire
Copy link

scwgoire commented Jul 10, 2024

I wonder if there would be an issue passing the open socket to a separate process if kTLS is involved... my initial research says there wouldn't be, but definitely worth testing.

I got a proof-of-concept working here. I basically adapted the first attempt from @FiloSottile to up to date Go TLS stack. I also added support for enabling TLS_RX on the socket. (example from Filippo only configured TLS_TX)

In the new process, you end up with a net.Conn in which you need to:

  • write TLS record header + plaintext data
  • read plaintext data only

I still need to have this run for a long time with lots of data to make sure there is no catch.

Potential drawbacks with the new process:

  • can't handle incoming non-data records
  • can't issue non-data records (the Conn is a net.Conn, not a tls.Conn)

Also, it need support for all ciphers, only implemented it in prefixNonceAEAD

@rsc
Copy link
Contributor

rsc commented Jul 25, 2024

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

The proposal is to develop transparent kTLS support behind GODEBUG=tlskernel=1.
A future proposal can discuss the conditions under which it should be enabled by default.

@rsc
Copy link
Contributor

rsc commented Jul 31, 2024

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

The proposal is to develop transparent kTLS support behind GODEBUG=tlskernel=1.
A future proposal can discuss the conditions under which it should be enabled by default.

@rsc rsc changed the title proposal: crypto/tls: support kernel-provided TLS crypto/tls: support kernel-provided TLS Jul 31, 2024
@rsc rsc modified the milestones: Proposal, Backlog Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Proposal Proposal-Accepted Proposal-Crypto Proposal related to crypto packages or other security issues
Projects
Status: Accepted
Development

No branches or pull requests