Description
Proposal Details
Background
The current implementations of the base32 and base64 codecs are not ideal for cryptographic use cases. The linked document describes practical timing side-channels against general purpose RFC 4648 codecs when used to encode/decode cryptographic secrets. You can verify the current implementation uses a table look-up in both directions, which introduces the risk of cache-timing attacks.
Specialized implementations of these codecs exist. @Sc00bz wrote a constant-time implementation of base64 in C, which I used as a basis for both base32 and base64 in PHP.
Proposed Change
New functions in encoding/base32
and encoding/base64
:
EncodeToStringConstTime
as an alternative toEncodeToString
DecodeStringConstTime
as an alternative toDecodeString
These functions could then be used in other packages that handle cryptographic secrets and auditors can be absolutely sure that timing leaks are not present.
(I'm happy to provide a patch for this, if accepted.)
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Activity
mateusz834 commentedon May 28, 2025
CC @golang/security
tob-scott-a commentedon May 28, 2025
Just in case there's any confusion: I do not think this qualifies as a security vulnerability (or I would have reported it as one, rather than filing an issue with the intent to follow up with a patch).
Jorropo commentedon May 28, 2025
👍 however do we want new functions ? It would be simpler for everyone if we made the current functions constant time.
If you could send your patch already it would be nice, I would like to benchmark it.
tob-scott-a commentedon May 28, 2025
A constant-time alternative will certainly be slower than the general-purpose functions: It replaces a map look-up with many operations.
In the context of handling cryptographic secrets, this trade-off is worth it. However, making all use-cases for a base64 codec constant-time would slow most Go programs down unnecessarily.
It is for this reason that my opening offer was to implement new functions rather than change the current behavior.
Jorropo commentedon May 28, 2025
A single map lookup is far from cheap, I don't want to risk myself to guessing the performance impact, a benchmark would really help 🙂.
Also in go we often do not mind trading a few % of performance if it means making the API easier to use.
mateusz834 commentedon May 28, 2025
We can always do this in reverse, make the current API constant-time, but add unsafe variants (if needed) for speed. Then folks would actually use the constant-time APIs, otherwise we might have low usage of such (new) APIs.
renthraysk commentedon May 28, 2025
Doesn't the fact that stdlib have configurable alphabets for the base64/32 rule out replacing them?
tob-scott-a commentedon May 29, 2025
@renthraysk It's not actually a total show-stopper (although it does make the patch a bit trickier to write). See #73909 for a first pass at a patch that adheres to the suggestions provided in this discussion so far.
gopherbot commentedon May 29, 2025
Change https://go.dev/cl/676917 mentions this issue:
encoding/base64: add constant-time behavior, enabled by default
rolandshoemaker commentedon May 29, 2025
I don't think we'd want to turn on constant-time behavior by default for all uses of encoding/{base32,base64}. I suspect the vast majority of their usage is for entirely benign inputs. That said we probably would want to use it by default for encoding/pem, where the contents are much more likely to be sensitive (of course we should probably still add some way to disable this, for people who i.e. need to parse large numbers of certificates etc).
I haven't thought through this too deeply, but based on my understanding of how people typically do constant time base64, could we add a generic decoder/encoder that generates the required bit shift offsets based on the alphabet, rather than needing to implement per-alphabet encoders and decoders? We'd then just need to add a
WithConstantTime
(or something) method toEncoder
.1 remaining item
tob-scott-a commentedon May 29, 2025
The problem with that sort of check is that the RFC 4648 alphabets for base64 are not contiguous, which requires arithmetic like https://go-review.googlesource.com/c/go/+/676917/1/src/encoding/base64/mapping.go to implement without branches.
I'm of the same opinion here.
In this case, these lines can be omitted and the corresponding logic moved into other places (e.g., encoding/pem) as needed.
rolandshoemaker commentedon May 29, 2025
Can this arithmetic not be precomputed for the discontiguous ranges in the alphabet such that there don't need to be branches? Or is there something obvious I'm missing.
tob-scott-a commentedon May 29, 2025
Sorry, I misread the code. It looks like your approach will work:
Compare https://go.dev/play/p/EEz9V8fgsCC with https://go.dev/play/p/VHwr1LZpst9
EDIT: Looks like there's a bit of a bug with the logic for handling
ret
andvalid
. This fixes it: https://go.dev/play/p/vQeJ3QP2VWfSc00bz commentedon May 30, 2025
I wrote this awhile ago. I think it's easier to read. Also you can edit the constants for the different character sets. Although these decode functions have an optimization for a single character range (which is obvious to change). Oh also if you want Crockford's base32 decode or case insensitive hex or base32 decode then there's a few modifications like adding
ch |= 'A' ^ 'a' // convert to lowercase
(wait unless EBCDIC... see footer) after any non-letter ranges. Then for Crockford's base32 decode check for o, i, and l for values 0, 1, and 1.Note
base64StandardDecodeCharFaster
should save 4 operations in x86 vsbase64StandardDecodeChar
. Also due to instruction latency and throughput it should be the same speed but a little smaller code output.Also for custom character sets there will be a point when scanning the character set will be faster. But I don't know when that is and I assume it's higher than common character sets (in ASCII). Basically unless the character set is randomised "for security" then doing ranges will leak info about the character set, but cases like that don't really matter because they will always find a way to break their scheme.
This all assumes that it's in ASCII. If go can compile to IBM mainframes then strings are likely in EBCDIC. Which will break things.
doggedOwl commentedon May 30, 2025
no actually it would not be simpler for everyone. base64 encoding is used extensively in various format (json, xml) etc without the need to be cryptographically secure. it would be a big regression if all those cases got slowed down for no good reason.
rolandshoemaker commentedon May 30, 2025
Ignoring the actual implementation details for a moment, I think a reasonable proposal here is to make the following change to encoding/base{32,64}:
and the following changes to the existing encoding/pem functions:
and add these new functions to encoding/pem:
Another option would be to add a new field to Block, that lets setting variable time behavior for encoding, but we'd still need a VariableTimeDecode function. I think I like adding the new methods more, since it provides better symmetry in the API.
tob-scott-a commentedon May 30, 2025
@rolandshoemaker That does seem like a better user experience than the API changes in my hasty patch.
neild commentedon May 30, 2025
What's the use case for variable-time encoding in encoding/pem?
encoding/base{32,64} can't be constant-time by default (performance regression, most users don't need constant-time encoding). If encoding/pem.{Encode,Decode} become constant-time by default, then we're saying that most programs either need constant-time encoding or don't care about the performance penalty. That seems plausible to me. Are there programs that use PEM, don't need constant-time encoding, and care about the (hopefully small) performance cost of constant-time base64 encoding?
tob-scott-a commentedon May 30, 2025
Consider a use-case of a Key and/or Certificate Transparency system that accepts PEM-encoded public keys (which are not public) or certificates (which in turn only contain public information) in order to validate them before committing them to a Merkle tree.
In such a hypothetical system, slowing down the base64 decoding by default would add a performance bottleneck with no real upside for the system in question.
This is the sort of consideration I had in mind, and why I opened with not enabling it by default.
neild commentedon May 30, 2025
That's an example of a system that doesn't care about constant time encoding. Is PEM encoding/decoding going to be in critical performance path in a system like that, though? How much of a cost are we talking about here?
If we're going to double the API surface of the encoding/pem package and require all future users to spend time trying to decide whether they need constant-time encoding or not, than I think we should quantify what the actual performance difference is.
rolandshoemaker commentedon May 30, 2025
I think we need benchmarks to make reasonable decisions here about the trade-offs, but I will note that encoding/pem is currently only three functions. Doubling the API surface is perhaps not ideal, but concretely that comes down to adding three new functions, which isn't particularly horrific.
The point about requiring users to decide is valid, but I think the combination of naming and reasonable documentation makes it less scary. Additionally, since the existing Encode/Decode will become "safe" means it's unlikely people will unintentionally begin using the "unsafe" permutations unless they are explicitly looking for them for performance reasons.
Of course all of this is moot if the benchmarks show that the performance degradation is relatively low, hence why we need numbers before we can make a decision.
aclements commentedon Jun 19, 2025
This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— aclements for the proposal review group