Skip to content

proposal: encoding/{base32,base64}: constant-time implementations of RFC 4648 codecs for use with cryptography keys #73901

Open
@tob-scott-a

Description

@tob-scott-a

Proposal Details

Background

The current implementations of the base32 and base64 codecs are not ideal for cryptographic use cases. The linked document describes practical timing side-channels against general purpose RFC 4648 codecs when used to encode/decode cryptographic secrets. You can verify the current implementation uses a table look-up in both directions, which introduces the risk of cache-timing attacks.

Specialized implementations of these codecs exist. @Sc00bz wrote a constant-time implementation of base64 in C, which I used as a basis for both base32 and base64 in PHP.

Proposed Change

New functions in encoding/base32 and encoding/base64:

  • EncodeToStringConstTime as an alternative to EncodeToString
  • DecodeStringConstTime as an alternative to DecodeString

These functions could then be used in other packages that handle cryptographic secrets and auditors can be absolutely sure that timing leaks are not present.

(I'm happy to provide a patch for this, if accepted.)

Activity

added this to the Proposal milestone on May 28, 2025
mateusz834

mateusz834 commented on May 28, 2025

@mateusz834
Member

CC @golang/security

added
Proposal-CryptoProposal related to crypto packages or other security issues
on May 28, 2025
tob-scott-a

tob-scott-a commented on May 28, 2025

@tob-scott-a
Author

Just in case there's any confusion: I do not think this qualifies as a security vulnerability (or I would have reported it as one, rather than filing an issue with the intent to follow up with a patch).

added
LibraryProposalIssues describing a requested change to the Go standard library or x/ libraries, but not to a tool
on May 28, 2025
Jorropo

Jorropo commented on May 28, 2025

@Jorropo
Member

👍 however do we want new functions ? It would be simpler for everyone if we made the current functions constant time.
If you could send your patch already it would be nice, I would like to benchmark it.

tob-scott-a

tob-scott-a commented on May 28, 2025

@tob-scott-a
Author

A constant-time alternative will certainly be slower than the general-purpose functions: It replaces a map look-up with many operations.

In the context of handling cryptographic secrets, this trade-off is worth it. However, making all use-cases for a base64 codec constant-time would slow most Go programs down unnecessarily.

It is for this reason that my opening offer was to implement new functions rather than change the current behavior.

Jorropo

Jorropo commented on May 28, 2025

@Jorropo
Member

A single map lookup is far from cheap, I don't want to risk myself to guessing the performance impact, a benchmark would really help 🙂.
Also in go we often do not mind trading a few % of performance if it means making the API easier to use.

mateusz834

mateusz834 commented on May 28, 2025

@mateusz834
Member

Also in go we often do not mind trading a few % of performance if it means making the API easier to use.

We can always do this in reverse, make the current API constant-time, but add unsafe variants (if needed) for speed. Then folks would actually use the constant-time APIs, otherwise we might have low usage of such (new) APIs.

renthraysk

renthraysk commented on May 28, 2025

@renthraysk

Doesn't the fact that stdlib have configurable alphabets for the base64/32 rule out replacing them?

tob-scott-a

tob-scott-a commented on May 29, 2025

@tob-scott-a
Author

@renthraysk It's not actually a total show-stopper (although it does make the patch a bit trickier to write). See #73909 for a first pass at a patch that adheres to the suggestions provided in this discussion so far.

gopherbot

gopherbot commented on May 29, 2025

@gopherbot
Contributor

Change https://go.dev/cl/676917 mentions this issue: encoding/base64: add constant-time behavior, enabled by default

rolandshoemaker

rolandshoemaker commented on May 29, 2025

@rolandshoemaker
Member

I don't think we'd want to turn on constant-time behavior by default for all uses of encoding/{base32,base64}. I suspect the vast majority of their usage is for entirely benign inputs. That said we probably would want to use it by default for encoding/pem, where the contents are much more likely to be sensitive (of course we should probably still add some way to disable this, for people who i.e. need to parse large numbers of certificates etc).

I haven't thought through this too deeply, but based on my understanding of how people typically do constant time base64, could we add a generic decoder/encoder that generates the required bit shift offsets based on the alphabet, rather than needing to implement per-alphabet encoders and decoders? We'd then just need to add a WithConstantTime (or something) method to Encoder.

1 remaining item

tob-scott-a

tob-scott-a commented on May 29, 2025

@tob-scott-a
Author

I'm not sure this is actually correct (threw it together very quickly), but something along the lines of this https://go.dev/play/p/4toQNnegoOh?

The problem with that sort of check is that the RFC 4648 alphabets for base64 are not contiguous, which requires arithmetic like https://go-review.googlesource.com/c/go/+/676917/1/src/encoding/base64/mapping.go to implement without branches.

I don't think we'd want to turn on constant-time behavior by default for all uses of encoding/{base32,base64}. I suspect the vast majority of their usage is for entirely benign inputs.

I'm of the same opinion here.

In this case, these lines can be omitted and the corresponding logic moved into other places (e.g., encoding/pem) as needed.

rolandshoemaker

rolandshoemaker commented on May 29, 2025

@rolandshoemaker
Member

The problem with that sort of check is that the RFC 4648 alphabets for base64 are not contiguous, which requires arithmetic like https://go-review.googlesource.com/c/go/+/676917/1/src/encoding/base64/mapping.go to implement without branches.

Can this arithmetic not be precomputed for the discontiguous ranges in the alphabet such that there don't need to be branches? Or is there something obvious I'm missing.

tob-scott-a

tob-scott-a commented on May 29, 2025

@tob-scott-a
Author

Sorry, I misread the code. It looks like your approach will work:

Compare https://go.dev/play/p/EEz9V8fgsCC with https://go.dev/play/p/VHwr1LZpst9

EDIT: Looks like there's a bit of a bug with the logic for handling ret and valid. This fixes it: https://go.dev/play/p/vQeJ3QP2VWf

Sc00bz

Sc00bz commented on May 30, 2025

@Sc00bz

I wrote this awhile ago. I think it's easier to read. Also you can edit the constants for the different character sets. Although these decode functions have an optimization for a single character range (which is obvious to change). Oh also if you want Crockford's base32 decode or case insensitive hex or base32 decode then there's a few modifications like adding ch |= 'A' ^ 'a' // convert to lowercase (wait unless EBCDIC... see footer) after any non-letter ranges. Then for Crockford's base32 decode check for o, i, and l for values 0, 1, and 1.

Note base64StandardDecodeCharFaster should save 4 operations in x86 vs base64StandardDecodeChar. Also due to instruction latency and throughput it should be the same speed but a little smaller code output.

Also for custom character sets there will be a point when scanning the character set will be faster. But I don't know when that is and I assume it's higher than common character sets (in ASCII). Basically unless the character set is randomised "for security" then doing ranges will leak info about the character set, but cases like that don't really matter because they will always find a way to break their scheme.

func base64StandardDecodeChar(ch int) int {
	const LO1 = int('A') - 1
	const HI1 = int('Z') + 1
	const LO2 = int('a') - 1
	const HI2 = int('z') + 1
	const LO3 = int('0') - 1
	const HI3 = int('9') + 1
	const LO4 = int('+') - 1
	const HI4 = int('+') + 1
	const LO5 = int('/') - 1
	const HI5 = int('/') + 1
	const COUNT1 = 0
	const COUNT2 = COUNT1 + HI1 - LO1 - 1
	const COUNT3 = COUNT2 + HI2 - LO2 - 1
	const COUNT4 = COUNT3 + HI3 - LO3 - 1
	const COUNT5 = COUNT4 + HI4 - LO4 - 1

	value := -1

	//    if (  LO  < ch  && ch < HI        )   { value += COUNT + ch - LO }
	value += (((LO1 - ch) & (ch - HI1)) >> 8) & (COUNT1 + ch - LO1)
	value += (((LO2 - ch) & (ch - HI2)) >> 8) & (COUNT2 + ch - LO2)
	value += (((LO3 - ch) & (ch - HI3)) >> 8) & (COUNT3 + ch - LO3)
	value += (((LO4 - ch) & (ch - HI4)) >> 8) & (COUNT4 + 1) // ch - LO4 == 1
	value += (((LO5 - ch) & (ch - HI5)) >> 8) & (COUNT5 + 1) // ch - LO5 == 1

	return value
}

func base64StandardDecodeCharFaster(ch int) int {
	const LO1 = int('A') - 1
	const HI1 = int('Z') + 1
	const LO2 = int('a') - 1
	const HI2 = int('z') + 1
	const LO3 = int('0') - 1
	const HI3 = int('9') + 1
	const CHAR4 = int('+')
	const CHAR5 = int('/')
	const COUNT1 = 0
	const COUNT2 = COUNT1 + HI1 - LO1 - 1
	const COUNT3 = COUNT2 + HI2 - LO2 - 1
	const COUNT4 = COUNT3 + HI3 - LO3 - 1
	const COUNT5 = COUNT4 + 1

	var borrow uint

	value := -1

	//    if (  LO  < ch  && ch < HI        )   { value += ch - LO + COUNT }
	value += (((LO1 - ch) & (ch - HI1)) >> 8) & (ch - LO1 + COUNT1)
	value += (((LO2 - ch) & (ch - HI2)) >> 8) & (ch - LO2 + COUNT2)
	value += (((LO3 - ch) & (ch - HI3)) >> 8) & (ch - LO3 + COUNT3)

	// if (                   ch == CHAR      ) { value +=  COUNT + 1 }
	_, borrow = bits.Sub(uint(ch ^ CHAR4), 1, 0); value += (COUNT4 + 1) & -int(borrow)
	_, borrow = bits.Sub(uint(ch ^ CHAR5), 1, 0); value += (COUNT5 + 1) & -int(borrow)

	return value
}

func base64StandardEncodeChar(src int) byte {
	const LO1 = int('A')
	const HI1 = int('Z')
	const LO2 = int('a')
	const HI2 = int('z')
	const LO3 = int('0')
	const HI3 = int('9')
	const LO4 = int('+')
	const HI4 = int('+')
	const LO5 = int('/')

	src += LO1

	//  if ( HI  < src      )   { src += LO_NEXT - HI - 1 }
	src += ((HI1 - src) >> 8) & (LO2 - HI1 - 1)
	src += ((HI2 - src) >> 8) & (LO3 - HI2 - 1)
	src += ((HI3 - src) >> 8) & (LO4 - HI3 - 1)
	src += ((HI4 - src) >> 8) & (LO5 - HI4 - 1)

	return byte(src)
}

This all assumes that it's in ASCII. If go can compile to IBM mainframes then strings are likely in EBCDIC. Which will break things.

doggedOwl

doggedOwl commented on May 30, 2025

@doggedOwl

however do we want new functions ? It would be simpler for everyone if we made the current functions constant time.
If you could send your patch already it would be nice, I would like to benchmark it.

no actually it would not be simpler for everyone. base64 encoding is used extensively in various format (json, xml) etc without the need to be cryptographically secure. it would be a big regression if all those cases got slowed down for no good reason.

rolandshoemaker

rolandshoemaker commented on May 30, 2025

@rolandshoemaker
Member

Ignoring the actual implementation details for a moment, I think a reasonable proposal here is to make the following change to encoding/base{32,64}:

// WithConstantTime creates a new encoding identical to enc except that the
// Encode/Decode operations happen in constant time. This is useful when the
// inputs to these operations are sensitive, and measuring the timing of the
// operations could leak information about them.  Note that Encode/Decode
// operations will still leak the length of the inputs.
func (enc Encoding) WithConstantTime() *Encoding

and the following changes to the existing encoding/pem functions:

// ...
// Encode uses constant-time base64 encoding, and will only leak the length of
// b.Bytes. To restore the old variable-time behavior, use the
// GODEBUG=vartimepem=0 environment variable.
func Encode(out io.Writer, b *Block) error

// ...
// EncodeToMemory uses constant-time base64 encoding, and will only leak the
// length of b.Bytes. To restore the old variable-time behavior, use the
// GODEBUG=vartimepem=0 environment variable.
func EncodeToMemory(b *Block) []byte

// ...
// Decode uses constant-time base64 decoding, and will only leak the length of
// p.Bytes. To restore the old variable-time behavior, use the
// GODEBUG=vartimepem=0 environment variable.
func Decode(data []byte) (p *Block, rest []byte)

and add these new functions to encoding/pem:

// VariableTimeEncode provides the same functionality as Encode, but uses
// variable-time base64 encoding. VariableTimeEncode should only be used when
// encoding non-sensitive values, or when the timing of VariableTimeEncode is
// not exposed.
func VariableTimeEncode(out io.Writer, b *Block) error

// VariableTimeEncodeToMemory provides the same functionality as EncodeToMemory,
// but uses variable-time base64 encoding. VariableTimeEncodeToMemory should
// only be used when encoding non-sensitive values, or when the timing of
// VariableTimeEncodeToMemory is not exposed.
func VariableTimeEncodeToMemory(b *Block) []byte

// VariableTimeDecode provides the same functionality as Decode, but uses
// variable-time base64 decoding. VariableTimeDecode should only be used when
// decoding non-sensitive values, or when the timing of VariableTimeDecode is
// not exposed.
func VariableTimeDecode(data []byte) (p *Block, rest []byte)

Another option would be to add a new field to Block, that lets setting variable time behavior for encoding, but we'd still need a VariableTimeDecode function. I think I like adding the new methods more, since it provides better symmetry in the API.

tob-scott-a

tob-scott-a commented on May 30, 2025

@tob-scott-a
Author

@rolandshoemaker That does seem like a better user experience than the API changes in my hasty patch.

neild

neild commented on May 30, 2025

@neild
Contributor

What's the use case for variable-time encoding in encoding/pem?

encoding/base{32,64} can't be constant-time by default (performance regression, most users don't need constant-time encoding). If encoding/pem.{Encode,Decode} become constant-time by default, then we're saying that most programs either need constant-time encoding or don't care about the performance penalty. That seems plausible to me. Are there programs that use PEM, don't need constant-time encoding, and care about the (hopefully small) performance cost of constant-time base64 encoding?

tob-scott-a

tob-scott-a commented on May 30, 2025

@tob-scott-a
Author

Are there programs that use PEM, don't need constant-time encoding, and care about the (hopefully small) performance cost of constant-time base64 encoding?

Consider a use-case of a Key and/or Certificate Transparency system that accepts PEM-encoded public keys (which are not public) or certificates (which in turn only contain public information) in order to validate them before committing them to a Merkle tree.

In such a hypothetical system, slowing down the base64 decoding by default would add a performance bottleneck with no real upside for the system in question.

This is the sort of consideration I had in mind, and why I opened with not enabling it by default.

neild

neild commented on May 30, 2025

@neild
Contributor

That's an example of a system that doesn't care about constant time encoding. Is PEM encoding/decoding going to be in critical performance path in a system like that, though? How much of a cost are we talking about here?

If we're going to double the API surface of the encoding/pem package and require all future users to spend time trying to decide whether they need constant-time encoding or not, than I think we should quantify what the actual performance difference is.

rolandshoemaker

rolandshoemaker commented on May 30, 2025

@rolandshoemaker
Member

I think we need benchmarks to make reasonable decisions here about the trade-offs, but I will note that encoding/pem is currently only three functions. Doubling the API surface is perhaps not ideal, but concretely that comes down to adding three new functions, which isn't particularly horrific.

The point about requiring users to decide is valid, but I think the combination of naming and reasonable documentation makes it less scary. Additionally, since the existing Encode/Decode will become "safe" means it's unlikely people will unintentionally begin using the "unsafe" permutations unless they are explicitly looking for them for performance reasons.

Of course all of this is moot if the benchmarks show that the performance degradation is relatively low, hence why we need numbers before we can make a decision.

moved this to Incoming in Proposalson Jun 19, 2025
moved this from Incoming to Active in Proposalson Jun 19, 2025
aclements

aclements commented on Jun 19, 2025

@aclements
Member

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— aclements for the proposal review group

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    LibraryProposalIssues describing a requested change to the Go standard library or x/ libraries, but not to a toolProposalProposal-CryptoProposal related to crypto packages or other security issues

    Type

    No type

    Projects

    Status

    Active

    Relationships

    None yet

      Development

      Participants

      @neild@Sc00bz@aclements@rolandshoemaker@gopherbot

      Issue actions

        proposal: encoding/{base32,base64}: constant-time implementations of RFC 4648 codecs for use with cryptography keys · Issue #73901 · golang/go