Configurable encoding / byte-oriented units for length of encoded data #27

stuartpb · 2020-10-22T03:01:47Z

Is your feature request related to a problem? Please describe.

I have an application I want to generate a SHARED_SECRET for (Pomerium). Pomerium specifically needs a 256-bit key for this, and it expects that the value of the secret it reads will be a Base64-encoded string representing 32 bytes.

With secret-generator as it is currently constituted, I can't generate this key: either I set a length of "40" and Pomerium fails to start while complaining that 30 bytes is too short, I set a length of "44" and Pomerium fails to start while complaining that 33 bytes is too long, or I set a length between the two and Pomerium fails to start while complaining that Base64 is malformed.

Describe the solution you'd like

The main shortcoming of the current system underlying this problem is that the length always refers to the length of the generated string, not the length of the byte sequence encoded by that string. Even in situations where the underlying binary data of the secret doesn't matter, it can helpful to calculate strength in terms of bits of randomness.

To meet these needs, I propose recognizing a B suffix for the (yet-to-be-documented) secret-generator.v1.mittwald.de/length annotation to specify the length of the random secret in bytes, similar to how other resources can be specified in Kubernetes with a distinguishing suffix to change their meaning (such as the m suffix on CPU requests/limits, which specifies the CPU quota in terms of millicores).

That resolves my current use case: however, there are many other applications for Secrets, and many of them expect their random secret data to be encoded in another form than a Base64 string (which is, itself, encoded into Base64 again in the data field of the Secret). As such, I also propose a secret-generator.v1.mittwald.de/encoding annotation, which can generate secrets for all of these applications:

Systems that take key material directly from a file could use an encoding type of raw (as they are still encoded as Base64 internally by Kubernetes).
Other encodings, many of which are already included in Go's core encoding routines, could be specified to use different representations for the underlying random bits, while also serving as a shorthand for the character set, if looking to just generate a random string (by specifying a non-B-suffixed length):
- base10 (digits)
- hex / base16 (hexadecimal)
- base32 (homograph-avoidant lowercase alphanumeric set)
- base36 (full lower-case alphanumeric set)
- base62 (full alphanumeric set, a la Sprig's randAlphaNum)
- base64url (URL-safe characters)
- ascii85 (full ASCII, a la Sprig's randAscii)

Describe alternatives you've considered

The most flexible form of encoding would be to let the user specify their own "alphabet" for converting binary values to arbitrary digit systems. However, this much fine-tuned control over a random secret, at this level of the cluster, would most likely be the wrong solution to the wrong problem: even the most sensitive of differences can generally be solved with one of the encodings, or a common variation thereof (ie. Douglas Crockford's version of Base32).

For lengths, I originally considered a separate unit annotation that could be either bytes or digits. However, not only would the former be a lot of annotation bloat for a pretty trivial change, it would also result in a false compatibility: if a document needing a 32-byte base64-encoded string were deployed to a cluster where an older version of the secret-generator were deployed, rather than failing to generate the secret (as it wouldn't recognize the length) and potentially leaving that up to a different component of the system (ie. one that tries to unseal secrets that couldn't be randomly generated), it would misinterpret the length value, and generate a secret of the wrong size (which may even result in the secret being unparseable).

I also considered an alternative bytes annotation that would be mutually exclusive to length. However, on top of this also being a certain amount of annotation bloat (with a confusing distinction between the two annotations), this has no clear way to extend the annotation values if the value of autogenerate is a comma-separated list, where each key may have a different encoding or length. With the proposed design, both encoding and length could potentially be extended out to comma-separated lists in the future as well, where the first length or encoding refers to the first key of autogenerate, the second describing the second key, etc. (If this extension were implemented, their behavior when the lists are not the same length would need to be defined: such consideration is out of scope for this proposal.)

The text was updated successfully, but these errors were encountered:

japsu · 2020-10-26T15:11:15Z

I find myself in need of something covered by this or similar.

Specifically, I need to give some applications a database URL such as postgres://foouser:secureButUrlSafePassword@postgres/foodatabase. This effectively disallows : and / from appearing in the password.

In another case, I need exactly 32 hex digits.

martin-helmich · 2020-10-27T10:34:35Z

Hey all! 👋 Thanks for the suggestion and the detailed proposal.

Just to sum up to see if I got everything correctly:

Add a different notation for the secret-generator.v1.mittwald.de/length annotation (B suffix), to specify the desired length of the binary input data, instead of the encoded output data.
Add a new annotation secret-generator.v1.mittwald.de/encoding that allows to specify the exact encoding mechanism to be used (with base64url also solving @japsu's issue). If omitted, this should probably default to base64 for BC.

Does this sum up your proposal correctly, @stuartpb?

A big 👍 from me on the proposed solution. However, I cannot make any guarantees as to when (or even if) we'll get to implementing this. In the meantime, any PRs going (even partly) in this direction are more than welcome.

mittwald-machine · 2020-11-11T01:33:51Z

There has not been any activity to this issue in the last 30 days. It will automatically be closed after 7 more days. Remove the stale label to prevent this.

stuartpb · 2020-11-13T02:28:12Z

I remember diving into the implementation and thinking that it seemed like it'd be pretty straightforward to implement this: I'll see if I can put a PR together some time this or next week.

ghost · 2020-11-19T09:24:06Z

@stuartpb anything to try out yet? This would be really useful.

stuartpb · 2020-11-25T04:59:43Z

@tomhau01 Thanks for checking in - it's a bit of a hectic time for me. I'll see if I can get to this this coming Friday.

Hermsi1337 · 2020-11-25T13:05:14Z

@stuartpb
We're currently on this, therefore you don't need to invest your friday.
But we'd appreciate your feedback as soon as the PR is opened (=

stuartpb added the enhancement label Oct 22, 2020

martin-helmich added the help-wanted label Oct 27, 2020

mittwald-machine added the stale label Nov 11, 2020

martin-helmich removed the stale label Nov 11, 2020

Hermsi1337 added work-in-progress We're currently working on this issue - stay tuned! and removed help-wanted labels Nov 25, 2020

martin-helmich assigned YannikBramkamp Nov 25, 2020

YannikBramkamp mentioned this issue Nov 25, 2020

Add support for secret length in bytes and additional encodings #29

Merged

YannikBramkamp closed this as completed in #29 Nov 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configurable encoding / byte-oriented units for length of encoded data #27

Configurable encoding / byte-oriented units for length of encoded data #27

stuartpb commented Oct 22, 2020 •

edited

Loading

japsu commented Oct 26, 2020 •

edited

Loading

martin-helmich commented Oct 27, 2020

mittwald-machine commented Nov 11, 2020

stuartpb commented Nov 13, 2020

ghost commented Nov 19, 2020

stuartpb commented Nov 25, 2020

Hermsi1337 commented Nov 25, 2020 •

edited

Loading

Configurable encoding / byte-oriented units for length of encoded data #27

Configurable encoding / byte-oriented units for length of encoded data #27

Comments

stuartpb commented Oct 22, 2020 • edited Loading

japsu commented Oct 26, 2020 • edited Loading

martin-helmich commented Oct 27, 2020

mittwald-machine commented Nov 11, 2020

stuartpb commented Nov 13, 2020

ghost commented Nov 19, 2020

stuartpb commented Nov 25, 2020

Hermsi1337 commented Nov 25, 2020 • edited Loading

stuartpb commented Oct 22, 2020 •

edited

Loading

japsu commented Oct 26, 2020 •

edited

Loading

Hermsi1337 commented Nov 25, 2020 •

edited

Loading