Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Percent-encode illegal characters in user.Info.Extra keys #65799

Merged
merged 1 commit into from Jul 27, 2018

Conversation

dekkagaijin
Copy link
Contributor

@dekkagaijin dekkagaijin commented Jul 4, 2018

This percent-encodes characters in X-Remote-Extra- and Impersonate-Extra- keys which aren't valid for header names per RFC 7230 (plus "%" to avoid breaking keys which contain them). The API server then blindly unescapes these keys.

Reviewer note:
Old clients sending keys which were %-escaped by the user will have their values unescaped by new API servers. New clients sending keys containing illegal characters (or "%") to old API servers will not have their values unescaped. This version skew incompatibility is a compromise discussed in #63682.

Fixes #63682

PTAL @mikedanese

Release note:

action required: the API server and client-go libraries have been fixed to support additional non-alpha-numeric characters in UserInfo "extra" data keys. Both should be updated in order to properly support extra data containing "/" characters or other characters disallowed in HTTP headers.

@k8s-ci-robot k8s-ci-robot added release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 4, 2018
@idealhack
Copy link
Member

/ok-to-test
/sig auth

@k8s-ci-robot k8s-ci-robot added sig/auth Categorizes an issue or PR as relevant to SIG Auth. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 4, 2018
@dekkagaijin
Copy link
Contributor Author

dekkagaijin commented Jul 4, 2018

FYI @mikedanese @liggitt I'm %-encoding "%" characters in addition to the characters which are illegal under RFC 7230 so that we can also correctly propagate "%"-containing keys. This way, new client -> old server requests are garbled, and new client -> new server requests are not, rather than the other way around:

encoded %s

String Charset Old Client, Old API server Old Client, New API server New Client, Old API server New Client, New API server
legal OK potentially misinterpreted potentially garbled OK
illegal request fails request fails garbled OK

unencoded %s

String Charset Old Client, Old API server Old Client, New API server New Client, Old API server New Client, New API server
legal OK potentially misinterpreted OK potentially misinterpreted
illegal request fails request fails garbled potentially misinterpreted

@mikedanese mikedanese self-assigned this Jul 9, 2018
@@ -160,6 +161,14 @@ func allHeaderValues(h http.Header, headerNames []string) []string {
return ret
}

func unescapeExtraKey(encodedKey string) string {
key, err := url.PathUnescape(encodedKey) // Decode %-encoded bytes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does PathUnescape work here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PathEscape is under-aggressive for our needs, but PathUnescape blindly converts %-encoded bytes back to regular bytes: https://golang.org/src/net/url/url.go?s=5075:5118#L173

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show some examples where url.PathEscape/url.QueryEscape are lacking?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PathEscape is under-aggressive for our needs

that's unfortunate... what characters does it not escape for us?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's "@", "=", and ":" are unreserved in path components but forbidden in header keys. I'll add explicit tests for that.
QueryEscape is lacking because we want to maximize compatibility with (i.e. minimize escaping of) existing legal header key strings. We don't want to be over-aggressivley escaping legal strings (which unnecessarily garbles legal keys keys being sent from a new client to an old API server).

I set up a playground to test the interactions between new/old clients/servers with various escaping/unescaping algorithms:
https://play.golang.org/p/eors6oEbRpT

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can we offload bulk of escaping to Path/QueryEscape and post-process the output to catch any extra characters we care about?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does QueryEscape's aggressive escaping break anything?

The '+' -> ' ' seems problematic

Copy link
Contributor

@awly awly Jul 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The '+' -> ' ' seems problematic

That would be for Unescape, right? QueryEscape is the other way around, which seems fine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does QueryEscape's aggressive escaping break anything?
I'm very skeptical about having to maintain complex encoding logic for the sake of making headers less garbled/readable.

Yes, anyone currently using non-alphanumeric strings and mismatched client/server versions.

Also, can we offload bulk of escaping to Path/QueryEscape and post-process the output to catch any extra characters we care about?

Only if it's acceptable to permanently garble keys that are percent-encoded by the caller.
Double-encoding seems more complicated/fragile than simply whitelisting all header chars.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The '+' -> ' ' seems problematic

That would be for Unescape, right? QueryEscape is the other way around, which seems fine?

If it's acceptable to additionally break old clients sending '+'-containing keys to new API servers, sure.

func unescapeExtraKey(encodedKey string) string {
key, err := url.PathUnescape(encodedKey) // Decode %-encoded bytes.
if err != nil {
return encodedKey // Always record extra strings, even if malformed/unencoded.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The why of this comment is not obvious to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the discussion in the original issue, it seemed like it was decided to always record these extra values (even when 'malformed'), so I made it a no-op on error and added test cases for strings like "foo%xxbar" which can't be PathUnescaped.

return fmt.Sprintf("%%%x", b)
}

func percentEncodeRune(r rune) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A rune is always a byte so you can make this just be

return fmt.Sprintf("%%%x", byte(r))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runes are int32s, but if we are OK with restricting extra key strings to the ascii subset of utf-8, that'd be OK. I deliberately encoded entire runes here, rather than bytes, to avoid extra-garbled wire-format strings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, for some reason I thought go strings were UTF-8. Let's encode the whole int32.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go strings are utf-8, but utf-8 runes can be 1-4 bytes long ;)

func headerEscape(key string) string {
encoded := ""
for _, r := range key {
encoded += percentEncodeIfIllegal(r)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this require a lot of allocations? Should we use a byte buffer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potentially...
I deliberately chose string concatenation because bytes.Buffer.WriteFoo() returns an error, and I thought that it would be weird to do error handling when such a case does not arise from string concatenation (which I assume panics when malloc or something fails)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scratch that, the docs are explicit in that Write(Byte|Rune|String) always returns a nil error. I'll use the buffer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use strings.Builder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

groupHeaders: []string{"X-Remote-Group"},
extraPrefixHeaders: []string{"X-Remote-Extra-"},
requestHeaders: http.Header{
"X-Remote-User": {"Bob"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test specifically with the + character, to make sure that round trips correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added one in the roundtripper test, but I can do so here, too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

b := key[i]
if shouldEscape(b) {
buf.WriteByte('%')
buf.WriteByte("0123456789abcdef"[b>>4])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs an explanatory comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


func headerKeyEscape(key string) string {
buf := strings.Builder{}
buf.Grow(headerKeyEscapedLen(key))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just write to buf without pre-scanning key to allocate memory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that this is a micro-optimization, but it was done in response to feedback earlier about repetitive string concatenations being malloc-heavy.

@@ -422,3 +422,130 @@ func (rt *debuggingRoundTripper) RoundTrip(req *http.Request) (*http.Response, e
func (rt *debuggingRoundTripper) WrappedRoundTripper() http.RoundTripper {
return rt.delegatedRoundTripper
}

func isLegalHeaderKey(key string) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@dekkagaijin
Copy link
Contributor Author

/retest

func shouldEscape(b byte) bool {
// url.PathUnescape() returns an error if any '%' is not followed by two
// hexadecimal digits, so we'll intentionally encode it.
return !legalHeaderByte(b) || b == '%'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about removing % from legalHeaderKeyBytes and remove this func?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I originally had.
I feel that explicitly declaring '%' as header-key-legal, referencing the relevant RFC and appropriated code, and then explicitly encoding it anyway is more readable and less surprising.

for i := 0; i < len(key); i++ {
b := key[i]
if shouldEscape(b) {
// %-encode bytes that should be escaped.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant explain how the magic string indexing and shifting below works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if shouldEscape(b) {
// %-encode bytes that should be escaped:
// https://tools.ietf.org/html/rfc3986#section-2.1
buf.WriteByte('%')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer fmt.Fprintf(buf, "%%%x", b) here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to ensure a proper encoding, it'd have to be fmt.Fprintf(buf, "%%%x%x", b>>4, b&15):
https://play.golang.org/p/gVydKNv7qq-

FWIW the current scheme is how things are done in the golang stdlib: https://golang.org/src/net/url/url.go?s=7512:7544#L304

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@awly
Copy link
Contributor

awly commented Jul 12, 2018

LGTM but someone else should comment on how breaking this is.

@dekkagaijin
Copy link
Contributor Author

@liggitt @mikedanese ping

@mikedanese
Copy link
Member

Can you squash? This looks good to me.

/approve

Signed-off-by: Jake Sanders <jsand@google.com>
@dekkagaijin
Copy link
Contributor Author

squashed

@dekkagaijin
Copy link
Contributor Author

/test pull-kubernetes-e2e-kops-aws

@dekkagaijin
Copy link
Contributor Author

@liggitt ping
Would also be good to know of someone that could review the release note

@dekkagaijin
Copy link
Contributor Author

@sttts @deads2k @lavalamp any approvers that could take a look?

@liggitt
Copy link
Member

liggitt commented Jul 27, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 27, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dekkagaijin, liggitt, mikedanese

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 27, 2018
@k8s-github-robot
Copy link

/test all

Tests are more than 96 hours old. Re-running tests.

@liggitt
Copy link
Member

liggitt commented Jul 27, 2018

thanks for all the work on this. I updated the release note, please take a look. would you mind opening a doc PR against the impersonation and requestheader documentation describing the safe set of characters to use, and against the webhook authenticator doc referencing the safe set of characters to use up through v1.11.x?

@liggitt liggitt added the kind/bug Categorizes issue or PR as related to a bug. label Jul 27, 2018
@dekkagaijin
Copy link
Contributor Author

@liggitt thanks, will do

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 66225, 66648, 65799, 66630, 66619). If you want to cherry-pick this change to another branch, please follow the instructions here.

@liggitt
Copy link
Member

liggitt commented Aug 8, 2018

just noticed the doc PR referenced 1.11.2, but this hasn't been picked back to 1.11.x, has it?

@dekkagaijin
Copy link
Contributor Author

@liggitt whoops... had a few changes in flight and only cherry picked one of them. I'll revise and send you the PR

k8s-github-robot pushed a commit that referenced this pull request Aug 16, 2018
…65799-upstream-release-1.10

Automatic merge from submit-queue.

Automated cherry pick of #65799: Escape illegal characters in remote extra keys

Cherry pick of #65799 on release-1.10.

#65799: Escape illegal characters in remote extra keys
k8s-github-robot pushed a commit that referenced this pull request Aug 21, 2018
…65799-upstream-release-1.11

Automatic merge from submit-queue.

Automated cherry pick of #65799: Escape illegal characters in remote extra keys

Cherry pick of #65799 on release-1.11.

#65799: Escape illegal characters in remote extra keys

```release-note
action required: the API server and client-go libraries have been fixed to support additional non-alpha-numeric characters in UserInfo "extra" data keys. Both should be updated in order to properly support extra data containing "/" characters or other characters disallowed in HTTP headers.
```
k8s-ci-robot added a commit that referenced this pull request Sep 10, 2018
…65799-upstream-release-1.9

Automated cherry pick of #65799: Escape illegal characters in remote extra keys
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

header key validation stricter than user.Info.Extra key validation
10 participants