-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding/base64: URLEncoding padding is optional #4237
Labels
Milestone
Comments
encoding/base64 specifies that it follows RFC 4648, but there's no mention in that RFC that padding is optional for the URL-safe alphabet. We could accept the lack of padding silently, but then again it's pretty easy to add in calling code: if m := len(enc) % 4; m != 0 { enc += strings.Repeat("=", 4-m) } For something so well-defined and easy to work around, I'm not sure whether we want to get in the business of accepting potentially corrupt input silently. Status changed to Thinking. |
http://tools.ietf.org/html/rfc4648 section 5. The pad character "=" is typically percent-encoded when used in an URI [9], but if the data length is known implicitly, this can be avoided by skipping the padding; see section 3.2. |
That refers to the encoding, and specifically when used in a URI. Indeed the section 3.2 that your quote refers to says In some circumstances, the use of padding ("=") in base-encoded data is not required or used. In the general case, when assumptions about the size of transported data cannot be made, padding is required to yield correct decoded data. Implementations MUST include appropriate pad characters at the end of encoded data unless the specification referring to this document explicitly states otherwise. encoding/base64 is a general-purpose package and doesn't know the context of where its output will be used, so it should always be including the padding. It seems oddly asymmetric to allow it to be absent when decoding. |
I agree we can always encode with padding, but we'd better accept non-padded data in decode. also, because we always encode with padding, i think it's preferable the docs explicitly state that if used in URI, the caller should url.QueryEscape it. the problem with reported program is that our Decoder silently truncate the output without any error (except that n < len(input)); however, if we call DecodeString or Decode directly, we get CorruptInputError. This arises from the fact that we're using ReadAtLeast in Read. I'm not sure this is correct. |
Well, that's what I was pondering. Some of our APIs are liberal in what they accept (e.g. net/http), but some are not (flag, most of encoding/*, etc.). Even with the padding, isn't the output safe to use? It should be. We shouldn't be throwing away errors, for sure. That sounds like a bug to me. |
out of curiosity, i grepped the std library for ReadAtLeast, and except some tests, only encoding/base32 and encoding/base64 use it. Update: dec.Read() does return an error (i was wrong about this), it's io.ErrUnexpectedEOF, should it be a base64.CorruptInputError to match that of dec.DecodeString? |
scratch the update in #7, I was confused by myself. please See the behavior for yourself: http://play.golang.org/p/BAnLexAKwr |
Decoding the Google OAuth2 JWT fails with the current go encoding/base64 implementation. I worked around this by adding the following before decoding: if l := len(s) % 4; l > 0 { s += string([]byte{'=', '=', '='}[3-l:]) // or strings.Repeat("=", 4-l) } Where "s" is the encoded string. Info: Google OAuth2 (OpenID) returns JWT (JSON Web Tokens) as the authentication result which is "base64url" encoded with padding removed. https://developers.google.com/accounts/docs/OAuth2Login#exchangecode From the JWT specification: Base64url Encoding Base64 encoding using the URL- and filename-safe character set defined in Section 5 of RFC 4648 [RFC4648], with all trailing '=' characters omitted (as permitted by Section 3.2) and without the inclusion of any line breaks, white space, or other additional characters. http://tools.ietf.org/html/draft-ietf-oauth-json-web-token-20 |
Perhaps this issue can be closed now? The base64 pkg has changed a bit since it was filed. Specifically with this change in 2014: RawURLEncoding can be used if you wish to decode data which uses no padding, so for the example given above, to avoid errors you can just use the appropriate raw encoding: |
@kennygrant, thanks! Closing. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The text was updated successfully, but these errors were encountered: