Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/base64: URLEncoding padding is optional #4237

Closed
gjemiller opened this issue Oct 11, 2012 · 20 comments
Closed

encoding/base64: URLEncoding padding is optional #4237

gjemiller opened this issue Oct 11, 2012 · 20 comments

Comments

@gjemiller
Copy link

It appears that under certain circumstances, padding characters are not needed in a
valid base64 encoding.  Specifically, when using the "url encoding" they are
optional.  This presents a problem when trying to decode base64 code from a third party
that omits them.

What steps will reproduce the problem?
http://play.golang.org/p/RWV_Nd34wr

// reproduced here:
package main

import (
    "bytes"
    "encoding/base64"
    "fmt"
    "io"
    "os"
)

func main() {
    decode("c3VyZS4=")    // example from wikipedia
    decode("c3VyZS4") // should be the same without padding
}

func decode(s string) {
    dec := base64.NewDecoder(base64.URLEncoding, bytes.NewBufferString(s))
    n, err := io.Copy(os.Stdout, dec)
    fmt.Printf("\n%d %v\n", n, err)
}


What is the expected output?
sure.
5 <nil>
sure.
5 <nil>

What do you see instead?
sure.
5 <nil>
sur
3 <nil>
@minux
Copy link
Member

minux commented Oct 12, 2012

Comment 1:

it is optional in URLEncoding, if exists, '=' must be escaped to %3D

Labels changed: added priority-later, removed priority-triage.

Status changed to Accepted.

@dsymonds
Copy link
Contributor

Comment 2:

encoding/base64 specifies that it follows RFC 4648, but there's no mention in that RFC
that padding is optional for the URL-safe alphabet. We could accept the lack of padding
silently, but then again it's pretty easy to add in calling code:
    if m := len(enc) % 4; m != 0 {
        enc += strings.Repeat("=", 4-m)
    }
For something so well-defined and easy to work around, I'm not sure whether we want to
get in the business of accepting potentially corrupt input silently.

Status changed to Thinking.

@minux
Copy link
Member

minux commented Oct 14, 2012

Comment 3:

http://tools.ietf.org/html/rfc4648
section 5.
The pad character "=" is typically percent-encoded when used in an
   URI [9], but if the data length is known implicitly, this can be
   avoided by skipping the padding; see section 3.2.

@dsymonds
Copy link
Contributor

Comment 4:

That refers to the encoding, and specifically when used in a URI. Indeed the section 3.2
that your quote refers to says
   In some circumstances, the use of padding ("=") in base-encoded data
   is not required or used.  In the general case, when assumptions about
   the size of transported data cannot be made, padding is required to
   yield correct decoded data.
   Implementations MUST include appropriate pad characters at the end of
   encoded data unless the specification referring to this document
   explicitly states otherwise.
encoding/base64 is a general-purpose package and doesn't know the context of where its
output will be used, so it should always be including the padding. It seems oddly
asymmetric to allow it to be absent when decoding.

@minux
Copy link
Member

minux commented Oct 14, 2012

Comment 5:

I agree we can always encode with padding, but we'd better accept non-padded
data in decode.
also, because we always encode with padding, i think it's preferable the docs 
explicitly state that if used in URI, the caller should url.QueryEscape it.
the problem with reported program is that our Decoder silently truncate the
output without any error (except that n < len(input)); however, if we call
DecodeString or Decode directly, we get CorruptInputError.
This arises from the fact that we're using ReadAtLeast in Read. I'm not sure
this is correct.

@dsymonds
Copy link
Contributor

Comment 6:

Well, that's what I was pondering. Some of our APIs are liberal in
what they accept (e.g. net/http), but some are not (flag, most of
encoding/*, etc.).
Even with the padding, isn't the output safe to use? It should be.
We shouldn't be throwing away errors, for sure. That sounds like a bug to me.

@minux
Copy link
Member

minux commented Oct 14, 2012

Comment 7:

out of curiosity, i grepped the std library for ReadAtLeast, and except some tests,
only encoding/base32 and encoding/base64 use it.
Update: dec.Read() does return an error (i was wrong about this), it's
io.ErrUnexpectedEOF,
should it be a base64.CorruptInputError to match that of dec.DecodeString?

@minux
Copy link
Member

minux commented Oct 14, 2012

Comment 8:

scratch the update in #7, I was confused by myself.
please See the behavior for yourself:
http://play.golang.org/p/BAnLexAKwr

@dsymonds
Copy link
Contributor

dsymonds commented Nov 6, 2012

Comment 9:

Labels changed: added go1.1maybe, removed go1.1.

@robpike
Copy link
Contributor

robpike commented Mar 7, 2013

Comment 10:

Labels changed: removed go1.1maybe.

@rsc
Copy link
Contributor

rsc commented Jul 30, 2013

Comment 11:

Labels changed: added go1.2maybe.

@rsc
Copy link
Contributor

rsc commented Jul 30, 2013

Comment 12:

Labels changed: added feature.

@robpike
Copy link
Contributor

robpike commented Aug 30, 2013

Comment 13:

Not for 1.2.

Labels changed: removed go1.2maybe.

@dsymonds
Copy link
Contributor

Comment 14:

It's not even clear we want to do this at all.

Labels changed: added priority-someday, packagechange, removed priority-later.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 15:

Labels changed: added repo-main.

@gopherbot
Copy link
Contributor

Comment 16 by oleku.konko:

Is this change likely for 1.3 ?

@dsymonds
Copy link
Contributor

Comment 17:

It seems unlikely.

Labels changed: added release-none.

@gopherbot
Copy link
Contributor

Comment 18 by tais.hansen:

Decoding the Google OAuth2 JWT fails with the current go encoding/base64 implementation.
I worked around this by adding the following before decoding:
    if l := len(s) % 4; l > 0 {
        s += string([]byte{'=', '=', '='}[3-l:]) // or strings.Repeat("=", 4-l)
    }
Where "s" is the encoded string.
Info:
Google OAuth2 (OpenID) returns JWT (JSON Web Tokens) as the authentication result which
is "base64url" encoded with padding removed.
https://developers.google.com/accounts/docs/OAuth2Login#exchangecode
From the JWT specification:
Base64url Encoding
      Base64 encoding using the URL- and filename-safe character set
      defined in Section 5 of RFC 4648 [RFC4648], with all trailing '='
      characters omitted (as permitted by Section 3.2) and without the
      inclusion of any line breaks, white space, or other additional
      characters.
http://tools.ietf.org/html/draft-ietf-oauth-json-web-token-20

@kennygrant
Copy link
Contributor

Perhaps this issue can be closed now?

The base64 pkg has changed a bit since it was filed. Specifically with this change in 2014:

2e0a1a7

RawURLEncoding can be used if you wish to decode data which uses no padding, so for the example given above, to avoid errors you can just use the appropriate raw encoding:

https://play.golang.org/p/skWYiMHk3j

@bradfitz
Copy link
Contributor

@kennygrant, thanks! Closing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants