-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe OAuth's use of application/x-www-form-urlencoded encoding #128
Comments
|
Does OAuth do something different than what's describe in the HTTP Basic Authentication Scheme spec? |
|
Yes. The client_id and client_secret are encoded before being passed to HTTP Basic as the username and password tokens. |
|
I'm going to need someone who knows more about string encodings than me to write this section. I think it would be useful to have a new appendix that describes this encoding, but we can update it to avoid mentioning the out of date info in the 6749 appendix B. |
|
In the mean time, there is a placeholder for this section and the internal references to it have been fixed https://drafts.oauth.net/oauth-v2-1/draft-ietf-oauth-v2-1.html#appendix-B |
|
String encoding does what it does, it escapes certain characters in the input string. Below are some JS examples with the intent to show the differences in both cases, we get the same results now, let's use "secret=" instead of "secret" we get different results. The OAuth specification requires us to use "application/x-www-form-urlencoded" encoding algorithm, and it's very often a source for interoperability issues as @panva mentions. |
|
The correct js code that matches original appendix b behaviour to encode e.g. the client_id before passing it as username to basic auth is: |
I believe the important bit is that |
|
This continues to be a problematic point in the implementations. Both client and server implementations sometimes just rely on Basic Authentication Scheme plugins completely, or partially disregarding the requirement to encode the username/password tokens using application/x-www-form-urlencoded. This makes interop for client implementations particularly difficult. I've encountered clients that
I've encountered servers that
The biggest offender here is Google who uses cc @jogu This issue is even present in the Google Federated Identity listing on https://openid.net/certification where the test suite back then did not properly encode. I hope that if the suite ran against google's implementation today the certification would fail. What a mess.
I'd like this issue to be discussed in future OAuth WG meetings. |
I'd be very much in favour of something along these lines. The experience of the OIDF certification team very much matches Filip's experience - any time values end up with characters in them that require encoding the number of interoperability problems sky rockets, and often in ways that are complex and time consuming to resolve (as these kind of encodings are often happening automatically in libraries and not always in obvious ways). It really doesn't seem like a good use of any developer's time to try and make these things work correctly when there's really no great advantage to using the characters that do require encoding. (This is perhaps also a more general problem, e.g. similar interoperability issues arise if a client actually tries to use a state value that uses the current full set of allowed characters in state as per https://www.rfc-editor.org/rfc/rfc6749#appendix-A.5 ) |
|
OpenID Federation Client IDs are HTTPS URLs. Some OpenID4VC Client IDs are as well. We need to keep the full character set. |
|
So interestingly, it isn't possible to use an HTTPS URL as a Updated reference: https://datatracker.ietf.org/doc/html/rfc7617#section-2 |
|
I'm very glad this was discussed at IETF 119. Unfortunately it doesn't seem like any changes are likely to be acceptable. This leaves client and server implementers in a tough spot where neither one can be compliant and/or strict about encoding/decoding the username/password tokens when interoperability is a goal because as seen - this current definition is already not followed. What guidance should we give to implementers, primarily clients, to be interoperable then? Don't use client_secret_basic, use client_secret_post instead? That's explicitly not meant to be the default and is defined as NOT RECOMMENDED.
|
|
I suppose the other way to go is to open it up more rather than restrict it, for example saying that only the While this wouldn't automatically make everything interoperable, it would at least give people a faster path to interoperability for new implementations. Existing implementations that do some mix of pre-encoding the client ID/secret could be described as essentially defining their own encoding of the client ID/secret that the client developer would have to know by reading the docs, just like they already have to read the docs to know what scopes to use. |
|
@aaronpk unfortunately this would also not be compatible, in this case with existing implementations, which would continue to decode the unencoded resulting in malformed data and failed client authentications. |
|
Right, but as was mentioned in the meeting, changing the spec also won't necessarily make existing implementations interoperable either. So my suggestion is more about making the spec reflect the reality of the variety of implementations instead. |
This does not help IT departments that require "special characters" in generated passwords (please don't ask), which are non-ASCII (i.e. §). If clients encode the client secret as per the spec, the character set is well defined as UTF-8. If they don't encode it, it isn't. I'd personally say the reasoning for the encoding is clear. Making it known to implementers that they have to take care here is the key problem. :-( |
Non-ASCII characters are outside of the ABNF definition for client_id and client_secret in the first place.
I agree with this. Putting my server developer hat on. I can deal with plenty of non-conformities coming from the clients on the server side. Putting my client developer hat on. I cannot deal with the different server implementations when they're not flexible. I would like to default to client_secret_post instead of client_secret_basic because of this issue but the fact that it's, a) a MAY support and b) as per spec, NOT RECOMMENDED, stops me from doing so. If that wasn't the case I'd be happy with the current character set and encoding rules, I could implement them properly in the client software and know that by default POST params are used that work with likely every server implementation. |
|
Sadly I couldn't join the meeting last night because of timezones, but I watched the recording at least. A couple of observations: The Openbanking use cases that were mentioned that use urls as client ids don't use client secrets. I believe most federation use cases don't allow client secrets either. But I do agree with the conclusion that there's (unfortunately) not much we can/should do to restrict the client_id character set in general. I believe the problem only applies to client_secret_basic. I'm not sure how many of the problem authorization servers also support client_secret_post, but it seems google at least does. I agree with Aaron's suggestion of making the reality clearer, and Filip's suggestion of adding examples that cover this case. I think we could also encourage people to use client_secret_post instead of basic. (Although OAuth2.1 already recommends using asymmetric client authentication so already recommends against all forms of client secrets...) |
|
@panva (I didn't see your comment before posting mine)
I don't think I was aware of that - where abouts is that recommendation? |
|
|
Thanks Filip. That's a pain. I guess my suggestion is useless then. |
|
As discussed in the May 14 interim: Resolved to:
|
499a8f8 removed Appendix B but unlinked references to it are still within the document next to almost every
x-www-form-urlencodedmention.Throughout the years this Appendix was THE resource used over and over to explain why and how the OAuth use of the Basic authorization scheme encodes the username and password tokens.
If this Appendix is removed I would propose to add examples of client secret basic with username and password tokens where the client_id and client_secret encoding changes the octets that go into the basic authorization scheme base64 encoding. This is an often overlooked implementation detail that both client and server implementers get wrong and end up inoperable, further driving users to use client secret post which this document marks as NOT RECOMMENDED.
The text was updated successfully, but these errors were encountered: