Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #3368 Use UTF-8 for basic authentication #3555

Merged
merged 5 commits into from Jul 25, 2020
Merged

Fixes #3368 Use UTF-8 for basic authentication #3555

merged 5 commits into from Jul 25, 2020

Conversation

ashwinbhaskar
Copy link
Collaborator

@ashwinbhaskar ashwinbhaskar commented Jul 1, 2020

No description provided.

core/src/main/scala/org/http4s/Credentials.scala Outdated Show resolved Hide resolved
def apply(token: String): BasicCredentials = {
val bytes = Base64.getDecoder.decode(token)
val userPass = new String(bytes, StandardCharsets.ISO_8859_1)
val userPass: String = decode(bytes, utf8CharsetDecoder)
.fold({ _ => decode(bytes, StandardCharsets.ISO_8859_1) }, identity)
Copy link
Member

@rossabaker rossabaker Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a suggested behavior specified in Appendix B. It would be nice to make this whole thing configurable, but I don't see a clear path to do that yet.

Copy link
Collaborator Author

@ashwinbhaskar ashwinbhaskar Jul 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appendix B suggests that you fallback even when there is a user-pass mismatch

When processing as UTF-8 fails (due to a failure to decode as UTF-8 or a mismatch of user-id/password), a server might try a fallback...

But IMO - if the decoding is successful, we don't need to fallback in case of user-pass mismatch. We only need to fall back in case of an exception when decoding with utf-8. WDYT?

Copy link
Member

@rossabaker rossabaker Jul 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think falling back in a user-pass mismatch would be too aggressive a default. Furthermore, we're just decoding. Auth is the caller's problem.

What concerns me is that if an http4s server was interpreting them as Latin1, and a client doesn't respect the new challenge parameter, we're quietly changing behavior and don't have a good recourse. That's why I'd like to make the challenge and this behavior configurable, even though I like your new defaults. I just don't see a great place to thread that configuration through.

Copy link
Collaborator Author

@ashwinbhaskar ashwinbhaskar Jul 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rossabaker can we try passing an options or charset parameter to apply of BasicAuth? Since this apply is what returns AuthMiddleWare, it might be the point to start threading the configuration from.

Copy link
Collaborator Author

@ashwinbhaskar ashwinbhaskar Jul 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rossabaker I tried passing charset to BasicAuth. But like you said I am not able to wire it up all the way. Will have to spend more time on that. Do you want that implementation in this PR? Or can we do that as a separate issue? This PR has implemented the RFC suggestion w.r.t charset and fall back.

Copy link
Member

@rossabaker rossabaker Jul 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So sorry, I fell behind again and hadn't seen the updates here. I think we can track that as a separate issue. As it stands, this is a 1.0 change, but I think it's a good one.

@ashwinbhaskar ashwinbhaskar changed the title WIP Fixes #3368 Use UTF-8 for basic authentication Fixes #3368 Use UTF-8 for basic authentication Jul 4, 2020
@satorg
Copy link
Contributor

satorg commented Jul 4, 2020

My 2 cents on this... Do we really need a fallback encoding here? I mean, is it even possible that some byte array fails to decode to a string with UTF-8, while succeeds with ISO-8859-1?

@satorg
Copy link
Contributor

satorg commented Jul 5, 2020

Just have checked this:

import java.nio.charset.StandardCharsets._
println(UTF_8 contains ISO_8859_1)

prints true, i.e. everything that can be encoded with ISO_8859_1, can be encoded with UTF_8 as well (but result bytes may differ for some particular characters).

JavaDocs on Charset.contains:

<p> A charset <i>C</i> is said to <i>contain</i> a charset <i>D</i> if,
and only if, every character representable in <i>D</i> is also
representable in <i>C</i>.  If this relationship holds then it is
guaranteed that every string that can be encoded in <i>D</i> can also be
encoded in <i>C</i> without performing any replacements.

<p> That <i>C</i> contains <i>D</i> does not imply that each character
representable in <i>C</i> by a particular byte sequence is represented
in <i>D</i> by the same byte sequence, although sometimes this is the
case.

@satorg
Copy link
Contributor

satorg commented Jul 5, 2020

Ah, ok, I see my mistake: although every char in ISO_8859_1 can be represented in UTF-8, but not every byte, encoded with ISO_8859_1 can be decoded with UTF-8. So the fallback approach may make sense.

@ChristopherDavenport ChristopherDavenport merged commit 452a200 into http4s:master Jul 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants