Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: UTF-8 HTTP Basic authentication (RFC-7617) #47

Merged
merged 1 commit into from
May 31, 2022

Conversation

H--o-l
Copy link
Contributor

@H--o-l H--o-l commented May 30, 2022

Username/password encoding in HTTP Basic authentication is broken for non-latin1
char.

UTF-8 Basic authentication is allowed since RFC-7617. Previously, RFC-2616
only allowed to use ISO-8859-1 text which is basically latin1.

CouchDB correctly encodes and decodes credentials using UTF-8:
https://github.com/apache/couchdb/blob/0059b8f90e58e10b199a4b768a06a762d12a30d3/dev/pbkdf2.py#L65

aiohttp still uses latin1 as default:
https://github.com/aio-libs/aiohttp/blob/72d3d4b1f68cca5ad15ef50bffb0419b798c7f23/aiohttp/helpers.py#L139

aiocouch is made for CouchDB, and CouchDB follows RFC-7617, so I think
aiocouch should follow RFC-7617 too.

@codecov
Copy link

codecov bot commented May 30, 2022

Codecov Report

Merging #47 (745d629) into master (56167f8) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master      #47   +/-   ##
=======================================
  Coverage   97.13%   97.13%           
=======================================
  Files          13       13           
  Lines         944      944           
=======================================
  Hits          917      917           
  Misses         27       27           
Impacted Files Coverage Δ
aiocouch/remote.py 96.73% <100.00%> (ø)

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@adrienverge
Copy link
Contributor

I completely agree with this proposal.

For reference, I made an equivalent pull request on couchdb-python: djc/couchdb-python#301

Copy link
Member

@bmario bmario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense for me, but can you add a test with an UTF-8 user login. Thanks.

@H--o-l H--o-l force-pushed the fix/basic-auth-utf-8 branch 2 times, most recently from c5f2800 to abaf844 Compare May 31, 2022 06:47
@H--o-l
Copy link
Contributor Author

H--o-l commented May 31, 2022

(don't re-review my PR right now I'm working on adding the test)

@H--o-l H--o-l force-pushed the fix/basic-auth-utf-8 branch 3 times, most recently from ef33fa4 to 6b694ad Compare May 31, 2022 09:07
@H--o-l
Copy link
Contributor Author

H--o-l commented May 31, 2022

Hi, @bmario thanks for your review and feedback.

I added the test. I wasn't able to use the CouchDB test admin account directly because iamssen/couchdb-github-action doesn't allow to set a custom admin password (changing the CouchDB test admin password was probably be a bad idea anyway).

Thus I created my own aiocouch_test_user_utf8 following the model of aiocouch_test_user but limited to the scope of my test. This user has a UTF-8 password, passwor§.

The added test fail before this PR and pass after it, I verified it locally on my computer.

Can't wait for your feedback.

Copy link
Member

@bmario bmario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your work, but I'm a bit confused now. Isn't the § character part of latin-1 and hence the test shouldn't fail even without the patch? As such, I'd like to see an obviously non-latin1 character, like emoji's.

Comment on lines 96 to 104
users = await couchdb.create("_users", exists_ok=True)
doc = await users.create("org.couchdb.user:aiocouch_test_user_utf8")
doc["name"] = "aiocouch_test_user_utf8"
doc["password"] = "passwor§"
doc["roles"] = ["aiocouch_test_role"]
doc["type"] = "user"
await doc.save()
assert doc["_id"] == "org.couchdb.user:aiocouch_test_user_utf8"
assert doc.id == "org.couchdb.user:aiocouch_test_user_utf8"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather see this as another fixture akin to couchdb_user_account in conftest.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made the change (thanks for the feedback).


async with CouchDB(
hostname,
user="aiocouch_test_user_utf8",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can usernames be in UTF-8 as well? If so, I'd like to see an obviously non-latin1 character in the username.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed they can, I have made the change (thanks for the feedback).

Username/password encoding in HTTP Basic authentication is broken for non-latin1
char.

UTF-8 Basic authentication is allowed since RFC-7617. Previously, RFC-2616
only allowed to use ISO-8859-1 text which is basically `latin1`.

CouchDB correctly encodes and decodes credentials using UTF-8:
https://github.com/apache/couchdb/blob/0059b8f90e58e10b199a4b768a06a762d12a30d3/dev/pbkdf2.py#L65

aiohttp still uses `latin1` as default:
https://github.com/aio-libs/aiohttp/blob/72d3d4b1f68cca5ad15ef50bffb0419b798c7f23/aiohttp/helpers.py#L139

aiocouch is made for CouchDB, and CouchDB follows RFC-7617, so I think
aiocouch should follow RFC-7617 too.
@H--o-l
Copy link
Contributor Author

H--o-l commented May 31, 2022

Thank you for your work, but I'm a bit confused now. Isn't the § character part of latin-1 and hence the test shouldn't fail even without the patch? As such, I'd like to see an obviously non-latin1 character, like emoji's.

@bmario Indeed I should have checked latin-1 table directly and pick a char outside it.
The § fails with the code before this PR because his encoding is different in latin-1 than in UTF-8, see the following demonstration which use the same code as aiohttp:

import base64
print(base64.b64encode('§'.encode('latin1')).decode('latin1'))  # pw==
print(base64.b64encode('§'.encode('utf-8')).decode('utf-8'))  # wqc=

I updated the PR to use the sofa emoji's 🛋️ (and with the other change you requested).

Copy link
Member

@bmario bmario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you.

@bmario bmario merged commit 8bbae9d into metricq:master May 31, 2022
@H--o-l H--o-l deleted the fix/basic-auth-utf-8 branch May 31, 2022 12:48
@H--o-l
Copy link
Contributor Author

H--o-l commented May 31, 2022

Thanks @bmario!
There is a chance this change ends up in a new release not too far from now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants