Digest auth middleware #305

yeraydiazdiaz · 2019-09-02T14:57:52Z

This PR implements Digest authentication as a middleware.

Digest is mentioned as pending work in the README.

Tested against httpbin.org's digest auth endpoint and mimicking request's implementation.

Note this PR does not include handling the auth-int Quality of Protection (qop) option in the RFC mainly because requests does not do it either. I'm happy to add it in a separate PR if it is deemed necessary though.

sethmlarson

Whole bunch of little comments, looks pretty good so far! :)

httpx/middleware/auth.py

httpx/utils.py

florimondmanca

Really great piece of work here — I didn't realize Digest Auth was this complex to put together.

Obviously then, we need to make sure the implementation is nailed down and as easy to understand as possible. I've left a couple of comments with a focus on readability, as I reviewed this following along the Digest access authentication page on Wikipedia. 😄

As a general comment on ._build_auth_header(), perhaps we can clearly separate the various parts of the function with some comments that give a hint on the general algorithm:

Extract all useful variables.
Compute the client nonce.
Compute the digest response. (This is where we compute HA1, HA2, and the final call to digest().)
Assemble parts of the authorization header. (Construction of format_args.)
Build the final authorization header.

I haven't looked at the test code yet.

httpx/middleware/auth.py

yeraydiazdiaz · 2019-09-03T07:52:58Z

Thanks for the reviews @sethmlarson and @florimondmanca, brace for incoming commits 😄

yeraydiazdiaz · 2019-09-03T10:55:11Z

This is now ready for re-review after implementing most of the comments.

@florimondmanca I couldn't find a clean way of applying your suggestion of laying the code out as a hint for the general algorithm since values produced in the calculation of the response are also used in the final digest value. I felt like after it was already cleaner after the minor refactorings but if you have suggestions let me know.

florimondmanca

The body of the header building logic looks good to me, a couple more comments. I’ll take a look at tests soon. :)

httpx/middleware/auth.py

yeraydiazdiaz · 2019-09-04T10:26:21Z

@encode/httpx-maintainers this is ready for re-review. Thanks for all the helpful comments and links. 🌟

sethmlarson

One tiny comment from me then I'm happy with this implementation. 🎉

tests/client/test_auth.py

docs/quickstart.md

tomchristie · 2019-09-04T12:17:21Z

This is fabulous stuff!

I've got one suggestion here, to consider...

Currently auth= accepts either a tuple, or a request->request callable.
I think we should change that so that auth= accepts either a tuple, or a request->request callable, or an instance of BaseMiddleware.

Then we can:

Drop the DigestAuth model.
Just name the basic auth middleware BasicAuth.
Just name the digest auth middleware DigestAuth.
Import both BasicAuth and DigestAuth from the top level of the package.

That way we're not introducing an extra model, and we have a nice consistency where auth=httpx.BasicAuth(username=..., password=...) is also a valid expression.

Make sense?

httpx/client.py

tomchristie · 2019-09-04T12:25:46Z

Plus, if a user is implementing a custom auth style that needs to be introduced as middleware, then they can still install it using auth=..., rather than installing it as middleware.

That'll alsp give us better ability to control any "only run auth if we're still on the initial request origin". (Because if a user installed auth as middleware instead, we don't exactly know that it's an authenticating thing.)

yeraydiazdiaz · 2019-09-04T13:21:28Z

Sounds good to me! The only thing that comes to mind is the check for BaseMiddleware may be a bit broad, but given that we're only allowing users to pass them as auth I think it should be fine. 👍

yeraydiazdiaz · 2019-09-04T16:45:18Z

@tomchristie this is ready for review with your suggested changes

florimondmanca

Just a few final comments from me as well.

About accepting any BaseMiddleware as auth — should we introduce a BaseAuthMiddleware? This base middleware wouldn’t have anything special but at least we’d have some semantic enforcement, and we wouldn’t be able to pass eg a RedirectMiddleware in there.

httpx/middleware/auth.py

httpx/client.py

florimondmanca · 2019-09-05T08:00:00Z

One question: have we validated that this works when making a real request to a server exposing digest auth, eg httpbin? :)

httpx/models.py

httpx/middleware/auth.py

yeraydiazdiaz · 2019-09-05T08:16:02Z

Glad you asked! I have been testing this against httpbin.org which has been quite useful. So much in fact that I thought we should consider having a "integration" test suite using its Docker image.

Not sure how we feel about that, but just now I noticed that my test script fails for some of the algorithms 😓

florimondmanca · 2019-09-05T08:33:47Z

@yeraydiazdiaz Did you know about pytest-httpbin? It spins up a local httpbin instance (Flask server) for requesting during tests. I provide an example in #298 since the Requests test suite use it and I didn’t want to change the test setup too much.

We’re already live-testing most of the time via the uvicorn server, but I agree providing integration tests for cases when we use a mock dispatcher (here and in HTTP/2) would be nice. Not sure if httpbin supports HTTP/2 though.

Anyway, I think if we agree that we’ll end up using pytest-httpbin via #298, you could setup an integration test with it here?

yeraydiazdiaz · 2019-09-05T08:39:56Z

Oh cool! I don't know how I missed that. I'll definitely add it to the PR.

Any thoughts on whether we should keep the mock dispatch tests?

florimondmanca · 2019-09-05T09:16:59Z

Any thoughts on whether we should keep the mock dispatch tests?

I think we should keep those that can’t be addressed via httpbin (eg special error cases or situations).

yeraydiazdiaz · 2019-09-05T09:46:31Z

So turns out it wasn't a bug in the implementation but rather a fairly subtle usability bug.

Since the DigestAuth holds state you cannot pass an instance to the client and reuse it for subsequent requests; you must pass a new instance to each request for it to work correctly. I've pushed a check to disallow passing DigestAuth instances on client initialization.

Side note about tests with pytest-httpbin, I think I'll leave it out of this PR for clarity. If this PR is merged before #298 I'm happy to add the tests there.

sethmlarson · 2019-09-05T11:48:28Z

I'd think we want people passing in auth as a part of client init. This will be a problem when people implement custom middleware as well.

florimondmanca · 2019-09-05T11:48:45Z

Since the DigestAuth holds state you cannot pass an instance to the client and reuse it for subsequent requests; you must pass a new instance to each request for it to work correctly.

Ah, good point. Since DigestAuth() only accepts static arguments (username and password), I think we should be able to rework the implementation to hide this detail from the user.

For example, the whole content of DigestAuth.call() could be moved into a separate (stateful) helper class which DigestAuth would instanciate and defer request handling to — if that makes sense?

(One learning point of this is that middleware shouldn’t hold per-request state themselves.)

Side note about tests with pytest-httpbin, I think I'll leave it out of this PR for clarity. If this PR is merged before #298 I'm happy to add the tests there.

Sounds good to me. :)

httpx/middleware/auth.py

sethmlarson · 2019-09-06T12:21:13Z

httpx/middleware/auth.py

+        self.per_nonce_count = per_nonce_count
+        self.num_401_responses = 0
+
+    async def __call__(


I'm still worried about this middleware not being able to handle multiple requests concurrently. Unless we mitigate somehow we're going to have to warn users implementing middleware that hold onto state about how to properly implement them and the pitfalls.

I'm still worried about this middleware not being able to handle multiple requests concurrently

Not sure what you mean? It is capable of handling concurrent requests, it's just that there's global state to keep track of. Or am I missing something?

yeraydiazdiaz · 2019-09-06T13:13:22Z

httpx/middleware/auth.py

+
+
+class DigestAuth(BaseMiddleware):
+    per_nonce_count: typing.Dict[bytes, int] = defaultdict(lambda: 0)


A note on this global variable. Requests has a much pragmatic approach to nonce counting, it simply keeps track of the last nonce and increases the nonce count if it is the same. However I chose to adhere a bit more closely to the RFC at the potential risk of this structure growing uncontrollably.

I'm not sure what the best solution for this would be. Maybe keep the last N nonce values? Reset on successful authentication?

Just make it a global. It's actually a benefit for it to be shared across clients, across threads, across tasks.

Make it essentially be an LRU dict, storing N items. (1000, 10000, or whatever). Whenever it's more than N items, delete the first entry. (Easy now that python dicts are ordered)

We don't need the _RequestDigestAuth class. Nonce state stuff can just be a dumb global, and everything can just be implemented on the middleware class.

Just make it a global

It is a mutable class variable so it should qualify as global, do you mean moving it to the module?

We don't need the _RequestDigestAuth class. Nonce state stuff can just be a dumb global, and everything can just be implemented on the middleware class.

I had this intuition as well, but the num_401_requests can't really be global since it would prevent concurrent non-authenticated requests from going through the flow correctly. For instance if I fire 3 concurrent requests the only the first one would go through the flow and the rest would assume the legitimate 401 responses are due to credentials being invalid and return the 401 as the final response.

Following closely the RFC the ideal solution would be to prevent the additional requests to be sent until the first authentication succeeds and then reuse the auth header (which the RFC suggests we should).

However this logic can become even more complex as the server might decide to reject one of the requests and ask for reauthentication so we'd have to block concurrent requests again.

It seemed like quite a complex solution, though there are some benefits, like potentially saving a round trip for authentication but the RFC says the server may or may not take advantage of it. In the end I decided to take the pragmatic approach requests seems to take.

Of course if we decide otherwise I'm happy to implement the full fledged solution.

Make it essentially be an LRU dict, storing N items. (1000, 10000, or whatever). Whenever it's more than N items, delete the first entry. (Easy now that python dicts are ordered)

Yep, that's what I was thinking as well 👍

yeraydiazdiaz · 2019-09-07T17:15:03Z

@sethmlarson @florimondmanca any opinions on #305 (comment)?

Personally I'm torn between having a more complete implementation and introducing much more complexity in this middleware.

For reference request's implementation takes a much more pragmatic approach that has been valuable for many years (the fact that auth-int has not been implemented in that time is probably a testament on how many servers actually require it).

sethmlarson · 2019-09-07T18:39:00Z

@yeraydiazdiaz Hmm requests uses thread locals to hold onto state... I'm a lot more comfortable with how they're implementing it than with putting so much state into globals / class variables. Could we do something with contextvars potentially?

florimondmanca · 2019-09-07T20:34:26Z

@yeraydiazdiaz In retrospect, I agree with @tomchristie that the request middleware is not strictly necessary. I think we should be able to convert all utility methods into regular functions, and end up with the typical __init__()/__call__() pair of methods, with the logic in __call__ delegating as much as possible to the helper functions.

I also think @sethmlarson's idea about context variables is interesting. I have the intuition we can use them to ditch num_401_responses, and only track whether the server has already rejected our credentials. It's hard to explain in words without implementing the whole thing, so I might draft a PR to illustrate?

Lastly, to help reduce the scope of this PR I'm going to submit one for refactoring the middleware into separate modules. This way we'll be able to solely focus on middleware/digest_auth.py, if that sounds alright?

yeraydiazdiaz · 2019-09-08T08:27:50Z

Thanks all.

Hmm requests uses thread locals to hold onto state... I'm a lot more comfortable with how they're implementing it than with putting so much state into globals / class variables. Could we do something with contextvars potentially?

This is something I suggested in a TODO comment on my first attempt on introducing Digest auth, @tomchristie was not too keen on them and suggested using class variables. Personally I agree with him but I'm not particularly familiar with contextvars and their benefits, maybe someone can break them down for me?

I have the intuition we can use them to ditch num_401_responses, and only track whether the server has already rejected our credentials. It's hard to explain in words without implementing the whole thing, so I might draft a PR to illustrate?

Sure, actually I'd be happy to close this PR in favor of yours if that's preferable. Frankly this one has gotten unwieldy and there's just too much information for me to figure out what the correct approach would be. Maybe it's best if someone with a fresh pair of eyes approaches the problem.

I would suggest scoping out how much we want to stick to the RFC, there's quite a bit of SHOULDs and MAYs in there that requests does not implement on top of them not having to support concurrency.

Digest auth: fix merge conflicts with master

florimondmanca · 2019-09-08T18:26:07Z

Personally I agree with him but I'm not particularly familiar with contextvars and their benefits, maybe someone can break them down for me?

Not a contextvars expert either, but the way I see them context variables are attached to a particular async context, i.e. a stack of coroutines that await one another. (Put differently, coroutines started concurrently via loop.create_task() or a trio nursery won't share the same async context.)

One typical use case I can think of is setting a context-local REQUEST_ID of some kind, e.g. for application monitoring. Gist

That said, when I wrote…

…and only track whether the server has already rejected our credentials

we don't have to use context variables to track that kind of information. A dirty trick would be to add _credentials_rejected=False to the middleware signature, and set the flag to True when we make the recursive call. In fact, seeing the difficulties faced in #326 I also think using contextvars when we don't have to is probably not the best way to go. :-)

tomchristie · 2019-09-09T14:56:37Z

Lastly, to help reduce the scope of this PR I'm going to submit one for refactoring the middleware into separate modules. This way we'll be able to solely focus on middleware/digest_auth.py, if that sounds alright?

Fantastic, yes!

florimondmanca · 2019-09-09T15:09:38Z

@yeraydiazdiaz

Sure, actually I'd be happy to close this PR in favor of yours if that's preferable.

I think we're nearly there, though! If this PR has too much old/now irrelevant content (which I'd definitely agree with), we can perfectly close it and re-open one to start from a clean slate. I can also open a PR to your fork with a refactoring proposal, if you'd like? :-)

tomchristie · 2019-09-09T15:11:35Z

I'd suggest that the best tack onto this would be to start with a PR for Digest Auth that simply doesn't include any nonce checking at all.

We can then issue another PR on top of that that adds the nonce checking and nothing else. I think I have a clear handle onto that, but it will make it far easier to discuss and tackle if we're treating it in isolation. 👍

httpx/utils.py

yeraydiazdiaz · 2019-09-09T16:49:33Z

Thanks all, I'll close this one and pick off bits on a new one as Tom suggests 🌟 🌟 🌟

yeraydiazdiaz mentioned this pull request Sep 2, 2019

[WIP] Digest auth #136

Closed

sethmlarson requested review from florimondmanca and sethmlarson September 2, 2019 15:15

sethmlarson suggested changes Sep 2, 2019

View reviewed changes

florimondmanca requested changes Sep 2, 2019

View reviewed changes

florimondmanca reviewed Sep 3, 2019

View reviewed changes

httpx/middleware/auth.py Outdated Show resolved Hide resolved

httpx/middleware/auth.py Outdated Show resolved Hide resolved

httpx/middleware/auth.py Outdated Show resolved Hide resolved

tomchristie reviewed Sep 3, 2019

View reviewed changes

httpx/middleware/auth.py Outdated Show resolved Hide resolved

florimondmanca self-requested a review September 4, 2019 11:41

sethmlarson suggested changes Sep 4, 2019

View reviewed changes

tests/client/test_auth.py Show resolved Hide resolved

tomchristie reviewed Sep 4, 2019

View reviewed changes

docs/quickstart.md Show resolved Hide resolved

tomchristie reviewed Sep 4, 2019

View reviewed changes

httpx/client.py Outdated Show resolved Hide resolved

yeraydiazdiaz requested a review from tomchristie September 4, 2019 13:57

florimondmanca requested changes Sep 4, 2019

View reviewed changes

httpx/middleware/auth.py Outdated Show resolved Hide resolved

httpx/client.py Outdated Show resolved Hide resolved

florimondmanca reviewed Sep 5, 2019

View reviewed changes

httpx/models.py Outdated Show resolved Hide resolved

httpx/middleware/auth.py Outdated Show resolved Hide resolved

yeraydiazdiaz force-pushed the digest-auth-middleware branch from 466ce58 to 79b2f75 Compare September 5, 2019 09:45

Yeray Diaz Diaz added 4 commits September 6, 2019 11:51

Remove unnecessary type check

99b13de

Avoid circular import on type checking

9bb823b

Add helper class holding per-request Digest auth state

e436bab

Hold nonce count globally

b2b7078

yeraydiazdiaz force-pushed the digest-auth-middleware branch from 8db1063 to b2b7078 Compare September 6, 2019 10:51

Do not quote algorithm, qop, and nc

ed89de7

sethmlarson suggested changes Sep 6, 2019

View reviewed changes

Yeray Diaz Diaz added 2 commits September 6, 2019 13:57

Fix variable name

36d8a33

Handle ', ' separated qop values

6f486f9

yeraydiazdiaz commented Sep 6, 2019

View reviewed changes

Add LRUDict util and use it for the global nonce count

33c072f

florimondmanca mentioned this pull request Sep 7, 2019

Refactor middleware #325

Merged

Fix merge conflicts with master

5ee7ade

florimondmanca mentioned this pull request Sep 7, 2019

Refactor redirect middleware using contextvars #326

Closed

Merge pull request #1 from encode/digest-auth-middleware

8a4ec2e

Digest auth: fix merge conflicts with master

StephenBrown2 reviewed Sep 9, 2019

View reviewed changes

httpx/utils.py Show resolved Hide resolved

StephenBrown2 reviewed Sep 9, 2019

View reviewed changes

httpx/utils.py Show resolved Hide resolved

yeraydiazdiaz closed this Sep 9, 2019

yeraydiazdiaz mentioned this pull request Sep 10, 2019

DigestAuth as middleware - No nonce count #332

Merged

florimondmanca mentioned this pull request Sep 21, 2019

Some API oddities/possible oversights #365

Closed

florimondmanca mentioned this pull request Feb 16, 2021

Keep digest authentication state, and reuse for following requests. #1467

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Digest auth middleware #305

Digest auth middleware #305

yeraydiazdiaz commented Sep 2, 2019

sethmlarson left a comment

florimondmanca left a comment •

edited

yeraydiazdiaz commented Sep 3, 2019

yeraydiazdiaz commented Sep 3, 2019

florimondmanca left a comment

yeraydiazdiaz commented Sep 4, 2019

sethmlarson left a comment

tomchristie commented Sep 4, 2019

tomchristie commented Sep 4, 2019 •

edited

yeraydiazdiaz commented Sep 4, 2019

yeraydiazdiaz commented Sep 4, 2019

florimondmanca left a comment

florimondmanca commented Sep 5, 2019 •

edited

yeraydiazdiaz commented Sep 5, 2019

florimondmanca commented Sep 5, 2019 •

edited

yeraydiazdiaz commented Sep 5, 2019

florimondmanca commented Sep 5, 2019

yeraydiazdiaz commented Sep 5, 2019

sethmlarson commented Sep 5, 2019

florimondmanca commented Sep 5, 2019 •

edited

sethmlarson Sep 6, 2019

yeraydiazdiaz Sep 6, 2019

yeraydiazdiaz Sep 6, 2019 •

edited

tomchristie Sep 6, 2019

yeraydiazdiaz Sep 7, 2019

yeraydiazdiaz commented Sep 7, 2019

sethmlarson commented Sep 7, 2019

florimondmanca commented Sep 7, 2019 •

edited

yeraydiazdiaz commented Sep 8, 2019

florimondmanca commented Sep 8, 2019 •

edited

tomchristie commented Sep 9, 2019

florimondmanca commented Sep 9, 2019

tomchristie commented Sep 9, 2019

yeraydiazdiaz commented Sep 9, 2019



		class DigestAuth(BaseMiddleware):
		per_nonce_count: typing.Dict[bytes, int] = defaultdict(lambda: 0)

Digest auth middleware #305

Digest auth middleware #305

Conversation

yeraydiazdiaz commented Sep 2, 2019

sethmlarson left a comment

Choose a reason for hiding this comment

florimondmanca left a comment • edited

Choose a reason for hiding this comment

yeraydiazdiaz commented Sep 3, 2019

yeraydiazdiaz commented Sep 3, 2019

florimondmanca left a comment

Choose a reason for hiding this comment

yeraydiazdiaz commented Sep 4, 2019

sethmlarson left a comment

Choose a reason for hiding this comment

tomchristie commented Sep 4, 2019

tomchristie commented Sep 4, 2019 • edited

yeraydiazdiaz commented Sep 4, 2019

yeraydiazdiaz commented Sep 4, 2019

florimondmanca left a comment

Choose a reason for hiding this comment

florimondmanca commented Sep 5, 2019 • edited

yeraydiazdiaz commented Sep 5, 2019

florimondmanca commented Sep 5, 2019 • edited

yeraydiazdiaz commented Sep 5, 2019

florimondmanca commented Sep 5, 2019

yeraydiazdiaz commented Sep 5, 2019

sethmlarson commented Sep 5, 2019

florimondmanca commented Sep 5, 2019 • edited

sethmlarson Sep 6, 2019

Choose a reason for hiding this comment

yeraydiazdiaz Sep 6, 2019

Choose a reason for hiding this comment

yeraydiazdiaz Sep 6, 2019 • edited

Choose a reason for hiding this comment

tomchristie Sep 6, 2019

Choose a reason for hiding this comment

yeraydiazdiaz Sep 7, 2019

Choose a reason for hiding this comment

yeraydiazdiaz commented Sep 7, 2019

sethmlarson commented Sep 7, 2019

florimondmanca commented Sep 7, 2019 • edited

yeraydiazdiaz commented Sep 8, 2019

florimondmanca commented Sep 8, 2019 • edited

tomchristie commented Sep 9, 2019

florimondmanca commented Sep 9, 2019

tomchristie commented Sep 9, 2019

yeraydiazdiaz commented Sep 9, 2019

florimondmanca left a comment •

edited

tomchristie commented Sep 4, 2019 •

edited

florimondmanca commented Sep 5, 2019 •

edited

florimondmanca commented Sep 5, 2019 •

edited

florimondmanca commented Sep 5, 2019 •

edited

yeraydiazdiaz Sep 6, 2019 •

edited

florimondmanca commented Sep 7, 2019 •

edited

florimondmanca commented Sep 8, 2019 •

edited