-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve potential concurrency issue #114
Conversation
When used from different threads (or potentially as part of an event loop), it is possible that two different requests to the same host occurring concurrently could stomp on one anothers kerberos context in the context dictionary. This could result in client or server (mutual) authentication failures as the wrong context would be used at different times. To resolve this, we do two things: 1. Within a call to `generate_request_header`, we bind the context we generate to a name local to the function, and always use the context via that name. As a result, any concurrently executing call to `generate_request_header` will be unable to rebind a new context to that name. 2. When populating the context dictionary, we index by `PreparedRequest` object instead of request urls `hostname`. This is possible because when performing mutual authentication on a response, the `PreparedRequest` object which resulted in the response being processed is available as the `request` property on the response. Unfortunately, this breaks our WinRM support because it would require that the user provide the prepared request to the `winrm_wrap` and `winrm_unwrap` functions. Since this is is intended to be a module facilitating Kerberos/GSSAPI authentication for HTTP, we drop explicit WinRM support, and instead attach the established context to the response we return so that callers can implement that application specific logic themselves. Note: In addition to removing `winrm_wrap` and `winrm_unwrap`, this changes the type signature of `generate_request_header`, so that could be considered a breaking change. We may want to prefix some of these intended to be internal-use functions with underscores as a warning to users.
We should also start a discussion about renaming a bunch of these methods and object properties so that they're prefixed with an |
Is anyone watching this willing to review it? |
Don't know the code enough to review it, but I was getting intermittent errors when using threading, and installing your fixed version resolved them, so the fix appears to work. |
Sorry, I had my GH notifications dumping to /dev/null for awhile... I agree that we need a thread-safe alternative here, but can't figure out without actually writing it if smuggling the context out in the request would be sufficient to allow the WinRM encryption to work. Also- isn't storing all the requests in the extension a huge memory leak? |
Agree with the above, at a minimum we shouldn't be straight up removing methods and properties in a PR without a deprecation cycle. The WinRM stuff is used by pywinrm to support message encryption over http and with this change it will no longer work as the methods are now gone. |
i'll revert this. it modified far more than intended. |
Thanks @mkomitee, and sorry this fell through the cracks for so long... It's been awhile since I've spent any time in this code, but IIRC, we don't have anything to definitively tie together a logical chain of requests that should share a Kerberos context. Passing the previous request around might sort of work, but it's arguably a weird pattern, and if someone screws up the bookkeeping, it's going to fall over. Ultimately, I'm not sure there's a sane way to do this with anything that's built-in to requests, since this stateful connection fun is non-RFC-compliant. Maybe we need an external object to be wired up on the request object for a kerberos session context- if we get one, we smuggle the context in there and always refer to it from there, if it's missing, we go with the current host-based behavior that assumes no overlapping requests to the same host. That at least allows the user to control the lifetime of the context, and sidesteps the memory leak that (I think?) the proposed impl has... Thoughts? |
the problem is, the winrm stuff should never have been merged. |
They're separate but related problems IMO- IIUC, to make it work properly with overlapping requests, you'd still need an altered call pattern, yeah? (winrm stuff aside, which I agree we could handle another way if we can smuggle the context out) |
I reverted, can you verify that things are back as they were? This was intended to be a module for facilitating Kerberos/GSSAPI authentication for HTTP, the winrm stuff should not have been merged to begin with, as it's a specific application which happens to use kerberos authentication, but requires the library to do heavier lifting than is appropriate. My change to make the context available on the response was to make it possible for applications to be built around it with the needs of something like winrm, but to not saddle this authentication library with implementing them at it's core. As it stands we have a real concurrency issue which makes it so that this library cannot effectively be used from concurrent (threaded applications). I wrote this patch a while ago so it's possible that not everything here was strictly needed for the fix, but it's possible that the fix for it breaks winrm support. |
Where were we storing all of the requests in the extension? We were storing the context on the request, and we're storing the context on the authentication object --- nothing was being stored in the module near as I can tell. edit: this is incorrect, I was storing all of the prepared requests as the index to the context dictionary. |
I swapped things back in to my brain and see what I was thinking again. The only way to resolve the concurrency issue is to take advantage of the fact that the prepared request is available on the response, so we can use that as an index into the context dictionary. Once we're done with it (having authenticated the server), to deal with the aforementioned memory leak, we can pop it from the dict so it can be garbage collected. The only issue that remains is winrm support. We need to keep the context forever (a memory leak we already have), or attach the context to the response we return to the user, and let them implement winrm themselves in another library. I really think winrm shouldn't be in this library, so I think that's the right way to go. Now that I have your attention (breaking changes do that :)) What do you think would be a reasonable means of deprecation, or do you have an alternate suggestion? |
The trick with payload encryption/decryption is that the cipher state is tied to the state of the current request, so it cannot be done without cooperation from the request layer. At an absolute minimum, the Kerberos context used to generate the current request has to be available to an application that needs to do this, and the application has to know that the request headers have already been generated and it's thus safe to do something that will increment the sequence number (Kerberos' anti-replay sequencing). Without a helper to expose something from the underlying kerb impl, the caller must also make a lot of assumptions about what requests-kerberos is doing under the covers (eg, that it's using pykerberos, what order things are happening in, etc). It's already somewhat fragile, but just exposing the context and letting users figure it out arguably makes it a lot more so. I agree it's fugly and maybe could have been done in other ways, but none were proposed, and that ship has long since sailed. If you want to work with us on a long-term plan to deprecate it, that's a discussion we can start. Regardless, the overlapping requests issue is the bigger one to solve, and it shouldn't require breaking existing functionality. |
Also, I'm curious if you've been testing using multiple requests or just one-shot? At a glance, it kinda looks like the latter... Remember that the way this stuff works is non-RFC compliant, so a series of requests to the same origin from the same session should use the same Kerberos context on the same underlying HTTP(S) connection so it doesn't have to renegotiate on each connection or potentially mix the implicit credentials (at least on Windows targets, never tried it on a Linux impl). That's what I meant about maybe needing a new call pattern regardless- wouldn't you need to pass the previous response on the next request to keep that working? If that's the case, I think an explicit surrogate context object makes more sense than requiring custom bookkeeping on the requests. If I'm smoking crack on that last part (ie it doesn't require a new calling pattern), we could probably hack something up with weakrefs... |
To be clear, I've never tested the winrm stuff, and like you I assume it's built on lots of assumptions. The only reliable way I can see it working due to all of the things you've brought up here & in #133 is that users only ever make one http request with the auth object so that they can know for certain that the context they established sticks around for the lifetime of their winrm session. With the library as it was originally written there was absolutely no intent to ever reuse an established context for a second request. Each request gets it's own context which is independently authenticated whether or not it reuses a previous tcp connection. edit: That's why generate_request_header always generates a new context by calling |
When used from different threads (or potentially as part of an event loop), it is possible that two different requests to the same host occurring concurrently could stomp on one anothers kerberos context in the context dictionary. This could result in client or server (mutual) authentication failures as the wrong context would be used at different times.
To resolve this, we do two things:
generate_request_header
, we bind the context we generate to a name local to the function, and always use the context via that name. As a result, any concurrently executing call togenerate_request_header
will be unable to rebind a new context to that name.PreparedRequest
object instead of request urlshostname
. This is possible because when performing mutual authentication on a response, thePreparedRequest
object which resulted in the response being processed is available as therequest
property on the response.Unfortunately, this breaks our WinRM support because it would require that the user provide the prepared request to the
winrm_wrap
andwinrm_unwrap
functions. Since this is is intended to be a module facilitating Kerberos/GSSAPI authentication for HTTP, we drop explicit WinRM support, and instead attach the established context to the response we return so that callers can implement that application specific logic themselves.Note: In addition to removing
winrm_wrap
andwinrm_unwrap
, this changes the type signature ofgenerate_request_header
, so that could be considered a breaking change. We may want to prefix some of these intended to be internal-use functions with underscores as a warning to users.This may be of particular interest to @nitzmahone because of the changes around winrm support and potentially @sigmavirus24 / @Lukasa since we're attaching the context directly to the response object, which may be frowned upon in the requests ecosystem in general.
This is in reference to #113