Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency issue in generate_request_header #8

Open
mkomitee opened this issue Mar 20, 2018 · 10 comments
Open

Concurrency issue in generate_request_header #8

mkomitee opened this issue Mar 20, 2018 · 10 comments

Comments

@mkomitee
Copy link
Contributor

Since there's a lot of code in common between this and requests/requests-kerberos, I believe requests-gssapi has some concurrency bugs as well.

requests/requests-kerberos#113 has details, and requests/requests-kerberos#114 has a fix which can be adapted.

The problem stems from how we index gssapi.SecurityContext objects by hostname. If you have multiple threads sharing an HTTPSPNEGOAuth object, and they both send requests to the same host:

  1. thread1 generates a SecurityContext to authenticate a request destined to foo.com and caches it in self.context["foo.com"] and uses it to step().
  2. thread2 generates a SecurityContext to authenticate another request, also destined to foo.com and caches it in self.context["foo.com"], overwriting the SecurityContext placed there by thread1, and uses it to step().
  3. thread1 receives a response, retrieves the SecurityContext (generated by thread2) to authenticate the response from foo.com, attempts to step() with the token it received, and ...

At this point, we're likely to either get a mutual authentication exception because the SecurityContext was used for a different request, or it may happen to work due to implementation details (though probably not). And even if it did work, when thread2 receives its response and attempts to authenticate it, it's likely to get a mutual authentication exception because the SecurityContext was fully established.

And the potential is also there for there to be a concurrency issue within generate_request_header alone, since it always accesses the SecurityContext via the self.context dictionary. Depending on how frequently the interpreter re-schedules threads (and the GIL is released), it's possible for thread1 to generate its context, then have the interpreter schedule thread2 which will replace it with another context. Then whichever one calls step() on it first will succeed, and whichever one calls step() on it second will likely fail since they'll both use the same context since they're accessing it via the self.context dictionary.

The solution for this is to index by the request (or PreparedRequest) object, not the hostname in the url in the request as we currently do.

As the original author of the code in question, my bad.

@frozencemetery
Copy link
Member

Thanks a lot for the heads up @mkomitee!

@frozencemetery
Copy link
Member

Hmm, we're at 11 months with no activity from the requests-kerberos folks. I'm considering merging your fix, but wanted to check first - do you have interest in adapting the PR to our (pretty similar) codebase, or should I just do that?

@mkomitee
Copy link
Contributor Author

I think there may be some confusion about ownership, authority, & responsibility for the project. I'm the original author as I submit the original code which was merged into the requests library itself originally. They did some refactoring and split out most of the authentication methods into a separate project.

I have the ability to merge it, but I don't think I have the ability to cut a release. I'm also reluctant to merge PRs without someone willing to review it.

@frozencemetery
Copy link
Member

I took a deeper look at the change. If requests-kerberos isn't going to take it (and they don't seem likely to at this point), then we have a bit of a problem with regards to the compat shim.

It's pretty difficult for generate_request_header to have different types between the two. I spent some time playing with it and was unable to come up with a solution to the problem of TypeError: generate_request_header() got multiple values for argument 'response'.

So if you have a patch that preserves the compat shim while fixing the issue, I'd be happy to take that.

aiudirog added a commit to pythongssapi/httpx-gssapi that referenced this issue Jan 26, 2021
Solves the concurrency issue brought up in pythongssapi/requests-gssapi#8 by passing the context around through the flow instead of storing it in a global dictionary.

Inlined authenticate_user() because it didn't do anything more than set the auth header and handle_401() because it became unnecessarily complicated.
@michael-o
Copy link
Contributor

michael-o commented Mar 31, 2021

Suffer from this one too with weird exceptions. Using a ThreadPoolExecutor which bangs mod_gssapi and Tomcat SPNEGO Autheticator really fast.

Any chance to have this ported to this module?

@frozencemetery
Copy link
Member

Any chance to have this ported to this module?

Per #8 (comment) I will consider any patch that preserves the requests-kerberos shim while fixing the issue.

@michael-o
Copy link
Contributor

michael-o commented Mar 31, 2021

I have now applied the following locally:

diff --git a/requests_gssapi/gssapi_.py b/requests_gssapi/gssapi_.py
index 6c56e66..b1a6e3a 100644
--- a/requests_gssapi/gssapi_.py
+++ b/requests_gssapi/gssapi_.py
@@ -120,7 +120,7 @@ class HTTPSPNEGOAuth(AuthBase):
         self.mech = mech
         self.sanitize_mutual_error_response = sanitize_mutual_error_response

-    def generate_request_header(self, response, host, is_preemptive=False):
+    def generate_request_header(self, host, request=None, response=None, is_preemptive=False):
         """
         Generates the GSSAPI authentication token

@@ -129,6 +129,9 @@ class HTTPSPNEGOAuth(AuthBase):

         """

+        if request is None:
+            request = response.request
+
         gssflags = [gssapi.RequirementFlag.out_of_sequence_detection]
         if self.delegate:
             gssflags.append(gssapi.RequirementFlag.delegate_to_peer)
@@ -143,15 +146,17 @@ class HTTPSPNEGOAuth(AuthBase):
                     name = "%s@%s" % (name, host)

                 name = gssapi.Name(name, gssapi.NameType.hostbased_service)
-            self.context[host] = gssapi.SecurityContext(
+            ctx = gssapi.SecurityContext(
                 usage="initiate", flags=gssflags, name=name,
                 creds=self.creds, mech=self.mech)

+            self.context[request] = ctx
+
             gss_stage = "stepping context"
             if is_preemptive:
-                gss_response = self.context[host].step()
+                gss_response = ctx.step()
             else:
-                gss_response = self.context[host].step(
+                gss_response = ctx.step(
                     _negotiate_value(response))

             return "Negotiate {0}".format(b64encode(gss_response).decode())
@@ -169,7 +174,7 @@ class HTTPSPNEGOAuth(AuthBase):
         host = urlparse(response.url).hostname

         try:
-            auth_header = self.generate_request_header(response, host)
+            auth_header = self.generate_request_header(host, response=response)
         except SPNEGOExchangeError:
             # GSS Failure, return existing response
             return response
@@ -260,7 +265,7 @@ class HTTPSPNEGOAuth(AuthBase):

         try:
             # If the handshake isn't complete here, nothing we can do
-            self.context[host].step(_negotiate_value(response))
+            self.context[response.request].step(_negotiate_value(response))
         except gssapi.exceptions.GSSError as error:
             log.exception("authenticate_server(): context stepping failed:")
             log.exception(error.gen_message())
@@ -307,7 +312,7 @@ class HTTPSPNEGOAuth(AuthBase):
             host = urlparse(request.url).hostname

             try:
-                auth_header = self.generate_request_header(None, host,
+                auth_header = self.generate_request_header(host, request=request,
                                                            is_preemptive=True)
                 log.debug(
                     "HTTPSPNEGOAuth: Preemptive Authorization header: {0}"

and started sending 200 tuples of POST, HEAD, GET, DELETE with 18 parallel threads. Some requests never arrive in my with context altough connectionpool shows this request. Looking again at HTTPSPNEGOAuth class it is conceptionally not threadsafe because it contains: pos and target_name which cannot be shared in one session. It seems to me that even the request as dict key is still not scaling for me. I could be wrong.

The only realiable way I see is to associate the context with the connection and remove it immediately as soon as the context is complete.

@michael-o
Copy link
Contributor

Yet another update: Reverted the update and used the autheticator on a per request basis. Works as designed. Now I need to patch WebDAV client because it does not access auth object on a per-request basis.

@damelLP
Copy link

damelLP commented Feb 1, 2023

I have a similar usecase (downstream lib is sync and I call it in an async context with an executor) I made it work by prefixing the negotiate check i.e.
here: https://github.com/pythongssapi/requests-gssapi/blob/main/requests_gssapi/gssapi_.py#L199-L200
instead of this:

        if _negotiate_value(response) is not None:
            _r = self.authenticate_user(response, **kwargs)

this:

        if is_http_error and _negotiate_value(response) is not None:
            _r = self.authenticate_user(response, **kwargs)

Other than potentially actually needing to re-authenticate at some point is there a reason we can't do this?

@michael-o
Copy link
Contributor

I have a similar usecase (downstream lib is sync and I call it in an async context with an executor) I made it work by prefixing the negotiate check i.e. here: https://github.com/pythongssapi/requests-gssapi/blob/main/requests_gssapi/gssapi_.py#L199-L200 instead of this:

        if _negotiate_value(response) is not None:
            _r = self.authenticate_user(response, **kwargs)

this:

        if is_http_error and _negotiate_value(response) is not None:
            _r = self.authenticate_user(response, **kwargs)

Other than potentially actually needing to re-authenticate at some point is there a reason we can't do this?

A better solution is to do per-request basis, as mentioned. Benefit: does not require changes in this lib. OR: client_lock = threading.Lock() and lock the sensitive spots. Works with the WebDAV module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants