Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dynamically updating authentication headers #6915

Open
1 task done
faph opened this issue Sep 15, 2022 · 24 comments
Open
1 task done

Support dynamically updating authentication headers #6915

faph opened this issue Sep 15, 2022 · 24 comments

Comments

@faph
Copy link
Contributor

faph commented Sep 15, 2022

Is your feature request related to a problem?

See #6908 for discussion with @Dreamsorcerer

We are have a continuously running service that makes HTTP requests using aiohttp's client. These requests are sent with a bearer token in the authorization header. This token needs to be refreshed from time to time.

Therefore we cannot set the header on the client object. We need to evaluate the header prior to making individual requests. This is cumbersome for an app developer.

Describe the solution you'd like

We would like to configure the client with the required data which can be used at each request to populate an authorization header. In its simplest form, this could be a simple co-routine that evaluates a token cache, generates a new token if required, and populate the header as required.

For the avoidance of doubt, the proposal is for wherever aiohttp supports an auth argument (client session, request), it would support the "enhanced" auth handler.

Currently, aiohttp supports a BasicAuth object which is a customized namedtuple (fields: username, password) with an encode method which returns the base64 encoded header value. Runtime checks are performed ensuring the auth value is an instance of that class.

TODO: expand on preferred optionSee options in comment below.

Describe alternatives you've considered

Both the requests library and the httpx library have similar functionality. For reference:

The httpx approach supports a more defined base class/interface for an auth handler, whereas requests just requires a callable. Fundamentally they do the same thing and inject an authorization header in the request headers.

Related component

Client

Additional context

I am probably able to contribute a PR with an implementation, depending on the agreed scope of the changes.

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@faph
Copy link
Contributor Author

faph commented Sep 16, 2022

Implementation options

  1. Do nothing. App developers can pass in headers manually as and when required. Or alternatively they could subclass ClientSession or ClientRequest (ClientSession is marked as "final" and should not be subclassed).

  2. Add support for a generic coroutine function to be passed in for the auth argument. This would be in addition to the currently supported BasicAuth class. The coroutine function would accept a single argument, the ClientRequest instance and the coroutine would be free to modify the request (with the intention to just modify headers...). ClientRequest.send would call/await this coroutine.

  3. Similar to option 2, but a single AuthHandler base class would be implemented with an empty async __call__ method. A __call__ method could be added to existing BasicAuth class which simply delegates to the existing encode synchronous method. [I guess the base class is really optional here since it’s-just-something-that-returns-an-awaitable]

  4. Tweak to options 2 and 3 above: the coroutine function would not accept any arguments and should just return a value for an Authorization header (or nothing). In that case, we could also accept a coroutine object instead.

  5. Add support for more generic "pre-send" and "post-receive" hooks (both on client instance, and individual request methods). These hooks would be free to modify anything on the request and the response respectively.

@faph
Copy link
Contributor Author

faph commented Sep 20, 2022

My personal preference would be towards option 3. It’s simple, clean and would feel very familiar for people coming from “requests” lib.

@faph faph changed the title [DRAFT] Support dynamically updating authentication headers Support dynamically updating authentication headers Sep 20, 2022
@faph
Copy link
Contributor Author

faph commented Sep 20, 2022

@Dreamsorcerer and maintainers, I welcome feedback on this.

@faph
Copy link
Contributor Author

faph commented Sep 21, 2022

I created draft PR #6954 just to test out how the existing BasicAuth functionality could be generalized.

@faph
Copy link
Contributor Author

faph commented Sep 23, 2022

Just picking up some feedback from @webknjaz here: #6954 (comment) (thanks!)

Personally, I don't feel adding an auth handler base class adds that much if the contract really is that it must be a co-routine function and just that. I'm not sure if that is related to the comment regarding the protocol...

Libraries like requests and httpx do both, they allow plain functions but also provide a base class to inherit from if one wishes. In case of httpx that does makes sense since it supports both async and async logic. But if it's a class with just a single method, you may as well not bother. Unless you want to give a hint to app developers that they can provide a "rich object" with its own state instead of a stateless co-routine.

@faph
Copy link
Contributor Author

faph commented Sep 23, 2022

For illustration, the expected usage is something like this:

async def my_auth(request: ClientRequest) -> None:
    token = await token_from_cache()
    if not token:
        token = await generate_token()
    request.headers["Authorization"] = f"Bearer {token}"

async with ClientSession(auth=my_auth) as session:
    url = "..."
    async with session.get(url) as resp:
        print(resp.status)

# OR

async with ClientSession() as session:
    url = "..."
    async with session.get(url, auth=my_auth) as resp:
        print(resp.status)

As a class-based solution, it could look like this:

class MyAuth:
    def __init__(self, identity_provider):
        self._identity_provider = identity_provider
        self._token_cache = ...

    async def __call__(self, request: ClientRequest) -> None:
        token = await self._token_cache.get()
        if not token:
            token = await generate_token(self._identity_provider)
            await self._token_cache.set(token)
        request.headers["Authorization"] = f"Bearer {token}"

my_identity_provider = ...

async with ClientSession(auth=MyAuth(my_identity_provider) as session:
    url = "..."
    async with session.get(url) as resp:
        print(resp.status)

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Sep 23, 2022

Personally, I don't feel adding an auth handler base class adds that much if the contract really is that it must be a co-routine function and just that.

My feeling is that we should probably just accept a coroutine and keep it generic. You can still pass a method from a more complex object, if you want to store state or something.

I also think that calling this auth would be a mistake, even if that's what other libraries are doing. The API doesn't seem to do anything specific to auth, I'd say this is a generic middleware, not dissimilar to the middlewares we support on the server side.

So, for the existing BasicAuth, you could end up with an approach like:

auth = BasicAuth("user")
async with ClientSession(middleware=auth.middleware):
    ...

With that new method being defined something like: https://github.com/aio-libs/aiohttp/pull/6954/files#diff-b797dd8733928df191ba2061121ab8b69976c185fcbfad4534891d3252b9ac30R165


async def my_auth(request: ClientRequest) -> None:
    token = await token_from_cache()
    if not token:
        token = await generate_token()
    request.headers["Authorization"] = f"Bearer {token}"

async with ClientSession(auth=my_auth) as session:
    ...

In this example though, I'm wondering why we wouldn't have access to the ClientSession in our middleware. Would it be useful to access the session as well? Would it be an improvement if this code only updated session.headers each time the token expires?


Though I still think this would be more efficient without something being called on every request. How does token_from_cache() know that the token has expired? If it's just time based, it seems to me that it would make more sense to just setup a callback, something like this (which should work without any changes):

async def set_token(session, event):
    while True:
        async with session.post("get_token") as resp:
            token = await resp.json()
            session.headers["Authorization"] = token["auth"]
            event.set()
            await asyncio.sleep(token["valid_duration"])

async with ClientSession() as session:
    ready = asyncio.Event()
    t = asyncio.create_task(set_token(session, event))
    await ready.wait()
    ...   # Your application code here...
    t.cancel()
    await t

Maybe to make that safer, there could be a Lock passed to ClientSession that is acquired during the auth call and blocks any requests until the new token is set. However, in most cases you can probably get away with just using valid_duration - 10 or similar, to ensure the request happens before the token gets expired.

@Dreamsorcerer
Copy link
Member

@webknjaz Any thoughts on this?

@faph
Copy link
Contributor Author

faph commented Sep 24, 2022

@Dreamsorcerer thanks! Let me separate this out into a few distinct decision points below

@faph
Copy link
Contributor Author

faph commented Sep 26, 2022

Auth-specific vs generic middleware/hook

This is really options 2-4 vs option 5 in #6915 (comment). I don't have a strong preference for either approach.

Some advantages for doing something auth-specific:

  1. There is an existing auth argument.
  2. It's recognizable for app devs coming from other HTTP client libs.

Generic advantages:

  1. More flexible.
  2. Avoids a generic solution, mislabeled as auth-specific.

The only thing with the generic solution is that there is some proxy-related auth logic in the client, so we need to make sure that this interacts as expected.

@faph
Copy link
Contributor Author

faph commented Sep 26, 2022

Client session property vs request property

I agree that in many case, one would set an auth handler on the session. However, the auth handler is in reality a property of an individual request, not the session. Different requests within a single session may require different authentication (or no authentication).

The existing logic is correct in this sense whereby the session can provide a default auth value for use by any request. Similarly, the implicit session API (aiohttp.request()) should equally allow auth to be set.

@faph
Copy link
Contributor Author

faph commented Sep 26, 2022

Background token refresh vs on-send function call

In a typical token expiring case, one could rotate the auth header using a background task. But I think it would involve setting ClientSession._default_auth or ClientSession._default_headers. I guess making this publicly settable could be considered.

However, this would only work for default auth handlers on a per-session basis. Not for individual requests.

I don't understand the concern about an on-the-fly function call to evaluate/populate the request's headers. Each ClientRequest object instantiation already goes through a series of functional calls. The only real change here that there's a pluggable hook which accepts a co-routine function. Obviously, any token cache fetch logic should be as fast as it can be...

@faph
Copy link
Contributor Author

faph commented Sep 26, 2022

Hook signature/API

If we're going the auth-specific route, we may wish to adopt an interface whereby the given co-routine function when called and awaited, returns the actual header value. For example:

async def my_auth() -> str:
    token = await ...
    return f"Bearer {token}"

It would definitely be more restrictive. I'm not sure how many authentication flows there are that require request context or that require other request properties to be modified...

In the more generic case, I suppose one could adopt the same, or a very similar, interface as used for the server. I guess the next issue would be whether it would support multiple (chained) middlewares. It could get complex quite quickly...

@Dreamsorcerer
Copy link
Member

All valid points, maybe we need some real world use cases to help decide what is and isn't needed.

  • I want to refresh an auth header before the token expires, so a request shouldn't fail or be delayed.
  • When I receive a 401 response (because token has been revoked etc.), I want to refresh the auth header and retry the original request.

I'm not sure I can think of a reason why you'd want to handle multiple auth headers in a single auth handler, but feel free to come up with scenarios and add them to the list. Then we can check that any given solution will work for all of these.

@faph
Copy link
Contributor Author

faph commented Sep 27, 2022

To be clear, I was thinking about multiple auth handlers used for different connections within a single session. For example:

my_auth = ...

async with ClientSession() as session:
    url1 = "..."
    async with session.post(url1, auth=my_auth) as resp:
        print(resp.status)
    
    url2 = "..."
    async with session.get(url2, auth=None) as resp:
        print(resp.status)

@faph
Copy link
Contributor Author

faph commented Sep 27, 2022

Use case: session-wide OAuth Bearer token, periodically expiring

In this use case, we use a typical OAuth JWT bearer token which can be retrieved from an Identity Provider. The token contains an expiry time. The token should be refreshed for requests made after this time. Early token refresh is OK.

Any request made in the session should have the Authorization header set (or Proxy-Authorization depending on proxy settings).

@faph
Copy link
Contributor Author

faph commented Sep 27, 2022

One thought here, as an MVP, to support the above use case, could we add a setter to the property ‘ClientSession.headers’? Or are there strong reasons why this should remain a read-only property?

@faph
Copy link
Contributor Author

faph commented Sep 28, 2022

I have created a PR for this, just to get started: PR #6983. If that's acceptable I will update docs and changelog to get this merged first before we decide on how to extend the auth arg API.

Let me know if you're able to review this @Dreamsorcerer

@faph
Copy link
Contributor Author

faph commented Sep 30, 2022

As discussed on PR #6983 this change is redundant.

The other question still outstanding is whether we want to enhance the auth arg to support a simple coroutine function without having to wire up a background task. Let me formulate a use case to support that decision.

@faph
Copy link
Contributor Author

faph commented Oct 3, 2022

Referencing other discussion thread requiring a custom authentication header based on the request body itself: Discussion #6627

One thought here is that if the authentication header value is just a function of the request body, one can easily compute the header value when generating the body on a request by request basis and pass both header and body as keyword arguments to the request method.

Dreamsorcerer added a commit that referenced this issue Oct 16, 2022
<!-- Thank you for your contribution! -->

## What do these changes do?

Adds a dedicated section on client authentication using the `auth`
argument for HTTP Basic Auth. Also explains how to configure other
authentication flows including periodic renewal of credentials.

## Are there changes in behavior for the user?

No. Just makes it easier for developers to access authentication
`aiohttp` features. Avoids users asking questions like this:
#6908

## Related issue number

Issue #6915 

## Checklist

- [x] I think the code is well written
- [x] Unit tests for the changes exist
- [x] Documentation reflects the changes
- [x] If you provide code modification, please add yourself to
`CONTRIBUTORS.txt`
  * The format is &lt;Name&gt; &lt;Surname&gt;.
  * Please keep alphabetical order, the file is sorted by names.
- [x] Add a new news fragment into the `CHANGES` folder
  * name it `<issue_id>.<type>` for example (588.bugfix)
* if you don't have an `issue_id` change it to the pr id after creating
the pr
  * ensure type is one of the following:
    * `.feature`: Signifying a new feature.
    * `.bugfix`: Signifying a bug fix.
    * `.doc`: Signifying a documentation improvement.
    * `.removal`: Signifying a deprecation or removal of public API.
* `.misc`: A ticket has been closed, but it is not of interest to users.
* Make sure to use full sentences with correct case and punctuation, for
example: "Fix issue with non-ascii contents in doctest text files."

Co-authored-by: Sam Bull <aa6bs0@sambull.org>
@aryadovoy
Copy link

Could you please show full example of updating headers dynamically with getting token? I cannot find this in docs.

@Dreamsorcerer
Copy link
Member

See the linked commit above your message: 981665a

@aryadovoy
Copy link

aryadovoy commented Jun 18, 2023

See the linked commit above your message: 981665a

Yep, it's ok, but I cannot understand how can I update headers every hour (for example). I have ClientSession instance, can I use something like this?

class Periodic:
    def __init__(self, func: Any, time: int) -> None:
        self.func = func
        self.time = time
        self.is_started = False
        self._task = None

    async def start(self) -> None:
        if not self.is_started:
            self.is_started = True
            self._task = asyncio.ensure_future(self._run())

    async def stop(self) -> None:
        if self.is_started:
            self.is_started = False
            self._task.cancel()
            with suppress(asyncio.CancelledError):
                await self._task

    async def _run(self) -> None:
        while True:
            await asyncio.sleep(self.time)
            self.func()


class SomeAiohttpClient:

    …

    @classmethod
    def _update_authorization_header(cls, key: str, secret: str) -> None:
        token = get_token(oauthkey=key, oauthsecret=secret)
        cls.aiohttp_client.headers["Authorization"] = f"Bearer {token}"
        logger.info("Bitbucket | Headers were updated")

    @classmethod
    async def get_aiohttp_client(
        cls, config: SomeConfig
    ) -> ClientSession:
        headers = cls._get_authorization_headers(config=config)

        if cls.aiohttp_client is None:
            cls.aiohttp_client = cls._get_aiohttp_client(headers=headers)

        cls.update_headers_periodically = Periodic(
            func=lambda: cls._update_authorization_header(
                key=config.key, secret=config.secret
            ),
            time=3600,
        )
        await cls.update_headers_periodically.start()

        return cls.aiohttp_client

@faph, have you implemented this part?

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Jun 18, 2023

Looks overly complicated to me. Why not just something as simple as:

async def _update_headers(sess):
    while True:
        await asyncio.sleep(3600)
        sess.headers.update(get_headers())

@asynccontextmanager
async def client(headers):
    async with ClientSession(headers=headers) as sess:
        t = asyncio.create_task(_update_headers(sess))
        yield sess
        t.cancel()
        await t

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants