Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow token refresh for multiple requests in a small window #36872

Closed
epixa opened this issue Dec 20, 2018 · 19 comments · Fixed by #39631 or #38382
Closed

Allow token refresh for multiple requests in a small window #36872

epixa opened this issue Dec 20, 2018 · 19 comments · Fixed by #39631 or #38382
Assignees
Labels
>enhancement :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc)

Comments

@epixa
Copy link
Contributor

epixa commented Dec 20, 2018

The problem

The current behavior for refreshing a token is to immediately invalidate a refresh token when it is used the first time. In principle this is a sensible way to prevent the refresh token from being used maliciously, but in practice it can trivially break a client that is making many requests to Elasticsearch.

In Kibana, the consequence is that sessions using our token-based providers (saml and token) occasionally get destroyed as multiple requests race to refresh an expired access token. This problem will only get worse as we expand the usage of canvas expressions which can result in more requests in parallel.

Let's say a user allows their session to idle for a bit and their access token expires, then they click refresh on a dashboard. Requests A, B, and C are fired off to Elasticsearch in parallel, all three get rejected due to an expired access token, and all three attempt to refresh the session using the same refresh token. A succeeds and returns a new session cookie to the client. B and C fail since the refresh token has already been used. The client sees the failures of B and C and assumes the session must be dead, so it either errors or sends the user to the login form with a cleared session.

Only processing a single refresh token in Kibana for parallel requests isn't practical because there could be multiple Kibana instances behind a load balancer handling requests for the same session.

The proposal

I propose that we add an optional nonce property to the refresh_token grant type. If the first request to refresh a token contains a nonce, then subsequent refresh token requests for that same refresh token using the same nonce will return the same new access token.

A client like Kibana can generate a random nonce value each time a user does a page reload on Kibana, and it'll include the nonce in any refresh token requests that occurs for that user during this time. Unlike the access token and refresh token, the nonce is never stored in the session itself.

There can be a "refresh window" associated with the nonce feature as well, maybe 5 minutes or something like that, where afterwords the token cannot be refreshed even with a matching nonce.

cc @elastic/kibana-security

@epixa epixa added >enhancement :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) labels Dec 20, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security

@jkakavas
Copy link
Member

That's a tricky one :) My initial reaction ( and I shared this with @kobelb in the issue we came across this ) is that this needs to be fixed on the client side. All the solutions that come to mind though, either require a shared server-side state in Kibana or do not handle the "HA with many Kibana instances" scenario.

I think the suggested approach adds a lot of complexity and I'd like to propose a different solution, one where we do not invalidate refresh tokens on first use. The original decision to revoke refresh tokens on use, still makes sense and it is in the spirit of the specification:

The authorization server MAY revoke the old
refresh token after issuing a new refresh token to the client.

but I consider revisiting it as a sensible trade-off in order to resolve the issue at hand with low to no risk involved.

Suggested Solution

We stop revoking refresh tokens on first use. This will resolve the issue at hand as Requests A,B, and C are fired off to Elasticsearch in parallel, all three get rejected due to an expired access token, and all three attempt to refresh the session using the same refresh token. A succeeds and gets a new access token and returns a new session cookie to the client. B and C also succeed but get new access tokens and refresh tokens that are stored in the users session cookie and overwrite the existing one. The end result is that the latest of the concurrent requests to get a response will dictate the access token and refresh token to store in the user's session cookie.

This means that nothing needs to change on the Kibana's side. Any concerns regarding the possibility of A, B, C returning different access/refresh tokens , all of which will be valid and with the same expiration ( give or take a few ms ) ?

Standards and Best Practices compliance

There is nothing in the original standard, the OAuth 2.0 Threat Model and Security Considerations or the recent OAuth 2.0 Security Best Current Practice draft that mandates the revocation on use of the refresh tokens.

Threat Model

Threat : Retaining persistent access

Attack1 : A legitimate client could use the same refresh token multiple times.

Countermeasures : The legitimate client would get a new access token and a new refresh token each time but

  • the access token would give them the same privileges that the original access token or the access token that they would get with the original refresh, would give them.
  • the new refresh tokens will extend the duration of its access ( each new one will give the option to refresh access for 24h ) but this is in no way different than sequentially refreshing using a new refresh token each time.

As such, this threat degrades into normal functionality and should not be considered as a risk.

Attack2 : An attacker could use the same leaked refresh token multiple times.

Countermeasures : As above, this doesn't give any additional access to an attacker. The ability to explicitly invalidate tokens (either for specific tokens or for any user/realm combination) is another mitigating factor to this attack. Finally we do authenticate the client on the refresh_grant so leaked refresh tokens cannot be used by other clients ( outside Kibana )

Threat : Performance degradation / Denial of Service

Attack1: A malicious user can continuously refresh an existing refresh token. Each of these requests would give them a new refresh token and access token and they could then start also using this new refresh token to get additional ones, etc. Compared to the old approach, this would allow the malicious user to create documents in the security index in an exponential rate which depending on deployment specific factors could hurt performance of the Token Service.

Countermeasures: We already delete expired tokens so there is an upper limit to the amount of documents they can create, as we would periodically delete ones containing expired refresh tokens. Additionally, we can decide on a sensible maximum number of refreshes that would resolve the issue but further mitigate the mentioned threat.

@kobelb
Copy link
Contributor

kobelb commented Dec 21, 2018

Do we need to take the logout/token-invalidation behavior into consideration here? If we could potentially have multiple access/refresh tokens that were created from the initial refresh token, it becomes quite complicated (perhaps impossible) to ensure that on logout "all" of the refresh/access tokens are invalidated. This is one of the main complaints which I hear from users about our existing basic auth provider, that on logout they can re-use the previous session cookies (which would contain the previous access/refresh tokens) to authenticate subsequently.

@jkakavas
Copy link
Member

@kobelb wouldn't a call to invalidate all tokens of the user satisfy the logout requirement ?

DELETE /_xpack/security/oauth2/token
{
  "username" : "user"
}

@kobelb
Copy link
Contributor

kobelb commented Dec 21, 2018

@jkakavas I forgot we could do that now, that's awesome! Would there be a way to limit this to only the tokens associated with the instance of Kibana so it wouldn't log them out of all other instances of Kibana, or their other usages of tokens.

@jkakavas
Copy link
Member

Would there be a way to limit this to only the tokens associated with the instance of Kibana

No, that's a good catch. I didn't think about this scenario at all. I don't think that we have a good way of differentiating that. We could add the API parameter to also filter by oAuth2 client but there is also a high probability that these different kibana instances will use the same user to authenticate to Elasticsearch (that principal is what we keep track as client). The original proposal will also need a solution around this as B, C, D would create new refresh_tokens also and on logout we would revoke only the one that D created.

I'll go back to the drawing board and re-evaluate @epixa 's proposal also.

@albertzaharovits
Copy link
Contributor

I agree with @jkakavas's assessment that we will not be introducing new attack vectors by not invalidating the refresh token on use. But I worry that we might increase the possibility of leaking refresh tokens if we generate new ones on each reuse. I would propose that we not generate a new refresh token for each use of another refresh token, and maintain the original one valid for subsequent uses.

@jkakavas
Copy link
Member

jkakavas commented Jan 9, 2019

I would propose that we not generate a new refresh token for each use of another refresh token, and maintain the original one valid for subsequent uses.

In terms of token leakage that would also broaden the attack surface in the time dimension as we would have fewer refresh tokens but with longer validity period. ( i.e. an attacker finds a refresh token logged somewhere from 4-5 days ago which might still be active )
This also suffers from the same issue with regards to access token invalidation on logout , as the previous proposal.

In general , the fact that the refresh token can only be used by the original client makes token leakage a reduced impact threat as an attacker would also need to compromise the client(usually kibana for now) credentials in order to exploit a leaked refresh token.

I'll go back to the drawing board and re-evaluate @epixa 's proposal also.

I'll get to it in the next couple of days.

@jkakavas
Copy link
Member

We discussed this today in our team meeting. Jay came up with an idea to simplify the original proposal by not requiring a nonce and using a window of 1 min. We can probably bring this down to a few seconds as we're trying to solve for concurrent requests. The idea is that we would return the same access token for the duration of this window as the original proposal also suggests.
I have an idea of an implementation that would allow this and plan to give it a shot unless someone brings up a concern that we have missed. @epixa did you have any specific ideas why the nonce would be of use that is not covered by just a time window?

@epixa
Copy link
Contributor Author

epixa commented Jan 17, 2019

I don't have strong opinions about the nonce, but I do think it solves a different problem that is possible within the time window.

In a UI client, if an attacker gets access to the access token for another user's session, they almost certainly have access to the refresh token as well. Both of these things must be associated with a session in order to be useful.

We can't prevent an attacker from initiating the session reset themselves, but without the nonce feature, these two things are reality:

  1. If the attacker is the one that first initiates the token refresh, the original user remains unaware that something is wrong as their session will also seamlessly refresh (to an already-exploited new token). If a nonce was in place, the original user's session would fail to refresh and we could potentially error with something like "this refresh token has already been used" which might tip them off to a problem.
  2. If the original user is the one that first initiates the token refresh, the attacker can still safely refresh the token anyway (still unbeknownst to the user). If the nonce was in place, the attacker's request would be rejected.

To "quietly" exploit the nonce feature, the attacker needs to attack a different mechanism entirely in order to get a valid nonce. The act of getting the session info isn't enough.

With the time window alone, no additional exploit is necessary, the timing just needs to be right. This window could be made easier to exploit if the attacker can determine approximately when a token was due to expire. I notice the token service itself returns with expires_in, so if that info is available on an ongoing basis from ES, then it would be pretty easy for an attacker to exploit the window. If not available easily through ES, it's also not unlikely that a client would store the whole token service payload (including expires_in) along with a created_at timestamp with the session, so in that case the attacker could exploit the window using the info they had on hand.

I'm comfortable leaving it to this team to decide the extent to which this is acceptable in our auth threat model.

@albertzaharovits
Copy link
Contributor

@epixa If I'm following correctly the nonce is a use-once token, that unlike the refresh token, which is also use-once atm, is generated by Kibana and not by ES. For that matter, the nonce is generated instead of stored.

I think the nonce is a raw method of binding the access-refresh token pair to a Kibana instance.
Instead, and this is not mentioned in the previous description, the refresh token is bound to the credentials of the kibana process that generated the access-refresh token pair in the first place. So, an attacker does not simply has to race for the refresh, or guess the generated nonce, it has to know the secret kibana password.

I guess, this is not how things work right now, but each authenticator process in a multiple kibana deployment should use different secret credentials. In this way, the nonce is not required to identify a Kibana instance (it is still useful, if credentials are shared, but then I don't think the refresh token is useful anymore).

@jkakavas
Copy link
Member

I don't think the nonce binds the access-refresh token pair to a client. It binds the pair to a user

A client like Kibana can generate a random nonce value each time a user does a page reload on Kibana,

and Court's argument was that this can be used as a canary token of sorts to indicate that only one user is allowed to refresh a refresh token ( someone else attempting to refresh again with a leaked refresh token within the proposed allowed window will not be successful as they won't send the same nonce ).

In a UI client, if an attacker gets access to the access token for another user's session, they almost certainly have access to the refresh token as well. Both of these things must be associated with a session in order to be useful.

The premise of this attack, presupposes that someone (in increasing order of complexity/decreasing order of probability) either

a. happens to get a kibana auth cookie laying around and try to use it as is
b. has access to the encryptionKey value ( thus access to kibana.yml and probably the kibana user credentials ) and gets a kibana auth cookie somewhere (logged in debug logs in the server ?- assuming physical access) so that they can decrypt an auth cookie they got somehow to get the values of the access token and refresh token and use them with the token API
c. have access to the kibana process
d. are in a MITM position between Elasticsearch and a client and can see raw traffic which means that they need to be doing SSL stripping as the token service requires TLS on the http layer of Elasticsearch.

c and d we can't protect against. They have the kibana user credentials, access to the values of access_token and refresh_token , can see the nonce being sent etc. Silently refreshing a token is the least they could do.

b has physical access to the kibana server, I believe they can do far more destructive actions than silently refreshing a user's token.

So, we're trying to solve this for a: This attacker can't see the refresh token as it is encrypted, and even if they could decrypt they cannot use it on their own because they don't have the credentials of the client that originally got the refresh token. They can only trick Kibana to refresh it for them taking advantage of our refresh window.

Let's assume the refresh window is 10sec and the access token is valid for 10 mins.

  • Without the refresh window and multiple refreshes ,the attacker would be able to refresh the token anytime within the first 10 mins of access token validity, before the legitimate user refreshed it because their access token expired. The legitimate user will get an error when their access token expired and would need to re-authenticate.
  • With the refresh window and multiple refreshes, the attacker would still be able to do the above plus do it without the user noticing if they could time the refresh (by sending a request to kibana) somewhere between 9m50s and 10m10s. They would then need to keep doing this every 10 min to retain access without logging out the legitimate user.

I'm leaning to this being an acceptable risk given the prerequisites to exploit this. The stealth property of this attack (which is the only thing an attacker gains here as compared to our current behavior) is , I think, not particularly interesting from an attacker perspective in a web environments where "you were logged out, please reauthenticate" messages are not too uncommon to make a legitimate user suspicious.

@epixa if you think that there is an easy way to have Kibana predictably send the same value for all XHR requests of the same user that might come concurrently, then it doesn't complicate our implementation that much. But if it's a significant effort from your side, I'm not sure we gain much by doing so.

@albertzaharovits
Copy link
Contributor

I don't think the nonce binds the access-refresh token pair to a client. It binds the pair to a user

A client like Kibana can generate a random nonce value each time a user does a page reload on Kibana,

and Court's argument was that this can be used as a canary token of sorts to indicate that only one user is allowed to refresh a refresh token ( someone else attempting to refresh again with a leaked refresh token within the proposed allowed window will not be successful as they won't send the same nonce (1) ).

Understood, thanks for clarifying this @jkakavas ! But from (1) - above , and this bellow

They can only trick Kibana to refresh it for them taking advantage of our refresh window.

I understand that the threat model is that the attacker can trick Kibana to refresh the token but cannot trick it to also use the correct nonce ?

If that is so, it sounds pretty low risk to me.

@epixa
Copy link
Contributor Author

epixa commented Jan 17, 2019

@kobelb I'd like your thoughts on this:

Generating a nonce on page load in Kibana and passing it through on all requests to Elasticsearch should be pretty easy. We'd pass the nonce to the UI via injectVars, add it as a header for all HTTP requests to the backend, and then update the authenticate logic to pass the nonce through in the event of refresh.

However, multiple browser tabs complicate the situation as each would have its own nonce, which at best would be a bad experience as we break other tabs and at worst would log the user out anyway as the logout operation would fire when refresh failed on another tab.

We could address this cross-tab issue by generating the nonce in sessionStorage instead. The downside to this approach is that the nonce is a little more sticky than it was before, and it's hard to say how much that impacts the value of the nonce from a security perspective. Since our tokens are stored in a cookie and the nonce is in sessionStorage, pwning the session token doesn't mean pwning the nonce, but that feels like an implementation detail of storage that would be easy to overlook and could realistically change in the future.

At this point, I'm leaning toward just abandoning the nonce concept entirely which is the prevailing preference from the ES side anyway.

@kobelb
Copy link
Contributor

kobelb commented Jan 17, 2019

However, multiple browser tabs complicate the situation as each would have its own nonce, which at best would be a bad experience as we break other tabs and at worst would log the user out anyway as the logout operation would fire when refresh failed on another tab.

We could address this cross-tab issue by generating the nonce in sessionStorage instead. The downside to this approach is that the nonce is a little more sticky than it was before, and it's hard to say how much that impacts the value of the nonce from a security perspective. Since our tokens are stored in a cookie and the nonce is in sessionStorage, pwning the session token doesn't mean pwning the nonce, but that feels like an implementation detail of storage that would be easy to overlook and could realistically change in the future.

At this point, I'm leaning toward just abandoning the nonce concept entirely which is the prevailing preference from the ES side anyway.

I think this is the biggest issue that we're doing to run into when trying to use a nonce for this mechanism. If the attacker is able to access the cookie using some mechanism in the browser (which is even less likely because it has the HttpOnly flag so they can't client-side) they'll definitely be able to access sessionStorage. This means that they're most likely going to get access to the nonce by intercepting a network request to Kibana, which will have both the nonce and the session, so we're already compromised in this scenario.

I'm also leaning toward abandoning the nonce concept, given the recent discussions.

@epixa
Copy link
Contributor Author

epixa commented Jan 17, 2019

Let's consider the nonce detail of this proposal abandoned from Kibana's side then.

@jkakavas
Copy link
Member

@epixa thanks for raising this and thanks for the iterations @kobelb, @albertzaharovits, @jaymode . I'll move with the original proposal without the nonce.

jkakavas added a commit that referenced this issue Mar 1, 2019
This change adds supports for the concurrent refresh of access
tokens as described in #36872
In short it allows subsequent client requests to refresh the same token that
come within a predefined window of 60 seconds to be handled as duplicates
of the original one and thus receive the same response with the same newly
issued access token and refresh token.
In order to support that, two new fields are added in the token document. One
contains the instant (in epoqueMillis) when a given refresh token is refreshed
and one that contains a pointer to the token document that stores the new
refresh token and access token that was created by the original refresh.
A side effect of this change, that was however also a intended enhancement
for the token service, is that we needed to stop encrypting the string
representation of the UserToken while serializing. ( It was necessary as we
correctly used a new IV for every time we encrypted a token in serialization, so
subsequent serializations of the same exact UserToken would produce
different access token strings)

This change also handles the serialization/deserialization BWC logic:

- In mixed clusters we keep creating tokens in the old format and
consume only old format tokens
- In upgraded clusters, we start creating tokens in the new format but
still remain able to consume old format tokens (that could have been
created during the rolling upgrade and are still valid)

Resolves #36872

Co-authored-by: Jay Modi jaymode@users.noreply.github.com
jkakavas added a commit to jkakavas/elasticsearch that referenced this issue Mar 1, 2019
This change adds supports for the concurrent refresh of access
tokens as described in elastic#36872
In short it allows subsequent client requests to refresh the same token that
come within a predefined window of 60 seconds to be handled as duplicates
of the original one and thus receive the same response with the same newly
issued access token and refresh token.
In order to support that, two new fields are added in the token document. One
contains the instant (in epoqueMillis) when a given refresh token is refreshed
and one that contains a pointer to the token document that stores the new
refresh token and access token that was created by the original refresh.
A side effect of this change, that was however also a intended enhancement
for the token service, is that we needed to stop encrypting the string
representation of the UserToken while serializing. ( It was necessary as we
correctly used a new IV for every time we encrypted a token in serialization, so
subsequent serializations of the same exact UserToken would produce
different access token strings)

This change also handles the serialization/deserialization BWC logic:

- In mixed clusters we keep creating tokens in the old format and
consume only old format tokens
- In upgraded clusters, we start creating tokens in the new format but
still remain able to consume old format tokens (that could have been
created during the rolling upgrade and are still valid)

Resolves elastic#36872

Co-authored-by: Jay Modi jaymode@users.noreply.github.com
jkakavas added a commit that referenced this issue Mar 1, 2019
This is a backport of #38382

This change adds supports for the concurrent refresh of access
tokens as described in #36872
In short it allows subsequent client requests to refresh the same token that
come within a predefined window of 60 seconds to be handled as duplicates
of the original one and thus receive the same response with the same newly
issued access token and refresh token.
In order to support that, two new fields are added in the token document. One
contains the instant (in epoqueMillis) when a given refresh token is refreshed
and one that contains a pointer to the token document that stores the new
refresh token and access token that was created by the original refresh.
A side effect of this change, that was however also a intended enhancement
for the token service, is that we needed to stop encrypting the string
representation of the UserToken while serializing. ( It was necessary as we
correctly used a new IV for every time we encrypted a token in serialization, so
subsequent serializations of the same exact UserToken would produce
different access token strings)

This change also handles the serialization/deserialization BWC logic:

- In mixed clusters we keep creating tokens in the old format and
consume only old format tokens
- In upgraded clusters, we start creating tokens in the new format but
still remain able to consume old format tokens (that could have been
created during the rolling upgrade and are still valid)

Resolves #36872

Co-authored-by: Jay Modi jaymode@users.noreply.github.com
@epixa
Copy link
Contributor Author

epixa commented Mar 2, 2019

I think the change was temporary reverted based on the latest comments in the PR, so I’m reopening this. Feel free to close if I’m mistaken.

@epixa epixa reopened this Mar 2, 2019
@jkakavas
Copy link
Member

jkakavas commented Mar 2, 2019

You re right Court, I will re-close this on Monday upon submitting the PR again

jkakavas added a commit that referenced this issue Mar 4, 2019
Co-authored-by: Jay Modi jaymode@users.noreply.github.com

This change adds support for the concurrent refresh of access
tokens as described in #36872
In short it allows subsequent client requests to refresh the same token that
come within a predefined window of 60 seconds to be handled as duplicates
of the original one and thus receive the same response with the same newly
issued access token and refresh token.
In order to support that, two new fields are added in the token document. One
contains the instant (in epoqueMillis) when a given refresh token is refreshed
and one that contains a pointer to the token document that stores the new
refresh token and access token that was created by the original refresh.
A side effect of this change, that was however also a intended enhancement
for the token service, is that we needed to stop encrypting the string
representation of the UserToken while serializing. ( It was necessary as we
correctly used a new IV for every time we encrypted a token in serialization, so
subsequent serializations of the same exact UserToken would produce
different access token strings)

This change also handles the serialization/deserialization BWC logic:

    In mixed clusters we keep creating tokens in the old format and
    consume only old format tokens
    In upgraded clusters, we start creating tokens in the new format but
    still remain able to consume old format tokens (that could have been
    created during the rolling upgrade and are still valid)
    When reading/writing TokensInvalidationResult objects, we take into
    consideration that pre 7.1.0 these contained an integer field that carried
    the attempt count

Resolves #36872
jkakavas added a commit that referenced this issue Mar 5, 2019
This is a backport of #39631

Co-authored-by: Jay Modi jaymode@users.noreply.github.com

This change adds support for the concurrent refresh of access
tokens as described in #36872
In short it allows subsequent client requests to refresh the same token that
come within a predefined window of 60 seconds to be handled as duplicates
of the original one and thus receive the same response with the same newly
issued access token and refresh token.
In order to support that, two new fields are added in the token document. One
contains the instant (in epoqueMillis) when a given refresh token is refreshed
and one that contains a pointer to the token document that stores the new
refresh token and access token that was created by the original refresh.
A side effect of this change, that was however also a intended enhancement
for the token service, is that we needed to stop encrypting the string
representation of the UserToken while serializing. ( It was necessary as we
correctly used a new IV for every time we encrypted a token in serialization, so
subsequent serializations of the same exact UserToken would produce
different access token strings)

This change also handles the serialization/deserialization BWC logic:

    In mixed clusters we keep creating tokens in the old format and
    consume only old format tokens
    In upgraded clusters, we start creating tokens in the new format but
    still remain able to consume old format tokens (that could have been
    created during the rolling upgrade and are still valid)
    When reading/writing TokensInvalidationResult objects, we take into
    consideration that pre 7.1.0 these contained an integer field that carried
    the attempt count

Resolves #36872
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc)
Projects
None yet
5 participants