-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kibana throws internal server error when auth fails due to missing saml token #22905
Comments
@kobelb in https://github.com/elastic/kibana/blob/master/x-pack/plugins/security/server/lib/authentication/providers/saml.js#L43, Kibana handles invalid refresh tokens with some hardcoded response body strings. Is this a case of ES not returning back the proper message body for an error? |
@elastic/es-security Kibana currently has the two error descriptions hard-coded to determine whether the SAML refresh token is invalid: We're seeing an additional error description |
No, this might be a bug. Did the user logged out (called ES We would be grateful for:
Maybe someone else from @elastic/es-security has a better idea, that might not require work from the reporter. |
@dmlittle we've had trouble recreating this issue that you're seeing, would you mind elaborating on your SAML configuration? Which identity provider are you using? Do you use single logout? |
Ι had missed this originally :/ The behavior described here can happen as we are removing tokens periodically. For this to manifest, all the above needs to happen:
The event of a user logging out or someone using the invalidation API can trigger the deletion of the now expired access and refresh tokens that the user got on Arguably, ES should respond with a 401 instead of a 500 here. |
@jkakavas that makes sense, we can update the SAML auth provider in Kibana to take this into consideration, thanks! |
@kobelb I'm treating this as an ES bug, so hopefully this will be resolved for the next minor. I'll update this issue with the solution once it's merged |
Gotcha, this is in regard to the |
If I understand the manifestation above correctly, Kibana gets the It was originally thought that the |
I'll try to reproduce locally and see if there is anything we'll need to do in Kibana once we start getting |
Thanks Aleh! To manually cause this to occur, I was running a query to explicitly delete the tokens from Elasticsearch. Otherwise, we'd have to wait until a certain time period after the refresh token has expired for ES to delete them themselves. |
Thanks for the suggestion! This time I just tweaked Elasticsearch source code a bit :) So yeah, if ES starts responding with 401 in case access token document is missing we'll clear session cookie, drop refresh token and re-initiate login [*]. We try to use refresh token only if we get Refresh token errors are different though, we get 400 error in most (all?) of the cases:
For the first two we'll start new SAML Handshake, but for the last one user will get Hey @jkakavas,
[*] If user hits this error during full page reload they will see blank page with JSON error, so user will have to refresh page once again. We can improve that if we want to, e.g. start SAML handshake if access token validation fails with 401 or something like that. |
No. For both explicitly invalidated and for expired access tokens, we return:
There is the question of what happens when you use a wrong/malformed access token ( This now returns 500 but will be fixed to return 401 ), what should be returned as error description in that 401 and how you should handle it in Kibana. I think that we should still try to refresh as there is no way of knowing if the access token is wrong (as in we can't find the relevant document ) because it is actually a wrong attempt, or a bit flipped, or the access token was expired and then deleted but it's associated refresh token is still good to use. So in summary I think we can say that if you try auth with an access token and you get a 401, try to use the refresh token you have.
As we're discussing in elastic/elasticsearch#39808, we should be able to handle all |
Nice, these error_description string based checks really bothered me, great that we can fully rely on status codes now. |
Fixed via #33777. The fix should be available starting from 7.1. |
@azasypkin we will still return 500 for a token that is deleted (elastic/elasticsearch#38866) , does #33777 handle this ? ( we do plan to fix it for 7.1 , but I think we'd better leave this issue open for tracking until then ) |
Yes, we added a workaround for this specific error that we'll remove once elastic/elasticsearch#38866 is merged. |
I know this issue has been closed for a while, but we seem to be encountering it again on a 7.5 cluster: https://github.com/elastic/infra/issues/17590 Should we re-open this issue, or create a new one? |
@mindbat The original problem has been addressed. This sounds similar and might as well be related ( the error message is the same but the part of the code that throws it is different ) but I think we should handle this in a separate issue. If @elastic/kibana-security feels otherwise, I will open an ES issue to track this. Since I can see in the infra ticket that this is reproducible, it would be immensely helpful if we can get the ES logs when that happens (with |
Gotcha.
Makes sense to me, and thanks!
Agreed! Since this is an ESS cluster, we don't have direct access to the logs...I'll see what we can do, though. |
No objections from our side, that looks weird indeed. In the worst case I'd expect to see |
The only place we throw with that error message is https://github.com/elastic/elasticsearch/blob/7.5/x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authc/TokenService.java#L1503 where we search for a document in the tokens index. However, to get to there we have already called There is always a possibility of a bug ( a race condition somewhere ) but the logs will be very helpful in troubleshooting so I'll defer further analysis until we get them |
@jkakavas What about https://github.com/elastic/elasticsearch/blob/7.5/x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authc/TokenService.java#L444 ? I think we need the ES server logs for this, and should open a separate issue to track it. |
Apologies everyone, I mixed up versions. This is a well known issue that we fixed in 7.6 ( and not in 7.5 ). Kibana, on the other hand, should have dealt with the |
That's correct. And if this happens since 7.2 we restart SAML handshake for non-AJAX requests or return @mindbat can we get verbose Kibana logs ( |
@azasypkin It's an ESS cluster, so we don't have direct access to logs. I'll ping Cloud |
Ok, I don't have a copy of all the logs, but I did find an Internal Server Error being logged over and over again on the day we experienced this for the first time (I've stripped out some non-essential ESS-only info from the below). The query it's logging is one of the ones we use in the Kibana Dashboard that's throwing the Internal Server Error
|
Unfortunately we can't get much value from this log, we really need full Kibana verbose logs to see the entire request authentication flow to know what's going on. |
@azasypkin Gotcha. I'll see about getting verbose logging turned on, and then replicating the error. |
Kibana version: 6.3
Elasticsearch version: 6.3
A user authenticates to Kibana via an Active Directory service, and can use Kibana for the day. When they stop using Kibana for a long time (usually when returning the next day), the next time they open Kibana they see an error on the screen:
Elasticsearch logs will show
One can work around the problem by clearing browser cookies and refreshing the page.
Expected result:
If the token used for authentication is no longer valid, the user should be asked to log in again.
The text was updated successfully, but these errors were encountered: