Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EZP-29253: Make session refresh re-try on timeouts and server errors #90

Merged
merged 3 commits into from Jun 11, 2018

Conversation

andrerom
Copy link
Contributor

@andrerom andrerom commented Jun 1, 2018

In order to try to improve stability of rest client, re-try session refresh which is often the one holding others back on timeout and 5xx errors. Do this for up to 180 seconds which can happen on slow deployment / security updates of backend, on cases where service needs to be taken down and cache needs to be re-generated.

NOTE: This does not solve issues when backend is down and client sends other requests to it, for instance when session has just been refreshed. However 1. depending on the call that will have to be identified on a case by case basis, can't handle it generically afaik. 2. v2 solves this.

Todo:

  • Rebase to right branch
  • In case of 5xx errors, maybe add some sleep / wait time in order to make sure we can survive a up to 90 180 sec downtime / update / deployment window.

@andrerom andrerom force-pushed the session_refresh_retry branch 3 times, most recently from 007178f to a4eda34 Compare June 1, 2018 13:45
In order to try to improve stability of rest client, re-try session
refresh which is often the one helding others back on timout and 5xx errors.
@andrerom andrerom changed the base branch from master to 1.5 June 1, 2018 16:01
@andrerom andrerom changed the title [WIP] EZP-29253: Make session refresh re-try on timeouts and server errors EZP-29253: Make session refresh re-try on timeouts and server errors Jun 1, 2018
@jacek-foremski
Copy link
Contributor

I tested the solution. It correctly retries the session refresh requests. However, after three requests (if all of them fail), the end result is the same - the Editor will see the login form. The reporter mentioned (and I also think that it would be a good idea) that Users should not be presented with the login form while they are logged in and still have a valid session - even if the session refresh requests are failing. Perhaps a better idea would be not to retry the requests, but let them fail silently (ofc without refreshing the session)?

@andrerom
Copy link
Contributor Author

andrerom commented Jun 4, 2018

while they are logged in and still have a valid session

Do we have anyway of knowing that? Or you suggest we just assume it.

let them fail silently (ofc without refreshing the session)?

We could do that after 3 attempts, only thing I see as a problem is that if this endpoint is down, all other endpoints are probably down, and typically calls to other endpoints are just waiting for session refresh to complete before they are sent by client to server.

Unless of course we manage to get more details from customer on cases where this fails which is not related to deployment / server being restarted and such which is what I'm trying to cover with timeout and 5xx error handling. If we get something we can reproduce on that, I would be much more comfortable changing the logic here.

@jacek-foremski
Copy link
Contributor

I was suggesting just assuming that the Editor still has a valid session. This is checked anyway on every request if I understand this correctly, so that shouldn't be an issue.
You are of course right in saying that if this endpoint is down, the other endpoints are also probably down. The reporter just seems to have an issue with showing the login form to the Editor when the errors happen (since the Editor is not really logged out).
In any case, I'll ask the reporter if retrying the session refresh requests is an acceptable solution for them and if not, I'll ask them to provide use case(s) not related to deployment.

Copy link
Member

@adamwojs adamwojs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. I'm missing only the configuration for the number of retries and timeout between then (maybe as the third parameter).

@andrerom

This comment has been minimized.

// We re-try session refresh for 180 seconds on timeouts/server errors as it is often the requests holding
// others back(see ConnectionManager), and is thus the one request we typically need to re-try a few
// times if backend has temporary downtime or is undergoing brief maintenance.
if (date - (new Date()) >= (timeout || 180000)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be (new Date()) - date (other way around).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed 😅

@andrerom
Copy link
Contributor Author

andrerom commented Jun 5, 2018

Updated again, cleaned up for review.

@andrerom
Copy link
Contributor Author

Merging to make aware on beta tag.

@andrerom andrerom merged commit 4624430 into 1.5 Jun 11, 2018
@andrerom andrerom deleted the session_refresh_retry branch June 11, 2018 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants