Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always set status to RollingUpdate if operation exists #93

Merged
merged 4 commits into from
Jul 10, 2020

Conversation

victor-carvalho
Copy link
Contributor

When trying to do operation that requires to recreate the pods (change image, env vars, etc ...) and there are no pods the operation gets stuck until it timeouts. This happens because on GetOperationStatus it checks if the operation is in RollingUpdate status and if not the progress always returns zero and EnsureCorrectRooms do not set the status if the number of invalid pods is zero.

This PR makes EnsureCorrectRooms always set the status to RollingUpdate if there is a current operation even if there are zero pods to update.

@victor-carvalho victor-carvalho changed the title Fix/update scheduler config zero replicas Always set status to RollingUpdate if operation exists Jul 9, 2020
@coveralls
Copy link

coveralls commented Jul 9, 2020

Coverage Status

Coverage decreased (-0.03%) to 74.533% when pulling 851e2ff on fix/update-scheduler-config-zero-replicas into 82080ea on master.

@gabrielcorado
Copy link
Contributor

gabrielcorado commented Jul 9, 2020

Can this change mess with some Operation that has finished?

EDIT: As we discussed it will be a problem only if two workers get it, but since Operation key is deleted at API when it finishes we think that is an edge case that is very unlikely to happen. Btw, I suggested a log addition so we can track what the description was before we change it to rolling update.

@leohahn
Copy link
Contributor

leohahn commented Jul 10, 2020

Is it hard to add a regression test for this issue?

@victor-carvalho victor-carvalho merged commit 1e7e89b into master Jul 10, 2020
@victor-carvalho victor-carvalho deleted the fix/update-scheduler-config-zero-replicas branch July 10, 2020 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants