New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MGMT-12970: don't reset auto-assign for irrelevant hosts #4891
MGMT-12970: don't reset auto-assign for irrelevant hosts #4891
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #4891 +/- ##
==========================================
- Coverage 67.77% 67.63% -0.15%
==========================================
Files 202 202
Lines 30100 30183 +83
==========================================
+ Hits 20401 20414 +13
- Misses 7892 7963 +71
+ Partials 1807 1806 -1
|
/assign @ori-amizur |
/test ci/prow/edge-subsystem-kubeapi-aws |
@slaviered: The specified target(s) for
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
80734b7
to
3f151ce
Compare
The monitor resets the 'suggested-role' field on all hosts of the clusters if one of the hosts is in auto-assign. This was done to eliminate race conditions between 'register host' and the calculation of auto-assign a synchronically in the monitor loop. The reset may happen once in a lifetime of a host. However, in production, we found several types of obsolete hosts mostly orphan ones that trigger that role clearance all the time The amendment to the query does not reset the roles where the the auto-assign host is not relevant: * in status disabled (obsolete) * disconnected * day2 host (old hosts - current day 2 sets the role to a worker role before registering) * host stuck on discovering state This PR, in combine with future improvements to GC code, will eliminate the spurious resets of auto-assign roles.
3f151ce
to
0a0ac44
Compare
/retest-required |
@slaviered: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ori-amizur, slaviered The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
) The monitor resets the 'suggested-role' field on all hosts of the clusters if one of the hosts is in auto-assign. This was done to eliminate race conditions between 'register host' and the calculation of auto-assign a synchronically in the monitor loop. The reset may happen once in a lifetime of a host. However, in production, we found several types of obsolete hosts mostly orphan ones that trigger that role clearance all the time The amendment to the query does not reset the roles where the the auto-assign host is not relevant: * in status disabled (obsolete) * disconnected * day2 host (old hosts - current day 2 sets the role to a worker role before registering) * host stuck on discovering state This PR, in combine with future improvements to GC code, will eliminate the spurious resets of auto-assign roles.
) The monitor resets the 'suggested-role' field on all hosts of the clusters if one of the hosts is in auto-assign. This was done to eliminate race conditions between 'register host' and the calculation of auto-assign a synchronically in the monitor loop. The reset may happen once in a lifetime of a host. However, in production, we found several types of obsolete hosts mostly orphan ones that trigger that role clearance all the time The amendment to the query does not reset the roles where the the auto-assign host is not relevant: * in status disabled (obsolete) * disconnected * day2 host (old hosts - current day 2 sets the role to a worker role before registering) * host stuck on discovering state This PR, in combine with future improvements to GC code, will eliminate the spurious resets of auto-assign roles.
The monitor resets the 'suggested-role' field on all hosts of the clusters if one of the hosts is in auto-assign. This was done to eliminate race conditions between 'register host' and the calculation of auto-assign a synchronically in the monitor loop.
The reset may happen once in a lifetime of a host.
However, in production, we found several types of obsolete hosts mostly orphan ones that trigger that role clearance all the time
The amendment to the query does not reset the roles where the the auto-assign host is not relevant:
This PR, in combine with future improvements to GC code, will eliminate the spurious resets of auto-assign roles.
List all the issues related to this PR
What environments does this code impact?
How was this code tested?
Checklist
docs
, README, etc)Reviewers Checklist