New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/controller: resync on unavailable #601
pkg/controller: resync on unavailable #601
Conversation
add check to syncStatusOnly to allow for resync of status when an unavailable machine or unupdated machines are found.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kikisdeliveryservice The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
a few unit tests are broken, need to update them. |
not sure why, so retesting.. |
so unit tests were initially failing because it falled f.run() instead of f.runExpectedError() which is what we'd now expect given the existence of the new unavailable/not updated check initially. |
@@ -816,7 +816,7 @@ func TestPaused(t *testing.T) { | |||
expMcp.Status = expStatus | |||
f.expectUpdateMachineConfigPoolStatus(expMcp) | |||
|
|||
f.run(getKey(mcp, t)) | |||
f.runExpectError(getKey(mcp, t)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we now have a successful test where f.run
just runs? (mocking machines len)
/hold after taking a closer look at https://bugzilla.redhat.com/show_bug.cgi?id=1695721, it looks like the MCO isn't causing any issue so I'm holding this until Trevor further clarifies why the MCO (apart from the transient) error is causing the upgrade to fail. |
makes sense @runcom |
/hold cancel not sure how we would ever get in such scenario where we don't resync anymore (and need this) but looks like we're getting reports that we can fall into this... |
/hold I shouldn't have removed hold tho 😄 |
/skip |
@kikisdeliveryservice: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
In an effort to clean up the MCO repo, closing old open PRs with no recent activity. Feel free to reopen. |
- What I did
an unavailable machine or unupdated machines are found.
Closes BZ 1695721
- How to verify it
CI runs should now pass without the error.