Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-1.1] virt-handler: wait up to 2 seconds for the network-status downwardAPI to appear #10646

Merged

Conversation

jean-edouard
Copy link
Contributor

What this PR does / why we need it:
The network-status downward API file can take some time to get populated, and since virt-handler reads it just once on startup, the dedicated migration network can get ignored.
This is only a problem in k8s 1.27+.
This PR is a workaround, for 1.1 only (unless we decide to backport it). We are working on a proper fix in main, but it is taking longer than expected due to a potential bug in virt-operator.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes the "falky" dedicated migration test (the code is flaky, not the test)

Special notes for your reviewer:

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Release note:

The dedicated migration network should now always be properly detected by virt-handler

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Oct 27, 2023
Comment on lines 36 to 44
for i := 0; i < 4; i++ {
dat, err = os.ReadFile(networkStatusPath)
if err != nil {
return "", fmt.Errorf("failed to read network status from downwards API")
}
if len(dat) != 0 {
break
}
time.Sleep(500 * time.Millisecond)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From your experience, are 1.5 seconds enough? Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. 90% of the time, the file is there right away, and if not I've never seen it appear later than after the first sleep.
What I really want is to limit the delay on nodes that don't have the annotation, and therefore a file that will stay empty.
However, looking at this code again, there's a useless sleep on last iteration that I need to remove.
/hold

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, also added an extra loop to be extra cautious and match the PR/commit title. PTAL!
/hold cancel

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😎

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 30, 2023
… to appear

This is a band-aid to work around the downwardAPI slowness seen in k8s 1.27+.
Future versions of the project will have a proper fix for this.

Signed-off-by: Jed Lejosne <jed@redhat.com>
@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 30, 2023
Copy link
Contributor

@fossedihelm fossedihelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jean-edouard
/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Oct 30, 2023
Copy link
Member

@xpivarc xpivarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: xpivarc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 31, 2023
@kubevirt-commenter-bot
Copy link

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@kubevirt-bot
Copy link
Contributor

@jean-edouard: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubevirt-e2e-k8s-1.28-sig-compute c753f79 link unknown /test pull-kubevirt-e2e-k8s-1.28-sig-compute-1.1

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@kubevirt-commenter-bot
Copy link

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@kubevirt-bot kubevirt-bot merged commit 2480172 into kubevirt:release-1.1 Nov 1, 2023
35 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants