Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 2096226: Check chosen node-ip can be used #181

Merged
merged 1 commit into from
Jun 13, 2022

Conversation

derekhiggins
Copy link
Contributor

In IPv6, if the chosen IP address is still tentative
then it can't be bound to, resulting in crio crashing.
Check that the IP address can be used and if not fail
or retry as appropriate.

@derekhiggins derekhiggins changed the title Check chosen node-ip can be used Bug 2096226: Check chosen node-ip can be used Jun 13, 2022
@openshift-ci openshift-ci bot added bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Jun 13, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 13, 2022

@derekhiggins: This pull request references Bugzilla bug 2096226, which is invalid:

  • expected the bug to target the "4.11.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 2096226: Check chosen node-ip can be used

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

// If using IPv6, verify that the choosen address isn't tentative
// i.e. we can actually bind to it
if net.IPv6len == len(chosen[0]) {
_, err := net.Listen("tcp", "["+chosen[0].String()+"]:10010")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Binding to a hard-coded port seems a little hacky - although we know that the way things are deployed on startup this should run prior to crio.service, this would fail if someone runs the node-ip check manually after it's running, right?

I'm wondering if we should either use a random high port, or ideally check the connection flags e.g can we determine if the state is/isn't tentative via the golang rtnetlink bindings?

https://pkg.go.dev/github.com/vishvananda/go-netlink/rtnetlink/addr#Flags

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove the port number and a random on is used, I didn't do this as it may pick a port another service wants, but its so short lived I guess it should be ok

I can Look into checking the flags

In IPv6, if the chosen IP address is still tentative
then it can't be bound to, resulting in crio crashing.
Check that the IP address can be used and if not fail
or retry as appropriate.
@cybertron
Copy link
Member

lgtm pending ipv6 job pass
/bugzilla refresh

Note that it is important for this to work after crio has already started because the node-ip logic is also used in the resolv-prepender script that makes sure our local DNS server is first in resolv.conf. I'm a little confused how the ipv6 job passed on the previous version of this.

@openshift-ci openshift-ci bot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Jun 13, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 13, 2022

@cybertron: This pull request references Bugzilla bug 2096226, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.11.0) matches configured target release for branch (4.11.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (vvoronko@redhat.com), skipping review request.

In response to this:

lgtm pending ipv6 job pass
/bugzilla refresh

Note that it is important for this to work after crio has already started because the node-ip logic is also used in the resolv-prepender script that makes sure our local DNS server is first in resolv.conf. I'm a little confused how the ipv6 job passed on the previous version of this.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 13, 2022

@derekhiggins: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@cybertron
Copy link
Member

/lgtm

Looks like this will unblock ipv6 ci and it's not an unreasonable thing to check, so let's go with it.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 13, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 13, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cybertron, derekhiggins

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 13, 2022
@openshift-ci openshift-ci bot merged commit 70d770d into openshift:master Jun 13, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 13, 2022

@derekhiggins: All pull requests linked via external trackers have merged:

Bugzilla bug 2096226 has been moved to the MODIFIED state.

In response to this:

Bug 2096226: Check chosen node-ip can be used

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

cybertron added a commit to cybertron/baremetal-runtimecfg that referenced this pull request Aug 24, 2022
In openshift#181 we fixed a problem where ipv6 addresses are initially in a
state that is not usable by services. However, this was only done
for the code path where VIPs are in use, so it only applies to IPI
deployments that are not using remote workers. It turns out this is
also happening on non-IPI deployments (and likely for IPI deployments
with remote workers), so we need to run the same check on that code
path.
openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/baremetal-runtimecfg that referenced this pull request Aug 29, 2022
In openshift#181 we fixed a problem where ipv6 addresses are initially in a
state that is not usable by services. However, this was only done
for the code path where VIPs are in use, so it only applies to IPI
deployments that are not using remote workers. It turns out this is
also happening on non-IPI deployments (and likely for IPI deployments
with remote workers), so we need to run the same check on that code
path.
cybertron added a commit to cybertron/baremetal-runtimecfg that referenced this pull request Sep 2, 2022
In openshift#181 we fixed a problem where ipv6 addresses are initially in a
state that is not usable by services. However, this was only done
for the code path where VIPs are in use, so it only applies to IPI
deployments that are not using remote workers. It turns out this is
also happening on non-IPI deployments (and likely for IPI deployments
with remote workers), so we need to run the same check on that code
path.

(cherry picked from commit 998bbd4)
tsorya pushed a commit to tsorya/baremetal-runtimecfg that referenced this pull request Sep 9, 2022
In openshift#181 we fixed a problem where ipv6 addresses are initially in a
state that is not usable by services. However, this was only done
for the code path where VIPs are in use, so it only applies to IPI
deployments that are not using remote workers. It turns out this is
also happening on non-IPI deployments (and likely for IPI deployments
with remote workers), so we need to run the same check on that code
path.
tsorya pushed a commit to tsorya/baremetal-runtimecfg that referenced this pull request Sep 9, 2022
In openshift#181 we fixed a problem where ipv6 addresses are initially in a
state that is not usable by services. However, this was only done
for the code path where VIPs are in use, so it only applies to IPI
deployments that are not using remote workers. It turns out this is
also happening on non-IPI deployments (and likely for IPI deployments
with remote workers), so we need to run the same check on that code
path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants