New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1847705: Stop forcing dhclient in baremetal and friends #1840
Conversation
@cybertron: This pull request references Bugzilla bug 1847705, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Note that my baremetal deployments are still failing for other reasons, but this seems to have gotten DHCP working correctly. /cc @mandre @rgolangh @patrickdillon I doubt this will cause any new breakage since dhclient is not on the 4.6 images anyway, but please verify that this won't mess up your platforms. |
Hmm. But there's something tricky going on here because the Ignition here can only affect the real root. So what's happening in the initramfs isn't being configured here. If this somehow fixes things then we have a drift between the initramfs and the real root. BTW this is tangentially related to coreos/fedora-coreos-tracker#513 I think. |
Did you test this with the current scenario of pivoting from 4.5, or with openshift/installer#3763 ? |
Yeah, this still gets the wrong address initially, but at least it gets the right one in the end so the deployment works. I'm not entirely sure that's a change in behavior since previously we'd have had the same issue with the duid. I'll have to look at a 4.5 deployment to see how that behaved. So far I've only tested with the installer patch to use the 4.6 image. I can try it without, although we might be able to tell how it worked from the ci job. It won't pass, but it might get far enough to see how DHCP worked. |
Tested locally and I can confirm this resolves the DHCP issue we were seeing with dev-scripts and ipv6 /lgtm |
I just tested both with and without openshift/installer#3763 - in both cases the masters now get their expected DHCPv6 reservations so I think this part of the fix is working OK. |
Also looking at the e2e-metal-ipi CI output (libvirt serial console output for each master VM we see: $ cat ostest_master_0-serial0.log | grep ^enp2s0 | uniq This again shows the expected IPs, although I'm a little surprised to see the ingressVIP show up on master-0 - this is a deployment with 2 workers so the ingress VIP should only ever point to workers I think? /cc @yboaron @celebdor - that's unrelated to this fix though, it may be a separate issue we need to address. |
Did the build you're using have #1817? For a long time baremetal masters have been scheduable regardless of # of workers due to the old kubelet hack still being there. |
Yes I just fetched this PR branch which includes that commit, so I guess there must still be something that causes the ingressVIP to end up on a master - not yet looked into it in detail Edit actually the results were from the CI run - there it's NUM_WORKERS=2 also AFAICS |
Was ingress responsive? |
/lgtm |
In 4.6 the RHCOS image no longer has dhclient. In order to maintain the same behavior for DHCPv6 it is also necessary to set the ipv6.dhcp-iaid option to mac (which is what dhclient always uses). Note that I applied the same config for ovirt even though it was only setting dhcp=dhclient before. I think making all of these configs consistent is the right thing to do, even if a platform doesn't care about IPv6. Also, there was previously a dhclient.conf file that overrode the search domain. Since we moved the resolv prepender to a dispatcher script I don't think we need that setting anymore. The prepender script already overrides the search domain. It looks like dhclient.conf was also being used to prepend the DNS VIP on at least one platform. I've still removed that file because I believe it has no effect on the 4.6 images so if that breaks anything it will need to be fixed separately anyway.
Just rebased. Hopefully this will pass metal ci now. |
/test e2e-metal-ipi |
Is metal-ipi expected to pass? I thought we still had OVN issues |
#1830 was the fix for the OVN issue. The rebase was to pull that in so it should pass now. |
/lgtm |
/approve |
1 similar comment
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: celebdor, cybertron, hardys, kikisdeliveryservice, runcom The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
21 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@cybertron: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@cybertron: All pull requests linked via external trackers have merged: openshift/machine-config-operator#1840, openshift/installer#3763. Bugzilla bug 1847705 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
In 4.6 the RHCOS image no longer has dhclient. In order to maintain
the same behavior for DHCPv6 it is also necessary to set the
ipv6.dhcp-iaid option to mac (which is what dhclient always uses).
Note that I applied the same config for ovirt even though it was
only setting dhcp=dhclient before. I think making all of these
configs consistent is the right thing to do, even if a platform
doesn't care about IPv6.
Also, there was previously a dhclient.conf file that overrode the
search domain. Since we moved the resolv prepender to a dispatcher
script I don't think we need that setting anymore. The prepender
script already overrides the search domain. It looks like dhclient.conf
was also being used to prepend the DNS VIP on at least one platform.
I've still removed that file because I believe it has no effect on
the 4.6 images so if that breaks anything it will need to be fixed
separately anyway.
- What I did
- How to verify it
- Description for the changelog