Bug 1972776: improve dual-stack install-config validation #5005

danwinship · 2021-06-16T15:56:55Z

tl;dr: require dual-stack install-configs to be "IPv4-primary". To be backported to 4.8.

At some point in early 4.7 we noticed that "IPv6-primary" dual-stack configurations (eg, where the first value for clusterNetworks/serviceNetworks is IPv6 rather than IPv4) didn't work right due to cluster-etcd-operator issues. I think that specific bug was fixed, but other problems have popped up, and more importantly, we aren't testing IPv6-primary installation anywhere, and dev-scripts actually can't bring up an IPv6-primary cluster currently. Someone just pointed out that a customer had tried this with a 4.8 fc release and was running into weird problems, so let's just block it.

(eg, see https://coreos.slack.com/archives/CQQ7P8UTT/p1623765189017300, https://coreos.slack.com/archives/CQQ7P8UTT/p1624015997022800)

The second commit is another fail-early-on-bad-config fix; it makes us reject values for serviceNetwork that will later be rejected by kube-apiserver.

The third and fourth commits are drive-by fixes, and don't necessarily have to be backported to 4.8.

openshift-ci · 2021-06-16T15:57:04Z

@danwinship: This pull request references Bugzilla bug 1972776, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target release (4.9.0) matches configured target release for branch (4.9.0)
bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (gpei@redhat.com), skipping review request.

In response to this:

Bug 1972776: improve dual-stack install-config validation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

danwinship · 2021-06-18T12:38:29Z

/assign @aojea

openshift-ci · 2021-06-18T12:39:52Z

@danwinship: This pull request references Bugzilla bug 1972776, which is valid.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target release (4.9.0) matches configured target release for branch (4.9.0)
bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (gpei@redhat.com), skipping review request.

In response to this:

Bug 1972776: improve dual-stack install-config validation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pkg/types/validation/installconfig.go

sadasu · 2021-06-23T12:18:15Z

/test e2e-metal-ipi-ovn-dualstack

stbenjam · 2021-06-24T18:35:54Z

The core change makes sense to me. One nit, could the error message a user gets be a little more clear to say that the first network is the primary one?

If we want to use the bugzilla bot, the drive-by fixes should probably be moved to a different PR. Also: where do we specify the service network needs to be a /112?

aojea · 2021-06-24T22:37:35Z

LGTM

where do we specify the service network needs to be a /112?

this is a kubernetes limitations, is that the question? or if we should document it?

sadasu · 2021-06-28T20:46:30Z

Since this (IPv4 addresses have to specified 1st in a dual-stack install-config) is an artificial limitation we are placing, we should also document this so that the customer gets it right the 1st time.

jianlinliu · 2021-06-29T07:30:33Z

/lgtm
/label qe-approved
/bugzilla cc-qa

verify result:
06-29 14:53:45.307 [INFO] Generating manifests files.....
06-29 14:53:45.307 level=fatal msg=failed to fetch Master Machines: failed to load asset "Install Config": invalid "install-config.yaml" file: [networking: Invalid value: "DualStack": dual-stack IPv4/IPv6 is not supported for this platform, specify only one type of address, networking.machineNetwork: Invalid value: "fc00::/48, 10.0.0.0/16": IPv4 addresses must be listed before IPv6 addresses, networking.serviceNetwork: Invalid value: "fd02::/112, 10.1.0.0/16": IPv4 addresses must be listed before IPv6 addresses, networking.clusterNetwork: Invalid value: "fd01::/48, 10.128.0.0/14": IPv4 addresses must be listed before IPv6 addresses]

openshift-ci · 2021-06-29T07:30:37Z

@jianlinliu: This pull request references Bugzilla bug 1972776, which is valid.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target release (4.9.0) matches configured target release for branch (4.9.0)
bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @jianlinliu

In response to this:

/lgtm
/label qe-approved
/bugzilla cc-qa

verify result:
06-29 14:53:45.307 [INFO] Generating manifests files.....
06-29 14:53:45.307 level=fatal msg=failed to fetch Master Machines: failed to load asset "Install Config": invalid "install-config.yaml" file: [networking: Invalid value: "DualStack": dual-stack IPv4/IPv6 is not supported for this platform, specify only one type of address, networking.machineNetwork: Invalid value: "fc00::/48, 10.0.0.0/16": IPv4 addresses must be listed before IPv6 addresses, networking.serviceNetwork: Invalid value: "fd02::/112, 10.1.0.0/16": IPv4 addresses must be listed before IPv6 addresses, networking.clusterNetwork: Invalid value: "fd01::/48, 10.128.0.0/14": IPv4 addresses must be listed before IPv6 addresses]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jianlinliu

/lgtm

Require the serviceNetwork subnet lengths to be in the range accepted by kube-apiserver.

Print the full CIDR (eg "172.30.0.0/16" rather than just "172.30.0.0"), and if there are multiple elements, print them in the order they appear, not sorted, since the order is relevant.

We had code to allow you to not specify an IPv6 machineNetwork value on dual-stack AWS clusters, but this is a no-op since we no longer allow you to configure dual-stack on AWS anyway. And if we support it in the future, it will likely be after improvements to IPv6 support on the AWS end, so this exception may no longer make sense at that point anyway.

danwinship · 2021-07-14T15:55:32Z

Since this (IPv4 addresses have to specified 1st in a dual-stack install-config) is an artificial limitation we are placing, we should also document this so that the customer gets it right the 1st time.

Yes, this is documented. (I can't point you to it because the 4.8 docs aren't up yet, but it's there:

[IMPORTANT]
====
The IPv4 entries must go *before* the IPv6 entries.
====

where do we specify the service network needs to be a /112?

but this isn't...

stbenjam · 2021-07-15T10:27:40Z

/lgtm

stbenjam · 2021-07-15T10:28:43Z

/assign @jhixson74 @patrickdillon

patrickdillon · 2021-07-27T20:11:14Z

/approve

openshift-ci · 2021-07-27T20:11:23Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: patrickdillon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [patrickdillon]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-bot · 2021-07-27T20:17:31Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-ci · 2021-07-27T23:46:07Z

@danwinship: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-aws-workers-rhel7	`cc1d3e9`	link	`/test e2e-aws-workers-rhel7`
ci/prow/e2e-crc	`cc1d3e9`	link	`/test e2e-crc`
ci/prow/e2e-openstack-kuryr	`cc1d3e9`	link	`/test e2e-openstack-kuryr`
ci/prow/e2e-libvirt	`cc1d3e9`	link	`/test e2e-libvirt`
ci/prow/e2e-metal-ipi-ovn-ipv6	`cc1d3e9`	link	`/test e2e-metal-ipi-ovn-ipv6`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-bot · 2021-07-28T01:17:41Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-07-28T01:29:39Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-07-28T06:29:38Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-07-28T06:41:39Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-ci · 2021-07-28T06:46:08Z

@danwinship: All pull requests linked via external trackers have merged:

openshift/installer#5005

Bugzilla bug 1972776 has been moved to the MODIFIED state.

In response to this:

Bug 1972776: improve dual-stack install-config validation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

danwinship · 2021-07-30T14:01:28Z

/cherry-pick release-4.8

openshift-cherrypick-robot · 2021-07-30T14:02:21Z

@danwinship: #5005 failed to apply on top of branch "release-4.8":

Applying: Require dual-stack clusters to be IPv4-primary
Using index info to reconstruct a base tree...
M	pkg/types/validation/installconfig.go
M	pkg/types/validation/installconfig_test.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/types/validation/installconfig_test.go
Auto-merging pkg/types/validation/installconfig.go
CONFLICT (content): Merge conflict in pkg/types/validation/installconfig.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Require dual-stack clusters to be IPv4-primary
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-4.8

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci bot added bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jun 16, 2021

openshift-ci bot requested review from jhixson74 and patrickdillon June 16, 2021 15:58

danwinship force-pushed the ipv6-validation branch 2 times, most recently from 33c2cc7 to b833aa1 Compare June 18, 2021 12:37

openshift-ci bot assigned aojea Jun 18, 2021

aojea reviewed Jun 18, 2021

View reviewed changes

pkg/types/validation/installconfig.go Show resolved Hide resolved

aojea reviewed Jun 18, 2021

View reviewed changes

pkg/types/validation/installconfig.go Outdated Show resolved Hide resolved

danwinship force-pushed the ipv6-validation branch from b833aa1 to b3914b2 Compare June 18, 2021 20:29

openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Jun 29, 2021

openshift-ci bot requested a review from jianlinliu June 29, 2021 07:30

openshift-ci bot assigned jianlinliu Jun 29, 2021

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 29, 2021

jianlinliu removed their assignment Jun 29, 2021

jianlinliu reviewed Jun 29, 2021

View reviewed changes

openshift-ci bot assigned jianlinliu Jun 29, 2021

jianlinliu removed their assignment Jun 29, 2021

openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 12, 2021

danwinship added 2 commits July 14, 2021 11:23

Require dual-stack clusters to be IPv4-primary

6d7f92c

Validate serviceNetwork subnet length

ac4f82b

Require the serviceNetwork subnet lengths to be in the range accepted by kube-apiserver.

danwinship added 2 commits July 14, 2021 11:23

Fix error reporting with bad IPv4/IPv6 install-config networks

9bd0181

Print the full CIDR (eg "172.30.0.0/16" rather than just "172.30.0.0"), and if there are multiple elements, print them in the order they appear, not sorted, since the order is relevant.

danwinship force-pushed the ipv6-validation branch from b3914b2 to cc1d3e9 Compare July 14, 2021 15:23

openshift-ci bot removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. lgtm Indicates that a PR is ready to be merged. labels Jul 14, 2021

openshift-ci bot assigned stbenjam Jul 15, 2021

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 15, 2021

openshift-ci bot assigned jhixson74 and patrickdillon Jul 15, 2021

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 27, 2021

openshift-merge-robot merged commit 17398dd into openshift:master Jul 28, 2021

danwinship deleted the ipv6-validation branch July 30, 2021 14:01

danwinship mentioned this pull request Jul 30, 2021

Bug 1973421: [4.8] improve dual-stack install-config validation #5114

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug 1972776: improve dual-stack install-config validation #5005

Bug 1972776: improve dual-stack install-config validation #5005

danwinship commented Jun 16, 2021 •

edited

openshift-ci bot commented Jun 16, 2021

danwinship commented Jun 18, 2021

openshift-ci bot commented Jun 18, 2021

sadasu commented Jun 23, 2021

stbenjam commented Jun 24, 2021

aojea commented Jun 24, 2021

sadasu commented Jun 28, 2021

jianlinliu commented Jun 29, 2021

openshift-ci bot commented Jun 29, 2021

jianlinliu left a comment

danwinship commented Jul 14, 2021

stbenjam commented Jul 15, 2021

stbenjam commented Jul 15, 2021

patrickdillon commented Jul 27, 2021

openshift-ci bot commented Jul 27, 2021

openshift-bot commented Jul 27, 2021

openshift-ci bot commented Jul 27, 2021 •

edited

openshift-bot commented Jul 28, 2021

openshift-bot commented Jul 28, 2021

openshift-bot commented Jul 28, 2021

openshift-bot commented Jul 28, 2021

openshift-ci bot commented Jul 28, 2021

danwinship commented Jul 30, 2021

openshift-cherrypick-robot commented Jul 30, 2021

Bug 1972776: improve dual-stack install-config validation #5005

Bug 1972776: improve dual-stack install-config validation #5005

Conversation

danwinship commented Jun 16, 2021 • edited

openshift-ci bot commented Jun 16, 2021

danwinship commented Jun 18, 2021

openshift-ci bot commented Jun 18, 2021

sadasu commented Jun 23, 2021

stbenjam commented Jun 24, 2021

aojea commented Jun 24, 2021

sadasu commented Jun 28, 2021

jianlinliu commented Jun 29, 2021

openshift-ci bot commented Jun 29, 2021

jianlinliu left a comment

Choose a reason for hiding this comment

danwinship commented Jul 14, 2021

stbenjam commented Jul 15, 2021

stbenjam commented Jul 15, 2021

patrickdillon commented Jul 27, 2021

openshift-ci bot commented Jul 27, 2021

openshift-bot commented Jul 27, 2021

openshift-ci bot commented Jul 27, 2021 • edited

openshift-bot commented Jul 28, 2021

openshift-bot commented Jul 28, 2021

openshift-bot commented Jul 28, 2021

openshift-bot commented Jul 28, 2021

openshift-ci bot commented Jul 28, 2021

danwinship commented Jul 30, 2021

openshift-cherrypick-robot commented Jul 30, 2021

danwinship commented Jun 16, 2021 •

edited

openshift-ci bot commented Jul 27, 2021 •

edited