Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1972776: improve dual-stack install-config validation #5005

Merged
merged 4 commits into from Jul 28, 2021

Conversation

danwinship
Copy link
Contributor

@danwinship danwinship commented Jun 16, 2021

tl;dr: require dual-stack install-configs to be "IPv4-primary". To be backported to 4.8.

At some point in early 4.7 we noticed that "IPv6-primary" dual-stack configurations (eg, where the first value for clusterNetworks/serviceNetworks is IPv6 rather than IPv4) didn't work right due to cluster-etcd-operator issues. I think that specific bug was fixed, but other problems have popped up, and more importantly, we aren't testing IPv6-primary installation anywhere, and dev-scripts actually can't bring up an IPv6-primary cluster currently. Someone just pointed out that a customer had tried this with a 4.8 fc release and was running into weird problems, so let's just block it.

(eg, see https://coreos.slack.com/archives/CQQ7P8UTT/p1623765189017300, https://coreos.slack.com/archives/CQQ7P8UTT/p1624015997022800)

The second commit is another fail-early-on-bad-config fix; it makes us reject values for serviceNetwork that will later be rejected by kube-apiserver.

The third and fourth commits are drive-by fixes, and don't necessarily have to be backported to 4.8.

@openshift-ci openshift-ci bot added bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jun 16, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 16, 2021

@danwinship: This pull request references Bugzilla bug 1972776, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (gpei@redhat.com), skipping review request.

In response to this:

Bug 1972776: improve dual-stack install-config validation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@danwinship danwinship force-pushed the ipv6-validation branch 2 times, most recently from 33c2cc7 to b833aa1 Compare June 18, 2021 12:37
@danwinship
Copy link
Contributor Author

/assign @aojea

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 18, 2021

@danwinship: This pull request references Bugzilla bug 1972776, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (gpei@redhat.com), skipping review request.

In response to this:

Bug 1972776: improve dual-stack install-config validation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sadasu
Copy link
Contributor

sadasu commented Jun 23, 2021

/test e2e-metal-ipi-ovn-dualstack

@stbenjam
Copy link
Member

The core change makes sense to me. One nit, could the error message a user gets be a little more clear to say that the first network is the primary one?

If we want to use the bugzilla bot, the drive-by fixes should probably be moved to a different PR. Also: where do we specify the service network needs to be a /112?

@aojea
Copy link

aojea commented Jun 24, 2021

LGTM

where do we specify the service network needs to be a /112?

this is a kubernetes limitations, is that the question? or if we should document it?

@sadasu
Copy link
Contributor

sadasu commented Jun 28, 2021

Since this (IPv4 addresses have to specified 1st in a dual-stack install-config) is an artificial limitation we are placing, we should also document this so that the customer gets it right the 1st time.

@jianlinliu
Copy link
Contributor

/lgtm
/label qe-approved
/bugzilla cc-qa

verify result:
06-29 14:53:45.307 [INFO] Generating manifests files.....
06-29 14:53:45.307 level=fatal msg=failed to fetch Master Machines: failed to load asset "Install Config": invalid "install-config.yaml" file: [networking: Invalid value: "DualStack": dual-stack IPv4/IPv6 is not supported for this platform, specify only one type of address, networking.machineNetwork: Invalid value: "fc00::/48, 10.0.0.0/16": IPv4 addresses must be listed before IPv6 addresses, networking.serviceNetwork: Invalid value: "fd02::/112, 10.1.0.0/16": IPv4 addresses must be listed before IPv6 addresses, networking.clusterNetwork: Invalid value: "fd01::/48, 10.128.0.0/14": IPv4 addresses must be listed before IPv6 addresses]

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Jun 29, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 29, 2021

@jianlinliu: This pull request references Bugzilla bug 1972776, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @jianlinliu

In response to this:

/lgtm
/label qe-approved
/bugzilla cc-qa

verify result:
06-29 14:53:45.307 [INFO] Generating manifests files.....
06-29 14:53:45.307 level=fatal msg=failed to fetch Master Machines: failed to load asset "Install Config": invalid "install-config.yaml" file: [networking: Invalid value: "DualStack": dual-stack IPv4/IPv6 is not supported for this platform, specify only one type of address, networking.machineNetwork: Invalid value: "fc00::/48, 10.0.0.0/16": IPv4 addresses must be listed before IPv6 addresses, networking.serviceNetwork: Invalid value: "fd02::/112, 10.1.0.0/16": IPv4 addresses must be listed before IPv6 addresses, networking.clusterNetwork: Invalid value: "fd01::/48, 10.128.0.0/14": IPv4 addresses must be listed before IPv6 addresses]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested a review from jianlinliu June 29, 2021 07:30
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 29, 2021
@jianlinliu jianlinliu removed their assignment Jun 29, 2021
Copy link
Contributor

@jianlinliu jianlinliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@jianlinliu jianlinliu removed their assignment Jun 29, 2021
@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 12, 2021
Require the serviceNetwork subnet lengths to be in the range accepted
by kube-apiserver.
Print the full CIDR (eg "172.30.0.0/16" rather than just
"172.30.0.0"), and if there are multiple elements, print them in the
order they appear, not sorted, since the order is relevant.
We had code to allow you to not specify an IPv6 machineNetwork value
on dual-stack AWS clusters, but this is a no-op since we no longer
allow you to configure dual-stack on AWS anyway. And if we support it
in the future, it will likely be after improvements to IPv6 support on
the AWS end, so this exception may no longer make sense at that point
anyway.
@openshift-ci openshift-ci bot removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. lgtm Indicates that a PR is ready to be merged. labels Jul 14, 2021
@danwinship
Copy link
Contributor Author

Since this (IPv4 addresses have to specified 1st in a dual-stack install-config) is an artificial limitation we are placing, we should also document this so that the customer gets it right the 1st time.

Yes, this is documented. (I can't point you to it because the 4.8 docs aren't up yet, but it's there:

[IMPORTANT]
====
The IPv4 entries must go *before* the IPv6 entries.
====

where do we specify the service network needs to be a /112?

but this isn't...

@stbenjam
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 15, 2021
@stbenjam
Copy link
Member

/assign @jhixson74 @patrickdillon

@patrickdillon
Copy link
Contributor

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 27, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: patrickdillon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 27, 2021
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 27, 2021

@danwinship: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-aws-workers-rhel7 cc1d3e9 link /test e2e-aws-workers-rhel7
ci/prow/e2e-crc cc1d3e9 link /test e2e-crc
ci/prow/e2e-openstack-kuryr cc1d3e9 link /test e2e-openstack-kuryr
ci/prow/e2e-libvirt cc1d3e9 link /test e2e-libvirt
ci/prow/e2e-metal-ipi-ovn-ipv6 cc1d3e9 link /test e2e-metal-ipi-ovn-ipv6

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

3 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 17398dd into openshift:master Jul 28, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 28, 2021

@danwinship: All pull requests linked via external trackers have merged:

Bugzilla bug 1972776 has been moved to the MODIFIED state.

In response to this:

Bug 1972776: improve dual-stack install-config validation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@danwinship
Copy link
Contributor Author

/cherry-pick release-4.8

@danwinship danwinship deleted the ipv6-validation branch July 30, 2021 14:01
@openshift-cherrypick-robot

@danwinship: #5005 failed to apply on top of branch "release-4.8":

Applying: Require dual-stack clusters to be IPv4-primary
Using index info to reconstruct a base tree...
M	pkg/types/validation/installconfig.go
M	pkg/types/validation/installconfig_test.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/types/validation/installconfig_test.go
Auto-merging pkg/types/validation/installconfig.go
CONFLICT (content): Merge conflict in pkg/types/validation/installconfig.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Require dual-stack clusters to be IPv4-primary
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-4.8

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants