Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1813012: Remove unused etcd discovery domain #4067

Conversation

retroflexer
Copy link
Contributor

@retroflexer retroflexer commented Aug 18, 2020

etcd has not been using DNS entries since 4.4 and EtcdDiscoveryDomain has been a redundant flag. This PR attempts to get rid of the unused flag.

This is an attempt to get #4024 split into smaller PRs to understand the build errors.

@retroflexer
Copy link
Contributor Author

/retest

@sallyom
Copy link
Contributor

sallyom commented Aug 19, 2020

/test e2e-ovirt

@retroflexer
Copy link
Contributor Author

/retest

1 similar comment
@retroflexer
Copy link
Contributor Author

/retest

@retroflexer
Copy link
Contributor Author

/reetest

@retroflexer
Copy link
Contributor Author

/retest

@cgwalters
Copy link
Member

/approve
Seems sane to me. I think this one is adequately covered by e2e - we're passing on the main clouds. So this one should be good for someone on the installer team to lgtm.

@retroflexer
Copy link
Contributor Author

/test e2e-metal-ipi

@retroflexer
Copy link
Contributor Author

/test e2e-ovirt

@abhinavdahiya
Copy link
Contributor

/test e2e-gcp
/test e2e-azure
/test e2e-aws-upi
/test e2e-metal
/test e2e-vsphere

@abhinavdahiya
Copy link
Contributor

So e2e-metal-ipi, e2e-openstack, e2e-vsphere each one is failing to bootstrap, and all these use hosted-networking. so it is possible something needs to be fixed here.

@cybertron
Copy link
Member

This is failing in the on-prem platforms because we're still using this field to get the cluster domain. For example: https://github.com/openshift/machine-config-operator/blob/9a6ef99e2dbac4c6c00ba890894ed003390f5c0a/templates/common/baremetal/files/baremetal-coredns-corefile.yaml#L8

If there's another place we can get that in MCO then we can certainly switch, but in the meantime the ci failures are legitimate.

@abhinavdahiya
Copy link
Contributor

This is failing in the on-prem platforms because we're still using this field to get the cluster domain. For example: https://github.com/openshift/machine-config-operator/blob/9a6ef99e2dbac4c6c00ba890894ed003390f5c0a/templates/common/baremetal/files/baremetal-coredns-corefile.yaml#L8

If there's another place we can get that in MCO then we can certainly switch, but in the meantime the ci failures are legitimate.

If you look at the DNS cluster config it includes the basedomain for the cluster, https://github.com/openshift/api/blob/e9036946d01a899fa1eacaddaf79310f47dff04f/config/v1/types_dns.go#L24-L31

I seems like that is what you really want..

@retroflexer
Copy link
Contributor Author

/hold

This PR cannot be merged until MCO finds a different way to extract base domain.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 20, 2020
@retroflexer
Copy link
Contributor Author

Ben has a PR on MCO to move away from using EtcdDiscoveryDomain. Thanks, Ben. Once that merges, we can proceed with this PR.
openshift/machine-config-operator#2025

@retroflexer retroflexer force-pushed the remove-unused-etcd-discovery-domain branch from 8011c6b to 768e56a Compare August 26, 2020 14:08
@retroflexer
Copy link
Contributor Author

Since PR #4062 is merged now, I rebased so that this PR can progress.

@retroflexer retroflexer changed the title Remove unused etcd discovery domain Bug 1813012: Remove unused etcd discovery domain Aug 26, 2020
@openshift-ci-robot openshift-ci-robot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Aug 26, 2020
@openshift-ci-robot
Copy link
Contributor

@retroflexer: This pull request references Bugzilla bug 1813012, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.0) matches configured target release for branch (4.6.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1813012: Remove unused etcd discovery domain

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

@retroflexer: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-vsphere 8011c6b link /test e2e-vsphere
ci/prow/e2e-aws-upi 8011c6b link /test e2e-aws-upi
ci/prow/e2e-aws-fips 768e56a link /test e2e-aws-fips
ci/prow/e2e-metal-ipi 768e56a link /test e2e-metal-ipi
ci/prow/e2e-openstack 768e56a link /test e2e-openstack
ci/prow/e2e-crc 768e56a link /test e2e-crc
ci/prow/e2e-libvirt 768e56a link /test e2e-libvirt
ci/prow/e2e-aws-workers-rhel7 768e56a link /test e2e-aws-workers-rhel7
ci/prow/e2e-aws 768e56a link /test e2e-aws

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@retroflexer
Copy link
Contributor Author

/test e2e-aws-workers-rhel7

@retroflexer
Copy link
Contributor Author

/test e2e-aws

@retroflexer
Copy link
Contributor Author

@abhinavdahiya could you PTAL and possibly approve?

@staebler
Copy link
Contributor

/test e2e-gcp
/test e2e-azure
/test e2e-aws-upi
/test e2e-metal
/test e2e-vsphere

@staebler
Copy link
Contributor

/approve

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, staebler

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 28, 2020
@retroflexer
Copy link
Contributor Author

/test e2e-aws-upi

2 similar comments
@retroflexer
Copy link
Contributor Author

/test e2e-aws-upi

@retroflexer
Copy link
Contributor Author

/test e2e-aws-upi

@hexfusion
Copy link
Contributor

not sure why this was not tagged as it has been approved I think its time we get this done.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 13, 2020
@retroflexer
Copy link
Contributor Author

/retest

@retroflexer
Copy link
Contributor Author

@abhinavdahiya do you see why e2e-aws-upi is failing? The apiserver is not coming up after the bootstrap, but I can't see how any of my changes is causing the failure.

@abhinavdahiya
Copy link
Contributor

abhinavdahiya commented Nov 18, 2020

@abhinavdahiya do you see why e2e-aws-upi is failing? The apiserver is not coming up after the bootstrap, but I can't see how any of my changes is causing the failure.

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_installer/4067/pull-ci-openshift-installer-master-e2e-aws-upi/1328264163889254400/artifacts/e2e-aws-upi/nodes/

we can see that all the instances are waiting to ignite, failing to connect to the bootstrap instance's machine-config-server to get ignition.

https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_installer/4067/pull-ci-openshift-installer-master-e2e-aws-upi/1328264163889254400#1:build-log.txt%3A263

This means that bootstrap host could not be reached to gather bootstrap log bundle, and it looks like bootstrap ec2 instance console logs are not captured.

So i think the bootstrap host is also maybe failing to boot..

@staebler
Copy link
Contributor

This may be a similar issue to the one that was fixed recently with the azure upi tests. The upi tests are still using version 2.1.0 for the ignition config files.

@staebler
Copy link
Contributor

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

openshift-merge-robot commented Nov 19, 2020

@retroflexer: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-ovirt 24e2573 link /test e2e-ovirt
ci/prow/e2e-aws-upi 24e2573 link /test e2e-aws-upi

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit f6bb837 into openshift:master Nov 19, 2020
@openshift-ci-robot
Copy link
Contributor

@retroflexer: All pull requests linked via external trackers have merged:

Bugzilla bug 1813012 has been moved to the MODIFIED state.

In response to this:

Bug 1813012: Remove unused etcd discovery domain

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@retroflexer retroflexer deleted the remove-unused-etcd-discovery-domain branch November 24, 2020 14:33
staebler added a commit to staebler/installer that referenced this pull request Dec 9, 2020
With openshift#4067, the etcdDiscoveryDomain
field is no longer set in the infrastructure.config.openshift.io manifest.
However, the cluster-network-operator relies on that field when syncing the
status of proxy.config.openshift.io [1].

[1] https://github.com/openshift/cluster-network-operator/blob/c23495cf6e6ffeffc0290c85ee4608102f7b47d1/pkg/util/proxyconfig/no_proxy.go#L113
staebler added a commit to staebler/installer that referenced this pull request Dec 9, 2020
With openshift#4067, the etcdDiscoveryDomain
field is no longer set in the infrastructure.config.openshift.io manifest.
However, the cluster-network-operator relies on that field when syncing the
status of proxy.config.openshift.io [1].

[1] https://github.com/openshift/cluster-network-operator/blob/c23495cf6e6ffeffc0290c85ee4608102f7b47d1/pkg/util/proxyconfig/no_proxy.go#L113
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants