Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retain CoreDNS corefile when migration fails in kubeadm #84523

Merged
merged 2 commits into from Nov 8, 2019

Conversation

rajansandeep
Copy link
Contributor

What type of PR is this?

/kind bug

What this PR does / why we need it:

Currently, when a user chooses to skip the preflight check of CoreDNS corefile migration, the kubeadm upgrade fails. This fix bypasses the migration step and retains the existing Corefile in case there is a preflight error and the user chooses to skip.

Which issue(s) this PR fixes:

Fixes #84326

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 29, 2019
@rajansandeep
Copy link
Contributor Author

/cc @chrisohaver @neolit123

@k8s-ci-robot k8s-ci-robot added area/kubeadm sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 29, 2019
return err
// Errors in Corefile Migration is verified during preflight checks. This part will be executed should a user choose
// to ignore preflight check errors.
klog.V(2).Infof("the Corefile Migration did not occur due to an error: %v. The existing CoreDNS Corefile configuration has been retained.", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably better to have this as a klog.Warning(

Copy link
Member

@neolit123 neolit123 Oct 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a user choose -> a user chose
chose in past tense here?

@@ -108,7 +108,7 @@ func checkMigration(client clientset.Interface) error {

_, err = migration.Migrate(currentInstalledCoreDNSversion, kubeadmconstants.CoreDNSVersion, corefile, false)
if err != nil {
return err
return errors.Errorf("the CoreDNS configuration migration will not occur due to an error: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be errors.Wrap(err, ....)

@neolit123
Copy link
Member

thanks @rajansandeep
just noting, i think this is the first time we add logic in kubeadm where ignoring a preflight check propagates into areas outside of preflight. but it feels fine to me.

/priority important-longterm
/approve
leaving LGTM to the folks on CC.

@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 29, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neolit123, rajansandeep

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 29, 2019
@neolit123
Copy link
Member

/hold
for the minor nits.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 29, 2019
Copy link
Contributor

@chrisohaver chrisohaver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wording tweaks, and spell out possible repercussions of not migrating the coredns config

@@ -108,7 +108,7 @@ func checkMigration(client clientset.Interface) error {

_, err = migration.Migrate(currentInstalledCoreDNSversion, kubeadmconstants.CoreDNSVersion, corefile, false)
if err != nil {
return err
return errors.Errorf("the CoreDNS configuration migration will not occur due to an error: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return errors.Errorf("the CoreDNS configuration migration will not occur due to an error: %v", err)
return errors.Errorf("the CoreDNS configuration will not be migrated, and may be incompatible with the upgraded version of CoreDNS: %v", err)

return err
// Errors in Corefile Migration is verified during preflight checks. This part will be executed should a user choose
// to ignore preflight check errors.
klog.V(2).Infof("the Corefile Migration did not occur due to an error: %v. The existing CoreDNS Corefile configuration has been retained.", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
klog.V(2).Infof("the Corefile Migration did not occur due to an error: %v. The existing CoreDNS Corefile configuration has been retained.", err)
klog.V(2).Infof("the CoreDNS Configuration was not migrated: %v. The existing CoreDNS Corefile configuration has been retained.", err)

@chrisohaver
Copy link
Contributor

chrisohaver commented Oct 29, 2019

IMO, this fix isn't going to solve custom CoreDNS problems alone.

It needs to be documented somewhere that users with custom CoreDNS need to use upgrade phases instead, skipping the CoreDNS upgrade, and doing the CoreDNS upgrade manually (and manually migrating CoreDNS config if necessary).
If a user of a custom CoreDNS upgrades using the normal kubeadm upgrade process, and ignores the warning, then their custom CoreDNS will get replaced with a stock version of CoreDNS. In most cases this would result in a non-functioning DNS after the upgrade finishes. The most common reason for using a custom CoreDNS compile is to use external plugins, and an external plugin present in a Corefile will cause a stock CoreDNS to exit before starting.

Perhaps we could succinctly explain this in the preflight warning...

"... clusters with CoreDNS deployments should use upgrade phases to skip CoreDNS upgrade ..."

@neolit123
Copy link
Member

"... clusters with CoreDNS deployments should use upgrade phases to skip CoreDNS upgrade ..."

adding something like that SGTM, we can do that once phase support for upgrade apply is added:
kubernetes/kubeadm#1318

@chrisohaver
Copy link
Contributor

i think this is the first time we add logic in kubeadm where ignoring a preflight check propagates into areas outside of preflight. but it feels fine to me.

I don't think this change actually does that.

@neolit123
Copy link
Member

I don't think this change actually does that.

my mistake, it does not.

@chrisohaver
Copy link
Contributor

Actually, now i'm thinking it makes more sense retain both the original corefile AND the original coredns deployment when the migration fails for any reason.

That changes the preflight warning to something along the lines of: "CoreDNS will not be upgraded: (error)"

@neolit123
Copy link
Member

@rajansandeep WDYT about @chrisohaver 's last comment?
it seems fine to me.

@rajansandeep
Copy link
Contributor Author

Actually, now i'm thinking it makes more sense retain both the original corefile AND the original coredns deployment when the migration fails for any reason.
That changes the preflight warning to something along the lines of: "CoreDNS will not be upgraded: (error)"

@neolit123 Yes, I'll push out a commit to make the changes @chrisohaver suggested.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 6, 2019
@@ -108,7 +108,7 @@ func checkMigration(client clientset.Interface) error {

_, err = migration.Migrate(currentInstalledCoreDNSversion, kubeadmconstants.CoreDNSVersion, corefile, false)
if err != nil {
return errors.Wrap(err, "the CoreDNS configuration will not be migrated, and may be incompatible with the upgraded version of CoreDNS")
return errors.Wrap(err, "CoreDNS will not be upgraded")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should include the error in the message.

Never mind - it's already doing that. didn't notice the "Wrap"

@neolit123
Copy link
Member

SGTM, thanks for the updates.
the PR is approved.
leaving LGTM to @chrisohaver

@chrisohaver
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 7, 2019
@rajansandeep
Copy link
Contributor Author

/retest

@rajansandeep
Copy link
Contributor Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 8, 2019
@k8s-ci-robot k8s-ci-robot merged commit ab1b374 into kubernetes:master Nov 8, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.17 milestone Nov 8, 2019
k8s-ci-robot added a commit that referenced this pull request Dec 2, 2019
…#84523-upstream-release-1.16

Automated cherry pick of #84523: retain corefile when migration fails
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upgrade to 1.16.x failed if CoreDNS was previously deployed with self-built image
4 participants