Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: fix the bug that 'kubeadm init --dry-run --upload-certs' command failed with 'secret not found' error #108002

Merged
merged 1 commit into from
Feb 9, 2022

Conversation

SataQiu
Copy link
Member

@SataQiu SataQiu commented Feb 8, 2022

What type of PR is this?

/kind bug

What this PR does / why we need it:

kubeadm: fix the bug that kubeadm init --dry-run --upload-certs command failed with secret not found error

Which issue(s) this PR fixes:

Fixes kubernetes/kubeadm#2649

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 8, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: SataQiu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 8, 2022
@k8s-ci-robot k8s-ci-robot added area/kubeadm sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 8, 2022
@SataQiu
Copy link
Member Author

SataQiu commented Feb 8, 2022

/test pull-kubernetes-e2e-kind-ipv6

Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR @SataQiu

} else {
secret, err = client.CoreV1().Secrets(metav1.NamespaceSystem).Get(context.TODO(), secretName, metav1.GetOptions{})
if err != nil {
return nil, errors.Wrap(err, "error to get token reference")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can fix this typo

Suggested change
return nil, errors.Wrap(err, "error to get token reference")
return nil, errors.Wrap(err, "error getting token reference")

Comment on lines 169 to 178
if isDryRun {
// If dry-running, just generate a fake bootstrap token Secret instead of doing a real query
secret = &v1.Secret{
ObjectMeta: metav1.ObjectMeta{
Namespace: metav1.NamespaceSystem,
Name: secretName,
UID: uuid.NewUUID(),
},
}
} else {
Copy link
Member

@neolit123 neolit123 Feb 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the better way to do this is modifying the "init getter".
this will result in no need to pass the isDryRun bool to functions in the stack.

here https://github.com/kubernetes/kubernetes/blob/f104ae885fb100278d9cbf2ec3a15b11f465872c/cmd/kubeadm/app/cmd/init.go#L533-L534
kubeadm constructs an init getter and dry run client.

inside the init getter there are some handlers:

idr.handleKubernetesService,
idr.handleGetNode,
idr.handleSystemNodesClusterRoleBinding,
idr.handleGetBootstrapToken,

we can add a new handler for resource type "secret" called something like handleCopyCertsSecret
and in the handler we can make sure the secret name matches the prefix as defined here:

func BootstrapTokenSecretName(tokenID string) string {
return fmt.Sprintf("%s%s", api.BootstrapTokenSecretPrefix, tokenID)
}

and return a usable / blank secret object for dryrun.

i have not tested, but it should work..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patient explanation @neolit123

But according to the following code:

// handleGetBootstrapToken handles the case where kubeadm init creates the default token; and the token code GETs the
// bootstrap token secret first in order to check if it already exists
func (idr *InitDryRunGetter) handleGetBootstrapToken(action core.GetAction) (bool, runtime.Object, error) {
	if !strings.HasPrefix(action.GetName(), "bootstrap-token-") || action.GetNamespace() != metav1.NamespaceSystem || action.GetResource().Resource != "secrets" {
		// We can't handle this event
		return false, nil, nil
	}
	// We can safely return a NotFound error here as the code will just proceed normally and create the Bootstrap Token
	return true, nil, apierrors.NewNotFound(action.GetResource().GroupResource(), "secret not found")
}

It looks like the handler will always return a NotFound error whether or not the bootstrap token Secret has already been created. It's hard to decide when to return the NotFound error and when to return the fake Secret.

And we never actually store objects to the backend, because of :

&core.SimpleReactor{
Verb: "create",
Resource: "*",
Reaction: successfulModificationReactorFunc,
},

// successfulModificationReactorFunc is a no-op that just returns the POSTed/PUTed value if present; but does nothing to edit any backing data store.
func successfulModificationReactorFunc(action core.Action) (bool, runtime.Object, error) {
objAction, ok := action.(actionWithObject)
if ok {
return true, objAction.GetObject(), nil
}
return true, nil, nil
}

Another approach might be to allow the bootstrap token Secret to be stored on the backend and then remove the handleGetBootstrapToken handler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the above approach. It also works. @neolit123

@SataQiu
Copy link
Member Author

SataQiu commented Feb 9, 2022

/test pull-kubernetes-e2e-kind

Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
i tested it locally and this does fix the problem. thanks.

it seems that it happens if users do kubeadm init --dry-run --upload-certs as well.
that is because --upload-certs executes into optional code.

i've logged this, because we need e2e tests for dry-run in kubeadm:
kubernetes/kubeadm#2653

in terms of the handlers in the dry-run client, i must admit i'm not very familiar with this area of the code, but i think they are a bit strange. maybe we can remove the handlers entirely...
for a dry-run client, i would expect it to always return some fake object...or allow an object to be pushed to the fake storage.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 9, 2022
@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

@cpanato
Copy link
Member

cpanato commented Feb 27, 2022

/triage accepted
/priority important-soon

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 27, 2022
k8s-ci-robot added a commit that referenced this pull request Feb 27, 2022
…002-upstream-release-1.23

Automated cherry pick of #108002: kubeadm: fix the bug that 'kubeadm init --dry-run
k8s-ci-robot added a commit that referenced this pull request Feb 27, 2022
…002-upstream-release-1.21

Automated cherry pick of #108002: kubeadm: fix the bug that 'kubeadm init --dry-run
k8s-ci-robot added a commit that referenced this pull request Feb 27, 2022
…002-upstream-release-1.22-1644916160

Automated cherry pick of #108002: kubeadm: fix the bug that 'kubeadm init --dry-run
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kubeadm init --dry-run --upload cert fails in upload-certs phase with 'secret not found'
5 participants