Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get the etcd data path from kubeadm config or etcd pod manifest on kubeadm reset #70003

Merged
merged 2 commits into from
Oct 29, 2018

Conversation

yagonobre
Copy link
Member

@yagonobre yagonobre commented Oct 19, 2018

What type of PR is this?
/kind bug

What this PR does / why we need it:
This clean up the correct etcd data path on kubeadm reset

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes kubernetes/kubeadm#1169

Does this PR introduce a user-facing change?:

Kubeadm reset now clean up custom etcd data path

/sig cluster-lifecycle

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 19, 2018
@neolit123
Copy link
Member

/kind bug
/assign @fabriziopandini

@yagonobre
thank you, could you please add a release note explaining the change in one sentence.

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Oct 19, 2018
@k8s-ci-robot k8s-ci-robot removed the needs-kind Indicates a PR lacks a `kind/foo` label and requires one. label Oct 19, 2018
Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yagonobre

thank you for this bug fix PR!
i have added some minor comments but in general - LGTM.

if err == nil {
dirsToClean = append(dirsToClean, etcdDataDir)
} else {
glog.Warningf("[reset] WARNING: invalid etcd manifest, etcd data directorie will not be cleaned")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change to:

glog.Warningf("[reset] etcd data path will not be cleaned: %v", err)

i'm assuming this is what we are going to get on external etcd?

"k8s.io/kubernetes/pkg/util/initsystem"
utilsexec "k8s.io/utils/exec"
)

const etcdVolumeName = "etcd-data"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please move this constant to the function that uses it.

@@ -175,6 +183,25 @@ func (r *Reset) Run(out io.Writer) error {
return nil
}

func getEtcdDataDir(manifestPath string) (string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, using the podspec volume.


dataDir, err := getEtcdDataDir(manifestPath)
if (err != nil) != test.expectErr {
t.Errorf(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.Fatalf()?

dataDir, err := getEtcdDataDir(manifestPath)
if (err != nil) != test.expectErr {
t.Errorf(
"getEtcdDataDir failed\n%s\nexpected error: %t\n\tgot: %t\nerror: %v",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use https://github.com/lithammer/dedent
we already vendor it in kubeadm.

}

if dataDir != test.dataDir {
t.Errorf("getEtcdDataDir failed\n%s\n\texpected: %s\ngot: %s", name, test.dataDir, dataDir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use t.Fatalf() + dedent here too.

if test.writeManifest {
err := ioutil.WriteFile(manifestPath, []byte(test.podYaml), 0644)
if err != nil {
t.Fatalf("failed to write pod manifest\n%s\n\tfatal error: %v", name, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use https://github.com/lithammer/dedent
we already vendor it in kubeadm.

@yagonobre
Copy link
Member Author

@neolit123 updated

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Oct 19, 2018
Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yagonobre thanks for this PR!

My first impression reading this PR is that the etcd folder detection process should initially try to read the kubeadm config and only as a fallback parse the etcd manifest. This could have several advantages, especially if we consider all the variants of etcd we supports (external etcd Vs local etcd Vs stacked etcd)
What do you think about?

@timothysc @detiber opinions?

@yagonobre
Copy link
Member Author

yagonobre commented Oct 19, 2018

@fabriziopandini This will just apply when used local/stacked etcd and both use the same spec. But where I can get the kubeadm config? in the configmap?

@fabriziopandini
Copy link
Member

@yagonobre
Yes, kubeadm config is a config map under kube-system namespace and we are using it as a "source of truth" about kubeadm clusters. Utilities for working with this config map are in util\config (even if some refactoring is still ongoing).

However, as suggested in my previous I would appreciate some other opinion about this and then decide if to iterate on this PR or if to leave this PR as it is (+ eventually open an issue for keeping track of the possible improvement with kubeadm config map)

@timothysc
Copy link
Member

My first impression reading this PR is that the etcd folder detection process should initially try to read the kubeadm config and only as a fallback parse the etcd manifest. This could have several advantages, especially if we consider all the variants of etcd we supports (external etcd Vs local etcd Vs stacked etcd)
What do you think about?

agreed

/assign @timothysc

@yagonobre yagonobre changed the title Get the etcd data path from etcd pod manifest on kubeadm reset Get the etcd data path from kubeadm config or etcd pod manifest on kubeadm reset Oct 25, 2018
@yagonobre
Copy link
Member Author

@fabriziopandini @timothysc updated!

@fabriziopandini
Copy link
Member

/approve
/ok-to-test

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 26, 2018
@yagonobre
Copy link
Member Author

/retest

1 similar comment
@yagonobre
Copy link
Member Author

/retest

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Oct 26, 2018
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 26, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fabriziopandini, yagonobre

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Oct 26, 2018
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Oct 26, 2018

@yagonobre: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-local-e2e-containerized 8948828b8827bcdeea6b33cbde9f1fb9a349d240 link /test pull-kubernetes-local-e2e-containerized

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@yagonobre
Copy link
Member Author

/retest

@fabriziopandini
Copy link
Member

@yagonobre it seems you have to run /hack/update-bazel.sh to fix your test

@yagonobre
Copy link
Member Author

@fabriziopandini done

@yagonobre
Copy link
Member Author

/remove-area kubelet
/remove-area kubectl
/remove-kind documentation
/remove-sig cli
/remove-sig node
/remove-sig storage
/remove-sig testing

@k8s-ci-robot k8s-ci-robot removed area/kubelet area/kubectl kind/documentation Categorizes issue or PR as related to documentation. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Oct 26, 2018
@neolit123
Copy link
Member

neolit123 commented Oct 28, 2018

@yagonobre please squash the commits to 1 or 2 (one for code, one for autogenerated bazel).
thanks.

@yagonobre
Copy link
Member Author

@neolit123 done

@neolit123
Copy link
Member

LGTM
waiting on CI and further comments, if any.
@kubernetes/sig-cluster-lifecycle-pr-reviews

Copy link
Member

@timothysc timothysc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the patch!

Comments for next time, and they are non-blocking

/lgtm

@@ -175,6 +185,33 @@ func (r *Reset) Run(out io.Writer) error {
return nil
}

func getEtcdDataDir(manifestPath string, client clientset.Interface) (string, error) {
const etcdVolumeName = "etcd-data"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const for this really shouldn't be localized.


etcdPod, err := utilstaticpod.ReadStaticPodFromDisk(manifestPath)
if err != nil {
return "", err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's weird to return empty strings vs. nil objects.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 29, 2018
@k8s-ci-robot k8s-ci-robot merged commit ab61dcd into kubernetes:master Oct 29, 2018
@yagonobre yagonobre deleted the kubeadm-clean-etcd-dir branch November 4, 2018 02:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kubeadm reset doesn't clean up custom etcd datadir
5 participants