Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated cherry pick of #73288 to release-1.11: Decouple node-problem-detector release from kubernetes #75518

Conversation

@wangzhen127
Copy link
Member

commented Mar 20, 2019

Cherry pick of #73288 on release-1.11.

#73288: Decouple node-problem-detector release from kubernetes

Also manually cherry picked the test fixes from #74922 and #75063.

Part of kubernetes/node-problem-detector#236.

Does this PR introduce a user-facing change?:

Node-Problem-Detector configuration is now decoupled from the Kubernetes release on GKE/GCE.

Manual cluster test:
Run the following command with this PR using an old version of NPD and customized release path and flags. In release-1.11, NPD is using v0.6.0. Change it to v0.5.0 in the test.

NODE_PROBLEM_DETECTOR_VERSION="v0.5.0" \
NODE_PROBLEM_DETECTOR_TAR_HASH="650ecfb2ae495175ee43706d0bd862a1ea7f1395" \
NODE_PROBLEM_DETECTOR_RELEASE_PATH=https://storage.googleapis.com/zhenw-gke-dev-public \
NODE_PROBLEM_DETECTOR_CUSTOM_FLAGS="--v=2 --logtostderr --system-log-monitors=/home/kubernetes/node-problem-detector/config/kernel-monitor.json --custom-plugin-monitors=/home/kubernetes/node-problem-detector/config/kernel-monitor-counter.json --port=20256" \
kubetest --build --up \
--provider=gce \
--gcp-project=zhenw-gke-dev \
--gcp-zone=us-central1-c

After cluster is up, manually verify a node:

  1. kube-env of the VM instance: environment variables are correctly set.
  2. sudo journalctl -u kube-node-installation.service: correct NPD version is downloaded.
  3. sudo cat /etc/systemd/system/node-problem-detector.service: NPD flags are set correctly.
  4. sudo journalctl -u node-problem-detector.service: NPD is running fine.

Node E2E test:
In node E2E test, the default version is now v0.6.2. Run the following command with this PR using a different version of NPD.

make test-e2e-node FOCUS="NodeProblemDetector" SKIP="" REMOTE=true CLEANUP=true DELETE_INSTANCES=true EXTRA_ENVS="NODE_PROBLEM_DETECTOR_IMAGE=k8s.gcr.io/node-problem-detector:v0.5.0"
  1. Verify the tests pass.
  2. Verify the log that NPD v0.5.0 is used (instead of v0.6.2), which indicates that we can customize the NPD version in node e2e tests.

@k8s-ci-robot k8s-ci-robot requested review from bowei and eparis Mar 20, 2019

@wangzhen127 wangzhen127 force-pushed the wangzhen127:automated-cherry-pick-of-#73288-upstream-release-1.11 branch 2 times, most recently from 3ff2f22 to 8372f22 Mar 20, 2019

@wangzhen127

This comment has been minimized.

Copy link
Member Author

commented Mar 20, 2019

/sig node
/sig testing
/kind cleanup
/priority important-soon

@wangzhen127 wangzhen127 changed the title Automated cherry pick of #73288: allows configuring NPD release and flags on GCI and add Automated cherry pick of #73288 to release-1.11: allows configuring NPD release and flags on GCI and add Mar 20, 2019

@wangzhen127

This comment has been minimized.

Copy link
Member Author

commented Mar 21, 2019

/assign @Random-Liu
Will fix the failure and update shortly.

@wangzhen127 wangzhen127 force-pushed the wangzhen127:automated-cherry-pick-of-#73288-upstream-release-1.11 branch 3 times, most recently from 8a2c4cc to 11b4c01 Mar 22, 2019

@wangzhen127 wangzhen127 force-pushed the wangzhen127:automated-cherry-pick-of-#73288-upstream-release-1.11 branch from 11b4c01 to 529eef6 Mar 22, 2019

@k8s-ci-robot k8s-ci-robot added size/XL and removed size/L labels Mar 22, 2019

@wangzhen127 wangzhen127 force-pushed the wangzhen127:automated-cherry-pick-of-#73288-upstream-release-1.11 branch from 529eef6 to ffa6f47 Mar 22, 2019

@wangzhen127 wangzhen127 changed the title Automated cherry pick of #73288 to release-1.11: allows configuring NPD release and flags on GCI and add Automated cherry pick of #73288 to release-1.11: Decouple node-problem-detector release from kubernetes Mar 22, 2019

@wangzhen127

This comment has been minimized.

Copy link
Member Author

commented Mar 23, 2019

Just tested this cherry-pick by following the steps in the PR description.

@Random-Liu

This comment has been minimized.

Copy link
Member

commented Mar 27, 2019

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm label Mar 27, 2019

@Random-Liu Random-Liu added this to the v1.11 milestone Mar 27, 2019

@dchen1107

This comment has been minimized.

Copy link
Member

commented Mar 27, 2019

/lgtm
/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Mar 27, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dchen1107, Random-Liu, wangzhen127

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wangzhen127

This comment has been minimized.

Copy link
Member Author

commented Mar 27, 2019

/assign @calebamiles
/assign @foxish

Assigning to branch manager and patch release manager. This cherry pick is for configuring NPD on GKE/GCE only.

@foxish

This comment has been minimized.

Copy link
Member

commented Apr 1, 2019

This looks like it needs a release note. Please update @wangzhen127

@k8s-ci-robot k8s-ci-robot merged commit 9575832 into kubernetes:release-1.11 Apr 1, 2019

17 checks passed

cla/linuxfoundation wangzhen127 authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details
@wangzhen127

This comment has been minimized.

Copy link
Member Author

commented Apr 1, 2019

This looks like it needs a release note.

Updated

@wangzhen127 wangzhen127 deleted the wangzhen127:automated-cherry-pick-of-#73288-upstream-release-1.11 branch Apr 1, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.