Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate scheduler/taint_manager.go structured logging #98259

Merged
merged 1 commit into from
Feb 5, 2021

Conversation

tanjing2020
Copy link
Contributor

@tanjing2020 tanjing2020 commented Jan 21, 2021

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
Ref:
https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/1602-structured-logging
https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/migration-to-structured-logging.md

Which issue(s) this PR fixes:

Verify log output (taint_manager.go):

before
taint_manager.go:106] NoExecuteTaintManager is deleting Pod: default/pod1
taint_manager.go:163] Sending events to api server.
taint_manager.go:166] kubeClient is nil when starting NodeController
taint_manager.go:187] Starting NoExecuteTaintManager
taint_manager.go:352] Not all taints are tolerated after update for Pod default/pod1 on node1
taint_manager.go:361] Current tolerations for default/pod1 tolerate forever, cancelling any scheduled deletion.
taint_manager.go:385] Noticed pod deletion: types.NamespacedName{Namespace:"default", Name:"pod1"}
taint_manager.go:400] Noticed pod update: types.NamespacedName{Namespace:"default", Name:"pod1"}
taint_manager.go:424] Noticed node deletion: "node1"
taint_manager.go:435] Noticed node update: scheduler.nodeUpdateItem{nodeName:"node1"}
taint_manager.go:440] Updating known taints on node node1: [{testTaint1 test1 NoExecute 2021-01-21 17:32:28.9701214 +0800 CST m=+0.016594301}]
taint_manager.go:453] failed to get Pods assigned to node node1
taint_manager.go:461] All taints were removed from the Node node1. Cancelling all evictions...
after
taint_manager.go:106] "NoExecuteTaintManager is deleting Pod" pod="default/pod1"
taint_manager.go:163] "Sending events to api server."
taint_manager.go:166] "kubeClient is nil when starting NodeController"
taint_manager.go:187] "Starting NoExecuteTaintManager"
taint_manager.go:352] "Not all taints are tolerated after update for Pod on node" pod="default/pod1" node="node1"
taint_manager.go:361] "Current tolerations for pod tolerate forever, cancelling any scheduled deletion." pod="default/pod1"
taint_manager.go:385] "Noticed pod deletion" pod="default/pod1"
taint_manager.go:400] "Noticed pod update" pod="default/pod1"
taint_manager.go:424] "Noticed node deletion" node="node1"
taint_manager.go:435] "Noticed node update" node={nodeName:node1}
taint_manager.go:440] "Updating known taints on node" node="node1" taints=[{Key:testTaint1 Value:test1 Effect:NoExecute TimeAdded:2021-01-21 17:33:14.1921732 +0800 CST m=+0.016589601}]
taint_manager.go:453] "Failed to get Pods assigned to node" err="failed to get Pods assigned to node node1" node="node1"
taint_manager.go:461] "All taints were removed from the Node. Cancelling all evictions..." node="node1"

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 21, 2021
@k8s-ci-robot
Copy link
Contributor

@tanjing2020: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 21, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @tanjing2020. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. labels Jan 21, 2021
@tanjing2020 tanjing2020 force-pushed the taint_manager_log branch 2 times, most recently from 221a6e6 to 992b2c2 Compare January 21, 2021 11:27
pkg/controller/nodelifecycle/scheduler/taint_manager.go Outdated Show resolved Hide resolved
@@ -349,7 +349,7 @@ func (tc *NoExecuteTaintManager) processPodOnNode(
}
allTolerated, usedTolerations := v1helper.GetMatchingTolerations(taints, tolerations)
if !allTolerated {
klog.V(2).Infof("Not all taints are tolerated after update for Pod %v on %v", podNamespacedName.String(), nodeName)
klog.V(2).InfoS("Not all taints are tolerated after update for Pod on node", "pod", podNamespacedName.String(), "node", nodeName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Pod/pod

@@ -358,7 +358,7 @@ func (tc *NoExecuteTaintManager) processPodOnNode(
minTolerationTime := getMinTolerationTime(usedTolerations)
// getMinTolerationTime returns negative value to denote infinite toleration.
if minTolerationTime < 0 {
klog.V(4).Infof("Current tolerations for %v tolerate forever, cancelling any scheduled deletion.", podNamespacedName.String())
klog.V(4).InfoS("Current tolerations for pod tolerate forever, cancelling any scheduled deletion.", "pod", podNamespacedName.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop the ending dot.

pkg/controller/nodelifecycle/scheduler/taint_manager.go Outdated Show resolved Hide resolved
@@ -450,15 +450,15 @@ func (tc *NoExecuteTaintManager) handleNodeUpdate(nodeUpdate nodeUpdateItem) {
// tc.PodUpdated which will use tc.taintedNodes to potentially delete delayed pods.
pods, err := tc.getPodsAssignedToNode(node.Name)
if err != nil {
klog.Errorf(err.Error())
klog.ErrorS(err, "Failed to get Pods assigned to node", "node", node.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Pods/pods

pkg/controller/nodelifecycle/scheduler/taint_manager.go Outdated Show resolved Hide resolved
@ingvagabund
Copy link
Contributor

@tanjing2020 is the PR ready for review?

@ingvagabund
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 25, 2021
@ingvagabund
Copy link
Contributor

cc @k82cn @gmarek

@tanjing2020
Copy link
Contributor Author

/cc @gmarek

@k82cn
Copy link
Member

k82cn commented Feb 5, 2021

/lgtm
/approve

Thanks for the contribution :)

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 5, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ingvagabund, k82cn, tanjing2020

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 5, 2021
@tanjing2020
Copy link
Contributor Author

/test pull-kubernetes-bazel-test

@k8s-ci-robot k8s-ci-robot merged commit 6dc0047 into kubernetes:master Feb 5, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Feb 5, 2021
@tanjing2020 tanjing2020 deleted the taint_manager_log branch February 5, 2021 08:09
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. sig/apps Categorizes an issue or PR as relevant to SIG Apps. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants