Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert cmd/kubelet/app/server.go to structured logging #98334

Merged
merged 1 commit into from Mar 17, 2021

Conversation

wawa0210
Copy link
Member

@wawa0210 wawa0210 commented Jan 24, 2021

What type of PR is this?

/kind bug

What this PR does / why we need it:

When the parameters are incorrect, kubelet only outputs the error log instead of the entire stack log
This way the user experience will be better and consistent with other components (such as kube-proxy)

Which issue(s) this PR fixes:

Fixes #98292

Special notes for your reviewer:
For compatibility, refer to klog.Fatalf Exit code remains 255

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


/sig node
/area kubelet

@k8s-ci-robot k8s-ci-robot added release-note kind/bug size/XS sig/node area/kubelet cncf-cla: yes needs-triage needs-priority labels Jan 24, 2021
@k8s-ci-robot k8s-ci-robot requested review from mtaufen and yujuhong Jan 24, 2021
@wawa0210
Copy link
Member Author

@wawa0210 wawa0210 commented Jan 24, 2021

/assign @vishh

@ehashman ehashman added this to Needs Reviewer in SIG Node PR Triage Jan 25, 2021
@ehashman
Copy link
Member

@ehashman ehashman commented Jan 25, 2021

/hold

Bug wasn't accepted and I'm not sure this is an improved user experience.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold label Jan 25, 2021
@ehashman
Copy link
Member

@ehashman ehashman commented Jan 25, 2021

Note: I think Fatal logs are going away anyways with the structured logging migration this release, so this PR will be replaced soon anyways. See: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/migration-to-structured-logging.md#replacing-fatal-calls

It would be better to migrate this entire file to use structured logging rather than just making this change.

@ehashman ehashman moved this from Needs Reviewer to Waiting on Author in SIG Node PR Triage Jan 25, 2021
@wawa0210
Copy link
Member Author

@wawa0210 wawa0210 commented Jan 26, 2021

It would be better to migrate this entire file to use structured logging rather than just making this change.

Understand, if possible, I want this pr to fix this bug alone (keep focus). Then reopen a pr to migrate the entire file to use structured logs. What do you think?

@ehashman
Copy link
Member

@ehashman ehashman commented Jan 26, 2021

It would be better to migrate this entire file to use structured logging rather than just making this change.

Understand, if possible, I want this pr to fix this bug alone (keep focus). Then reopen a pr to migrate the entire file to use structured logs. What do you think?

I would prefer you fix it all in one go, because migrating away from using Fatalf is part of the log migration, and we have limited reviewer/approver bandwidth.

@k8s-ci-robot k8s-ci-robot added size/L and removed size/XS labels Jan 29, 2021
Copy link
Member

@ehashman ehashman left a comment

Thanks for doing this! I have a bit of feedback on the log changes.

@@ -117,7 +117,8 @@ func NewKubeletCommand() *cobra.Command {
kubeletConfig, err := options.NewKubeletConfiguration()
// programmer error
if err != nil {
klog.Fatal(err)
klog.ErrorS(err, "Failed create a new kubelet configuration")
Copy link
Member

@ehashman ehashman Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
klog.ErrorS(err, "Failed create a new kubelet configuration")
klog.ErrorS(err, "Failed to create a new kubelet configuration")

@@ -117,7 +117,8 @@ func NewKubeletCommand() *cobra.Command {
kubeletConfig, err := options.NewKubeletConfiguration()
// programmer error
if err != nil {
klog.Fatal(err)
klog.ErrorS(err, "Failed create a new kubelet configuration")
os.Exit(255)
Copy link
Member

@ehashman ehashman Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use exit code 1 throughout this file, and not 255, to be consistent with the rest of the files in cmd/: https://cs.k8s.io/?q=os.Exit&i=nope&files=cmd%2F.*&repos=kubernetes/kubernetes

Copy link
Member Author

@wawa0210 wawa0210 Jan 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the error code of klog.Fatalf will return 255, in order not to break compatibility, I returned 255.
So I kept the status quo. Does it need to be adjusted to 1 here? Does it need to be discussed?

Copy link
Member

@ehashman ehashman Feb 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll ask in Slack about it.

Copy link
Member

@neolit123 neolit123 Feb 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not be using 255, but rather 1.
also idealy applications should have a single point of os.Exit() instead of multiple ones.

but as mentioned on slack this change would need an ACTION REQUIRED if users are expecting an exact value and not != 0

Copy link
Member Author

@wawa0210 wawa0210 Feb 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also idealy applications should have a single point of os.Exit() instead of multiple ones.

There are various reasons for calling os.Exit(1). If only one os.Exit() is needed, do you have any good suggestions?

Copy link
Member

@neolit123 neolit123 Feb 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deferring to the kubelet maintainers, but ideally errors should bubble up to a single point of os.Exit(x).
and there, depending on the different error types different exit codes can be returned.

kubernetes components tend to just os.Exit at arbitrary locations with fixed error codes, which is not a great pattern.

}

// check if there are non-flag arguments in the command line
cmds := cleanFlagSet.Args()
if len(cmds) > 0 {
cmd.Usage()
klog.Fatalf("unknown command: %s", cmds[0])
klog.InfoS("Unknown command", "command", cmds[0])
Copy link
Member

@ehashman ehashman Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -534,7 +548,7 @@ func run(ctx context.Context, s *options.KubeletServer, kubeDeps *kubelet.Depend
return err
}
if cloud != nil {
klog.V(2).Infof("Successfully initialized cloud provider: %q from the config file: %q\n", s.CloudProvider, s.CloudConfigFile)
klog.V(2).InfoS("Successfully initialized cloud provider", "cloud provider", s.CloudProvider, "config file", s.CloudConfigFile)
Copy link
Member

@ehashman ehashman Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keys should not have spaces. I think the convention is to use camelCase (i.e. "cloudProvider")

@@ -611,14 +625,14 @@ func run(ctx context.Context, s *options.KubeletServer, kubeDeps *kubelet.Depend
cgroupRoots = append(cgroupRoots, nodeAllocatableRoot)
kubeletCgroup, err := cm.GetKubeletContainer(s.KubeletCgroups)
if err != nil {
klog.Warningf("failed to get the kubelet's cgroup: %v. Kubelet system container metrics may be missing.", err)
klog.InfoS("Failed to get the kubelet's cgroup. Kubelet system container metrics may be missing.", "error", err)
Copy link
Member

@ehashman ehashman Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -677,15 +691,15 @@ func run(ctx context.Context, s *options.KubeletServer, kubeDeps *kubelet.Depend

if reservedSystemCPUs.Size() > 0 {
// at cmd option valication phase it is tested either --system-reserved-cgroup or --kube-reserved-cgroup is specified, so overwrite should be ok
klog.Infof("Option --reserved-cpus is specified, it will overwrite the cpu setting in KubeReserved=\"%v\", SystemReserved=\"%v\".", s.KubeReserved, s.SystemReserved)
klog.InfoS("Option --reserved-cpus is specified, it will overwrite the cpu setting in KubeReserved=\"%v\", SystemReserved=\"%v\".", s.KubeReserved, s.SystemReserved)
Copy link
Member

@ehashman ehashman Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log line needs to be updated, it still has string formatting

if s.KubeReserved != nil {
delete(s.KubeReserved, "cpu")
}
if s.SystemReserved == nil {
s.SystemReserved = make(map[string]string)
}
s.SystemReserved["cpu"] = strconv.Itoa(reservedSystemCPUs.Size())
klog.Infof("After cpu setting is overwritten, KubeReserved=\"%v\", SystemReserved=\"%v\"", s.KubeReserved, s.SystemReserved)
klog.InfoS("After cpu setting is overwritten, KubeReserved=\"%v\", SystemReserved=\"%v\"", s.KubeReserved, s.SystemReserved)
Copy link
Member

@ehashman ehashman Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also needs its string formatting updated

@@ -791,7 +805,7 @@ func run(ctx context.Context, s *options.KubeletServer, kubeDeps *kubelet.Depend
go wait.Until(func() {
err := http.ListenAndServe(net.JoinHostPort(s.HealthzBindAddress, strconv.Itoa(int(s.HealthzPort))), mux)
if err != nil {
klog.Errorf("Starting healthz server failed: %v", err)
klog.ErrorS(err, "Failed to starting healthz server")
Copy link
Member

@ehashman ehashman Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
klog.ErrorS(err, "Failed to starting healthz server")
klog.ErrorS(err, "Failed to start healthz server")

@@ -1093,7 +1107,7 @@ func RunKubelet(kubeServer *options.KubeletServer, kubeDeps *kubelet.Dependencie
for _, ip := range strings.Split(kubeServer.NodeIP, ",") {
parsedNodeIP := net.ParseIP(strings.TrimSpace(ip))
if parsedNodeIP == nil {
klog.Warningf("Could not parse --node-ip value %q; ignoring", ip)
klog.InfoS("Could not parse --node-ip ignoring", "nodeIp", ip)
Copy link
Member

@ehashman ehashman Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
klog.InfoS("Could not parse --node-ip ignoring", "nodeIp", ip)
klog.InfoS("Could not parse --node-ip ignoring", "nodeIP", ip)

@@ -1114,7 +1128,7 @@ func RunKubelet(kubeServer *options.KubeletServer, kubeDeps *kubelet.Dependencie
})

credentialprovider.SetPreferredDockercfgPath(kubeServer.RootDirectory)
klog.V(2).Infof("Using root directory: %v", kubeServer.RootDirectory)
klog.V(2).InfoS("Using root directory", "directory", kubeServer.RootDirectory)
Copy link
Member

@ehashman ehashman Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serathius do you have an opinion on directory vs. dir?

Copy link
Contributor

@serathius serathius Mar 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"path" :P

@k8s-ci-robot k8s-ci-robot requested a review from serathius Jan 29, 2021
@ehashman
Copy link
Member

@ehashman ehashman commented Jan 29, 2021

/priority important-longterm
/triage accepted
/retitle Convert cmd/kubelet/app/server.go to structured logging

@k8s-ci-robot k8s-ci-robot added the priority/important-longterm label Jan 29, 2021
@k8s-ci-robot k8s-ci-robot changed the title Fix kubelet flag verification failed, output error message instead of stack error Convert cmd/kubelet/app/server.go to structured logging Jan 29, 2021
@k8s-ci-robot k8s-ci-robot added triage/accepted and removed needs-priority labels Jan 29, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Mar 11, 2021
@ehashman ehashman moved this from Waiting on Author to Needs Approver in SIG Node PR Triage Mar 11, 2021
@ehashman ehashman moved this from Waiting on Author to Needs Approver in Structured Logging Migration for Kubelet, 1.21 Mar 11, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-rebase label Mar 11, 2021
@ehashman
Copy link
Member

@ehashman ehashman commented Mar 11, 2021

/remove-kind bug
/kind cleanup

@k8s-ci-robot k8s-ci-robot added kind/cleanup and removed kind/bug labels Mar 11, 2021
@mrunalp
Copy link
Contributor

@mrunalp mrunalp commented Mar 16, 2021

/approve

@mrunalp mrunalp moved this from Needs Approver to Done in SIG Node PR Triage Mar 16, 2021
@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Mar 16, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mrunalp, wawa0210

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved label Mar 16, 2021
@ehashman ehashman moved this from Needs Approver to Done in Structured Logging Migration for Kubelet, 1.21 Mar 16, 2021
@k8s-ci-robot k8s-ci-robot added the needs-rebase label Mar 16, 2021
@k8s-ci-robot k8s-ci-robot removed the lgtm label Mar 17, 2021
Copy link
Member

@ehashman ehashman left a comment

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Mar 17, 2021
@pacoxu
Copy link
Member

@pacoxu pacoxu commented Mar 17, 2021

/retest

@pacoxu
Copy link
Member

@pacoxu pacoxu commented Mar 17, 2021

/remove-label needs-rebase

@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Mar 17, 2021

@pacoxu: The label(s) /remove-label needs-rebase cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash

In response to this:

/remove-label needs-rebase

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase label Mar 17, 2021
@pacoxu
Copy link
Member

@pacoxu pacoxu commented Mar 17, 2021

/retest

@wawa0210
Copy link
Member Author

@wawa0210 wawa0210 commented Mar 17, 2021

/test pull-kubernetes-e2e-kind-ipv6

@k8s-ci-robot k8s-ci-robot merged commit 1dce898 into kubernetes:master Mar 17, 2021
13 of 14 checks passed
@wawa0210 wawa0210 deleted the fix-98292 branch Mar 17, 2021
@k8s-ci-robot k8s-ci-robot added release-note-none and removed release-note labels Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved area/kubelet cncf-cla: yes kind/cleanup lgtm priority/important-longterm release-note-none sig/node size/L triage/accepted
Development

Successfully merging this pull request may close these issues.

9 participants