Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get kubelet log from journald when kubelet is running as a systemd service in e2e tests #19306

Merged
merged 1 commit into from Jan 13, 2016

Conversation

yifan-gu
Copy link
Contributor

@yifan-gu yifan-gu commented Jan 5, 2016

When the kubelet is running as a systemd service, we need to get the kubelet log from journald instead of /var/log/kubelet.log

I tested by running some jenkins e2e jobs against my own coreos/e2e branch. This works.The kubelet logs can be found here (kube-proxy log is still empty, because I haven't run them as static pods on that branch yet).

cc @dchen1107 @jonboulle @ixdy @zmerlynn

@k8s-github-robot
Copy link

Labelling this PR as size/M

@k8s-github-robot k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 5, 2016
@yifan-gu yifan-gu changed the title Kubelet service e2e Get kubelet log from journald when kubelet is running as a systemd service Jan 5, 2016
@yifan-gu yifan-gu changed the title Get kubelet log from journald when kubelet is running as a systemd service Get kubelet log from journald when kubelet is running as a systemd service in e2e tests Jan 5, 2016
@k8s-bot
Copy link

k8s-bot commented Jan 6, 2016

GCE e2e test build/test passed for commit 887ed273227fdda07011af6ba26f8ebf765691c9.

@@ -36,6 +36,11 @@ NODE_IMAGE_PROJECT=${KUBE_GCE_NODE_PROJECT:-"${MASTER_IMAGE_PROJECT}"}
CONTAINER_RUNTIME=${KUBE_CONTAINER_RUNTIME:-docker}
RKT_VERSION=${KUBE_RKT_VERSION:-0.5.5}

# If the distro is CoreOS, kubelet will run as a systemd service.
if [[ "${OS_DISTRIBUTION}" == "coreos" ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at https://github.com/kubernetes/kubernetes/blob/master/cluster/saltbase/pillar/systemd.sls, I worry this is a bit too fragile.

Can we instead somehow detect at run-time in the code whether the kubelet is running under systemd or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ixdy ACK, I will give a try tomorrow, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ixdy How about ssh'ing to all the hosts, and then check the return code of 'sudo systemctl status kubelet.service' ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, sorry,missed exactly where you're checking this from - I doubt you want to expose this from within the kubelet.

@k8s-bot
Copy link

k8s-bot commented Jan 6, 2016

GCE e2e test build/test passed for commit 03fc3c27715e08083eacea5ac2371784f175341a.

@yifan-gu
Copy link
Contributor Author

yifan-gu commented Jan 7, 2016

Pushed a new version, which checks the return code of systemctl status kubelet.service at runtime to determine whether kubelet is running as a systemd service.
Verified by running e2e tests against gce/coreos on my branch. It successfully gets the kubelet logs

@k8s-bot
Copy link

k8s-bot commented Jan 7, 2016

GCE e2e build/test failed for commit ed5ff44bf8bb4264e9ba3d3c9508417ff7c20b5d.

@yifan-gu
Copy link
Contributor Author

yifan-gu commented Jan 7, 2016

@ixdy This hardcodes "kubelet.service" as the name of the kubelet systemd service. Might not be the best solution, but should be better than my original one. Feedback is welcome :)

@@ -97,3 +123,29 @@ func logCore(cmds []command, hosts []string, dir, provider string) {
}
wg.Wait()
}

func findKubeletSystemdService(provider string, hosts ...string) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two minor nits:

  • maybe "isUsingSystemdKubelet" is a better name for this function?
  • you never use the error return value - should this return only bool?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ixdy Didn't follow what you meant by "never use the error return value"? If there's an error, a message will be printed.

@k8s-bot
Copy link

k8s-bot commented Jan 8, 2016

GCE e2e test build/test passed for commit ddddb4aa7d6efbce7fa18cf85db5589475db47fc.

@@ -97,3 +123,29 @@ func logCore(cmds []command, hosts []string, dir, provider string) {
}
wg.Wait()
}

func isUsingSystemdKubelet(provider string, hosts ...string) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are only two return points from this function on lines 147 and 150, and neither one returns an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ixdy Got you. Removed the returned error. Thanks!

@k8s-bot
Copy link

k8s-bot commented Jan 8, 2016

GCE e2e build/test failed for commit c3405f54ea6d7f563630569520fe78d4cdbf1b3a.

cmds := []command{{"cat /var/log/kube-proxy.log", "kube-proxy"}}

found := isUsingSystemdKubelet(provider, hosts...)
if found {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more minor nit (sorry): rather than using the variable found, maybe just have

if isUsingSystemdKubelet(provider, hosts...) {
   ...
} else {
  ...
}

similarly for the master.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ixdy done

@k8s-bot
Copy link

k8s-bot commented Jan 8, 2016

GCE e2e build/test failed for commit f01c621888ab693cecc10b347d60bb7196fd995d.

@yifan-gu
Copy link
Contributor Author

yifan-gu commented Jan 8, 2016

@k8s-bot test this please

{"cat /var/log/supervisor/supervisord.log", "supervisord"},
cmds := []command{{"cat /var/log/kube-proxy.log", "kube-proxy"}}
if isUsingSystemdKubelet(provider, hosts...) {
cmds = append(cmds, command{"sudo journalctl --output=cat -u kubelet.service", "kubelet"})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we let the name of kubelet.service get passed down somehow rather than hard-code it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could factor out "kubelet.service" into a constant, sure. how likely is this to change, though?

@k8s-bot
Copy link

k8s-bot commented Jan 8, 2016

GCE e2e test build/test passed for commit f01c621888ab693cecc10b347d60bb7196fd995d.

@yifan-gu
Copy link
Contributor Author

yifan-gu commented Jan 8, 2016

can we let the name of kubelet.service get passed down somehow rather than hard-code it?

@derekwaynecarr I feel it's better. Also if pass the name of the service, do we still need to check kubelet service at runtime?
For example, If we pass a flag --kubelet-service-name="", can we assume if the flag is non-empty, the kubelet is running as a service?
@ixdy

@ixdy
Copy link
Member

ixdy commented Jan 9, 2016

I think this is probably fine as-is.

@yifan-gu
Copy link
Contributor Author

yifan-gu commented Jan 9, 2016

I think this is probably fine as-is.

@ixdy Sorry, do you mean the current approach (which hardcodes the kubelet.service) is fine, or passing --kubelet-service-name flag is fine?

@ixdy
Copy link
Member

ixdy commented Jan 9, 2016

Approach as currently implemented (everything hard-coded) is fine with me.

if err != nil {
fmt.Printf("Error running command: %v\n", err)
}
results[i] = result.Code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to set to 0 if err != nil, is that right/expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonboulle So in this case we assume the kubelet is running as systemd service even when there are errors during SSH().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonboulle raises a good point. if a node has gone away (and thus this ssh call fails), we don't want that to influence the choice of logging location for the rest of the nodes.

I think we should probably be using only successful ssh connections to detect whether nodes are using systemd or not. Could probably even shortcut - return true/false for the first host that we successfully connect to, rather than checking all. I imagine that's the easiest fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ixdy Actually SSH failures doens't affect this as the code is still 0 when there is an error. But I made a change so instead of assume "kubelet is running as systemd service", now we assume "kubelet is not running as systemd service" and try to find the first successful detection of the kubelet.service.

@yifan-gu
Copy link
Contributor Author

@ixdy So is this OK to merge or you have more concerns?

@yifan-gu
Copy link
Contributor Author

ping @ixdy ?

If there is any successful detection of kubelet.service,
fetch the kubelet logs using journalctl.
@yifan-gu
Copy link
Contributor Author

@ixdy Updated.
Now if there's any successful detection of kubelet.service, then use journalctl to fetch the log.

@ixdy
Copy link
Member

ixdy commented Jan 13, 2016

LGTM, thanks

@ixdy ixdy added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 13, 2016
@k8s-bot
Copy link

k8s-bot commented Jan 13, 2016

GCE e2e test build/test passed for commit 3b66179039f36f10722c08dfb034db588f2ba603.

@k8s-bot
Copy link

k8s-bot commented Jan 13, 2016

GCE e2e test build/test passed for commit fb5ee73.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Jan 13, 2016

GCE e2e test build/test passed for commit fb5ee73.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

k8s-github-robot pushed a commit that referenced this pull request Jan 13, 2016
@k8s-github-robot k8s-github-robot merged commit e1a71fe into kubernetes:master Jan 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants