Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glog hates reliability, tries to write to read-only rootfs #427

Closed
leo-baltus opened this issue Apr 22, 2019 · 11 comments

Comments

@leo-baltus
Copy link

commented Apr 22, 2019

I have noticed that the controller and the speaker seem to exit after some time, mostly after 8min intervals. kubectl logs does not give any clues. With strace however:

11467 write(5, "log: exiting because of error: log: cannot create log: open /tmp/controller.controller-7f47947cf9-qpmqm.unknownuser.log.WARNING.20190422-084350.1: read-only file system\n", 169 <unfinished ...>

Changing deploy to readOnlyRootFilesystem: false the exiting has stopped and I am now able to entering the container to view the logs:

cat controller.controller-5d56fc6ddd-ld7tz.unknownuser.log.WARNING.201904
Log file created at: 2019/04/22 10:46:58
Running on machine: controller-5d56fc6ddd-ld7tz
Binary: Built with gc go1.11.5 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
W0422 10:46:58.861347       1 reflector.go:270] pkg/mod/k8s.io/client-go@v10.0.0+incompatible/tools/cache/reflector.go:95: watch of *v1.ConfigMap ended with: too old resource version: 21070598 (21071202)

Wich seems very similar to e.g. containous/traefik#1785. In short glog (part op client-go) seems to prefer to log to files rather than stderr.

I am unning image: metallb/controller:master on a kubernetes-v1.12.3 single-host cluster on-prem using flannel. No ipvs/iptables/kube-proxy tweaking going on.

I could create a PR if you like.

@riaan53

This comment has been minimized.

Copy link

commented Apr 22, 2019

Noticed this as well on master, it does not happen on v0.7.3.

@leo-baltus

This comment has been minimized.

Copy link
Author

commented Apr 23, 2019

kubernetes/client-go#358 (comment) suggests moving to klog instead.

@danderson

This comment has been minimized.

Copy link
Owner

commented Apr 23, 2019

Bleh, glog kicks me in the shins again.

Yes, we should move off it. If k8s itself has stopped using it, that's the main blocker for that removed I'll have to see if klog can be interfaced with to pull data out of it and into MetalLB's own logs.

Thanks for the report!

@danderson danderson added the bug label Apr 23, 2019

@voron

This comment has been minimized.

Copy link

commented Apr 24, 2019

I hit the same problem w/ master builds, both controller and speaker

@dano0b

This comment has been minimized.

Copy link

commented Jun 3, 2019

Would be awesome is that could be addressed. After 1-2 weeks the controller/speaker are just not working anymore and services are not accessible anymore. (requires manual deletion of the pods)

@Elegant996

This comment has been minimized.

Copy link

commented Jun 3, 2019

@dano0b Switch the tag you're using from master to update-versions. It is best to avoid the latest tags such as master when pulling images for critical services.

@dano0b

This comment has been minimized.

Copy link

commented Jun 4, 2019

I'm using kubernetes 1.14 and need the latest version because of it. At the moment my k8s cluster provisioning uses https://github.com/danderson/metallb/blob/master/manifests/metallb.yaml (with some customization) to deploy metallb. For me it is waiting for 0.7.4 with hopefully a fix for this issue.

@Elegant996

This comment has been minimized.

Copy link

commented Jun 4, 2019

@dano0b I'm also running Kubernetes 1.14 and am having no issues with the solution I listed. Did you try using the above suggestion? What errors were you presented with?

Just because there's a commit that mentions Kubernetes 1.14 does not make it mandatory.

@dano0b

This comment has been minimized.

Copy link

commented Jun 4, 2019

Thanks for your suggestion but I don't want to customize the manifest more to support custom docker image tags. I think it would be just awesome to have a non-crashing version including the latest changes.
(can't remember what didn't work for me but I had issues using 0.7.3)

@aleks-mariusz

This comment has been minimized.

Copy link

commented Jul 8, 2019

my metallb components have stayed up more than ten minutes now with the master branch (using this until there's an official update).. it seems i was indeed hit by this issue (thanks @trevex ), so i added the following to my helm-based deployment script to make the patches automatically:

kubectl -n $NAMESPACE patch deployment $(kubectl -n $NAMESPACE get deployment -l app=metallb,component=controller -o jsonpath='{.items[0].metadata.name}') -p '{"spec": {"template": {"spec": {"containers": [{"name": "controller", "securityContext": {"readOnlyRootFilesystem": false}}]}}}}'
kubectl -n $NAMESPACE patch daemonset $(kubectl -n $NAMESPACE get daemonset -l app=metallb,component=speaker -o jsonpath='{.items[0].metadata.name}') -p '{"spec": {"template": {"spec": {"containers": [{"name": "speaker", "securityContext": {"readOnlyRootFilesystem": false}}]}}}}'

This will patch both the deployment and the daemonsets for metallb to set the readOnlyRootFilesystem securityContext value to false

@danderson danderson changed the title spurious restart glog hates reliability, tries to write to read-only rootfs Jul 8, 2019

danderson added a commit that referenced this issue Jul 8, 2019

@danderson

This comment has been minimized.

Copy link
Owner

commented Jul 9, 2019

Removed glog entirely in d42e599, replaced with klog which supports non-hacky log redirection, so this should no longer be a problem as of the next release.

@danderson danderson closed this Jul 9, 2019

danderson added a commit that referenced this issue Jul 9, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.