Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coherence Operator Pod keeps restarting #371

Closed
dpauly opened this issue Nov 19, 2019 · 3 comments
Closed

Coherence Operator Pod keeps restarting #371

dpauly opened this issue Nov 19, 2019 · 3 comments
Assignees

Comments

@dpauly
Copy link

dpauly commented Nov 19, 2019

When upgrading the Operator from 2.0.1 to 2.0.2, I saw some issues where the operator pod was repeatedly restarting, see the logs from the pods below:

W1118 22:24:59.015756 1 reflector.go:270] pkg/mod/k8s.io/client-go@v0.0.0-20190228174230-b40b2a5939e4/tools/cache/reflector.go:95: watch of *v1.CoherenceCluster ended with: too old resource version: 428512 (428627)
log: exiting because of error: log: cannot create log: open /tmp/coherence-operator.args-coherence-operator-695f64855f-jlm86.unknownuser.log.WARNING.20191118-222459.1: no such file or directory

Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T23:41:55Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:09:08Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}

This has since stabilised, so it may be a transient issue.

@thegridman thegridman self-assigned this Nov 19, 2019
@dpauly
Copy link
Author

dpauly commented Nov 20, 2019

logs-from-coherence-operator-in-coherence-operator-74cb5f9dd8-4mjdz.txt
Attached more comprehensive operator log

@thegridman
Copy link
Member

The error in the logs is coming from a component that is way down in the guts of k8s and not something that is part of our code base. From searching the internet for similar issues it appears that it is a problem with the logging library used by one of the k8s components where it tries to access a log file to log the warning.

We cannot just change the version of that particular library we can only move to a newer version of the Operator SDK (which is on a newer version of the k8s libraries) and see whether that improves things. Moving Operator SDK versions is not a trivial task and would be a few days work (assuming it goes smoothly).

@thegridman thegridman mentioned this issue Dec 13, 2019
thegridman added a commit that referenced this issue Dec 13, 2019
@thegridman
Copy link
Member

I'm closing this as I believe the changes in 2.0.3 have resolved the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants