Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image 3.4.14 or 3.5.5 not starting on Kubernetes #67

Closed
srilumpa opened this issue Jun 3, 2019 · 4 comments
Closed

Image 3.4.14 or 3.5.5 not starting on Kubernetes #67

srilumpa opened this issue Jun 3, 2019 · 4 comments

Comments

@srilumpa
Copy link

srilumpa commented Jun 3, 2019

Hi,

I am not sure if this is link to the the use of openjdk:8-jre-slim as base image for zookeeper, but since acceptance of #63 and #55 there is an issue with the default user used to start zookeeper.

I have a Zookeeper cluster in version 3.4.13 running on a kubernetes cluster. I tried to upgrade it to 3.4.14 by "simply" changing the image tag to use. Unfortunately, the pod crashes immediatly with the following logs:

ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
2019-06-03 12:45:25,165 [myid:] - INFO  [main:QuorumPeerConfig@136] - Reading configuration from: /conf/zoo.cfg
2019-06-03 12:45:25,171 [myid:] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2019-06-03 12:45:25,171 [myid:] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
2019-06-03 12:45:25,171 [myid:] - INFO  [main:DatadirCleanupManager@101] - Purge task is not scheduled.
2019-06-03 12:45:25,172 [myid:] - WARN  [main:QuorumPeerMain@116] - Either no config or no quorum defined in config, running  in standalone mode
2019-06-03 12:45:25,184 [myid:] - INFO  [main:QuorumPeerConfig@136] - Reading configuration from: /conf/zoo.cfg
2019-06-03 12:45:25,185 [myid:] - INFO  [main:ZooKeeperServerMain@98] - Starting server
2019-06-03 12:45:25,190 [myid:] - INFO  [main:Environment@100] - Server environment:zookeeper.version=3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf, built on 03/06/2019 16:18 GMT
2019-06-03 12:45:25,191 [myid:] - INFO  [main:Environment@100] - Server environment:host.name=zookeeper-0.zookeeper-headless.default.svc.cluster.local
2019-06-03 12:45:25,191 [myid:] - INFO  [main:Environment@100] - Server environment:java.version=1.8.0_212
2019-06-03 12:45:25,191 [myid:] - INFO  [main:Environment@100] - Server environment:java.vendor=Oracle Corporation
2019-06-03 12:45:25,191 [myid:] - INFO  [main:Environment@100] - Server environment:java.home=/usr/local/openjdk-8
2019-06-03 12:45:25,191 [myid:] - INFO  [main:Environment@100] - Server environment:java.class.path=/zookeeper-3.4.14/bin/../zookeeper-server/target/classes:/zookeeper-3.4.14/bin/../build/classes:/zookeeper-3.4.14/bin/../zookeeper-server/target/lib/*.jar:/zookeeper-3.4.14/bin/../build/lib/*.jar:/zookeeper-3.4.14/bin/../lib/slf4j-log4j12-1.7.25.jar:/zookeeper-3.4.14/bin/../lib/slf4j-api-1.7.25.jar:/zookeeper-3.4.14/bin/../lib/netty-3.10.6.Final.jar:/zookeeper-3.4.14/bin/../lib/log4j-1.2.17.jar:/zookeeper-3.4.14/bin/../lib/jline-0.9.94.jar:/zookeeper-3.4.14/bin/../lib/audience-annotations-0.5.0.jar:/zookeeper-3.4.14/bin/../zookeeper-3.4.14.jar:/zookeeper-3.4.14/bin/../zookeeper-server/src/main/resources/lib/*.jar:/conf:
2019-06-03 12:45:25,192 [myid:] - INFO  [main:Environment@100] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2019-06-03 12:45:25,192 [myid:] - INFO  [main:Environment@100] - Server environment:java.io.tmpdir=/tmp
2019-06-03 12:45:25,192 [myid:] - INFO  [main:Environment@100] - Server environment:java.compiler=<NA>
2019-06-03 12:45:25,193 [myid:] - INFO  [main:Environment@100] - Server environment:os.name=Linux
2019-06-03 12:45:25,193 [myid:] - INFO  [main:Environment@100] - Server environment:os.arch=amd64
2019-06-03 12:45:25,193 [myid:] - INFO  [main:Environment@100] - Server environment:os.version=4.15.0
2019-06-03 12:45:25,194 [myid:] - INFO  [main:Environment@100] - Server environment:user.name=?
2019-06-03 12:45:25,194 [myid:] - INFO  [main:Environment@100] - Server environment:user.home=?
2019-06-03 12:45:25,194 [myid:] - INFO  [main:Environment@100] - Server environment:user.dir=/zookeeper-3.4.14
2019-06-03 12:45:25,200 [myid:] - ERROR [main:ZooKeeperServerMain@66] - Unexpected exception, exiting abnormally
java.io.IOException: Unable to create data directory /datalog/version-2
	at org.apache.zookeeper.server.persistence.FileTxnSnapLog.<init>(FileTxnSnapLog.java:87)
	at org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:112)
	at org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:89)
	at org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:55)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:119)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)

Notice the Server environment:user.name=? and Server environment:user.home=? in the logs? I changed the entrypoint command to have some kind of eternal sleep for the pod to keep running while I am trying to understand what was happening and I found this:

I have no name!@zookeeper-0:/zookeeper-3.4.14$ whoami
whoami: cannot find name for user ID 1000
I have no name!@zookeeper-0:/zookeeper-3.4.14$ id zookeeper
uid=999(zookeeper) gid=999(zookeeper) groups=999(zookeeper)

Not sure why but it seems that the container is not started with the right user. It is using the user with ID 1000 (which does not exists) instead of the zookeeper one (with ID 999). I noticed the same behaviour with the 3.5.5 image.

Again, I am not sure this is linked to the build of the new image but the image with tag 3.4.13 is running correctly.

Do you have an idea about the cause of this issue?

@mingfang
Copy link

mingfang commented Jun 5, 2019

The user id changed from 1000 in v3.4.13, to 999 in v3.4.14 and v3.5.5
You can check it like this

root:~# docker run --rm zookeeper:3.4.13 bash -c 'su-exec $ZOO_USER whoami && su-exec $ZOO_USER id'
zookeeper
uid=1000(zookeeper) gid=1000(zookeeper) groups=1000(zookeeper)

root:~# docker run --rm zookeeper:3.4.14 bash -c 'gosu $ZOO_USER whoami && gosu $ZOO_USER id'
zookeeper
uid=999(zookeeper) gid=999(zookeeper) groups=999(zookeeper)

root:~# docker run --rm zookeeper:3.5.5 bash -c 'gosu $ZOO_USER whoami && gosu $ZOO_USER id'
zookeeper
uid=999(zookeeper) gid=999(zookeeper) groups=999(zookeeper)

One solution is to change your security context to

securityContext:
  fsGroup: 999
  runAsUser: 999

@31z4
Copy link
Owner

31z4 commented Jun 8, 2019

Thanks @mingfang and @srilumpa 👍
UID and GID are explicitly set now.

@mingfang
Copy link

mingfang commented Jun 8, 2019

@31z4 Can you set the gid and uid back to 1000 to match the previous version?
That would help people that are upgrading.

groupadd -r zookeeper --gid=999; \

@31z4
Copy link
Owner

31z4 commented Jun 8, 2019

Done! Please wait until this PR docker-library/official-images#6058 is merged.

@31z4 31z4 closed this as completed Jun 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@mingfang @srilumpa @31z4 and others