-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
K8s projected volumes don't work when using systemd inside the container #767
Comments
Hi @jojonium, Thanks for giving Sysbox a shot.
That's so strange; whether systemd is PID 1 or not should not make a difference at all.
That looks normal to me; what do you see under Also, do you have access to the K8s node where the Sysbox pod is running? If so, please re-launch the Sysbox pod and do a Thanks! |
There's no serviceaccount directory created under /var/run/
This is what
|
Thanks @jojonium, that helps. At host level, is The way it should work is:
Now, I don't know yet why systemd being PID 1 causes a problem, except that I did notice that when systemd is PID 1, sysbox automatically mounts Another question: how does |
At the host level, You're right about systemd mounting This is from inside the container with systemd enabled:
And without:
This is also without systemd:
|
Thanks @jojonium, very helpful info.
I think I see the problem; in the above output, the I suspect that Sysbox (incorrectly) did the |
Seems the bug is here in sysbox-runc. That code ensures the mounts are ordered such that they don't opaque each other (e.g., mount Normally the higher level container manager (e.g., Docker or K8s) sends the mounts in the correct order, but because Sysbox implicitly adds some mounts of it's own (e.g., tmpfs on If it's OK, I can try patching it and send you a new sysbox-runc binary that you can then use on the K8s node, to see if it fixes the problem. I've not been able to reproduce locally with Docker yet unfortunately. |
Sure, I can try out the patched binary and see if that fixes it. |
Hi @jojonium, OK, I've attached a patched sysbox-runc binary. It's based on this PR. Please stop all sysbox pods on the K8s node, then gunzip the patched sysbox-runc and copy it to the K8s node, to the location where the original sysbox-runc is located (I suggest you back-up the original one just in case). Then relaunch the sysbox pod and let me know if it fixes the problem please. I've done basic testing on it, but haven't run the full sysbox test suite on the patch yet. Should work fine though 🤞 . |
I copied the new binary onto the node and restarted the sysbox pod but it doesn't seem to have fixed the issue: From the node:
And from the container:
I will note that the easy workaround is to set the token's |
Hi @jojonium,
Oh too bad, thanks. So strange that even with the fix I provided, the
Silly question, but just to double check: did you update the sysbox-runc on all nodes of the K8s cluster? (to make sure the pod is in fact using the updated sysbox-runc)? |
After experimenting some more I found that the token mount only fails when I have |
Ah ... interesting; since ENTRYPOINT always executes, then I believe the command must be creating a redundant execution of
From Sysbox's perspective, it knows nothing about ENTRYPOINT or command; it's simply told by the higher level runtime (e.g., K8s or Docker) what program to start the container with and with what arguments. It then checks if the program is systemd ( How does findmnt look when things work? |
Does the ENTRYPOINT really always run? For example if I set the command to
Whereas if I omit the And I think I got mixed up before,
What doesn't work is |
I see, my mistake: K8s does not use ENTRYPOINT, it's a Docker thing only.
Makes sense.
That looks good, thanks.
Yes that explains it; I guess we could improve the detection logic in Sysbox, but in general there's no need to use the shell to execute systemd, so it's not something we would prioritize. So think this resolves the issue: the Thanks for getting to the bottom of it! |
Alright I think I fully understand it now. My use case is I want to do some initial set up in the container a shell script in |
Oh I see; is that setup something you could do as a systemd service unit, or does it need to be done before systemd starts? If the latter, is it something you could do in the Dockerfile for the image (such that when the container starts the setup is already in place), or does it need to be done at runtime? |
Yeah it can either be baked into the image or run as a systemd service, running shell commands at container startup was just the easiest first method I thought of, which led me to this issue. |
OK cool, glad we got to the bottom of the issue then. Closing the issue now. Thanks again. |
Using an Ubuntu 20.04 (kernel 5.15.0) node running Kubernetes 1.26 and sysbox 0.6.3.
I'm trying to inject a ServiceAccount token into my pod using a Kubernetes projected volume.
Systemd works fine from inside the container as expected, but if I try to get the injected token:
If I simply change the command to
["sh", "-c", "sleep 1000"]
so it doesn't start systemd as PID 1 the token is injected successfully and I can read it.I can see the mount with
findmnt
so I'm not sure why it's failing to actually get mounted:I came across issue #728 while looking into this so I thought to check the logs from
sysbox-mgr
in case shiftfs wasn't working properly but it doesn't seem to be the same issue reported there:I don't know why this would only happen when systemd is started as the container's PID 1, any insight is appreciated.
The text was updated successfully, but these errors were encountered: