Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dangerous file modifications by chown init container #485

Closed
lindhe opened this issue Jul 12, 2022 · 7 comments
Closed

Dangerous file modifications by chown init container #485

lindhe opened this issue Jul 12, 2022 · 7 comments

Comments

@lindhe
Copy link
Contributor

lindhe commented Jul 12, 2022

Hello!

We are Splunk Observability Cloud customers using this chart to collect logs and metrics from our Kubernetes clusters. We have been having issues caused by the chown init container.

Background

A while back, we noticed some very strange behaviour in some of our pods. Our Ingress Controllers crashed seemingly out of random, and were stuck in CrashLoopBackOff. Upon investigating, it turned out it could no longer read the /etc/resolv.conf, which would obviously be a pretty big issue for a reverse proxy... To our surprise, the file access permissions looked very strange. It didn't have read permissions, which it normally would, and it had the sgid bit set! And it also, seemingly out of nowhere, had group ownership set to gid 20000. How peculiar... And it turned out that this affected all pods in our cluster, it just happened to be the Ingress Controllers that showed clear symptoms from this.

# Reference: a pod's container for "clean" system (minikube using docker)
root@minikube:/var/lib/docker/containers# ll d23205e41c27ad199e6972751f662e3c776fab00210284cfb99fdb0aab9db68d/
total 40
drwx--x---  4 root root 4096 Jul  8 08:27 ./
drwx--x--- 20 root root 4096 Jul  8 08:28 ../
drwx------  2 root root 4096 Jul  8 08:27 checkpoints/
-rw-------  1 root root 3058 Jul  8 08:27 config.v2.json
-rw-r-----  1 root root    0 Jul  8 08:27 d23205e41c27ad199e6972751f662e3c776fab00210284cfb99fdb0aab9db68d-json.log
-rw-r--r--  1 root root 1562 Jul  8 08:27 hostconfig.json
-rw-r--r--  1 root root    3 Jul  8 08:27 hostname
-rw-r--r--  1 root root  164 Jul  8 08:27 hosts
drwx--x---  3 root root 4096 Jul  8 08:27 mounts/
-rw-r--r--  1 root root  103 Jul  8 08:27 resolv.conf
-rw-r--r--  1 root root   71 Jul  8 08:27 resolv.conf.hash

After a lot of debugging, we realized that the culprit was that we had done a security audit where we decided that we should disallow running containers as root. When implementing that, we changed all of our applications to run as non-root users, including this OpenTelemetry chart. So, that was quite the surprise – lowering Otel's privileges caused harm to our system!

When we understood the problem, we were able to mostly rescue things by reverting our non-root changes in Otel and manually restoring file permissions to more reasonable settings. We think we found reasonable settings, but we are not sure since it's very hard to know what the state of the file system was before the chown init container did its thing (especially considering the sticky bits, which makes it impossible to know the intended file permissions for newly created files)!

Ideally, we should read through all changes in all charts and all images before we install them in our clusters. That, however, is pretty hard to do in practice. We missed this snippet, partly because we didn't expect so damaging changes when enabling rootless operation.

Problems

There are a multitude of issues with the current design, and we think it is good if we try and list them so we can consider their impacts when implementing a solution. Please help us identify if there are other problems that we have missed in our initial analysis!

Show-stopping problems

  • It is irrefutable that the current design will cause breaking changes to the filesystem for any Docker based Kubernetes distribution. Since /var/lib/docker/containers does not only hold logs but also other system configuration files (at least hostname, hosts and resolv.conf), it is unreasonable to change permissions of all files in that directory.

Uncomfortable problems

  • When Otel is installed, it makes system-wide changes and never cleans up after itself when uninstalling.
  • There may be severe security implications of setting the sticky bit, particularly for files.
  • It causes changes even outside of Kubernetes! Starting a new container with Docker, on the node host but not in Kubernetes, will also inherit the file permissions because of the sticky bit.

Solutions

Granted this is a hard problem to solve. The pod has to be able to read the logs somehow, and if it's not running as the root user then it has to have some trick up its sleeve to do its job. But we think that the current solution is way too damaging and dangerous to be allowed in production, so we should try and fix this.

One way you could make it more likely to catch these kinds of issues could be to make sure you test on at least one cluster per container runtime, e.g. one Docker based cluster (minikube, RKE, etc.), one containerd based cluster (AKS) and one CRIO based cluster (OpenShift maybe?).

S1

One solution, which is probably sub-optimal, is to only allow running as root:root. That's the nature of some applications. ¯\_(ツ)_/¯

S2

Another solution might be to enable a clear warning to the user when this option is enabled. We think that this is not a safe enough solution so it should be avoided. But in case you insist having it implemented the current way, then at least it would be a minor improvement with a very clear warning.

S3

A third solution may be to disallow running as 20000:20000 but instead run as 20000:root. That's what we have gone with currently. As far as we can tell, that grants us read permission to all logs while not requiring any system-wide file changes. While we think this seems like the best compromise between security and usability, we are not 100% it always works as intended. It seems to work fine for us (having tested in AKS, RKE and minikube), but it's really hard to know exactly what one gets access to when running as 20000:root on arbitrary clusters. The exact security implications of running with group root are unclear.

# Install otel chart with runAsUser:20000, runAsGroup:20000
# -> will trigger "chown" init-container executing these commands
#    ```
#    chgrp -Rv 20000 /var/lib/docker/containers;
#    chmod -R g+rxs /var/lib/docker/containers;
#    setfacl -n -Rm d:g:20000:rx,g:20000:rx /var/lib/docker/containers;
#    ```

# File permissions for pod created BEFORE otel chart installed
# * Ownership/permissions changed for all types of files - not only log files
# * Also note all the sticky bits on several files
root@minikube:/var/lib/docker/containers# ll 3343e28b97487d101695a95753818c6ae2a82933daafc872f08cb9dac40bcbb6/
total 40
drwxr-s---+  4 root 20000 4096 Jul  8 08:19 ./
drwxr-s---+ 38 root 20000 4096 Jul  8 08:23 ../
-rw-r-s---+  1 root 20000    0 Jul  8 08:19 3343e28b97487d101695a95753818c6ae2a82933daafc872f08cb9dac40bcbb6-json.log*
drwxr-s---+  2 root 20000 4096 Jul  8 08:19 checkpoints/
-rw-r-s---+  1 root 20000 3058 Jul  8 08:19 config.v2.json*
-rw-r-sr--+  1 root 20000 1562 Jul  8 08:19 hostconfig.json*
-rw-r-sr--+  1 root 20000    3 Jul  8 08:19 hostname*
-rw-r-sr--+  1 root 20000  164 Jul  8 08:19 hosts*
drwxr-s---+  3 root 20000 4096 Jul  8 08:19 mounts/
-rw-r-sr--+  1 root 20000  103 Jul  8 08:19 resolv.conf*
-rw-r-sr--+  1 root 20000   71 Jul  8 08:19 resolv.conf.hash*

# File permissions for pod created AFTER otel chart installed
# * We don't end up with files having the sticky bit set
# * BUT - the "o+r" permission is missing on (e.g.) resolv.conf
root@minikube:/var/lib/docker/containers# ll fae1bb3fc0e04ecf54f9d68c7ff1cbaf49c94cbf9b186f06005abc5e41c17d28/
total 40
drwx--s---+  4 root root  4096 Jul  8 08:23 ./
drwxr-s---+ 38 root 20000 4096 Jul  8 08:23 ../
drwx--S---+  2 root root  4096 Jul  8 08:23 checkpoints/
-rw-------+  1 root root  3082 Jul  8 08:23 config.v2.json
-rw-r-----+  1 root root     0 Jul  8 08:23 fae1bb3fc0e04ecf54f9d68c7ff1cbaf49c94cbf9b186f06005abc5e41c17d28-json.log
-rw-r--r--+  1 root root  1562 Jul  8 08:23 hostconfig.json
-rw-r-----+  1 root root     9 Jul  8 08:23 hostname
-rw-r-----+  1 root root   170 Jul  8 08:23 hosts
drwx--s---+  3 root root  4096 Jul  8 08:23 mounts/
-rw-r-----+  1 root root   103 Jul  8 08:23 resolv.conf
-rw-r--r--+  1 root root    71 Jul  8 08:23 resolv.conf.hash

# The permissions of resolv.conf from within the pod (container) file system
$ kubectl run -it --image=alpine test-shell
ls -l /etc/resolv.conf
-rw-r-----    1 root     root           103 Jul  8 08:53 /etc/resolv.conf
@hvaghani221
Copy link
Contributor

hvaghani221 commented Jul 13, 2022

The agent only needs access to the checkpoint directory and log files. So, maybe we should modify the permissions of log files only.

PR: #486

@lindhe
Copy link
Contributor Author

lindhe commented Jul 13, 2022

That definitely sounds like a step in the right direction. Do we have any way to confirm that checkpoint/ is expected to hold all relevant log files but nothing else?

@hvaghani221
Copy link
Contributor

@lindhe It turned out that there is no need to change the group owner of /var/lib/docker/containers
setfacl will provide read access to non-root users without modifying the group owner.

It is fixed in #486

@lindhe
Copy link
Contributor Author

lindhe commented Aug 2, 2022

Wow, that's great! Sounds like a really good solution. I'm no expert in Linux's ACL, but I'm pretty sure we identified no issues with that line.

Thanks a lot for this fix! I will test this out and report back if anything seems broken.

@lindhe
Copy link
Contributor Author

lindhe commented Aug 3, 2022

A question to someone more knowledgeable about setfacl than I am: There is a --restore flag in setfacl, could that be used for a clean-up operation when uninstalling OpenTelemetry?

@hvaghani221
Copy link
Contributor

@lindhe, restore flag can be used to delete ACLs but it requires root access. So, the collector itself cannot do the cleanup. You need to run separate job for that.

@lindhe
Copy link
Contributor Author

lindhe commented Aug 3, 2022

@harshit-splunk perhaps a new pod (with root access) can be started upon uninstall via Helm's post-delete hook? I think it would be very nice if Otel could clean up after itself properly, even if the current state is a lot less of a mess than before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants