Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix docker image permissions of /pulsar for OpenShift #8242

Closed
wants to merge 1 commit into from

Conversation

roelrymenants
Copy link

Motivation

On OpenShift the user id in the container is random. In this case, the /pulsar folder has problematic permissions. The VOLUME statement even locks these into place for derivative images.
e.g. as a random user the /pulsar/conf folder is read-only, while the pulsar helm chart assumes it to be present and writable.

Modifications

Change ownership of everything in the /pulsar directory to be root:root, and make sure the owning group has the same rights as the owner.
This solves the problem, as OpenShift ensures the user in the container is a member of the root group.

@sijie
Copy link
Member

sijie commented Oct 21, 2020

The change seems to fix a very narrow problem. Does this approach work on other platforms?

@fransguelinckx
Copy link

fransguelinckx commented Oct 28, 2020

@sijie it is indeed a narrow problem, but it prevents us from deploying on openshift as it has stricter security requirements by default than vanilla kubernetes. On the other hand it is a change with very limited impact. Not tested on other platforms. Default group for a pod is the root group though.

@codelipenghui
Copy link
Contributor

move to 2.8.0 first.

@codelipenghui codelipenghui modified the milestones: 2.7.0, 2.8.0 Nov 17, 2020
@michaeljmarshall
Copy link
Member

I just opened #8751 to request that the pulsar Dockerfile is fixed to ensure that pulsar containers are not given the root user, as they do not need to run as root.

I am not familiar with OpenShift, so I cannot speak to a solution for this specific problem, but I do think it is a security best practice to give applications the least necessary permission to run.

As such, I do not think this PR should be merged.

@fransguelinckx
Copy link

@michaeljmarshall The PR only gives the root group the necessary permissions to access the /pulsar folder. This is because on an openshift cluster, containers are run using a random user ID.

An extract from the openshift documentation, see section "Support arbitrary user ids" (https://docs.openshift.com/container-platform/4.5/openshift_images/create-images.html)

By default, OpenShift Container Platform runs containers using an arbitrarily assigned user ID. This provides additional security against processes escaping the container due to a container engine vulnerability and thereby achieving escalated permissions on the host node.

For an image to support running as an arbitrary user, directories and files that may be written to by processes in the image should be owned by the root group and be read/writable by that group. Files to be executed should also have group execute permissions.

Adding the following to your Dockerfile sets the directory and file permissions to allow users in the root group to access them in the built image:

RUN chgrp -R 0 /some/directory && \
    chmod -R g=u /some/directory

Because the container user is always a member of the root group, the container user can read and write these files.

In conclusion, I think this PR allows for better security by supporting running pulsar by an arbitrary user ID.

@michaeljmarshall
Copy link
Member

michaeljmarshall commented Feb 2, 2021

@fransguelinckx - thanks for the added context. I am not familiar with OpenShift's container platform, so the context is helpful.

It seems to me that the challenge here comes from competing paradigms between how OpenShift runs containers by default and how EKS (and other platforms) recommend running containers. OpenShift is prescriptive about running the user as part of the root group (so it can ensure containers have the right file permissions) and yet EKS recommends just the opposite (because it expects users to ensure file permissions are correct). Here are the Amazon docs that detail best practices for using a PodSecurityPolicy to prevent containers from running as the root user and the root group (https://aws.github.io/aws-eks-best-practices/security/docs/pods/#restrict-the-containers-that-can-run-as-privileged). Here is the brief part of the policy with an explicit comment about forbidding the root group.

    fsGroup:
        rule: 'MustRunAs'
        ranges:
        # Forbid adding the root group.
        - min: 1
          max: 65535

The docker image produced by the Dockerfile in this PR would not be able to run with the PodSecurityPolicy recommended by EKS.

I'm not entirely sure how to best accommodate both paradigms. While OpenShift allows users to override a pod's UID/GID by specifying runAsUser and runAsGroup, they discourage it because it could affect other pods running in other namespaces. Here is the reference https://www.openshift.com/blog/a-guide-to-openshift-and-uids and here is the main quote:

When the application also needs to be executed under a specific custom UID, then the Pod definition needs to use a ServiceAccount with the RunAsAny privilege and the Pod needs to use the runAsUser to set the UID to the hardcoded UID. NOTE: Using a hardcoded UID is NOT recommended. Among issues with this approach is that it is prone to UID collisions with system UIDs or with UIDs of the same or different application running in a different Namespace expecting to use the same UID.

If you're deploying your bookkeeper pods on dedicated hosts, I think that removes their main concern because other namespaces won't be deployed to the nodes. I'm not sure how you're deploying these pods though.

One thing that might be beneficial to all is to remove the VOLUME instruction from pulsar's Dockerfile because it prevents file ownership updates to the /pulsar directories in derivative images. Docker does claim that specifying volumes in the Dockerfile is best practice here:

The VOLUME instruction should be used to expose any database storage area, configuration storage, or files/folders created by your docker container. You are strongly encouraged to use VOLUME for any mutable and/or user-serviceable parts of your image.

and they mention the limitation of derivative images here:

Changing the volume from within the Dockerfile: If any build steps change the data within the volume after it has been declared, those changes will be discarded.

All that said, I don't know how essential it is to use the VOLUME instruction. There are people who seem to think that it does not add enough value to a docker image, as seen in the top answer to this stack overflow.

By removing the instruction, it'd make it easier for end users to make derivative images based on the official docker images instead of having to build their own image from source.

It might also be relevant to note that kubernetes 1.20 now has a feature to manage the file permissions of volumes here. I haven't looked into this yet, but I'm wondering if it simplifies this problem by allowing the runtime to make sure the filesystem has the appropriate permissions set.

Personally, I would think there are more users who are looking for it to run as a non root user and group out of the box, which is why I submitted #8796.

Another option is to produce two images to satisfy the different requirements.

@sijie - what are your thoughts here? I think we'll need some input from the larger pulsar community.

@fransguelinckx
Copy link

Thanks, TIL about different (conflicting) paradigms on EKS and Openshift. Very curious indeed about what the broader pulsar community has to say about this.

@michaeljmarshall
Copy link
Member

michaeljmarshall commented Feb 3, 2021

@fransguelinckx - it sounds like the way to solve this (and make the container usable on both platforms) is to remove the VOLUME instruction in the Dockerfile. I brought up this issue in yesterday's pulsar community meeting, and it sounded like there was consensus that removing the instruction was the right way to go. I have just a little bit more research to do before I submit an update to my PR. I'm hoping to submit an update within 36 hours.

I believe the main reason this works for both platforms is that by leaving the volume open ended (and not declared in the Dockerfile), the file system permissions will be inherited from the host. In that way, it's up to end users (or the orchestration platform) to ensure their hosts are properly configured. I still need to double check that this is all correct, but that's my current understanding.

@michaeljmarshall
Copy link
Member

I finally understand the nuance to this problem, and I am pretty sure I overstepped when I said that the two platforms were incompatible. There is a way to make this work for both platforms.

I think some of my confusion came from the actual contents of this PR. The PR proposes the following:

RUN chown -R 0:0 /pulsar \
    && chmod -R g=u /pulsar

however, as @fransguelinckx shows, what is actually required is

RUN chgrp -R 0 /pulsar && \
    chmod -R g=u /pulsar

The key nuance here is that members of the root group need to have write permissions. The user does not have to be root, as the PR proposes. Instead, we can have the user that owns all of the files be the pulsar user and then make sure that the files are owned by the root group with the proper write permissions.

That change will work properly for the OpenShift case because the root group will have sufficient permissions and it will work for the case where we're running as a non-root user and non-root group as long as the user is the pulsar user (the group won't actually matter, but at least it will be non-root, which will satisfy the pod security policy).

As an aside, using the VOLUME instruction would make sense if everyone were specifying a volume to mount for /pulsar/data and /pulsar/conf. As it is, I don't believe that is the case, which makes it less meaningful. Also, if the volumes were mounted, they would inherit the host's filesystem permissions, which would have been one way to override the default behavior of the container.

@roelrymenants
Copy link
Author

@michaeljmarshall sounds very sensible to me.

I only added the root user chown because file ownership in the conf folder wasn't consistent. Making pulsar the owner, and root the group would indeed resolve both issues.

If you implement this in your PR, I will close this one

@michaeljmarshall
Copy link
Member

@roelrymenants - thanks for the context, that makes sense. A major part of the confusion is my inexperience with the nuances of file ownership for docker, declared volumes, and the different systems mentioned. I learned a number of things in finding the final solution mentioned above. I plan on updating my PR tonight to work for both use cases.

@roelrymenants
Copy link
Author

This issue is resolved by #8796

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants