Fix docker image permissions of /pulsar for OpenShift #8242

roelrymenants · 2020-10-12T09:12:29Z

Motivation

On OpenShift the user id in the container is random. In this case, the /pulsar folder has problematic permissions. The VOLUME statement even locks these into place for derivative images.
e.g. as a random user the /pulsar/conf folder is read-only, while the pulsar helm chart assumes it to be present and writable.

Modifications

Change ownership of everything in the /pulsar directory to be root:root, and make sure the owning group has the same rights as the owner.
This solves the problem, as OpenShift ensures the user in the container is a member of the root group.

…the pulsar docker image

sijie · 2020-10-21T05:32:21Z

The change seems to fix a very narrow problem. Does this approach work on other platforms?

fransguelinckx · 2020-10-28T08:55:00Z

@sijie it is indeed a narrow problem, but it prevents us from deploying on openshift as it has stricter security requirements by default than vanilla kubernetes. On the other hand it is a change with very limited impact. Not tested on other platforms. Default group for a pod is the root group though.

codelipenghui · 2020-11-17T03:08:49Z

move to 2.8.0 first.

michaeljmarshall · 2020-11-30T19:09:16Z

I just opened #8751 to request that the pulsar Dockerfile is fixed to ensure that pulsar containers are not given the root user, as they do not need to run as root.

I am not familiar with OpenShift, so I cannot speak to a solution for this specific problem, but I do think it is a security best practice to give applications the least necessary permission to run.

As such, I do not think this PR should be merged.

fransguelinckx · 2021-02-01T10:31:03Z

@michaeljmarshall The PR only gives the root group the necessary permissions to access the /pulsar folder. This is because on an openshift cluster, containers are run using a random user ID.

An extract from the openshift documentation, see section "Support arbitrary user ids" (https://docs.openshift.com/container-platform/4.5/openshift_images/create-images.html)

By default, OpenShift Container Platform runs containers using an arbitrarily assigned user ID. This provides additional security against processes escaping the container due to a container engine vulnerability and thereby achieving escalated permissions on the host node.

For an image to support running as an arbitrary user, directories and files that may be written to by processes in the image should be owned by the root group and be read/writable by that group. Files to be executed should also have group execute permissions.

Adding the following to your Dockerfile sets the directory and file permissions to allow users in the root group to access them in the built image:

RUN chgrp -R 0 /some/directory && \
    chmod -R g=u /some/directory

Because the container user is always a member of the root group, the container user can read and write these files.

In conclusion, I think this PR allows for better security by supporting running pulsar by an arbitrary user ID.

michaeljmarshall · 2021-02-02T19:57:29Z

@fransguelinckx - thanks for the added context. I am not familiar with OpenShift's container platform, so the context is helpful.

It seems to me that the challenge here comes from competing paradigms between how OpenShift runs containers by default and how EKS (and other platforms) recommend running containers. OpenShift is prescriptive about running the user as part of the root group (so it can ensure containers have the right file permissions) and yet EKS recommends just the opposite (because it expects users to ensure file permissions are correct). Here are the Amazon docs that detail best practices for using a PodSecurityPolicy to prevent containers from running as the root user and the root group (https://aws.github.io/aws-eks-best-practices/security/docs/pods/#restrict-the-containers-that-can-run-as-privileged). Here is the brief part of the policy with an explicit comment about forbidding the root group.

    fsGroup:
        rule: 'MustRunAs'
        ranges:
        # Forbid adding the root group.
        - min: 1
          max: 65535

The docker image produced by the Dockerfile in this PR would not be able to run with the PodSecurityPolicy recommended by EKS.

I'm not entirely sure how to best accommodate both paradigms. While OpenShift allows users to override a pod's UID/GID by specifying runAsUser and runAsGroup, they discourage it because it could affect other pods running in other namespaces. Here is the reference https://www.openshift.com/blog/a-guide-to-openshift-and-uids and here is the main quote:

When the application also needs to be executed under a specific custom UID, then the Pod definition needs to use a ServiceAccount with the RunAsAny privilege and the Pod needs to use the runAsUser to set the UID to the hardcoded UID. NOTE: Using a hardcoded UID is NOT recommended. Among issues with this approach is that it is prone to UID collisions with system UIDs or with UIDs of the same or different application running in a different Namespace expecting to use the same UID.

If you're deploying your bookkeeper pods on dedicated hosts, I think that removes their main concern because other namespaces won't be deployed to the nodes. I'm not sure how you're deploying these pods though.

One thing that might be beneficial to all is to remove the VOLUME instruction from pulsar's Dockerfile because it prevents file ownership updates to the /pulsar directories in derivative images. Docker does claim that specifying volumes in the Dockerfile is best practice here:

The VOLUME instruction should be used to expose any database storage area, configuration storage, or files/folders created by your docker container. You are strongly encouraged to use VOLUME for any mutable and/or user-serviceable parts of your image.

and they mention the limitation of derivative images here:

Changing the volume from within the Dockerfile: If any build steps change the data within the volume after it has been declared, those changes will be discarded.

All that said, I don't know how essential it is to use the VOLUME instruction. There are people who seem to think that it does not add enough value to a docker image, as seen in the top answer to this stack overflow.

By removing the instruction, it'd make it easier for end users to make derivative images based on the official docker images instead of having to build their own image from source.

It might also be relevant to note that kubernetes 1.20 now has a feature to manage the file permissions of volumes here. I haven't looked into this yet, but I'm wondering if it simplifies this problem by allowing the runtime to make sure the filesystem has the appropriate permissions set.

Personally, I would think there are more users who are looking for it to run as a non root user and group out of the box, which is why I submitted #8796.

Another option is to produce two images to satisfy the different requirements.

@sijie - what are your thoughts here? I think we'll need some input from the larger pulsar community.

fransguelinckx · 2021-02-03T19:23:33Z

Thanks, TIL about different (conflicting) paradigms on EKS and Openshift. Very curious indeed about what the broader pulsar community has to say about this.

michaeljmarshall · 2021-02-03T23:32:15Z

@fransguelinckx - it sounds like the way to solve this (and make the container usable on both platforms) is to remove the VOLUME instruction in the Dockerfile. I brought up this issue in yesterday's pulsar community meeting, and it sounded like there was consensus that removing the instruction was the right way to go. I have just a little bit more research to do before I submit an update to my PR. I'm hoping to submit an update within 36 hours.

I believe the main reason this works for both platforms is that by leaving the volume open ended (and not declared in the Dockerfile), the file system permissions will be inherited from the host. In that way, it's up to end users (or the orchestration platform) to ensure their hosts are properly configured. I still need to double check that this is all correct, but that's my current understanding.

michaeljmarshall · 2021-02-04T06:01:53Z

I finally understand the nuance to this problem, and I am pretty sure I overstepped when I said that the two platforms were incompatible. There is a way to make this work for both platforms.

I think some of my confusion came from the actual contents of this PR. The PR proposes the following:

RUN chown -R 0:0 /pulsar \
    && chmod -R g=u /pulsar

however, as @fransguelinckx shows, what is actually required is

RUN chgrp -R 0 /pulsar && \
    chmod -R g=u /pulsar

The key nuance here is that members of the root group need to have write permissions. The user does not have to be root, as the PR proposes. Instead, we can have the user that owns all of the files be the pulsar user and then make sure that the files are owned by the root group with the proper write permissions.

That change will work properly for the OpenShift case because the root group will have sufficient permissions and it will work for the case where we're running as a non-root user and non-root group as long as the user is the pulsar user (the group won't actually matter, but at least it will be non-root, which will satisfy the pod security policy).

As an aside, using the VOLUME instruction would make sense if everyone were specifying a volume to mount for /pulsar/data and /pulsar/conf. As it is, I don't believe that is the case, which makes it less meaningful. Also, if the volumes were mounted, they would inherit the host's filesystem permissions, which would have been one way to override the default behavior of the container.

roelrymenants · 2021-02-04T09:05:17Z

@michaeljmarshall sounds very sensible to me.

I only added the root user chown because file ownership in the conf folder wasn't consistent. Making pulsar the owner, and root the group would indeed resolve both issues.

If you implement this in your PR, I will close this one

michaeljmarshall · 2021-02-04T19:30:50Z

@roelrymenants - thanks for the context, that makes sense. A major part of the confusion is my inexperience with the nuances of file ownership for docker, declared volumes, and the different systems mentioned. I learned a number of things in finding the final solution mentioned above. I plan on updating my PR tonight to work for both use cases.

roelrymenants · 2021-02-16T07:01:17Z

This issue is resolved by #8796

Give root:root the necessary permissions on the /pulsar directory in …

6d54827

…the pulsar docker image

roelrymenants force-pushed the docker-openshift branch from 5791541 to 6d54827 Compare October 12, 2020 13:55

sijie assigned roelrymenants Oct 21, 2020

sijie added this to the 2.7.0 milestone Oct 21, 2020

codelipenghui modified the milestones: 2.7.0, 2.8.0 Nov 17, 2020

michaeljmarshall mentioned this pull request Feb 4, 2021

[Issue 8751] Update Dockerfile for Pulsar and Dashboard to Create and Use pulsar User (nonroot user) #8796

Merged

1 task

roelrymenants closed this Feb 16, 2021

roelrymenants deleted the docker-openshift branch February 16, 2021 07:01

michaeljmarshall mentioned this pull request Jul 8, 2021

[Docker Images] Default Pulsar docker images to run as a non-root user #11269

Closed

sijie mentioned this pull request Jul 9, 2021

ISSUE-11269: [Docker Images] Default Pulsar docker images to run as a non-root user streamnative/pulsar-archived#2755

Open

michaeljmarshall mentioned this pull request Nov 24, 2021

[Docker] Enable filesystem permission updates in derivative docker im… #12956

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix docker image permissions of /pulsar for OpenShift #8242

Fix docker image permissions of /pulsar for OpenShift #8242

roelrymenants commented Oct 12, 2020

sijie commented Oct 21, 2020

fransguelinckx commented Oct 28, 2020 •

edited

Loading

codelipenghui commented Nov 17, 2020

michaeljmarshall commented Nov 30, 2020

fransguelinckx commented Feb 1, 2021

michaeljmarshall commented Feb 2, 2021 •

edited

Loading

fransguelinckx commented Feb 3, 2021

michaeljmarshall commented Feb 3, 2021 •

edited

Loading

michaeljmarshall commented Feb 4, 2021

roelrymenants commented Feb 4, 2021

michaeljmarshall commented Feb 4, 2021

roelrymenants commented Feb 16, 2021

Fix docker image permissions of /pulsar for OpenShift #8242

Fix docker image permissions of /pulsar for OpenShift #8242

Conversation

roelrymenants commented Oct 12, 2020

Motivation

Modifications

sijie commented Oct 21, 2020

fransguelinckx commented Oct 28, 2020 • edited Loading

codelipenghui commented Nov 17, 2020

michaeljmarshall commented Nov 30, 2020

fransguelinckx commented Feb 1, 2021

michaeljmarshall commented Feb 2, 2021 • edited Loading

fransguelinckx commented Feb 3, 2021

michaeljmarshall commented Feb 3, 2021 • edited Loading

michaeljmarshall commented Feb 4, 2021

roelrymenants commented Feb 4, 2021

michaeljmarshall commented Feb 4, 2021

roelrymenants commented Feb 16, 2021

fransguelinckx commented Oct 28, 2020 •

edited

Loading

michaeljmarshall commented Feb 2, 2021 •

edited

Loading

michaeljmarshall commented Feb 3, 2021 •

edited

Loading