Skip to content

OSDOCS12626 GA User Name Space in OpenShift 4.20#84966

Merged
mburke5678 merged 1 commit intoopenshift:mainfrom
mburke5678:node-user-namespace-ga
Sep 17, 2025
Merged

OSDOCS12626 GA User Name Space in OpenShift 4.20#84966
mburke5678 merged 1 commit intoopenshift:mainfrom
mburke5678:node-user-namespace-ga

Conversation

@mburke5678
Copy link
Contributor

@mburke5678 mburke5678 commented Nov 14, 2024

https://issues.redhat.com/browse/OSDOCS-12626

Preview: Running pods in Linux user namespaces -- New assembly and included module.

QE review:

  • QE has approved this change.

@mburke5678 mburke5678 added this to the Planned for 4.18 GA milestone Nov 14, 2024
@openshift-ci openshift-ci bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Nov 14, 2024
<3> Specifies the UID the container is run with.
<4> Specifies which primary GID the containers is run with.
<5> Requests that the pod is to be run in a user namespace. If `true`, the pod runs in the host user namespace. If `false`, the pod runs in a new user namespace that is created for the pod. The default is `true`.
<3> Specifies the type of proc mount to use for the containers. The `unmasked` value ensures that a container's `/proc` is mounted as read-write by the container process. This bypasses the default masking behavior of the container runtime, and should only be used with an SCC that sets `hostUsers` to `false`. The default is `Default`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] RedHat.TermsErrors: Use 'read/write' rather than 'read-write'. For more information, see RedHat.TermsErrors.

@openshift-ci openshift-ci bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 22, 2024
@mburke5678
Copy link
Contributor Author

@haircommander Can you PTAL?

After you save the changes, new machine configs are created, the machine config pools are updated, and scheduling on each node is disabled while the change is being applied.
Also, configuring workloads as `procMount: unmasked` is generally considered as safe within a user namespace. Setting `procMount` to `unmasked` has benefits that are beyond the scope of this documentation.

To ensure that the namespace functionality exists on all nodes that you want to run in a user namespace, you can configure the minimum version of kubelet that is required for the nodes in your cluster. If the kubelet version in your cluster is lower than this version, new nodes are not scheduled and existing nodes are marked as degraded. For existing nodes with a lower version, the kubelet can only read the node object by using `oc get` or `oc update` commands, and the actions that the node can perform, by using a `SelfSubjectAccessReview`. The node is not allowed to gain access to any other API objects.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the kubelet version in your cluster is lower than this version, new nodes are not scheduled and existing nodes are marked as degraded

actually minimumKubeletVersion cannot be set if the current nodes in the cluster are too old. minimumKubeletVersion, once set, is only for guranteeing sure new nodes in the cluster are new enough

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For existing nodes with a lower version, the kubelet can only read the node object by using oc get or oc update commands

the kubelet doesn't use oc , you can just say the kubelet can only read and update its own node object

Copy link
Contributor Author

@mburke5678 mburke5678 Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haircommander

actually minimumKubeletVersion cannot be set if the current nodes in the cluster are too old.

Does "too old" suggest v1.29 and lower?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

older than the defined MinimumKubeletVersion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haircommander

For existing nodes with a lower version, the kubelet can only read its node object.

Is this statement true only if I set a minimum kubelet version? Are there any negative ramifications that the user who runs lower kubelet version(s) should be aware of?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this statement true only if I set a minimum kubelet version

yeah

Are there any negative ramifications that the user who runs lower kubelet version(s) should be aware of?

There are two cases:

  • if the lower kubleet version exists in the cluster at the time min kubelet version is attempted to be set, the validation rejects the min kubelet update
  • if the node is being added after the min kubelet version is established, that node will not be able to connect to the cluster meaningfully. It would take manual intervention to fix

@mburke5678
Copy link
Contributor Author

@lyman9966 When appropriate, can you please review this PR for QE? I believe you have not started testing the feature itself.

@mburke5678 mburke5678 changed the title GA User Name Space GA User Name Space in OpenShift 4.19 Jan 27, 2025
@mburke5678 mburke5678 changed the title GA User Name Space in OpenShift 4.19 OSDOCS12626 GA User Name Space in OpenShift 4.19 Jan 27, 2025
@kalexand-rh kalexand-rh removed this from the Planned for 4.18 GA milestone Feb 20, 2025
@bergerhoffer
Copy link
Contributor

The branch/enterprise-4.19 label has been added to this PR.

This is because your PR targets the main branch and is labeled for enterprise-4.18. And any PR going into main must also target the latest version branch (enterprise-4.19).

If the update in your PR does NOT apply to version 4.19 onward, please re-target this PR to go directly into the appropriate version branch or branches (enterprise-4.x) instead of main.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 24, 2025
@mburke5678 mburke5678 changed the title OSDOCS12626 GA User Name Space in OpenShift 4.19 OSDOCS12626 GA User Name Space in OpenShift 4.20 May 14, 2025
@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 13, 2025
@mburke5678
Copy link
Contributor Author

@lyman9966 Can you PTAL for QE?


:FeatureName: Support for Linux user namespaces
include::snippets/technology-preview.adoc[]
You can configure Linux user namespace use by setting the `hostUsers` parameter to `false` in the pod spec, and a few other configurations, as shown in the following procedure.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can changeYou can configure Linux user namespace use to You can configure Linux user namespace

<2> Specifies whether the pod is to be run in a user namespace. If `false`, the pod runs in a new user namespace that is created for the pod. If `true`, the pod runs in the host user namespace. The default is `true`.
<3> `capabilities` permit privileged actions without giving full root access. Technically, setting capabilities inside of a user namespace is safer than setting them outside, as the scope of the capabilities are limited by being inside user namespace, and can generally be considered to be safe. However, giving pods capabilities like `CAP_SYS_ADMIN` to any untrusted workload could increase the potential kernel surface area that a containerized process has access to and could find exploits in. Thus, capabilities inside of a user namespace are allowed at `baseline` level in pod security admission.
<4> Specifies that processes inside the container run with a user that has any UID other than 0.
<5> Optional: Specifies the type of proc mount to use for the containers. The `unmasked` value ensures that a container's `/proc` file system is mounted as read/write by the container process. The default is `Default`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[The default is Default] should be [The default is masked]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haircommander Can you help with this ^^?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyman9966 This is from the help text in the web console:

procMount denotes the type of proc mount to use for the containers. The default value is Default which uses the container runtime defaults for readonly paths and masked paths. This requires the ProcMountType feature flag to be enabled. Note that this field cannot be set when spec.os.name is windows.

Possible enum values:

- `"Default"` uses the container runtime defaults for readonly and masked paths for /proc. Most container runtimes mask certain paths in /proc to avoid accidental security exposure of special devices or information.

- `"Unmasked"` bypasses the default masking behavior of the container runtime and ensures the newly created /proc the container stays in tact with no modifications.

Allowed Values:

Default
Unmasked

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a pod without specifying a procMount. But I am not sure where the parameter is shown.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried with the following:

  1. set procMount: Default
  2. not specify procMount
    The both can lead to the same results as procMount: Unmasked. (i.e. rw permission). I'm not sure if there is a bug exist.
    % oc exec nested-podman -- mount | grep "/proc type proc"
    proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)

@haircommander Can you take a look? Thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyman9966 can you show me the pod spec? I used

apiVersion: v1
kind: Pod
metadata:
  name: nested-podman-1
  annotations:
    io.kubernetes.cri-o.Devices: "/dev/net/tun"
spec:
  hostUsers: false # user namespace
  containers:
  - name: nested-podman
    image: docker.io/lyman9966/baseline-nested-container:v1.0
    args:
    - sleep
    - "1000000"
    securityContext:
      runAsUser: 0
      #      procMount: Unmasked
      capabilities:
        add:
        - "SETGID"
        - "SETUID"

and got

 $ oc exec -ti pod/nested-podman-1 -- /bin/sh                                                              
sh-5.2# ls ^C      
sh-5.2#       
sh-5.2# mount | grep proc
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /proc/acpi type tmpfs (ro,relatime,context="system_u:object_r:container_file_t:s0:c228,c594",size=0k,uid=3725918208,gid=3725918208,inode64)
devtmpfs on /proc/interrupts type devtmpfs (ro,nosuid,seclabel,size=4096k,nr_inodes=1998457,mode=755,inode64)
devtmpfs on /proc/kcore type devtmpfs (ro,nosuid,seclabel,size=4096k,nr_inodes=1998457,mode=755,inode64)
devtmpfs on /proc/keys type devtmpfs (ro,nosuid,seclabel,size=4096k,nr_inodes=1998457,mode=755,inode64)
tmpfs on /proc/scsi type tmpfs (ro,relatime,context="system_u:object_r:container_file_t:s0:c228,c594",size=0k,uid=3725918208,gid=3725918208,inode64)
devtmpfs on /proc/timer_list type devtmpfs (ro,nosuid,seclabel,size=4096k,nr_inodes=1998457,mode=755,inode64)
proc on /proc/bus type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/fs type proc (ro,nosuid,nodev,noexec,relatime) 
proc on /proc/irq type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sys type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sysrq-trigger type proc (ro,nosuid,nodev,noexec,relatime)

as expected

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting to default gives me the same

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haircommander I think I got the same results as you, but I misunderstand the meaning of masked procMount. I thought the "rw" mount of /proc should not appear in command "mount | grep proc" :
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
But it's ok indeed, because there is "ro" mount for sub /proc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah unmasked means one rw mount, rather than the default which has many ro mounts (basically it's a limited view into proc)

----
<1> Specifies the machine config pool label.
<2> Specifies the container runtime to deploy.
To require a specific SCC for a workload, set the `openshift.io/required-scc` annotation in the object specification. For more information, see "Configuring a workload to require a specific SCC". Alternatively, you can add an SCC to a specific user or group by using the `oc adm policy add-scc-to-user` or `oc adm policy add-scc-to-group` command. For more information, see the "OpenShift CLI administrator command reference".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my test, only set the openshift.io/required-scc annotation in the object specification can't pass pod admission.
For a non-admin user, if I only set the openshift.io/required-scc annotation, it will prompt:
unable to validate against any security context constraint: provider "restricted-v3": Forbidden: not usable by user or serviceaccount

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haircommander Can you help with this ^^?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove the required-scc part I think

@mburke5678 mburke5678 force-pushed the node-user-namespace-ga branch 2 times, most recently from e82c4e2 to 1d38662 Compare September 15, 2025 15:56
@mburke5678 mburke5678 added the merge-review-needed Signifies that the merge review team needs to review this PR label Sep 15, 2025
@lyman9966
Copy link

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 16, 2025
@haircommander
Copy link
Member

/lgtm

@lahinson lahinson added merge-review-in-progress Signifies that the merge review team is reviewing this PR and removed merge-review-needed Signifies that the merge review team needs to review this PR labels Sep 17, 2025
Copy link
Contributor

@lahinson lahinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mburke5678 Looks pretty good. I caught one tiny thing to fix (a missing word) and made a couple of suggestions to handle in a future PR. Feel free to merge this when you're ready.

@lahinson lahinson removed the merge-review-in-progress Signifies that the merge review team is reviewing this PR label Sep 17, 2025
@mburke5678 mburke5678 force-pushed the node-user-namespace-ga branch from 1d38662 to cfbac32 Compare September 17, 2025 17:39
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 17, 2025
@openshift-ci
Copy link

openshift-ci bot commented Sep 17, 2025

New changes are detected. LGTM label has been removed.

@openshift-ci
Copy link

openshift-ci bot commented Sep 17, 2025

@mburke5678: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@mburke5678
Copy link
Contributor Author

/cherrypick enterprise-4.20

@openshift-cherrypick-robot

@mburke5678: once the present PR merges, I will cherry-pick it on top of enterprise-4.20 in a new PR and assign it to you.

Details

In response to this:

/cherrypick enterprise-4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@mburke5678 mburke5678 merged commit 9306b4f into openshift:main Sep 17, 2025
2 checks passed
@mburke5678 mburke5678 deleted the node-user-namespace-ga branch September 17, 2025 19:51
@mburke5678
Copy link
Contributor Author

/cherrypick enterprise-4.20

@openshift-cherrypick-robot

@mburke5678: new pull request created: #99308

Details

In response to this:

/cherrypick enterprise-4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-cherrypick-robot

@mburke5678: new pull request could not be created: failed to create pull request against openshift/openshift-docs#enterprise-4.20 from head openshift-cherrypick-robot:cherry-pick-84966-to-enterprise-4.20: status code 422 not one of [201], body: {"message":"Validation Failed","errors":[{"resource":"PullRequest","code":"custom","message":"A pull request already exists for openshift-cherrypick-robot:cherry-pick-84966-to-enterprise-4.20."}],"documentation_url":"https://docs.github.com/rest/pulls/pulls#create-a-pull-request","status":"422"}

Details

In response to this:

/cherrypick enterprise-4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

branch/enterprise-4.20 size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.