-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support user namespace #8170
Comments
Hi, I would like to it if that's OK, thanks a lot! |
Currently, the virtiofs in the kernel seems not support idmap mount for now, so in the guest VM, it seems unlikely to support idmap mount for virtiofs without modify the virtiofs (Correct me if I am wrong).
So, is it necessary to add idmap mount support for virtiofs first? @bergwolf |
@bergwolf, currently, I have made some changes to make The rest of the work is how to use the Linux new mount API and This is what inside the container: This is inside the guest VM: |
Hi! Let me know if you have any question regarding userns support. I'm author of the k8s KEP and worked in the implementation of k8s, containerd, runc, some bits of CRIO, crun and the Linux kernel. I'll be happy to help with any questions to adopt this in Kata! :) There are two avenues I see to explore in Kata support regarding the advantages that userns has for containers based on Linux namespaces only:
|
Hi @rata, thanks a lot! It's going to be a great help. I am currently working on the second avenue you mentioned. As for the first one, I think the kata vm processes are all started by root user for now, I am not very sure, correct me if I am wrong. :) For the second avenue, we do need the idmap support for the virtiofs which kata uses a lot for rootfs and volumes. |
Hi @rata, sorry to bother you again, I was wondering if the |
@yawqi cool, do you plan to work on the kernel side for virtiofs? I don't know if I'd say, let's see how to support this in kata and we can see the other containerd debugging command on the side :) |
Hi @rata, from my understanding, we do need the kernel side support for the virtiofs to fully support the userns for kata inside guest VM, what do you think? Are you interested in the kernel side of virtiofs? If you are interested, I would be very grateful and willing to help you with it (if there is anything I can do). If not, I surely want to do it too, and any help is appreciated :), I just hope that I can help and learn something from it. Thanks again! |
@yawqi I'm not familiar with kata, if you exec into the container and run I'm not sure if we can create an idmap mount if the fs for the rootfs (assuming the rootfs is in a fs that supports it, like ext4) and then create the virtiofs on top of that idmapped mount. Can you try that? Or is it not trivial to do so? |
@rata The output of
And this is the
And this is the mountinfo of kata runtime on the host:
Sorry for attaching so much output, I want to provide detailed info about it. :) |
Actually, for this example, the rootfs is overlayfs on the host and the v1 volume is ext4, both has supported the idmapped mount. Did you mean that if the underlying fs support idmap, maybe the virtiofs can take advantage of it, is my understanding correct? Thanks a lot, I will look into it. If I have made any progress, I will let you know. :) And if you have any suggestions, please tell me! |
@yawqi sure, let me know if that works when you tried it. Thanks! :) If the fs of the rootfs doesn't support idmap mounts, containerd does a recursive chown (it is very expensive, but it should work). But that is for regular OCI runtimes, I don't know if that happens if kata is used and if it can benefit from that. Let me know when you tried it. Are you a maintainer in kata? |
@rata I am not a maintainer of kata, just trying to help and learn something about it. :) Did you mean the chown is performed by containerd when using runc? I thought it is the runtime's job to do it, good to know about this. |
Sorry to bother you again, I was wondering if it is convenient for you to share me how you test while you develop, can you share your config.json for the idmapping develop test with me? It would be of great help! :) |
Currently, to support userns inside guest VM for each container seperately, I think there are two main problems need to be solved. Firstly, setting up userns for a container, this part can reference the runc implementation. Secondly, idmap mount the volume mounts. For rootfs, the idmap mount is supported by the containerd for the overlayfs snapshotter(either by idmap mount or chown, depends on the kernel version). So for this problem, I believe we should first support idmap mount for the volume mounts, the rootfs part should be taken care of by the snapshotter I guess? As for the idmap mount support for volume mounts, at containerd/containerd#7063, they proposed 3 methods to handle the ownership of the rootfs, which also can provide some insight for us:
Ideally, the (a) should be the best option, but it has requirements for the kernel version, and more importantly, the underlying filesystem needs to support the idmap mount, in our case, the virtiofs. @rata has methoned that we should look into whether we can make use of the underlying filesystem's support for idmap mount to make virtiofs support idmap mount, I will loook into it, but I think it will take some time. For the (c) method, in our case, https://github.com/cloud-hypervisor/fuse-backend-rs/pull/159, I believe this PR has provide what we need. So maybe we can start here for the support of volume idmap mounts, and then try to provide the support via (a) method. How does you guys feel about it? Please feel free to leave any comments and suggestions, thanks a lot! :) @bergwolf @rata |
@yawqi Sorry for being silent. I'll be away until the end of the month, sorry :( Some notes:
No, that is for the rootfs. For volumes we are using idmap mounts, but the OCI runtime is doing them (runc or crun, for example). If you are not a maintainer here, I wonder what maintainers think about how the integration with userns should look like. Their input is probably valuable before doing more work.
I don't have one handy now. But you can see the tests for that (https://github.com/opencontainers/runc/blob/a2ba98557d25996532fae62fe560cdb7688a74a2/tests/integration/idmap.bats). It basically runs "runc spec" to get a default config.json and runs the "update_config" funtion that just uses jq to modify the json. You should be able to infer or even run the tests and add a step to copy the config.json to another dir and just get it for you to use. |
Is your feature request related to a problem? Please describe.
Following the kubernetes user namespace story and KEP, containerd has merged idmapped mount and the relevant runc bits.
Now it is time for kata to be in the party and integrate with containerd to handle idmapped mount and user namespace so that rootless containers can be truly rootless even inside the guest.
The text was updated successfully, but these errors were encountered: