-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2 initContainer can't run as non-root #407
Comments
It seems that the problem is that MPIJob cannot run in istio enabled namespace. In pure namespace, it runs OK. Is this expected? mpi-operator is currently incompatible with istio? |
The problem with istio namespace is, that an istio proxy sidecar is injected. If the istio proxy is not running, no pod can reach network, including init containers. However, istio proxy does not start till all init containers finish. mpi operator runs kubectl-delivery init container that needs to access kube api. This is not possible till istio proxy starts. However, the istio proxy does not start until init container finishes. Ergo deadlock. This is famous bug in istio: istio/istio#11130. Situation could be solved, if kubectl-delivery init container forcibly starts before istio-validation init container. Not sure, if this is possible. |
Can you add a label to disable injection? |
I could but I guess it is the same as a new namespace without istio, isn't it? The question is whether the mpi operator can work in istio enabled namespaces or not. If not, maybe it is worth of mention it in README as whole kubeflow framework actually needs istio ;) |
/cc @andreyvelich I remember that we encounter this problem before in Katib. |
We disabled istio sidecar for Katib Trials because we are using external source to download dataset: https://github.com/kubeflow/katib/blob/master/examples/v1beta1/mxnet-mnist/mnist.py#L44. What do you think @alculquicondor @terrytangyuan ? |
I haven't tried it with Istio before so I would recommend adding a note on README regarding this unless others have successful experience with using them together. |
The v2 controller doesn't have init containers that access the network. Please consider giving it a try. I'm currently working on documenting it, but it's already fully functional. You can build it with
Then you can deploy it with
|
Ok, thanks, I will give it a try. Does it use ssh or does it work without root? I am interested in the latter one. Not sure whether to create a new issue or just mention it here, but it seems that if MPI job fails (e.g. due to istio problem) and you delete it and recreate then no pods ever get created on the second try. I need to rename the job. |
Yes, it uses ssh and it works without root, but you need to make sure the image supports it. See the example in Please open a separate issue with precise steps, wait times, log lines, etc. |
Unfortunately, it requires root for init-ssh container. If you have restrictive PSP, it fails. I try to fix it so it is able to run on PSP enabled cluster and provide patch if interested. edit: |
this is launcher definition:
and I believe, this code should set runasuser for init-ssh container as well, but apparently, it does not.
|
ah, nope, my bad. It is not implemented for init-ssh container. |
Unfortunately, the init container needs root permissions to be able to change the owner of the volumes. This is some kubernetes limitation. One thing we could do is to have an option to disable the init-container, but then your image should know how to grab the ssh keys from This was actually my initial approach, but it put burden in the author of the image to know how to configure the Would this be acceptable to you? I'm happy to review a change like that. |
What about using a PVC for home? Those init containers are good as you do not need to make special initialization in the MPI containers. PVCs for home are already working OK. So perhaps to add an option for home storage instead of emptyDir and allow init containers to run as user? |
Do you mean having I don't think it being a PVC or emptyDir would make a difference. |
You are probably right. However, I found how to mitigate this problem. Just set The only thing I needed in operator was to put away chmod 0700 /mnt/home-ssh and chown /mnt/home-ssh from the init container. If this could be parametrized somehow, the operator supports running without root access. |
Oh, I didn't know about I think you should be able to change the port to 2222 and the ssh_config on your own images. I prefer to keep the sample simple. The operator doesn't need to know about the port.
It sounds like you have a working prototype. Can you send a PR? |
I think we can simply remove those lines and recommend users to set Since the workloads run in a container, there is no need to "secure" the home directory and ssh keys further. |
I use emptyDir for ~/.ssh now with terrible access rights 0777 but it just works with yeah the port 2222 is good perhaps for non-root example. Prototype is currently like this:
but if you are ok just remove them, ok, no other change is needed. So, you are happy to remove those lines, I can provide at least working PI - non root example. |
Can we go further and remove the initContainer altogether? The only reason we needed it was because of the tight file permission requirements. If we can get rid of those checks with |
/retitle v2 initContainer can't run as non-root |
That didn't work, @xhejtman can you retitle the issue as suggested above? |
We could if it is ok that ~/.ssh/known_hosts is not writtable. Mounted secrets are readonly. |
I think we would have to mount the secret somewhere else, for example |
or you just can use UserKnownHostsFile to point .known-hosts to a different location.. or maybe a bit safer solution, pre-generate hostkeys and populate /etc/ssh/ssh_known_hosts. Not sure, if this is easy to do currently. |
Generating the known_hosts would be impossible, because we don't know the Pod IPs before they start. Having a different So it comes down to what is easier to use. Probably directly mounting the Volume in |
Yes you are right, letting user decide where to mnount ssh keys is the best option. Whether they mount it to ~/.ssh or not is up to them. So if you release a new version of the operator, I can test it and provide PI example without root. |
I'm still working on it. After removing the init container, the non-root use case works, but root doesn't. This is the error from the launcher:
Are you aware of away to disable that check? |
Are you mounting a secret? I think you can set mode for individual files: defaultMode: 400, just checked, for ssh connect, access rights 0777 on .ssh do not matter. |
Yes, I'm mounting the secret directly. However, I cannot use |
Hello,
I use kubeflow 1.3.0, mpi-operator latest from Jul 5 2021. I tried MPI Job example horovod, however, launcher deployment has error in kubectl-delivery pod:
Failed to list *v1.Pod: Get https://10.43.0.1:443/api/v1/namespaces/adrian/pods?limit=500&resourceVersion=0: dial tcp 10.43.0.1:443: connect: connection refused
I tested, the pod is unable to reach any outer node or external internet on ports 80 or 443. I tested those two workers, they are OK, they can reach anything, one of the workers is running on the same physical node as launcher and worker can reach network, launcher cannot. Launcher can ping interneter (e.g. 8.8.8.8).
Is it some istio misconfiguration problem? I tried to forcibly start launcher on a different node, result is the same. Thanks for help.
The text was updated successfully, but these errors were encountered: