Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[openmpi] support non-root users run jobs (introducing runAsUser, runAsGroup, supplementalGroups) #820

Merged
merged 3 commits into from May 19, 2018

Conversation

everpeace
Copy link
Contributor

@everpeace everpeace commented May 17, 2018

Fixes #793

Changes

  • introduce runAsUser, runAsGroup, supplementalGroups parameters.
  • If these parameters set, this package will put securityContext to the master/worker pods.
  • changed init.sh to support non-root users can spawn sshd servers.
  • changed openmpi-controller path from /root to /openmpi-controller so that non-root users can execute the controller
    • this bump version of openmpi-controller docker image.

Note

new openmpi-controller's docker image needs to be published before merging this.

@jiezhang I need your assistance. Could you build and publish jiezhang/openmpi-controller:0.0.3 before this pr is approved 🙇 ??


This change is Reviewable

@everpeace
Copy link
Contributor Author

/assign @jiezhang @inc0
/cc @alsrgv @jlewi

@pdmack
Copy link
Member

pdmack commented May 17, 2018

@everpeace, please autoformat_jsonnet.sh.

@jiezhang
Copy link

Review status: 0 of 7 files reviewed at latest revision, all discussions resolved, some commit checks failed.


components/openmpi-controller/Dockerfile, line 7 at r1 (raw file):

ENV HOME /root

RUN mkdir -p /openmpi-controller

Can you move it to /kubeflow/openmpi/openmpi-controller? Just be consistent with other artifacts (like assets) deployed by this package.


kubeflow/openmpi/assets/init.sh, line 61 at r1 (raw file):

cp ${OPENMPI_DIR}/assets/ssh_config ${HOME}/.ssh/config

SSHD_CONFIG=${OPENMPI_DIR}/assets/sshd_config

nit: all constants declared at the top of the file.


Comments from Reviewable

@jiezhang
Copy link

@everpeace Sorry I'm unable to move the package to kubeflow due to #197. I'll publish new package once this is merged.

@jiezhang
Copy link

Review status: 0 of 7 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


kubeflow/openmpi/assets/init.sh, line 64 at r1 (raw file):

# when this runs by non-root user.
# we have to create ephemeral hostkeys
if [ "$(id -u)" != "0" ]; then

Consider moving this into a helper function.


kubeflow/openmpi/prototypes/openmpi.jsonnet, line 28 at r1 (raw file):

// @optionalParam uploadDataFrom string null Local path where data are uploaded from.
// @optionalParam uploadDataTo string null URI where data are uploaded to. Namespace, name, and pod will be appended to the URI. Only S3 bucket is supported at the moment.
// @optionalParam runAsUser string null uid of the first process of containers in master/worker pods.  If not set, this will be default value of your cluster configuration.

nit: one space


Comments from Reviewable

@everpeace
Copy link
Contributor Author

Review status: 0 of 7 files reviewed at latest revision, 4 unresolved discussions.


components/openmpi-controller/Dockerfile, line 7 at r1 (raw file):

Previously, jiezhang (Jie Zhang) wrote…

Can you move it to /kubeflow/openmpi/openmpi-controller? Just be consistent with other artifacts (like assets) deployed by this package.

Done.


kubeflow/openmpi/assets/init.sh, line 61 at r1 (raw file):

Previously, jiezhang (Jie Zhang) wrote…

nit: all constants declared at the top of the file.

Done.


kubeflow/openmpi/assets/init.sh, line 64 at r1 (raw file):

Previously, jiezhang (Jie Zhang) wrote…

Consider moving this into a helper function.

Done.


kubeflow/openmpi/prototypes/openmpi.jsonnet, line 28 at r1 (raw file):

Previously, jiezhang (Jie Zhang) wrote…

nit: one space

Done.


Comments from Reviewable

@everpeace
Copy link
Contributor Author

everpeace commented May 18, 2018

@jiezhang I fixed my pr about your feedback and rebased it. Would you mind reviewing it again?

Sorry I'm unable to move the package to kubeflow due to #197. I'll publish new package once this is merged.

@jiezhang Ok. I understood the situation. I'll wait for it. Thank you in advance.

please autoformat_jsonnet.sh.

@pdmack apologies 🙇 I did it.

@jlewi
Copy link
Contributor

jlewi commented May 18, 2018

/lgtm
/hold

Putting a hold on this because I think someone more familiar with this should approve.

@jiezhang
Copy link

/lgtm

@jiezhang
Copy link

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jiezhang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jiezhang
Copy link

/hold cancel

@jiezhang
Copy link

/test kubeflow-presubmit

@jiezhang
Copy link

/retest

@jiezhang
Copy link

Review status: 0 of 7 files reviewed at latest revision, 4 unresolved discussions.


kubeflow/openmpi/util.libsonnet, line 16 at r2 (raw file):

      [std.splitLimit(keyValue, "=", 1)[0]]: std.splitLimit(keyValue, "=", 1)[1]
      for keyValue in $.toArray(str)
    } else {},

I just realized that resource limits are not set properly when customResources is null. Thanks for fixing this.


Comments from Reviewable

@jiezhang jiezhang mentioned this pull request May 18, 2018
@everpeace
Copy link
Contributor Author

/test kubeflow-presubmit

@everpeace
Copy link
Contributor Author

Review status: 0 of 7 files reviewed at latest revision, 5 unresolved discussions.


kubeflow/openmpi/util.libsonnet, line 16 at r2 (raw file):

Previously, jiezhang (Jie Zhang) wrote…

I just realized that resource limits are not set properly when customResources is null. Thanks for fixing this.

no worry. sorry for fixing this without mentioning...


Comments from Reviewable

@k8s-ci-robot k8s-ci-robot merged commit 48791c4 into kubeflow:master May 19, 2018
saffaalvi pushed a commit to StatCan/kubeflow that referenced this pull request Feb 11, 2021
…runAsGroup`, `supplementalGroups`) (kubeflow#820)

* openmpi: support non-root user in `init.sh`

* openmpi: change openmpi-controller path outside of /root so non-root users can read

* openmpi: support `runAsUser`, `runAsGroup`, `supplementalGroups`.
yanniszark pushed a commit to arrikto/kubeflow that referenced this pull request Feb 15, 2021
* Add tfjob and pytorch examples to e2e

* Fix tests

* Fix tests

* Fix tests

* Fix tests

* Install crds before katib

* Fix tests

* Adding timeout to 30 min
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants