Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openmpi: make 'schedulerName' configurable to use custom schedulers. #683

Merged

Conversation

everpeace
Copy link
Contributor

@everpeace everpeace commented Apr 19, 2018

What the PR changes

introducing schedulerName parameter in openmpi prototype.

Why we need this?

In Kubernetes default scheduler, scheduling multiple openmpi prototypes will sometimes lead deadlocks as discussed kubeflow/training-operator#165 . Please imagine the case 2 openmpi prototypes with 100 workers were scheduled simultaneously.

In that case, users would want to do gang-scheduling (scheduling a group of pods all-together). Currently, kube-arbitrator can achieve such scheduling.


This change is Reviewable

@everpeace
Copy link
Contributor Author

@jiezhang would you mind reviewing this, too? Thank you in advance.

@pdmack
Copy link
Member

pdmack commented Apr 19, 2018

/ok-to-test

@jiezhang
Copy link

:lgtm:


Review status: 0 of 2 files reviewed at latest revision, all discussions resolved, some commit checks failed.


Comments from Reviewable

@jiezhang
Copy link

/lgtm

@jiezhang
Copy link

@everpeace Thanks for contributing to the package.

@jiezhang
Copy link

Review status: 0 of 2 files reviewed at latest revision, all discussions resolved, some commit checks failed.


kubeflow/openmpi/workloads.libsonnet, line 41 at r1 (raw file):

          terminationGracePeriodSeconds: 30,
          dnsPolicy: "ClusterFirst",
          schedulerName: params.schedulerName,

Actually I modified the workload to use Pod instead of StatefulSet in #671 (to support graceful shutdown).

IIUC we'll be scheduling all the pods at the same time now. Do we still need this change?


Comments from Reviewable

@jiezhang
Copy link

/lgtm cancel

@k8s-ci-robot k8s-ci-robot removed the lgtm label Apr 19, 2018
@everpeace
Copy link
Contributor Author

Review status: 0 of 2 files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.


kubeflow/openmpi/workloads.libsonnet, line 41 at r1 (raw file):

Previously, jiezhang (Jie Zhang) wrote…

Actually I modified the workload to use Pod instead of StatefulSet in #671 (to support graceful shutdown).

IIUC we'll be scheduling all the pods at the same time now. Do we still need this change?

I don't think so. Even if we use pods instead of statefulsets, the situation is not changed in terms of scheduling. Default cheduler still schedules pending pods (created by openmpi prototype) one-by-one manner. This will lead deadlock.


Comments from Reviewable

@jiezhang
Copy link

/lgtm


Review status: 0 of 2 files reviewed at latest revision, all discussions resolved, some commit checks failed.


Comments from Reviewable

@jiezhang
Copy link

Reviewed 2 of 2 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks failed.


Comments from Reviewable

@jiezhang
Copy link

/cc @jlewi

@k8s-ci-robot k8s-ci-robot requested a review from jlewi April 20, 2018 01:15
@jiezhang
Copy link

/assign @jlewi

In Kubernetes default scheduler, scheduling multiple openmpi package will sometimes lead deadlocks as discussed kubeflow/training-operator#165 .

In that case,  user would want to perform gang-scheduling(scheduling a group of pods all-together).  Currently, kube-arbitrator support it.

To achieve that, we need to make 'schedulerName' customizable.
@everpeace everpeace force-pushed the feature/openmpi-custom-scheduler branch from 8305dcd to 4c4e4a2 Compare April 20, 2018 02:13
@k8s-ci-robot k8s-ci-robot removed the lgtm label Apr 20, 2018
@everpeace
Copy link
Contributor Author

I rebased my branch to the latest master because it conflicted.

@pdmack
Copy link
Member

pdmack commented Apr 20, 2018

/approve
/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jiezhang, pdmack

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 8e4f669 into kubeflow:master Apr 20, 2018
saffaalvi pushed a commit to StatCan/kubeflow that referenced this pull request Feb 11, 2021
In Kubernetes default scheduler, scheduling multiple openmpi package will sometimes lead deadlocks as discussed kubeflow/training-operator#165 .

In that case,  user would want to perform gang-scheduling(scheduling a group of pods all-together).  Currently, kube-arbitrator support it.

To achieve that, we need to make 'schedulerName' customizable.
surajkota pushed a commit to surajkota/kubeflow that referenced this pull request Jun 13, 2022
…AML resources. (kubeflow#1016)

* unittests should compare result of kustomize build to golden set of YAML resources.

* Per kubeflow/manifests#306 to allow reviewers to verify that the expected
  output is correct we should check in the result of "kustomize build -o"
  so that reviewers can review the diff and verify that it is correct.

* This also simplifies the test generation code; the python script
  generate_tests.py just recourses over the directory tree and runs "kustomize build -o" and checks in the output into the test_data directory.

* This is different from what the tests are currently doing.

  * Currently what the generation scripts do is generate "kustomization.yaml" files and then generate the expected output from that when the test is run.

  * This makes it very difficult to validate the expected output and to
    debug whether the expected output is correct.

  * Going forward, per kubeflow#1014, I think what we want to do is check in test cases
    corresponding to kustomization.yaml files corresponding to various
    kustomizations that we want to validate are working correctly

  * Our generate scripts would then run "kustomize build" to generate expected
    output and check that in so that we can validate that the expected output
    is correct.

* Also change the tests data structure so that it mirrors the kustomize directory tree rather than flattening the tests into the "tests" directory.

  * Fix kubeflow#683

* Right now running the unittests takes a long time

  * The problem is that we generate unittests for every "kustomization.yaml"
    file
  * Per kubeflow#1014 this is kind of pointless/redundant because most of these
    tests aren't actually testing kustomizations.
  * We will address this in follow on PRs which will add more appropriate
    tests and remove some of these unnecessary/redundant tests.

* Cherry pick AWS fixes.

* Regenerate the tests.

* Fix the unittests; need to update the generate logic to remove unused tests
 to remove tests that aren't part of this PR.

* Address comments.

* Rebase on master and regenerate the tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants