Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MPIJob] Change hyper-param loop to reuse the same CRD/containers #1247

Merged
merged 39 commits into from Aug 24, 2021

Conversation

yaronha
Copy link
Collaborator

@yaronha yaronha commented Aug 24, 2021

before every hyper-param iteration used a seperate CRD/containers, using large abount of resources
this is a problem since MpiJobs are likely to use GPUs.

this change runs the iteration loop in the MpiJob launcher (use the same containers/resources for all iterations)

yaron haviv added 30 commits April 27, 2021 23:48
This reverts commit 2538e36
@yaronha yaronha changed the title [Runtimes] MpiJob hyper-param loop will reuse the same CRD/containers [Runtimes] MpiJob hyper-param loop reuse the same CRD/containers Aug 24, 2021
@yaronha yaronha requested a review from Hedingber August 24, 2021 20:43
@Hedingber Hedingber changed the title [Runtimes] MpiJob hyper-param loop reuse the same CRD/containers [MPIJob] Change hyper-param loop to reuse the same CRD/containers Aug 24, 2021
@Hedingber Hedingber merged commit f269bfa into mlrun:development Aug 24, 2021
@yaronha yaronha deleted the fix-mpijob-hparam branch November 2, 2021 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants