New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add elastic run api #3503
Add elastic run api #3503
Conversation
Unit Test Results (with flaky tests) 968 files +19 968 suites +19 10h 21m 50s ⏱️ + 40m 19s Results for commit af09b9a. ± Comparison against base commit a304c81. ♻️ This comment has been updated with latest results. |
0362ab5
to
923f511
Compare
1398fb9
to
439502f
Compare
439502f
to
0cac76a
Compare
ac778f9
to
4dc5aab
Compare
Signed-off-by: Enrico Minack <github@enrico.minack.dev>
Signed-off-by: Enrico Minack <github@enrico.minack.dev>
Signed-off-by: Enrico Minack <github@enrico.minack.dev>
Signed-off-by: Enrico Minack <github@enrico.minack.dev>
Signed-off-by: Enrico Minack <github@enrico.minack.dev>
Signed-off-by: Enrico Minack <github@enrico.minack.dev>
Signed-off-by: Enrico Minack <github@enrico.minack.dev>
Signed-off-by: Enrico Minack <github@enrico.minack.dev>
Signed-off-by: Enrico Minack <github@enrico.minack.dev>
Signed-off-by: Enrico Minack <github@enrico.minack.dev>
4dc5aab
to
3eaf15b
Compare
I managed to move the |
…hanges Reverts changes to run_task.py, launch.py and http_client.py. Signed-off-by: Enrico Minack <github@enrico.minack.dev>
3eaf15b
to
fa2ea9c
Compare
Signed-off-by: Enrico Minack <github@enrico.minack.dev>
Signed-off-by: Enrico Minack <github@enrico.minack.dev>
fa2ea9c
to
af09b9a
Compare
Currently, the elastic training mode can only be used through
horovodrun
and not the existinghorovod.run
API.This allows to run
horovod.run
withmin_num_proc
orhost_discovery_script
set to run afunc
in elastic mode.