Skip to content

Commit

Permalink
Catch any exception to prevent the discovery thread from silently dyi…
Browse files Browse the repository at this point in the history
…ng (#3436)

Signed-off-by: Mohamed Yousef <myb@imachines.com>
  • Loading branch information
ASDen committed Mar 30, 2022
1 parent 12f9f9a commit afb0497
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion horovod/runner/elastic/driver.py
Expand Up @@ -194,7 +194,7 @@ def _discover_hosts(self):
if update_res != HostUpdateResult.no_update:
self._notify_workers_host_changes(self._host_manager.current_hosts, update_res)
self._wait_hosts_cond.notify_all()
except RuntimeError as e:
except BaseException as e:
if first_update:
# Misconfiguration, fail the job immediately
self._shutdown.set()
Expand Down

0 comments on commit afb0497

Please sign in to comment.