Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add synchronization barriers to the ends of the test_*_duplicate_name…
…_error (including reducescatter test) Without this, deadlocks in the subsequent test were possible: One process would already have enqueued a collective op like hvd.broadcast(), while the other would still block in hvd.init() [specifically in _get_process_set_ids_and_ranks()]. I could not use hvd.barrier() for this second barrier because that would somehow cause a segmentation fault. Went for an allreduce instead. Signed-off-by: Max H. Gerlach <git@maxgerlach.de>
- Loading branch information