Skip to content

Commit

Permalink
Increase timeout for ProcessGroupGlooTest (pytorch#85474)
Browse files Browse the repository at this point in the history
We see spurious failures due to timeouts in`test_allreduce_coalesced_basics` but only when running the whole test suite with
`python run_test.py --verbose -i distributed/test_c10d_gloo`. Increasing the timeout to 50s should provide enough leeway to avoid this. Note that the default for the `_timeout` is 30 minutes.

Originally reported in EasyBuild at easybuilders/easybuild-easyconfigs#15137 (comment) and patch proposed by @casparvl
Pull Request resolved: pytorch#85474
Approved by: https://github.com/rohan-varma
  • Loading branch information
Flamefire authored and alvgaona committed Oct 11, 2022
1 parent 3288f03 commit f010669
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion test/distributed/test_c10d_gloo.py
Expand Up @@ -215,7 +215,7 @@ def setUp(self):

def opts(self, threads=2):
opts = c10d.ProcessGroupGloo._Options()
opts._timeout = 5.0
opts._timeout = 50.0
opts._devices = [create_device(interface=LOOPBACK)]
opts._threads = threads
return opts
Expand Down

0 comments on commit f010669

Please sign in to comment.