Skip to content

CI: Add timeout to torch ucc tests#1204

Merged
Sergei-Lebedev merged 1 commit intoopenucx:masterfrom
dpressle:ci_ucc_test_timeout
Oct 20, 2025
Merged

CI: Add timeout to torch ucc tests#1204
Sergei-Lebedev merged 1 commit intoopenucx:masterfrom
dpressle:ci_ucc_test_timeout

Conversation

@dpressle
Copy link
Collaborator

What

Add timeout to ucc torch tests in CI

Why ?

In some cases the tests can hang and job is stuck for 5 hours until job global timeout is reached.

How ?

This commit add 90 minutes timeout to the torch ucc test command to have quick termination in case this test is hung.

In some cases the tests can hang and job is stuck for 5 hours until job
global timeout is reached.

This commit add 90 minutes timeout to the torch ucc test command to have
quick termination in case this test is hung.

Issue: HPCINFRA-3985
Signed-off-by: Daniel Pressler <danielpr@nvidia.com>
@dpressle
Copy link
Collaborator Author

bot:retest

1 similar comment
@dpressle
Copy link
Collaborator Author

bot:retest

@dpressle
Copy link
Collaborator Author

@dpressle
Copy link
Collaborator Author

bot:retest

@dpressle dpressle marked this pull request as ready for review October 19, 2025 15:07
@dpressle dpressle requested a review from ikryukov October 19, 2025 15:07
@Sergei-Lebedev Sergei-Lebedev enabled auto-merge (squash) October 20, 2025 13:40
@Sergei-Lebedev Sergei-Lebedev merged commit f66f66a into openucx:master Oct 20, 2025
9 checks passed
@dpressle dpressle deleted the ci_ucc_test_timeout branch October 20, 2025 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants