Skip to content

Fix special tests CI #7790

@awaelchli

Description

@awaelchli

🚀 CI / Tests

We currently have about 3 tests failing on master, it's a mix of NCCL error and "RuntimeError: Address already in use".
Need to find the reason for the failures and adjust CI to report it properly.

FAILED tests/plugins/test_deepspeed_plugin.py::test_deepspeed_multigpu_stage_2_accumulated_grad_batches[True]
FAILED tests/plugins/test_deepspeed_plugin.py::test_deepspeed_multigpu_stage_2_accumulated_grad_batches[False]
FAILED tests/plugins/test_sharded_plugin.py::test_ddp_sharded_plugin_manual_optimization[ddp_sharded_spawn]

Metadata

Metadata

Assignees

Labels

ciContinuous IntegrationfeatureIs an improvement or enhancementhelp wantedOpen to be worked on

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions