-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Closed
Copy link
Labels
bugSomething isn't workingSomething isn't workingdistributedGeneric distributed-related topicGeneric distributed-related topichelp wantedOpen to be worked onOpen to be worked onpriority: 0High priority taskHigh priority task
Milestone
Description
🐛 Bug
when using accelerator="ddp_cpu" together with plugins=[DDPPlugin(find_unused_parameters=True)] to create a trainer, the trainer will cause the program tries to re-run its self (and recreate the trainer) and finally then failed at checking gpu devices.
Please reproduce using the BoringModel
trainer = Trainer(
max_epochs=1,
gpus=0,
accelerator="ddp_cpu",
num_processes=4,
plugins=[DDPPlugin(find_unused_parameters=True)],
)To Reproduce
Expected behavior
Environment
- PyTorch Version (e.g., 1.0): 1.7.1
- OS (e.g., Linux): osX
- How you installed PyTorch (
conda,pip, source): pip - Build command you used (if compiling from source):
- Python version: 3.8.7
- CUDA/cuDNN version: No
- GPU models and configuration: gpus=None
- Any other relevant information:
Additional context
awaelchli
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingdistributedGeneric distributed-related topicGeneric distributed-related topichelp wantedOpen to be worked onOpen to be worked onpriority: 0High priority taskHigh priority task