WandB Logger in Single Machine + Multi-GPU DDP setting #17225
Replies: 1 comment
-
@carmocca Can you help with this please? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm switching from MLFlowLogger to WandbLogger, and one problem I've run into is that when using > 1 GPUs, the logging for the WandBLogger breaks after the first GPU instance gets initialized. Reading the WandB documentation, it looks like the WandB Run object is not available on any rank > 0. So have these commands from the PL docs been tested in multi-gpu setting?
Because I keep getting these errors before Trainer.fit(...) is called:
![Screenshot 2023-03-29 at 8 42 14 AM](https://user-images.githubusercontent.com/8902328/228598506-49062312-53d7-49f8-8010-961ac939bc9d.png)
This is happening using PyTorch-Lightning 2.0.0, PyTorch 1.13.1, on 2 V100 GPUs.
Beta Was this translation helpful? Give feedback.
All reactions