You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, one of new features in #2528gather_all_tensors_if_available has a list copy bug, and this would lead that tensors in all GPUs are the wrongly same as one GPU, since they share the same storage:
@ShomyLiu good catch, would you be up for sending a PR? Please, note that the function is not used anywhere yet, but are there for future changes to the metric package.
@SkafteNicki it's my pleasure for a PR. I will finish this as soon as possible.
Yeah, it's a new function to wrap the torch.distributed.all_gather. But I think it is a very common use case; especially, when using DDP mode, we always need to gather all the outputs cross all the GPUs.
馃悰 Bug
Hi, one of new features in #2528
gather_all_tensors_if_available
has alist
copy bug, and this would lead that tensors in all GPUs are the wrongly same as one GPU, since they share the same storage:https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/metrics/converters.py#L304
change into:
The text was updated successfully, but these errors were encountered: