Multi gpu training #283

constantinpape · 2024-05-31T13:21:14Z

This seems to be working (network is trained on example script), but I haven't tested if this implementation is doing exactly what it's supposed to.

anwai98 · 2024-06-03T07:23:35Z

Found the tutorial for the DistributedSampler implementation for the dataloaders: https://pytorch.org/tutorials/intermediate/FSDP_tutorial.html

Updates multi-gpu training

constantinpape added 2 commits May 31, 2024 09:40

Implement multi-gpu training WIP

5517d75

Working version of multi-gpu training

83dad3a

constantinpape and others added 3 commits June 5, 2024 17:46

Integrate distributed sampler WIP

08eace0

Finish implementation of multi-gpu training

fbb77cf

Minor updates to multi-gpu training (#294)

92a44ca

Updates multi-gpu training

constantinpape merged commit be6bdcd into main Jun 7, 2024
4 checks passed

constantinpape deleted the multi-gpu-training branch June 7, 2024 18:00

Provide feedback