Skip to content

neonbjb/torch-distributed-bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Simple utility for testing the throughput of torch.distributed connections.

Usage

GLOO on local CPU (for testing purposes)

torchrun --nnodes 1 --nproc-per-node 8 bench.py --iterations 1000

NCCL on 8-gpu node

torchrun --nnodes 1 --nproc-per-node 8 bench.py --iterations 1000 --backend nccl

4 GPU nodes, 8 GPUs each (32 total)

torchrun --nnodes 4 --nproc-per-node 8 bench.py --iterations 1000 --backend nccl

Depending on how your set-up is configured, you may need to write your own bench() function to properly configure your torch environment.

About

Bench test torch.distributed

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages