-
Notifications
You must be signed in to change notification settings - Fork 0
Quick Start
Mateusz Kapusta edited this page Jun 7, 2026
·
2 revisions
rixa can be used both with PyTorch and NVSHMEM.
One can use rixa to start PyTorch distributed job with one simple line
import rixa
rixa.pytorch.init_process_group("gloo")
ML_training_loop()
torch.distributed.destroy_process_group()For the NCCL backend more options are provided. If the nccl backend is specified and gpu_assign_method == local_rank (default)
GPUs would be assigned to the processes based on the local rank.
import rixa
rixa.pytorch.init_process_group("nccl",gpu_assign_method="local_rank")
# No need to call torch.cuda.set_device
GPU_training_loop()
torch.distributed.destroy_process_group()Example:
import rixa
from cuda.core import Device
store = rixa.PMIxStore(30) #timeout in seconds
dev = Device(store.get_local_rank()) #use local rank for the device
rixa.nvshmem.init(dev, store)
nvshmem.finalize()