Note
Each process adds a GPU memory overhead (~1GB, although it can be much higher) due to PyTorch's CUDA kernels. See PyTorch Issue #12873 for more details
Note
At the moment, only simultaneous training and evaluation of agents with local memory (no memory sharing) is implemented
Snippet
../snippets/trainer.py
../../../skrl/trainers/torch/parallel.py
skrl.trainers.torch.parallel.ParallelTrainer
__init__