Hello,
I'm following your's GRPO multi-GPU approach on SLURM environment. I do understand why we would like to use separate GPU card(s) for vLLM deployment, however I've got errors with NPROC_PER_NODE. I was told that its number should be equal to GPU's count which is incorrect with separate cards for vLLM idea. I can use max 4 GPUs per one node.
AssertionError: Colocate mode requires device_count(4) == num_infer_workers(4). Please check if your device count matches NPROC_PER_NODE setting.
Any idea's why?